RAID 10 clean, degraded

zerozenit · 20. Dezember 2021

Zitat von chente

Geaves, if you'll allow me to interrupt ...
That disk seems to be fine (although a little hot, 40º). I would do the same with the others and compare the number of starts and / or the number of hours of operation. Assuming they were purchased at the same time, if this disk is significantly lower it could be the result of a hardware failure preventing this disk from booting.

The others... (2/3)

Code

smartctl 6.6 2016-05-31 r4324 [x86_64-linux-4.19.0-0.bpo.9-amd64] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Device Model:     WDC WD120EFAX-68UNTN0
Serial Number:    2AHN3HXY
LU WWN Device Id: 5 000cca 27ad73c1d
Firmware Version: 81.00A81
User Capacity:    12,000,138,625,024 bytes [12.0 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    5400 rpm
Form Factor:      3.5 inches
Device is:        Not in smartctl database [for details use: -P showall]
ATA Version is:   ACS-2, ATA8-ACS T13/1699-D revision 4
SATA Version is:  SATA 3.2, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Mon Dec 20 13:39:36 2021 CET
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
AAM feature is:   Unavailable
APM level is:     164 (intermediate level without standby)
Rd look-ahead is: Enabled
Write cache is:   Enabled
ATA Security is:  Disabled, NOT FROZEN [SEC1]
Wt Cache Reorder: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x84)    Offline data collection activity
                    was suspended by an interrupting command from host.
                    Auto Offline Data Collection: Enabled.
Self-test execution status:      (   0)    The previous self-test routine completed
                    without error or no self-test has ever 
                    been run.
Total time to complete Offline 
data collection:         (   87) seconds.
Offline data collection
capabilities:              (0x5b) SMART execute Offline immediate.
                    Auto Offline data collection on/off support.
                    Suspend Offline collection upon new
                    command.
                    Offline surface scan supported.
                    Self-test supported.
                    No Conveyance Self-test supported.
                    Selective Self-test supported.
SMART capabilities:            (0x0003)    Saves SMART data before entering
                    power-saving mode.
                    Supports SMART auto save timer.
Error logging capability:        (0x01)    Error logging supported.
                    General Purpose Logging supported.
Short self-test routine 
recommended polling time:      (   2) minutes.
Extended self-test routine
recommended polling time:      (1264) minutes.
SCT capabilities:            (0x003d)    SCT Status supported.
                    SCT Error Recovery Control supported.
                    SCT Feature Control supported.
                    SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAGS    VALUE WORST THRESH FAIL RAW_VALUE
  1 Raw_Read_Error_Rate     PO-R--   100   100   016    -    0
  2 Throughput_Performance  --S---   127   127   054    -    112
  3 Spin_Up_Time            POS---   253   253   024    -    50 (Average 128)
  4 Start_Stop_Count        -O--C-   100   100   000    -    101
  5 Reallocated_Sector_Ct   PO--CK   100   100   005    -    0
  7 Seek_Error_Rate         -O-R--   100   100   067    -    0
  8 Seek_Time_Performance   --S---   128   128   020    -    18
  9 Power_On_Hours          -O--C-   098   098   000    -    15198
 10 Spin_Retry_Count        -O--C-   100   100   060    -    0
 12 Power_Cycle_Count       -O--CK   100   100   000    -    100
 22 Unknown_Attribute       PO---K   100   100   025    -    100
192 Power-Off_Retract_Count -O--CK   100   100   000    -    824
193 Load_Cycle_Count        -O--C-   100   100   000    -    824
194 Temperature_Celsius     -O----   166   166   000    -    39 (Min/Max 20/52)
196 Reallocated_Event_Count -O--CK   100   100   000    -    0
197 Current_Pending_Sector  -O---K   100   100   000    -    0
198 Offline_Uncorrectable   ---R--   100   100   000    -    0
199 UDMA_CRC_Error_Count    -O-R--   200   200   000    -    0
                            ||||||_ K auto-keep
                            |||||__ C event count
                            ||||___ R error rate
                            |||____ S speed/performance
                            ||_____ O updated online
                            |______ P prefailure warning

Alles anzeigen

Code

smartctl 6.6 2016-05-31 r4324 [x86_64-linux-4.19.0-0.bpo.9-amd64] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Device Model:     WDC WD120EFAX-68UNTN0
Serial Number:    8CJNKM4E
LU WWN Device Id: 5 000cca 26fe58890
Firmware Version: 81.00A81
User Capacity:    12,000,138,625,024 bytes [12.0 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    5400 rpm
Form Factor:      3.5 inches
Device is:        Not in smartctl database [for details use: -P showall]
ATA Version is:   ACS-2, ATA8-ACS T13/1699-D revision 4
SATA Version is:  SATA 3.2, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Mon Dec 20 13:40:24 2021 CET
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
AAM feature is:   Unavailable
APM level is:     164 (intermediate level without standby)
Rd look-ahead is: Enabled
Write cache is:   Enabled
ATA Security is:  Disabled, NOT FROZEN [SEC1]
Wt Cache Reorder: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x82)    Offline data collection activity
                    was completed without error.
                    Auto Offline Data Collection: Enabled.
Self-test execution status:      (   0)    The previous self-test routine completed
                    without error or no self-test has ever 
                    been run.
Total time to complete Offline 
data collection:         (   87) seconds.
Offline data collection
capabilities:              (0x5b) SMART execute Offline immediate.
                    Auto Offline data collection on/off support.
                    Suspend Offline collection upon new
                    command.
                    Offline surface scan supported.
                    Self-test supported.
                    No Conveyance Self-test supported.
                    Selective Self-test supported.
SMART capabilities:            (0x0003)    Saves SMART data before entering
                    power-saving mode.
                    Supports SMART auto save timer.
Error logging capability:        (0x01)    Error logging supported.
                    General Purpose Logging supported.
Short self-test routine 
recommended polling time:      (   2) minutes.
Extended self-test routine
recommended polling time:      (1269) minutes.
SCT capabilities:            (0x003d)    SCT Status supported.
                    SCT Error Recovery Control supported.
                    SCT Feature Control supported.
                    SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAGS    VALUE WORST THRESH FAIL RAW_VALUE
  1 Raw_Read_Error_Rate     PO-R--   100   100   016    -    0
  2 Throughput_Performance  --S---   127   127   054    -    112
  3 Spin_Up_Time            POS---   192   192   024    -    235 (Average 449)
  4 Start_Stop_Count        -O--C-   100   100   000    -    408
  5 Reallocated_Sector_Ct   PO--CK   100   100   005    -    0
  7 Seek_Error_Rate         -O-R--   100   100   067    -    0
  8 Seek_Time_Performance   --S---   140   140   020    -    15
  9 Power_On_Hours          -O--C-   098   098   000    -    15197
 10 Spin_Retry_Count        -O--C-   100   100   060    -    0
 12 Power_Cycle_Count       -O--CK   100   100   000    -    408
 22 Unknown_Attribute       PO---K   100   100   025    -    100
192 Power-Off_Retract_Count -O--CK   100   100   000    -    1000
193 Load_Cycle_Count        -O--C-   100   100   000    -    1000
194 Temperature_Celsius     -O----   171   171   000    -    38 (Min/Max 20/52)
196 Reallocated_Event_Count -O--CK   100   100   000    -    0
197 Current_Pending_Sector  -O---K   100   100   000    -    0
198 Offline_Uncorrectable   ---R--   100   100   000    -    0
199 UDMA_CRC_Error_Count    -O-R--   200   200   000    -    0
                            ||||||_ K auto-keep
                            |||||__ C event count
                            ||||___ R error rate
                            |||____ S speed/performance
                            ||_____ O updated online
                            |______ P prefailure warning

Alles anzeigen

zerozenit · 20. Dezember 2021

The others... (3/3)

Code

smartctl 6.6 2016-05-31 r4324 [x86_64-linux-4.19.0-0.bpo.9-amd64] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Device Model:     WDC WD120EFAX-68UNTN0
Serial Number:    8CJUVJPE
LU WWN Device Id: 5 000cca 26fe7efe8
Firmware Version: 81.00A81
User Capacity:    12,000,138,625,024 bytes [12.0 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    5400 rpm
Form Factor:      3.5 inches
Device is:        Not in smartctl database [for details use: -P showall]
ATA Version is:   ACS-2, ATA8-ACS T13/1699-D revision 4
SATA Version is:  SATA 3.2, 6.0 Gb/s (current: 1.5 Gb/s)
Local Time is:    Mon Dec 20 13:41:27 2021 CET
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
AAM feature is:   Unavailable
APM level is:     164 (intermediate level without standby)
Rd look-ahead is: Enabled
Write cache is:   Enabled
ATA Security is:  Disabled, NOT FROZEN [SEC1]
Wt Cache Reorder: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x84)    Offline data collection activity
                    was suspended by an interrupting command from host.
                    Auto Offline Data Collection: Enabled.
Self-test execution status:      (   0)    The previous self-test routine completed
                    without error or no self-test has ever 
                    been run.
Total time to complete Offline 
data collection:         (   87) seconds.
Offline data collection
capabilities:              (0x5b) SMART execute Offline immediate.
                    Auto Offline data collection on/off support.
                    Suspend Offline collection upon new
                    command.
                    Offline surface scan supported.
                    Self-test supported.
                    No Conveyance Self-test supported.
                    Selective Self-test supported.
SMART capabilities:            (0x0003)    Saves SMART data before entering
                    power-saving mode.
                    Supports SMART auto save timer.
Error logging capability:        (0x01)    Error logging supported.
                    General Purpose Logging supported.
Short self-test routine 
recommended polling time:      (   2) minutes.
Extended self-test routine
recommended polling time:      (1262) minutes.
SCT capabilities:            (0x003d)    SCT Status supported.
                    SCT Error Recovery Control supported.
                    SCT Feature Control supported.
                    SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAGS    VALUE WORST THRESH FAIL RAW_VALUE
  1 Raw_Read_Error_Rate     PO-R--   100   100   016    -    0
  2 Throughput_Performance  --S---   128   128   054    -    108
  3 Spin_Up_Time            POS---   253   253   024    -    50 (Average 50)
  4 Start_Stop_Count        -O--C-   100   100   000    -    273
  5 Reallocated_Sector_Ct   PO--CK   100   100   005    -    0
  7 Seek_Error_Rate         -O-R--   100   100   067    -    0
  8 Seek_Time_Performance   --S---   136   136   020    -    16
  9 Power_On_Hours          -O--C-   098   098   000    -    15198
 10 Spin_Retry_Count        -O--C-   100   100   060    -    0
 12 Power_Cycle_Count       -O--CK   100   100   000    -    272
 22 Unknown_Attribute       PO---K   100   100   025    -    100
192 Power-Off_Retract_Count -O--CK   100   100   000    -    1038
193 Load_Cycle_Count        -O--C-   100   100   000    -    1038
194 Temperature_Celsius     -O----   158   158   000    -    41 (Min/Max 20/53)
196 Reallocated_Event_Count -O--CK   100   100   000    -    0
197 Current_Pending_Sector  -O---K   100   100   000    -    0
198 Offline_Uncorrectable   ---R--   100   100   000    -    0
199 UDMA_CRC_Error_Count    -O-R--   200   200   000    -    0
                            ||||||_ K auto-keep
                            |||||__ C event count
                            ||||___ R error rate
                            |||____ S speed/performance
                            ||_____ O updated online
                            |______ P prefailure warning

Alles anzeigen

chente · 20. Dezember 2021

The disc in question has 11,303 hours of use and the other three have 15,197, 15198 and 15,198

Did you create the Raid10 with new discs?

zerozenit · 20. Dezember 2021

Zitat von chente

The disc in question has 11,303 hours of use and the other three have 15,197, 15198 and 15,198
Did you create the Raid10 with new discs?

Never, but last year somothing appened, I don't know if it is relevant, you can see here OMV Raid Missing after reboot

It is also possible that I have been using 3 disks instead of 4 for about 5 months ... or not? Isn't there a way to get notified if one of the 4 disks in RAID10 goes missing or has problems? I only noticed it by chance by checking some configurations. The S.M.A.R.T.test is performed every morning at 6:00 am. on every disk but never got any alerts. Thank you.

zerozenit · 20. Dezember 2021

Zitat von chente

Did you create the Raid10 with new discs?

I'm sorry, my correct answer is: "yes, in origin I create the Raid10 with 4 new discs". Thank you.

chente · 20. Dezember 2021

Zitat von zerozenit

in origin I create the Raid10 with 4 new discs

Thank you. Let's see what geaves thinks of this ...

geaves · 20. Dezember 2021

Initially the drives appear to be fine, either mdadm 'threw' the drive due to some intermittent fault, the drive certainly did not remove itself, otherwise the raid would have become inactive.

I'm leaning toward an intermittent fault, which leaves either the sata cable and/or the sata port the drive is connected to

zerozenit · 20. Dezember 2021

Zitat von geaves

I'm leaning toward an intermittent fault, which leaves either the sata cable and/or the sata port the drive is connected to

I agree. Replacing the cable is definitely worth doing to start. But before the shutdown to take care of the hardware inside the case I think it is necessary to fix the array, or can we fix it later?

geaves · 20. Dezember 2021

Zitat von zerozenit

But before the shutdown to take care of the hardware inside the case I think it is necessary to fix the array, or can we fix it later?

You can fix it later

Zoki · 20. Dezember 2021

Bit before touching the hardware of an already degraded RAID you should take the time to do a backup (yes, it takes a long given the amount of storage).

zerozenit · 20. Dezember 2021

I changed the cable, after reboot "Failed to start File System Check on /dev/disk/by-label/REDRAID4X12.

What is the best option to continue? Thank you.

geaves · 20. Dezember 2021

Zitat von zerozenit

What is the best option to continue

Wasn't expecting that, that would suggest there is a problem with that array, or at least it's file system, the norm to do this is to run it manually, but that is on a clean array, I have no idea what affect this might have on a degraded array, particularly a Raid10.

So, login as root and run fsck /dev/md0 and accept everything with a y (yes) or fsck -y /dev/md0 this means it will correct errors without user input. I have no idea how long this take given the size of the drives.

zerozenit · 20. Dezember 2021

Zitat von geaves

So, login as root and run fsck /dev/md0 and accept everything with a y (yes) or fsck -y /dev/md0 this means it will correct errors without user input. I have no idea how long this take given the size of the drives.

Thank you geaves it's ok, fixed, the system is up and everything seems working fine. Obviously there is always the problem of the fourth disk not present in the array.

These are the check commands:

Code

root@pandora:~# cat /proc/mdstat
Personalities : [raid10] [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] 
md0 : active raid10 sda[0] sdb[1] sde[2]
      23437508608 blocks super 1.2 512K chunks 2 near-copies [4/3] [UUU_]
      bitmap: 83/175 pages [332KB], 65536KB chunk

unused devices: <none>

Code

root@pandora:~# blkid
/dev/sda: UUID="8b767a7d-c52c-068d-c04f-1a3cfd8d4c5f" UUID_SUB="3904f2f1-fe1f-bde3-a965-d9dbe0074f66" LABEL="pandora:Raid4x12TBWdRed" TYPE="linux_raid_member"
/dev/sdb: UUID="8b767a7d-c52c-068d-c04f-1a3cfd8d4c5f" UUID_SUB="a6bb8aa8-4e9b-7f90-b105-45a9301acbce" LABEL="pandora:Raid4x12TBWdRed" TYPE="linux_raid_member"
/dev/sdc1: UUID="2218-DC43" TYPE="vfat" PARTUUID="09f69470-ba7b-4b6b-9456-c09f4c6ad2ee"
/dev/sdc2: UUID="87bfca96-9bee-4725-ae79-d8d7893d5a49" TYPE="ext4" PARTUUID="3c45a8f0-3106-4ba8-89bc-b15d22e81144"
/dev/sdc3: UUID="856b0ba6-a0a9-49f2-81ef-27e24004aa98" TYPE="swap" PARTUUID="fda4b444-cf82-4ae8-b916-01b8244acee3"
/dev/md0: LABEL="REDRAID4X12" UUID="5fd65f52-b922-45e3-a940-eb7c75460446" TYPE="ext4"
/dev/sde: UUID="8b767a7d-c52c-068d-c04f-1a3cfd8d4c5f" UUID_SUB="6c9c5433-6838-c39f-abfa-7807205a3238" LABEL="pandora:Raid4x12TBWdRed" TYPE="linux_raid_member"
/dev/sdf: UUID="8b767a7d-c52c-068d-c04f-1a3cfd8d4c5f" UUID_SUB="a0287edc-2404-a0cc-735b-3c99f2f923af" LABEL="pandora:Raid4x12TBWdRed" TYPE="linux_raid_member"

Code

root@pandora:~# fdisk -l | grep "Disk "
Disk /dev/sda: 10,9 TiB, 12000138625024 bytes, 23437770752 sectors
Disk /dev/sdb: 10,9 TiB, 12000138625024 bytes, 23437770752 sectors
Disk /dev/sdc: 28,7 GiB, 30752636928 bytes, 60063744 sectors
Disk identifier: 51328880-3F36-4C4F-A18D-76E5CF56DD7D
Disk /dev/sdd: 100 MiB, 104857600 bytes, 204800 sectors
Disk /dev/sde: 10,9 TiB, 12000138625024 bytes, 23437770752 sectors
Disk /dev/md0: 21,8 TiB, 24000008814592 bytes, 46875017216 sectors
Disk /dev/sdf: 10,9 TiB, 12000138625024 bytes, 23437770752 sectors

Code

root@pandora:~# cat /etc/mdadm/mdadm.conf
# mdadm.conf
#
# Please refer to mdadm.conf(5) for information about this file.
#

# by default, scan all partitions (/proc/partitions) for MD superblocks.
# alternatively, specify devices to scan, using wildcards if desired.
# Note, if no DEVICE line is present, then "DEVICE partitions" is assumed.
# To avoid the auto-assembly of RAID devices a pattern that CAN'T match is
# used if no RAID devices are configured.
DEVICE partitions

# auto-create devices with Debian standard permissions
CREATE owner=root group=disk mode=0660 auto=yes

# automatically tag new arrays as belonging to the local system
HOMEHOST <system>

# definitions of existing MD arrays
ARRAY /dev/md0 metadata=1.2 name=pandora:Raid4x12TBWdRed UUID=8b767a7d:c52c068d:c04f1a3c:fd8d4c5f

Alles anzeigen

Code

# definitions of existing MD arrays
ARRAY /dev/md0 metadata=1.2 name=pandora:Raid4x12TBWdRed UUID=8b767a7d:c52c068d:c04f1a3c:fd8d4c5f
root@pandora:~# 
root@pandora:~# 
root@pandora:~# mdadm --detail --scan --verbose
ARRAY /dev/md0 level=raid10 num-devices=4 metadata=1.2 name=pandora:Raid4x12TBWdRed UUID=8b767a7d:c52c068d:c04f1a3c:fd8d4c5f
   devices=/dev/sda,/dev/sdb,/dev/sde

In Raid Management I have sda, sdb and sde, not sdf. In Recovery -> Devices there is listed only the mysterious sdd...

What to do? Thank you so much for your support.

zerozenit · 20. Dezember 2021

Although the system seems to work correctly, I have read the logs and there was a problem at boot "quotaon: cannot find /srv/dev-disk-by-label-REDRAID4X12/aquota.user on /dev/md0 [/srv/dev-disk-by-label-REDRAID4X12]" and therefore "Failed to start Enable File System Quotas". Can this help us?

Code

Dec 20 18:33:17 pandora systemd[1]: Started File System Quota Check.
Dec 20 18:33:17 pandora kernel: [    0.011915] ACPI: LPIT 0x000000005D67E970 00005C (v01 INTEL  GLK-SOC  00000003 BRXT 0100000D)
Dec 20 18:33:17 pandora kernel: [    0.011919] ACPI: APIC 0x000000005D67E9D0 000084 (v03 INTEL  GLK-SOC  00000003 BRXT 0100000D)
Dec 20 18:33:17 pandora kernel: [    0.011924] ACPI: NPKT 0x000000005D67EA60 000065 (v01 INTEL  GLK-SOC  00000003 BRXT 0100000D)
Dec 20 18:33:17 pandora kernel: [    0.011928] ACPI: SSDT 0x000000005D67EAD0 003E72 (v02 INTEL  DptfTab  00000003 BRXT 0100000D)
Dec 20 18:33:17 pandora systemd[1]: Starting Enable File System Quotas...
Dec 20 18:33:17 pandora kernel: [    0.011932] ACPI: SSDT 0x000000005D682950 0010B3 (v02 INTEL  UsbCTabl 00000003 BRXT 0100000D)
Dec 20 18:33:17 pandora kernel: [    0.011937] ACPI: SSDT 0x000000005D683A10 00153E (v01 Intel_ Platform 00001000 INTL 20160930)
Dec 20 18:33:17 pandora kernel: [    0.011941] ACPI: AAFT 0x000000005D684F50 0002CE (v01 ALASKA OEMAAFT  01072009 MSFT 00000097)
Dec 20 18:33:17 pandora kernel: [    0.011946] ACPI: SSDT 0x000000005D685220 000538 (v02 PmRef  Cpu0Ist  00003000 INTL 20160930)
Dec 20 18:33:17 pandora quotaon[573]: quotaon: cannot find /srv/dev-disk-by-label-REDRAID4X12/aquota.group on /dev/md0 [/srv/dev-disk-by-label-REDRAID4X12]
Dec 20 18:33:17 pandora kernel: [    0.011951] ACPI: SSDT 0x000000005D685760 000775 (v02 CpuRef CpuSsdt  00003000 INTL 20160930)
Dec 20 18:33:17 pandora kernel: [    0.011955] ACPI: SSDT 0x000000005D685EE0 00035F (v02 PmRef  Cpu0Tst  00003000 INTL 20160930)
Dec 20 18:33:17 pandora kernel: [    0.011960] ACPI: SSDT 0x000000005D686240 0001E6 (v02 PmRef  ApTst    00003000 INTL 20160930)
Dec 20 18:33:17 pandora kernel: [    0.011964] ACPI: SSDT 0x000000005D686430 0028FE (v02 SaSsdt SaSsdt   00003000 INTL 20160930)
Dec 20 18:33:17 pandora kernel: [    0.011968] ACPI: UEFI 0x000000005D688D30 000042 (v01 ALASKA A M I    00000002      01000013)
Dec 20 18:33:17 pandora quotaon[573]: quotaon: cannot find /srv/dev-disk-by-label-REDRAID4X12/aquota.user on /dev/md0 [/srv/dev-disk-by-label-REDRAID4X12]
Dec 20 18:33:17 pandora systemd[1]: quotaon.service: Main process exited, code=exited, status=2/INVALIDARGUMENT
Dec 20 18:33:17 pandora systemd[1]: Failed to start Enable File System Quotas.
Dec 20 18:33:17 pandora systemd[1]: quotaon.service: Unit entered failed state.
Dec 20 18:33:17 pandora systemd[1]: quotaon.service: Failed with result 'exit-code'.
Dec 20 18:33:17 pandora systemd[1]: Reached target Local File Systems.

Alles anzeigen

geaves · 21. Dezember 2021

Zitat von zerozenit

In Raid Management I have sda, sdb and sde, not sdf. In Recovery -> Devices there is listed only the mysterious sdd

The recovery will not work in this case as blkid has identified /dev/sdf with a raid signature, to use recovery you would have to wipe the drive first, a long process given the drive size. However, mdadm --add /dev/md0 /dev/sdf should add the drive back to the array

Zitat von zerozenit

Can this help us?

Yes, but, I've never had to deal with this personally and AFAIK this was disabled, but it may not be OMV4, I'll tag a couple of the other mods who know more about this than I do, crashtest macom

crashtest · 21. Dezember 2021

Zitat von zerozenit

Although the system seems to work correctly, I have read the logs and there was a problem at boot "quotaon: cannot find /srv/dev-disk-by-label-REDRAID4X12/aquota.user on /dev/md0 [/srv/dev-disk-by-label-REDRAID4X12]" and therefore "Failed to start Enable File System Quotas". Can this help us?

There may be a number of errors, found in the logs, that are harmless. If this is not affecting you in any tangible way, other than the log entries, I wouldn't worry about it.

Otherwise, you could give the following a try.
___________________________________________________________

Turn the quota service off.

sudo /etc/init.d/quota stop

(In the following examples, substitute the appropriate labels for your drives.)

sudo quotaoff --user --group /srv/dev-disk-by-label-DATA

sudo quotaoff --user --group /srv/dev-disk-by-label-RSYNC

zerozenit · 30. Dezember 2021

Zitat von geaves

The recovery will not work in this case as blkid has identified /dev/sdf with a raid signature, to use recovery you would have to wipe the drive first, a long process given the drive size. However, mdadm --add /dev/md0 /dev/sdf should add the drive back to the array

I made a backup (very long) and then wiped the 4th disk. At that point I did the recovery of the raid. Everything went smoothly, thanks.

I then upgraded from OMV 4 to OMV 5 with simultaneous upgrade from Debian 9 to Debian 10 following these dleidert instructions. It all worked out for the best.

I really appreciate the help from the forum. Thank you all.

RAID 10 clean, degraded

zerozenit 30. Dezember 2021

Jetzt mitmachen!

Ähnliche Themen

6x 3TB RAID6 - one disk dying, what to do?

Replacing disk in Raid1 [OMV6] - entry not visible

Mirror RAID clean, degraded

Mirror RAID clean, degraded

Tags