OpenMediaVault5 NAS [RAID Management] "State: clean, degraded", [File Systems] "Status: Missing"

victort · 24. Oktober 2023

Hello lovely people,

I do have a NAS that runs OpenMediaVault5 (version:5.5.13-1 (Usul)), use 11 HDD in RAID5 and that have (actually had till today) available 12TB of space. From this morning, all over the sudden, it started to show available only 65GB, which I suspect that actually is the drive on which OpenMediaVault5 is installed on. In order to not waste space, I've installed OpenMediaVault5 on an old 80GB (74.53GB real space) SATA drive.

I do suspect that something went wrong with the RAID5 management, here's why:

- Storage->Disks, all the disk are listed, so they are seen be the system.

- Storage->S.M.A.R.T.->Devices, all the HDD are shown and the status is "OK" (the green dot).

Screenshot: https://i.ibb.co/6J7Z7kd/01.png

- Storage-> Logical Volume Management -> Physical Volumes, is only shown one device (with Available: 1.82TiB, Used: 1.82TiB), which should be more devices than this one.

Screenshot: https://i.ibb.co/C6h6pRT/02.png

- Storage-> Logical Volume Management -> Volume Groups, is one volume group, (Available: 9.10 TiB, Free: 0.00B, Physical volumes: [unknown]/dev/md126)

Screenshot: https://i.ibb.co/JFs5J2n/03.png

- Storage-> Logical Volume Management -> Logical volumes, only one, with a Capacity: 9.10TiB, Active: No. In my opinion there should be and additional Logical Volume, and beside Active, for this existing one, should be "Yes", instead of "No".

Screenshot: ttps://i.ibb.co/ydCYbZC/04.png

- Storage-> RAID Management, it's only one RAID device, which should be couple of them in fact. For the existing one, the "State" is "clean, degraded", Level: Capacity 1.82TiB

Screenshot: https://i.ibb.co/YfDZ445/05.png

- Storage-> File Systems, one out of 6 File Systems devices, have the Status: Missing, for "Mounted" column it looks like one out of 4 (the 80GB HDD) is mounted (the other 3 are not, being mentioned "No")

Screenshot: https://i.ibb.co/5BhRnNF/06.png

I don't know what to do at the moment in order to:

a. diagnosed and full understand the problem and what caused it.

b. have the data from these 11HDD safe, while I've trying to sort this out.

c. to repair and remount the missing RAID devices without loosing any data, to get back access to all HDDs.

for the Past month, I need to confess, I've turned OFF my OMV5 NAS, only by using the power button, long pressing it, and my fear is that on a medium to long term, this can cause some internal errors, or software error that lead to the RAID management problems.

What do you think, what could be the cause of the problem and how can it be solved?

Thank you in advance for your help.

geaves · 24. Oktober 2023

Zitat von victort

use 11 HDD in RAID5
it's only one RAID device, which should be couple of them in fact

The second comment contradicts the first, to have a single Raid5 all drives must be part of that single array, however the image of raid management shows an array using /dev/sdf2 and /dev/sdj2. That suggests that partitions were used to create that array, the array was therefore not created using OMV's GUI as OMV uses the whole drive when creating raid arrays.

Zitat von victort

for the Past month, I need to confess, I've turned OFF my OMV5 NAS, only by using the power button, long pressing it, and my fear is that on a medium to long term, this can cause some internal errors, or software error that lead to the RAID management problems.
What do you think,

Well at least you've noted your own error and yes it can cause corruption both to the file system and to OMV

Follow this sticky and post each output in a code box this symbol </> on the forum menu bar, this will make it easier to read

Is this recoverable, unknown, 11 drives is a Raid5 is not a good idea, Raid5 allows for one drive failure, if 2 drives fail the array is toast, as you have used partitions the missing partition in the image could be related to a drive in the other array.

BTW OMV5 is EOL

victort · 25. Oktober 2023

Zitat

The second comment contradicts the first, to have a single Raid5 all drives must be part of that single array...

Geaves first of all, a BIG thank you for your time and effort to help me, I really appreciate it.

Sorry for any confusion, I should've been more clear about the HDD configuration and the Raid.

I will try to do my best to remember how they were connected almost 3 years ago, when this NAS was created.

1. 1 x 80 GB HDD (ST3802110AS, 74.53GB) used only for OS (OMV5), stand alone.

2. 10 HDD out of all 11 HDD were part of the same LVM (Logical Volume Manager) distributed as follows:

a). 2 x 6TB HDD in RAID1 (ST6000VX0023-2EF110 , ST6000VN001-2BB186 with 5.46TB, each)

b). 3 x 3TB HDD in RAID5 (HUS724030ALA640, ST3000DM008-2DM166, WDC WD30EFRX-68EUZN0, with 2.73TB, each).

c). 2 x 2 TB HDD in RAID 1 (2 x ST2000DM001-1CH164)

d). 3 x 500 GB HDD in RAID 5 (ST3500312CS, ST3500418AS, WDC WD5000AADS-00S9B0, 465.76 GB, each)

Zitat

[...] the image of raid management shows an array using /dev/sdf2 and /dev/sdj2. That suggests that partitions were used to create that array, the array was therefore not created using OMV's GUI as OMV uses the whole drive when creating raid arrays.

You are right, one friend of mine, who's a DevOps Team Lead and a Linux expert, he discovered that OMV is based on Debian, and helped me to create the put the HDD in RAID, to create LVM and everything else need it, using PuTTY. He tried to configure the HDDs in such a way to get the most from my disks and keep my data as safe as possible. Because I was a total novice in OMV, and didn't know how to use it properly, I've accepted his help.

Zitat

Well at least you've noted your own error and yes it can cause corruption both to the file system and to OMV

I've learned my lesson now.

Zitat

Follow this sticky and post each output in a code box this symbol </> on the forum menu bar, this will make it easier to read

I did follow it, and here's the output:

cat /proc/mdstat

Code

root@BlackT:/# cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4] [linear] [multipath] [raid0] [raid1] [raid10]
md2 : inactive sdc1[1](S)
      488252391 blocks super 1.2

md126 : active (auto-read-only) raid5 sdb2[1](S) sdf2[3] sdj2[0]
      1953239040 blocks super 1.2 level 5, 512k chunk, algorithm 2 [3/2] [U_U]
      bitmap: 7/8 pages [28KB], 65536KB chunk

md127 : inactive sdb1[1](S) sdf1[5](S) sdi1[3](S) sdj1[0](S) sde1[2](S)
      9766901955 blocks super 1.2

unused devices: <none>

Alles anzeigen

blkid

Code

root@BlackT:/# blkid
/dev/sdk1: UUID="5927fce4-4679-4efc-a04e-c1aeb5e3c3fb" TYPE="ext4" PARTUUID="b81e6b4e-01"
/dev/sdk5: UUID="368e37a0-fba2-4f84-bfc6-4161c5ccac3c" TYPE="swap" PARTUUID="b81e6b4e-05"
/dev/sdj1: UUID="89f0f506-3517-9a63-ad97-870b92e66241" UUID_SUB="b9db1ae5-e589-7563-3a97-821782d65b68" LABEL="BlackT.local:r5-2tb" TYPE="linux_raid_member" PARTLABEL="P9HPBRYY_1" PARTUUID="ed170cfa-aa5d-0a46-98cf-ab90c2091c73"
/dev/sdj2: UUID="137bec62-c8fa-f58a-f879-3c5372c5b132" UUID_SUB="e1dde2b4-85fd-9594-b6de-2339a8d9de61" LABEL="BlackT.local:r5-1tb" TYPE="linux_raid_member" PARTLABEL="P9HPBRYY_2" PARTUUID="0049cc09-b091-474a-b944-9ba4db3ab067"
/dev/sdi1: UUID="89f0f506-3517-9a63-ad97-870b92e66241" UUID_SUB="2c96d9a4-1baa-e8e1-1416-4b1350a0709f" LABEL="BlackT.local:r5-2tb" TYPE="linux_raid_member" PARTLABEL="Z1E0XEPV_1" PARTUUID="605e55dd-9887-5447-b347-c6ea32808867"
/dev/sdb1: UUID="89f0f506-3517-9a63-ad97-870b92e66241" UUID_SUB="ced64641-82a4-7c75-7a35-74c81ff4e5dc" LABEL="BlackT.local:r5-2tb" TYPE="linux_raid_member" PARTLABEL="WD-WCC4N2H5D6XE_1" PARTUUID="3887a5de-409c-904f-bcd6-fb686c9ffab0"
/dev/sdb2: UUID="137bec62-c8fa-f58a-f879-3c5372c5b132" UUID_SUB="e9b071ed-4d3a-31be-b294-3d2e88703ad6" LABEL="BlackT.local:r5-1tb" TYPE="linux_raid_member" PARTLABEL="WD-WCC4N2H5D6XE_2" PARTUUID="09d48df4-c91b-a345-85e8-fcbbd4c21e4b"
/dev/sda2: LABEL="New Volume" UUID="2E64379164375AB3" TYPE="ntfs" PARTLABEL="Basic data partition" PARTUUID="1548f38f-63c3-4db3-93b9-bbd52be05748"
/dev/sde1: UUID="89f0f506-3517-9a63-ad97-870b92e66241" UUID_SUB="958558f8-2fc3-2629-505e-c78a7e2a2c21" LABEL="BlackT.local:r5-2tb" TYPE="linux_raid_member" PARTLABEL="Z1E0XEED_1" PARTUUID="44887bab-7d68-bb4b-9aef-e3f6cdfac2aa"
/dev/sdf1: UUID="89f0f506-3517-9a63-ad97-870b92e66241" UUID_SUB="ffd5b62f-d747-3153-76ea-fff6d5882e2e" LABEL="BlackT.local:r5-2tb" TYPE="linux_raid_member" PARTLABEL="Z50505FY_1" PARTUUID="f7554faf-fec9-084f-9265-97f94dd30359"
/dev/sdf2: UUID="137bec62-c8fa-f58a-f879-3c5372c5b132" UUID_SUB="a2a2626b-d69d-7bfd-82a2-8d2a3a8dba35" LABEL="BlackT.local:r5-1tb" TYPE="linux_raid_member" PARTLABEL="Z50505FY_2" PARTUUID="c9fe407a-3058-d14e-8df9-8aacdfc68092"
/dev/sdc1: UUID="c3912ec0-9fd2-83df-d5ff-048b9effd01b" UUID_SUB="c7a8f1e6-110e-fa0f-c00d-15e3ae67785c" LABEL="localhost.localdomain:2" TYPE="linux_raid_member" PARTUUID="1e04df7f-27af-4645-907b-a2d3b2a27838"
/dev/md126: UUID="fVtMKQ-0T7B-GoOK-7Nwp-ztlJ-6Phv-voNU7I" TYPE="LVM2_member"
/dev/sdh1: PARTUUID="3b4b0142-d2e0-a242-8364-2e1ff0a92374"
/dev/sda1: PARTLABEL="Microsoft reserved partition" PARTUUID="58a87572-0270-4c8e-a208-d38694bd4ab4"
/dev/sdd1: PARTUUID="1d42474a-b56a-784f-81b1-c706304fd39c"
root@BlackT:/#

Alles anzeigen

fdisk -l | grep "Disk "

Code

root@BlackT:/# fdisk -l | grep "Disk "
Partition 1 does not start on physical sector boundary.
Disk /dev/sdh: 465.8 GiB, 500107862016 bytes, 976773168 sectors
Disk model: ST3500312CS
Disk identifier: B3603173-9AE5-EF40-A292-CE8BE6ACFDEA
Disk /dev/sdg: 5.5 TiB, 6001175126016 bytes, 11721045168 sectors
Disk model: ST6000VN001-2BB1
Disk /dev/sdk: 74.5 GiB, 80026361856 bytes, 156301488 sectors
Disk model: ST3802110AS
Disk identifier: 0xb81e6b4e
Disk /dev/sdj: 2.7 TiB, 3000592982016 bytes, 5860533168 sectors
Disk model: HUS724030ALA640
Disk identifier: CCBD964C-7998-6742-926B-BD9002802AB8
Disk /dev/sdi: 1.8 TiB, 2000397852160 bytes, 3907027055 sectors
Disk model: ST2000DM001-1CH1
Disk identifier: 010E6671-827B-3847-AD30-1FCDD82016AC
Disk /dev/sdb: 2.7 TiB, 3000592982016 bytes, 5860533168 sectors
Disk model: WDC WD30EFRX-68E
Disk identifier: 4E45E18D-86DA-5441-A6C0-2CA23DDFB5FB
Disk /dev/sda: 5.5 TiB, 6001175126016 bytes, 11721045168 sectors
Disk model: ST6000VX0023-2EF
Disk identifier: ACBDD4CA-DA7A-4AB8-A6A8-CAF1CE604E3E
Disk /dev/sde: 1.8 TiB, 2000397852160 bytes, 3907027055 sectors
Disk model: ST2000DM001-1CH1
Disk identifier: AD9E85F7-13EA-194A-9AB6-E331511EA19F
Disk /dev/sdf: 2.7 TiB, 3000592982016 bytes, 5860533168 sectors
Disk model: ST3000DM008-2DM1
Disk identifier: 9599AE1C-8DA4-9445-B104-84B8E40614B9
Disk /dev/sdc: 465.8 GiB, 500106780160 bytes, 976771055 sectors
Disk model: WDC WD5000AADS-0
Disk identifier: 30AC752C-9901-014C-890F-42B95F39C70B
Disk /dev/sdd: 465.8 GiB, 500107862016 bytes, 976773168 sectors
Disk model: ST3500418AS
Disk identifier: D74FC5A4-8396-AD43-BEA3-E0C8AFC1EDE4
Disk /dev/md126: 1.8 TiB, 2000116776960 bytes, 3906478080 sectors
root@BlackT:/#

Alles anzeigen

cat /etc/mdadm/mdadm.conf

Code

root@BlackT:/# cat /etc/mdadm/mdadm.conf
# This file is auto-generated by openmediavault (https://www.openmediavault.org)
# WARNING: Do not edit this file, your changes will get lost.

# mdadm.conf
#
# Please refer to mdadm.conf(5) for information about this file.
#

# by default, scan all partitions (/proc/partitions) for MD superblocks.
# alternatively, specify devices to scan, using wildcards if desired.
# Note, if no DEVICE line is present, then "DEVICE partitions" is assumed.
# To avoid the auto-assembly of RAID devices a pattern that CAN'T match is
# used if no RAID devices are configured.
DEVICE partitions

# auto-create devices with Debian standard permissions
CREATE owner=root group=disk mode=0660 auto=yes

# automatically tag new arrays as belonging to the local system
HOMEHOST <system>
# instruct the monitoring daemon where to send mail alerts
MAILADDR vic**********@**********.com
MAILFROM root

# definitions of existing MD arrays
#INACTIVE-ARRAY /dev/md4 metadata=1.2 name=t-black:4 UUID=e26095ea:568e0e54:55a79e83:5f9eda43
#INACTIVE-ARRAY /dev/md126 metadata=1.2 name=localhost.localdomain:2 UUID=c3912ec0:9fd283df:d5ff048b:9effd01b
root@BlackT:/#

Alles anzeigen

mdadm --detail --scan --verbose

Code

root@BlackT:/# mdadm --detail --scan --verbose
INACTIVE-ARRAY /dev/md127 num-devices=5 metadata=1.2 name=BlackT.local:r5-2tb UUID=89f0f506:35179a63:ad97870b:92e66241
   devices=/dev/sdb1,/dev/sde1,/dev/sdf1,/dev/sdi1,/dev/sdj1
ARRAY /dev/md/BlackT.local:r5-1tb level=raid5 num-devices=3 metadata=1.2 spares=1 name=BlackT.local:r5-1tb UUID=137bec62:c8faf58a:f8793c53:72c5b132
   devices=/dev/sdb2,/dev/sdf2,/dev/sdj2
INACTIVE-ARRAY /dev/md2 num-devices=1 metadata=1.2 name=localhost.localdomain:2 UUID=c3912ec0:9fd283df:d5ff048b:9effd01b
   devices=/dev/sdc1
root@BlackT:/#

I do hope that you will be able to confirm that this mess that I'm in, can be sort out without loosing data.

Zitat

BTW OMV5 is EOL

I've seen now that there's a OMV6, but to be honest, I've always had this fear, that somehow in the future I will need up update my OMV to a new version and my NAS will fall apart, cause probably I will end up doing something wrong. Hopefully if I will be able to restore my NAS, I will consider to update, after educating myself about how to update.

Thank you so much, geaves, once again (and everyone else who will go through my post) for your help.

geaves · 26. Oktober 2023

Zitat von victort

You are right, one friend of mine, who's a DevOps Team Lead and a Linux expert, he discovered that OMV is based on Debian, and helped me to create the put the HDD in RAID, to create LVM and everything else need it, using PuTTY. He tried to configure the HDDs in such a way to get the most from my disks and keep my data as safe as possible. Because I was a total novice in OMV, and didn't know how to use it properly, I've accepted his help.

Whilst your friend is obviously knowledgeable with Linux, OMV is based upon the KIS principle (Keep it Simple), all that he has done could have been completed from OMV's GUI

Doing what he has done, may make sense to him but you have come to the forum to resolve an issue that is technically not OMV related due to the way it was set up.

Look at the output from cat /proc/mdstat, the array md126 contains drive references /dev/sdf2 and dev/sdj2, the array md127 contains drive references /dev/sdf1 and /dev/sdj1

So the drives /dev/sdf and dev/sdj are being used across 2 arrays due to the partitioning, something I would never do, because if there is an issue with the physical drive then it's going to reflect across both arrays

Going back to cat /proc/mdstat md2 is referencing 1 partition, I have no idea if this is a Raid1 or Raid5 if it's a Raid5 it could be toast

md126, this might be fixable as it's in an (auto-read-only) state as root from ssh try mdadm --readwrite /dev/md126

md127 is inactive probably due to the powering off, try mdadm --assemble --force --verbose /dev/md127 /dev/sdb1 /dev/sdf1 dev/sdi1 /dev/sdj1 /dev/sde1 this might reassemble the array, but at this moment in time I have no idea

victort · 3. November 2023

Zitat

Look at the output from cat /proc/mdstat, the array md126 contains drive references /dev/sdf2 and dev/sdj2, the array md127 contains drive references /dev/sdf1 and /dev/sdj1

So the drives /dev/sdf and dev/sdj are being used across 2 arrays due to the partitioning, something I would never do, because if there is an issue with the physical drive then it's going to reflect across both arrays

Yes, you are right with the disks. I remember that we've done that (me and my friend), in order to maximize the available space, and having in mind to change that, when new larger drives it would've been bought in the future, which it happen after a while, but never got the chance to find time and change that.

Zitat

Going back to cat /proc/mdstat md2 is referencing 1 partition, I have no idea if this is a Raid1 or Raid5 if it's a Raid5 it could be toast

So I think there 1 or 2 HDD in there (I can't remember, cause they were plugged in there long time ago, in the hope that I will make time to add them) that are not part of any array. Thank you for pointing that.

Zitat

md126, this might be fixable as it's in an (auto-read-only) state as root from ssh try mdadm --readwrite /dev/md126

I've tried that and:

Code

root@BlackT:/# mdadm --readwrite /dev/md126
mdadm: failed to set writable for /dev/md126: Device or resource busy

Zitat

md127 is inactive probably due to the powering off, try mdadm --assemble --force --verbose /dev/md127 /dev/sdb1 /dev/sdf1 dev/sdi1 /dev/sdj1 /dev/sde1 this might reassemble the array, but at this moment in time I have no idea

Code

root@BlackT:/# mdadm --assemble --force --verbose /dev/md127 /dev/sdb1 /dev/sdf1  dev/sdi1 /dev/sdj1 /dev/sde1
mdadm: looking for devices for /dev/md127
mdadm: /dev/sdb1 is busy - skipping
mdadm: /dev/sdf1 is busy - skipping
mdadm: dev/sdi1 is busy - skipping
mdadm: /dev/sdj1 is busy - skipping
mdadm: /dev/sde1 is busy - skipping

It looks like I didn't had too much luck at this time.

Will you have any other suggestions for me, please?

geaves · 3. November 2023

Zitat von victort

It looks like I didn't had too much luck at this time

At least your learning what not to do

OK the problem is the partitions across different arrays;

Try this one first;

mdadm --stop /dev/md127 you should get a message that the array has stopped, then;

mdadm --assemble --force --verbose /dev/md127 /dev/sdb1 /dev/sdf1 dev/sdi1 /dev/sdj1 /dev/sde1

if that works, then try stopping the array for md126 then run the --readwrite option in #4

None of this I am hopeful of at this moment in time, the --readwrite option can be run on an active array without it erroring

victort · 3. November 2023

Zitat

At least your learning what not to do

Yes, true. Thank you for your help, very much appreciated.

Zitat

mdadm --stop /dev/md127 you should get a message that the array has stopped

It looks like it really did. I've got this in return:

Code

root@BlackT:/# mdadm --stop /dev/md127
mdadm: stopped /dev/md127

Zitat

Code

mdadm --assemble --force --verbose /dev/md127 /dev/sdb1 /dev/sdf1  dev/sdi1 /dev/sdj1 /dev/sde1

After running it, I've got:

Code

root@BlackT:/# mdadm --assemble --force --verbose /dev/md127 /dev/sdb1 /dev/sdf1  dev/sdi1 /dev/sdj1 /dev/sde1
mdadm: looking for devices for /dev/md127
mdadm: /dev/sdb1 is identified as a member of /dev/md127, slot 1.
mdadm: /dev/sdf1 is identified as a member of /dev/md127, slot 4.
mdadm: dev/sdi1 is identified as a member of /dev/md127, slot 3.
mdadm: /dev/sdj1 is identified as a member of /dev/md127, slot 0.
mdadm: /dev/sde1 is identified as a member of /dev/md127, slot 2.
mdadm: forcing event count in /dev/sde1(2) from 155127 upto 156497
mdadm: added /dev/sdb1 to /dev/md127 as 1 (possibly out of date)
mdadm: added /dev/sde1 to /dev/md127 as 2
mdadm: added dev/sdi1 to /dev/md127 as 3
mdadm: added /dev/sdf1 to /dev/md127 as 4
mdadm: added /dev/sdj1 to /dev/md127 as 0
mdadm: /dev/md127 assembled from 4 drives - not enough to start the array.
root@BlackT:/#

Alles anzeigen

Zitat

if that works, then try stopping the array for md126 then run the --readwrite option in #4

I think you mean to run the command:

Code

mdadm --stop /dev/md126

, isn't? But not sure what "readwrite option in #4" means.

Code

root@BlackT:/# mdadm --stop /dev/md126
mdadm: Cannot get exclusive access to /dev/md126:Perhaps a running process, mounted filesystem or active volume group?

Is there anything else that I need to do?

geaves · 3. November 2023

OK try stopping md127, then run the readwrite command for md126, so mdadm --readwrite /dev/md126

All this is due to the cross partitions on the array, if the above doesn't work can you unmount md126 in filesystems

victort · 4. November 2023

After yesterday's post, I've turned off my NAS, and now is back on and I've run the commands you've advised (mentioning this because I don't know if this could affect the answer that I've got now and how yesterday commands are related with today's ones).

Most of the time, when I don't need my NAS, I turning it off, to save energy and because I only use it to store data on it and I don't need it daily.

I know OMV have an option to spin down/turn off the HDD that are not in use, but I don't know how to use it, and fear, I think, made me to stay away from things I don't know yet.

Zitat

OK try stopping md127, then run the readwrite command for md126, so mdadm --readwrite /dev/md126

Code

root@BlackT:/# mdadm --stop /dev/md127
mdadm: Cannot get exclusive access to /dev/md127:Perhaps a running process, mounted filesystem or active volume group?
root@BlackT:/# mdadm --readwrite /dev/md126
mdadm: failed to set writable for /dev/md126: Device or resource busy

geaves · 4. November 2023

TBH there is only one option left for you to try, each version of OMV has the capability to install systemrescuecd, I think in V5 it's in omv-extras this installs and boots once to effectively a command line live cd.

This works without the knowledge (best way to describe it) of OMV, but the mdadm commands will work, the device or resource busy is related to the cross usage of partitions, and each array pointing to shares. On a normally configured omv system this is not usually a problem but the way your system is configured it is.

I would suggest you try that, if that doesn't work, I honestly don't know what to suggest or than contacting the person who set this up

victort · 5. November 2023

Geaves, I just noticed that my NAS is showing now 9.09 TB, instead of 2TB couple of days ago and most of the data is there. I can't yet figure it out what's missing, but I still think that there are couple of TB missing. I think the whole space on it should've been 12TB, if I remember right.

Going a bit back through first posts and your first advice, and checking commands and replies that I've got in Putty, and making a comparison, it seems that both md126 and md127 are now working (being active), compared with first time, when only md126 was active.

Zitat

THEN

cat /proc/mdstat

Code

root@BlackT:/# cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4] [linear] [multipath] [raid0] [raid1] [raid10]
md2 : inactive sdc1[1](S)
      488252391 blocks super 1.2

md126 : active (auto-read-only) raid5 sdb2[1](S) sdf2[3] sdj2[0]
      1953239040 blocks super 1.2 level 5, 512k chunk, algorithm 2 [3/2] [U_U]
      bitmap: 7/8 pages [28KB], 65536KB chunk

md127 : inactive sdb1[1](S) sdf1[5](S) sdi1[3](S) sdj1[0](S) sde1[2](S)
      9766901955 blocks super 1.2

unused devices: <none>

Alles anzeigen

NOW

Code

root@BlackT:/# cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4] [linear] [multipath] [raid0] [raid1] [raid10]
md2 : inactive sdc1[1](S)
      488252391 blocks super 1.2

md126 : active raid5 sdj2[0] sdf2[3]
      1953239040 blocks super 1.2 level 5, 512k chunk, algorithm 2 [3/2] [U_U]
      bitmap: 7/8 pages [28KB], 65536KB chunk

md127 : active raid5 sdj1[0] sdi1[3] sde1[2] sdf1[5]
      7813521408 blocks super 1.2 level 5, 512k chunk, algorithm 2 [5/4] [U_UUU]
      bitmap: 15/15 pages [60KB], 65536KB chunk

Alles anzeigen

md2 still stubborn, still inactive.

Zitat

TBH there is only one option left for you to try, each version of OMV has the capability to install systemrescuecd, I think in V5 it's in omv-extras this installs and boots once to effectively a command line live cd.

I will consider this option as well. If I will face any challenges with systemrescuecd, I will comeback with questions.

Geaves, thank you so much for your support, time and dedication to solve my problem.

geaves · 7. November 2023

If you compare each output /dev/sdb (partitions /dev/sdb1 /dev/sdb2) are missing from md126 and md127, this would suggest a failed drive

victort · 12. November 2023

I've got the chance to spend some time with my friend, that helped me configure my NAS at the beginnings and these are our findings about all the hard drive within my system:

Code

root@BlackT:~# mdadm --stop /dev/md2
mdadm: stopped /dev/md2
root@BlackT:~# cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4] [linear] [multipath] [raid0] [raid1] [raid10]
md126 : active raid5 sdj2[0] sdf2[3]
      1953239040 blocks super 1.2 level 5, 512k chunk, algorithm 2 [3/2] [U_U]
      bitmap: 7/8 pages [28KB], 65536KB chunk

md127 : active raid5 sdj1[0] sdf1[5] sdi1[3] sde1[2]
      7813521408 blocks super 1.2 level 5, 512k chunk, algorithm 2 [5/4] [U_UUU]
      bitmap: 15/15 pages [60KB], 65536KB chunk

unused devices: <none>
root@BlackT:~# mdadm --assemble --force --verbose /dev/md2 /dev/sda1 /dev/sdd1 /dev/sdh1
mdadm: looking for devices for /dev/md2
mdadm: No super block found on /dev/sdd1 (Expected magic a92b4efc, got 00000000)
mdadm: no RAID superblock on /dev/sdd1
mdadm: /dev/sdd1 has no superblock - assembly aborted
root@BlackT:~# mdadm --assemble --force --verbose /dev/md2 /dev/sda1 /dev/sdh1
mdadm: looking for devices for /dev/md2
mdadm: No super block found on /dev/sdh1 (Expected magic a92b4efc, got 00000000)
mdadm: no RAID superblock on /dev/sdh1
mdadm: /dev/sdh1 has no superblock - assembly aborted
root@BlackT:~# mdadm --assemble --force --verbose /dev/md2 /dev/sda1 /dev/sdh1

Alles anzeigen

Code

All my HDDs and all the partitions are listed bellow:

/dev/sda: 465.8 GiB
/dev/sda1 465.8G  md2

/dev/sdb:   2.7TiB
/dev/sdb1   1.8T  md127 dead?
/dev/sdb2 931.5G  md126 dead?

/dev/sdc: 5.5TiB No raid partitions
/dev/sdc1  16M
/dev/sdc2 5.5T

/dev/sdd: 465.8GiB
/dev/sdd1 465.8G  md2? has no superblock

/dev/sde: 1.8TiB
/dev/sde1 1.8T    md127

/dev/sdf:   2.7TiB
/dev/sdf1   1.8T  md127
/dev/sdf2 931.5G  md126

/dev/sdh: 465.8GiB
/dev/sdh1 465.8G  md2? has no superblock

/dev/sdi: 1.8TiB
/dev/sdi1 1.8T     md127

/dev/sdj:    2.7TiB
/dev/sdj1    1.8T  md127
/dev/sdj2  931.5G  md126

/dev/sdk: 74.5GiB  OS
/dev/sdk1 66.6G
/dev/sdk2  7.9G
/dev/sdk5  7.9G

/dev/sdg: 5.5 TiB No raid

Alles anzeigen

So, based on these, our understanding is that there's a problem with all md2 hdds, or a part of them like 'sdh1' and 'sdd1' for which we get "has no superblock - assembly aborted". But it looks like for 'sda' we don't get that, and we were wondering if is something that we can to with this one, in order to sort the things with the other 2? Any suggestion how can we bring back in this situation the last part of my NAS, the 'md2' ?

At this moment it's even hard to determine if there was anything important on them or not on them (because on a first look, without testing the files, if all of them run ok or not, all the file are looking fine on my NAS, nothing missing). Is there a way to find out?

Thank you millions time for all your help and time geaves!

All the other HDDs it seems that are working, and at this moment, because the 2 x 6TB were empty, I use them to backup all my data from all the other NAS drives that have data on them.

Any suggestion (based on my hard drives availability) how to reconfigure all my HDD in my NAS, to make things more reliable and avoid similar future situations?

geaves · 13. November 2023

Zitat von victort

Any suggestion how can we bring back in this situation the last part of my NAS, the 'md2'

No, not possible......a Raid5 allows for one drive (partition) failure within an array, 2 and the array is toast, drive failure could be the drive going dead, intermittent i/o errors or superblock errors.

Normally there is a backup superblock on the drive/partition which one can attempt to restore to recover the superblock, but I have found attempting with other users on here has never worked. In your case this line:->

mdadm: No super block found on /dev/sdd1 (Expected magic a92b4efc, got 00000000)

gives zeros that's something I've never seen before on here nor from real world experience.

You could try searching for mdadm no super block found and you could try the various suggestions you will find but IMHO md2 is toast and any data on it has gone.

---------------------------------------------------------------------------------------------------------------------------------------------------

If you have backed up your data from the other two arrays to one of your 6TB drives at least that's a step in the right direction

The problem you have is your drive size mis match, from the information you have supplied

3 x 500Gb

2 x 2TB

2 x 3TB

2 x 6TB

I can see now why your friend did what he did and use partitions to maximise the space available, but as you've discovered when things go wrong it's spreads across all the arrays.

OMV uses the full block device (drive) to create an array using the GUI (OMV uses the KIS principle)

A Raid5 is made up of a minimum of 3 drives, lets say you create a Raid5 from 2 x 2TB and 2 x 3TB, the array will be create based upon the smallest drive/s within the array, so 2 x 2TB and 2 x 3TB will give you -> 6TB data capacity. But what you could do in the future is to replace each of the 2TB drives with 3TB drives and grow the array, giving you 9TB data capacity.

Your 6TB drives would then be used for data backup, this is a common failing amongst home users, they believe because they are using a Raid a backup is not necessary.

As for the 500GB drives they are really not worth using, I use a 300GB laptop drive in my system but that is purely for docker, docker compose and container configs as I use zfs.

Another option is to use mergerfs this would pool the 2 x 2TB and 2 x 3TB giving you 10TB of space, most use mergerfs with snapraid, however, snapraid was written primarily for use with large media files, whereby the data is not being changed on a regular basis. This is another learning curve and should not be used as a 'that sounds like a good idea'

I don't know enough about mergerfs but I'll tag a couple of users who may be able to help crashtest chente one caveat using mergerfs you will need to use one of those 500GB drives for docker, docker compose and docker configs, rather than point docker configs to a single drive within the pool. The 6TB drives would still be your backup drives and this can easily be set up using rsync.

chente · 13. November 2023

2 pools mergerfs + rsync

A simple setup could be to set up two pools of the same size and make copies from one to the other with rsync like this:

pool_1 formed by:

Data: 6TB + 6TB + 500GB + 500GB = 13TB

pool_2 formed by:

Data: 3TB + 3TB + 3TB + 2TB + 2TB = 13TB

Use pool_1 for your data and use pool_2 to make regular rsync copies of existing data in pool_1

You would have 13TB of data capacity and there is still another 500GB disk available that can be used for docker.

2 pools mergerfs + SnapRaid + rsync

If you want to complicate it a little more you can add SnapRaid to this configuration. You could do something like this:

pool_1 formed by:

Data: 3TB + 3TB + 2TB + 2TB + 500GB + 500GB = 11TB

Parity: 3TB

pool_2 formed by:

Data: 6TB + 6TB = 12TB

In this case you would have parity in pool_1 and a capacity of 11TB for data. You could set up regular syncs with rsync to pool_2. The last 500GB drive could be used for docker. If you are not going to use docker you can add this drive to pool_1 and you would have 11.5TB of capacity.

Regarding regular copies with rsync it is a simple way to make backups. This can be optimized through specialized backup applications such as openmediavault-borgbackup that make versioned backups and other options in the same space.

Regarding SnapRaid (or any type of Raid to add parity) as time goes by I become convinced that it is a waste of time. I don't think I've ever read anyone say they have corrupt file problems. That is a decision that you must make, but I can tell you that SnapRaid has a quite serious learning curve for novice users. mergerfs on the other hand has no secrets, it is very easy to configure and use.

Here you have the documentation you might need.

omv6:omv6_plugins:mergerfs [omv-extras.org]

omv6:omv6_plugins:snapraid [omv-extras.org]

chente · 13. November 2023

Zitat von chente

pool_1 formed by:
Data: 3TB + 3TB + 2TB + 2TB + 500GB + 500GB = 11TB
Parity: 3TB

Rereading this I see that it should be optimized a little more. Probably a single parity drive would not be enough for 6 data drives, it would be advisable to have at least two parity drives.

crashtest · 13. November 2023

Zitat von chente

Probably a single parity drive would not be enough for 6 data drives, it would be advisable to have at least two parity drives.

I'd agree with this. At a minimum, I'd use the newest (healthiest) 3TB drive for Parity. There's no restore without it.

Zitat von geaves

one caveat using mergerfs you will need to use one of those 500GB drives for docker, docker compose and docker configs, rather than point docker configs to a single drive within the pool.

Pointing Dockers to the mount point of a single physical drive is a work around for Dockers and SQL DB files. However, if the "Balance Tool" is used once (even accidentally) that might explode the Docker folder over all drives in the array, ruining the Docker install. It's far better to dedicate a small drive to Dockers and for other utility purposes, that is outside of a mergerfs pool. If SATA ports are lacking, these days, 256GB USB3 thumbdrives are reasonably priced.

geaves · 13. November 2023

Zitat von crashtest

However, if the "Balance Tool" is used once (even accidentally) that might explode the Docker folder over all drives in the array, ruining the Docker install.

Ah, didn't know that, I just assumed one could just use a single drive within the pool if a single was not available, won't mention that one again

OpenMediaVault5 NAS [RAID Management] "State: clean, degraded", [File Systems] "Status: Missing"

ryecoaaron 24. Oktober 2023

geaves 24. Oktober 2023

Jetzt mitmachen!

Tags