Filesystem is missing / RAID disappeard

  • Hi


    I need some help for rebuilding my RAID1.

    I ran OMV since many years, since a couple of weeks one HDD is unhealthy (S.M.A.R.T. values). So I bought an replacement disk.


    Two days ago I had a very high CPU usage and many warnings according that. So I did a reboot of OMV without any investigations. After the reboot the RAID was disappeared.


    What can I do to get the RAID back or rebuild it directly with my replacement disk?

    Code
    root@centaurus:~# cat /proc/mdstat 
    Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] 
    md127 : inactive sdc[1](S)
          3906887512 blocks super 1.2
           
    unused devices: <none>

    That output is weird. I'm very sure, that my RAID was RAID1 with two devices.


    Code
    root@centaurus:~# blkid
    /dev/sdc: UUID="c0a1e9a7-7c39-191e-3222-1cfbcd271d52" UUID_SUB="1111d18a-71a4-f9b8-3c83-29b9734bcd56" LABEL="magellan:DatenNAS" TYPE="linux_raid_member"
    /dev/sdb1: UUID="7AA4-2D1C" BLOCK_SIZE="512" TYPE="vfat" PARTUUID="d8798485-2f81-4c7b-9c72-f8430423b800"
    /dev/sdb2: UUID="4cc12b30-85e1-459e-8c29-9d5fcaa471bc" BLOCK_SIZE="4096" TYPE="ext4" PARTUUID="454b6af2-bd1a-4570-81b1-1ae0f4fb8e98"
    /dev/sdb3: UUID="7dffdff3-c63a-48c7-b635-0b1adbd8d8ca" TYPE="swap" PARTUUID="90b31067-837b-432c-9300-d3dfc84c2b43"

    I don't know why blkid doesn't schow /dev/sda. In the WebUI it is there:


    Code
    root@centaurus:~# fdisk -l | grep "Disk "
    Disk /dev/sdc: 3.64 TiB, 4000787030016 bytes, 7814037168 sectors
    Disk model: WDC WD40EFRX-68N
    Disk /dev/sda: 3.64 TiB, 4000787030016 bytes, 7814037168 sectors
    Disk model: WDC WD40EFRX-68N
    Disk /dev/sdb: 232.89 GiB, 250059350016 bytes, 488397168 sectors
    Disk model: Samsung SSD 870 
    Disk identifier: 0A7E0373-9E5E-40D3-8230-F064D704264F


    /dev/sda has some S.M.A.R.T. errors.

  • Magellan

    Added the Label OMV 6.x
    • Official Post

    /dev/sda has some S.M.A.R.T. errors.

    Could be reason why it has dropped from the array and the fact blkid cannot locate it, the array is currently inactive.


    ssh into omv and try;


    mdadm --stop /dev/md127 then


    mdadm --assemble /dev/md127

    Raid is not a backup! Would you go skydiving without a parachute?


    OMV 6x amd64 running on an HP N54L Microserver

  • Thanks.


    I tried so:

    Quote

    root@centaurus:~# mdadm --stop /dev/md127

    mdadm: stopped /dev/md127

    Quote

    root@centaurus:~# mdadm --assemble /dev/md127

    mdadm: /dev/md127 not identified in config file.


    That seems not to work.


    So I tried it more explicit:


    Quote

    root@centaurus:~# mdadm --assemble --verbose /dev/md127 /dev/sd[ab]

    mdadm: looking for devices for /dev/md127

    mdadm: no recogniseable superblock on /dev/sda

    mdadm: /dev/sda has no superblock - assembly aborted


    What does that mean? Is my /dev/sda broken? Is there any chance to rebuild the RAID with my replacement disk?


    mdadm.conf for information:

    • Official Post

    1) What does that mean?

    2) Is my /dev/sda broken?

    3) Is there any chance to rebuild the RAID with my replacement disk?

    1) It means that mdadm cannot find/locate the superblock on that drive

    2) Don't know, at this moment mdadm does not recognise the drive as being part of an array

    3) Probably


    Question, why did you run this line -> mdadm --assemble --verbose /dev/md127 /dev/sd[ab] and what is wrong with it, look at the output from your first post, cat proc/mdstat, blkid, and the screenshot of storage -> disks

  • Question, why did you run this line -> mdadm --assemble --verbose /dev/md127 /dev/sd[ab]

    I thougth it would be a good idea to give the disks to use. Maybe a bad idea. And yes, it is wrong! It should be mdadm --assemble --verbose /dev/md127 /dev/sd[ac]


    Should I try that?

  • Code
    root@centaurus:~# mdadm --assemble --verbose /dev/md127 /dev/sd[ac]
    mdadm: looking for devices for /dev/md127
    mdadm: no recogniseable superblock on /dev/sda
    mdadm: /dev/sda has no superblock - assembly aborted

    What's next? :)

  • Code
    root@centaurus:~# mdadm --stop /dev/md127
    mdadm: error opening /dev/md127: No such file or directory
    
    root@centaurus:~# mdadm --assemble --force --verbose /dev/md127 /dev/sd[ac]
    mdadm: looking for devices for /dev/md127
    mdadm: no recogniseable superblock on /dev/sda
    mdadm: /dev/sda has no superblock - assembly aborted


    same output as in post #7.


    Any ideas how to go further? I'm quite lost.

  • Code
    root@centaurus:~# cat /proc/mdstat
    Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
    unused devices: <none>
    
    root@centaurus:~# blkid
    /dev/sdc: UUID="c0a1e9a7-7c39-191e-3222-1cfbcd271d52" UUID_SUB="1111d18a-71a4-f9b8-3c83-29b9734bcd56" LABEL="magellan:DatenNAS" TYPE="linux_raid_member"
    /dev/sdb1: UUID="7AA4-2D1C" BLOCK_SIZE="512" TYPE="vfat" PARTUUID="d8798485-2f81-4c7b-9c72-f8430423b800"
    /dev/sdb2: UUID="4cc12b30-85e1-459e-8c29-9d5fcaa471bc" BLOCK_SIZE="4096" TYPE="ext4" PARTUUID="454b6af2-bd1a-4570-81b1-1ae0f4fb8e98"
    /dev/sdb3: UUID="7dffdff3-c63a-48c7-b635-0b1adbd8d8ca" TYPE="swap" PARTUUID="90b31067-837b-432c-9300-d3dfc84c2b43"
    • Official Post

    Bummer, that would explain this line -> mdadm: error opening /dev/md127: No such file or directory from #9


    At this moment it appears the system/mdadm is not finding any arrays, which means there's nothing that can be done, try a reboot and run the above commands again

  • OK, I rebooted OMV.


    After the reboot the disk with S.M.A.R.T. errors is /dev/sdc.


    /dev/md127 is available again:

    Code
    root@centaurus:~# cat /proc/mdstat
    Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
    md127 : inactive sda[1](S)
          3906887512 blocks super 1.2
    
    unused devices: <none>

    commands from above:

    Code
    root@centaurus:~# mdadm --stop /dev/md127
    mdadm: stopped /dev/md127
    Code
    root@centaurus:~# mdadm --assemble --force --verbose /dev/md127 /dev/sd[ac]
    mdadm: looking for devices for /dev/md127
    mdadm: Cannot read superblock on /dev/sdc
    mdadm: no RAID superblock on /dev/sdc
    mdadm: /dev/sdc has no superblock - assembly aborted


    Still no luck with re-creating the RAID.


    I think this disk with errors is over. I there any chance to include my new replacement disk into the RAID?

    • Official Post

    I there any chance to include my new replacement disk into the RAID

    No until you can actually get the array started you add another disk.


    What makes no sense here is the fact that the array will not start with just one drive, which technically it should, at this moment in time for some reason this is not recoverable.


    What's the output of blkid

  • I thought I created an RAID1. But as mdadm —detail /dev/md127 shows (first post), it is a RAID0. Right?


    Does it not start with one disk because it is a RAID0?


    Output of blkid:

    Code
    root@centaurus:~# blkid
    /dev/sda: UUID="c0a1e9a7-7c39-191e-3222-1cfbcd271d52" UUID_SUB="1111d18a-71a4-f9b8-3c83-29b9734bcd56" LABEL="magellan:DatenNAS" TYPE="linux_raid_member"
    /dev/sdb1: UUID="7AA4-2D1C" BLOCK_SIZE="512" TYPE="vfat" PARTUUID="d8798485-2f81-4c7b-9c72-f8430423b800"
    /dev/sdb2: UUID="4cc12b30-85e1-459e-8c29-9d5fcaa471bc" BLOCK_SIZE="4096" TYPE="ext4" PARTUUID="454b6af2-bd1a-4570-81b1-1ae0f4fb8e98"
    /dev/sdb3: UUID="7dffdff3-c63a-48c7-b635-0b1adbd8d8ca" TYPE="swap" PARTUUID="90b31067-837b-432c-9300-d3dfc84c2b43"
  • I've seen that before with users with a raid5

    OK.


    /dev/sda is the healty disk after reboot:


    Examine /dev/sdc:

    Code
    root@centaurus:~# mdadm --examine /dev/sdc
    mdadm: No md superblock detected on /dev/sdc.


    Thanks for your help bytheway...!

    • Official Post

    Well that answers some questions, but what makes no sense is why the array aborts when attempting to re assemble, when one drive is ok.


    Try this;


    mdadm --stop /dev/md127


    mdadm --assemble --force --verbose /dev/md127 /dev/sda


    cat /proc/mdstat


    DO NOT RUN ANYTHING ELSE :)

  • DO NOT RUN ANYTHING ELSE

    Got it.


    Code
    root@centaurus:~# mdadm --stop /dev/md127
    mdadm: error opening /dev/md127: No such file or directory


    Code
    root@centaurus:~# mdadm --assemble --force --verbose /dev/md127 /dev/sda
    mdadm: looking for devices for /dev/md127
    mdadm: /dev/sda is identified as a member of /dev/md127, slot 1.
    mdadm: no uptodate device for slot 0 of /dev/md127
    mdadm: added /dev/sda to /dev/md127 as 1
    mdadm: /dev/md127 has been started with 1 drive (out of 2).

    That looks good so far...

    Code
    root@centaurus:~# cat /proc/mdstat
    Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [ra                                         id10]
    md127 : active (auto-read-only) raid1 sda[1]
          3906887488 blocks super 1.2 [2/1] [_U]
          bitmap: 0/30 pages [0KB], 65536KB chunk
    
    unused devices: <none>

    Does that mean we have a RAID1 again with 1 working disk?


    Seems like:

    Next?

Participate now!

Don’t have an account yet? Register yourself now and be a part of our community!