Data loss during rebuild

  • Hello,


    one of my RAID 5 disks reported more and more read errors and the smart state turned to red after some time so I decided to exchange it.

    I failed the disk and removed it via mdadm, powered down the system, exchanged the disk and added the new disk to the md device (mdadm).

    The rebuild started automatically, nothing thrilling so far.


    After some hours I become a bit impatient and copied ~ 100GB data to the shared volume, with the md device beyond still recovering.

    The copy job finished fine, the free space shrink as expected from 7.6 to 7.5TB, nothing to worry about.


    And then, some minutes after the copy job finished successfully, the data were accessible, the free space switched suddenly from 7.5 to 10.3TB

    Also after the rebuild finished several hours later it was still on 10.3TB.


    cat /proc/mdstat

    show that the RAID 5 is fine


    /usr/share/mdadm/checkarray -a -i /dev/md0

    show no issues


    xfs_repair /dev/md0

    show no issues


    as per the syslog no issues with any of the disks at the time the issue occured


    However I miss ~ 3TB which are ~ 300 Files


    Any idea what happened with this RAID 5?

    In the moment I feel a bit uncomfortable with this storage...


    Maybe an idea how I can get the missing data back?

    In the moment in think about xfs_undelete (https://github.com/ianka/xfs_undelete) or maybe the UFS Explorer RAID Recovery.



    Best Regards

    Roland

  • crashtest

    Approved the thread.
  • Hello there.

    Did you really replace the failed disk or did you add another disk to your array? Seems to me you simply extended your array.

  • Hi,


    I definitely replaced it:


    mdadm --manage /dev/md0 --fail /dev/sdc

    mdadm --manage /dev/md0 --remove /dev/sdc


    shutdown and replacement of the faulty disk


    mdadm --manage /dev/md0 --add /dev/sdc

  • Hello.

    It is the first time for me to see something like this. The only thing I do differently before I add the drive, I always recreate the partition layout first.

  • I dont have any partitions on the disk, this was a desicion years ago were I thought they are not that important if I use the complete disk anyway.

    In the meantime I think it would be better, simply in case I like to use LVMs in the future.

    Anyhow, it should make no difference in the rebuild process or stability of the RAID.

Participate now!

Don’t have an account yet? Register yourself now and be a part of our community!