Raid5 inactive- Need help please

  • Hi,


    I have a raid 5 with 4 disks. It seems that 2 disks has SMART errors (sda and sdd). So I wanted to change them with new disk. First of all I tried to change sdd. So I replaced the disk and add it to the raid via omv web interface. I add an error on reading sda. Next I replaced the old disk sdd, replaced the sda disk with another new one and raid was not visible anymore. I got the raid back thanks to mdadm --assemble --force /dev/md127 /dev/sd[a-d]


    What is the good way to change my disk and don't loose my data ?


    Here the system informations



    cat /proc/mdstat 
    Personalities : [raid6] [raid5] [raid4] 
    md127 : active (auto-read-only) raid5 sda[0] sdc[4] sdb[1]
      5860538880 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/3] [UUU_]


    blkid
    /dev/sdb: UUID="f32f2b91-1c43-ee2a-f2ec-e6be62c39a89" UUID_SUB="a09edadb-246f-7a15-f5ec-b5138b98dfb2" LABEL="Amelia:NasRaid5" TYPE="linux_raid_member"
    /dev/sdc: UUID="f32f2b91-1c43-ee2a-f2ec-e6be62c39a89" UUID_SUB="d67bf01d-6c82-af37-3a4a-5b8a2c4de855" LABEL="Amelia:NasRaid5" TYPE="linux_raid_member" 
    /dev/sdd: UUID="f32f2b91-1c43-ee2a-f2ec-e6be62c39a89" UUID_SUB="417ab984-bece-f637-a267-fe7817430ba8" LABEL="Amelia:NasRaid5" TYPE="linux_raid_member"
    /dev/sda: UUID="f32f2b91-1c43-ee2a-f2ec-e6be62c39a89" UUID_SUB="c2041b86-fc04-258d-d907-40281f45c82a" LABEL="Amelia:NasRaid5" TYPE="linux_raid_member" 
    /dev/md127: LABEL="data" UUID="aeef4793-13f1-40ef-8704-03cc7f4bed77" TYPE="ext4"


    fdisk -l | grep "Disk "
    Disk /dev/sda doesn't contain a valid partition table
    Disk /dev/sdb doesn't contain a valid partition table
    Disk /dev/sdc doesn't contain a valid partition table
    Disk /dev/sdd doesn't contain a valid partition table
    Disk /dev/md127 doesn't contain a valid partition table
    Disk /dev/sda: 2000.4 GB, 2000398934016 bytes
    Disk identifier: 0x00000000
    Disk /dev/sdb: 2000.4 GB, 2000398934016 bytes
    Disk identifier: 0x00000000
    Disk /dev/sdc: 2000.4 GB, 2000398934016 bytes
    Disk identifier: 0x00000000
    Disk /dev/sdd: 2000.4 GB, 2000398934016 bytes
    Disk identifier: 0x00000000
    Disk /dev/sde: 40.0 GB, 40020664320 bytes
    Disk identifier: 0x00081194
    Disk /dev/md127: 6001.2 GB, 6001191813120 bytes
    Disk identifier: 0x00000000


    cat /etc/mdadm/mdadm.conf
    # mdadm.conf
    #
    # Please refer to mdadm.conf(5) for information about this file.
    #



    # by default, scan all partitions (/proc/partitions) for MD superblocks.
    # alternatively, specify devices to scan, using wildcards if desired.
    # Note, if no DEVICE line is present, then "DEVICE partitions" is assumed.
    # To avoid the auto-assembly of RAID devices a pattern that CAN'T match is
    # used if no RAID devices are configured.
    DEVICE partitions



    # auto-create devices with Debian standard permissions
    CREATE owner=root group=disk mode=0660 auto=yes



    # automatically tag new arrays as belonging to the local system
    HOMEHOST <system>



    # definitions of existing MD arrays
    ARRAY /dev/md/NasRaid5 metadata=1.2 spares=1 name=Amelia:NasRaid5 UUID=f32f2b91:1c43ee2a:f2ece6be:62c39a89



    mdadm --detail --scan --verbose
    ARRAY /dev/md127 level=raid5 num-devices=4 metadata=1.2 name=Amelia:NasRaid5 UUID=f32f2b91:1c43ee2a:f2ece6be:62c39a89
      devices=/dev/sda,/dev/sdb,/dev/sdc

  • Hi,


    Thanks for your answer. It is what I m doing. I don't want to loose my data but I have not enough space to backup all and it is very long to backup because of the very small CPU which is working a lot to compute the files missing part. Once all my precious preeeeciiiousss data will be saved into another place, I ll be back to ask the better way to change my disks.

  • I cannot backup anymore, too much IO error.


    I will start to replace sdd because it is not visible by the array. I will replace the disk and run
    mdadm /dev/md127 -a /dev/sddIs it the best way to replace the disk ? What should I do if recovery failed ?

  • Reading data, to backup, is far easier for failing disks to withstand than a recovery/rebuild would be. If you can't backup files anymore, I doubt that it's possible to successfully recover a disk.
    (But I'll cross my fingers for you.)

    Thanks for your help and for crossing your fingers for me ;-).


    I had to power off to replace the disk and as expected at startup array didn't start "not enough drives to start the array".
    Then I tried to run

    Code
    mdadm --assemble --force /dev/md127 /dev/sd[b-d]
    mdadm: forcing event count in /dev/sdd(3) from 3594801 upto 3676860
    mdadm: clearing FAULTY flag for device 2 in /dev/md127 for /dev/sdd
    mdadm: Marking array /dev/md127 as 'clean'
    mdadm: /dev/md127 assembled from 3 drives - not enough to start the array.

    Is there a way to rebuild the array allowing data loss ? Or is it really over and I have to create a new array ?

  • I have 4To and I can only backup 1To. Not all 4To is painful to loose.


    Good news. Recovery for one disk is OK. Bad news. Data are still unreachable.


    Do you think I can try to repair filesystem or I recover the second failed disk ?

  • I decided to repair filesystem before changing the 2nd disk.


    The raid array is clean now but the partition is not mount.
    When I mount it I have this error

    Code
    #mount -a
    mount: wrong fs type, bad option, bad superblock on /dev/md127,


    dmesg give me the real error


    Code
    JBD2: no valid journal superblock found


    I repaired this error with



    Code
    mke2fs -t ext4 -O ^has_journal /dev/md127


    Then mount is OK but still no data. Currently I am checking filesystem with

    Code
    #fsck.ext4 -Dcf -C 0 /dev/md127
    e2fsck 1.42.5 (29-Jul-2012)
    Vérification des blocs défectueux (test en mode lecture seule) :  11.55% effectué, 47:26 écoulé. (0/0/0 erreurs)

    I hope it will give good results....

Jetzt mitmachen!

Sie haben noch kein Benutzerkonto auf unserer Seite? Registrieren Sie sich kostenlos und nehmen Sie an unserer Community teil!