Lost all drives

  • root@helios4:~# cat /proc/mdstat
    Personalities : [raid10]
    md0 : active (auto-read-only) raid10 sdc[6] sdd[5] sda[7] sdb[4]
    15627790336 blocks super 1.2 512K chunks 2 near-copies [4/4] [UUUU]
    bitmap: 0/117 pages [0KB], 65536KB chunk


    unused devices: <none>

  • root@helios4:~# cat /proc/mdstat
    Personalities : [raid10]
    md0 : active raid10 sdc[6] sdd[5] sda[7]
    15627790336 blocks super 1.2 512K chunks 2 near-copies [4/3] [U_UU]
    bitmap: 10/117 pages [40KB], 65536KB chunk


    unused devices: <none>


    THE FILES ARE BACK!!!


    Thank you, thank you, thank you!


    I did notice that as the system came back up it ran an fsck on the raid and corrected errors.


    Now that its back can you give me a synopsis of what happened? Note: I was a UNIX admin *years and years* ago, but got away from it, and UNIX/LINUX for a while so I'd really like to understand better. Again, thank you.

    • Offizieller Beitrag

    I did notice that as the system came back up it ran an fsck on the raid and corrected errors.

    Sometimes a reboot just resolves what appears to be the obvious.


    Now that its back can you give me a synopsis of what happened?

    The power outage caused the problem and probably poorly seated connections, but because mdadm is software if something goes wrong it doesn't know what it is, unlike a hardware raid. So you have to execute a few commands to find the cause the main one here being cat /proc/mdstat, that gives you the state of the array, blkid tells you what drives are part of it, from there you can make a start in putting it back together.
    The most common problem for raid errors are power loss, and a user assuming that mdadm is 'hot swap', which it isn't, a swapped out drive causes the array to go inactive because it wasn't removed from mdadm.
    In the UK power cuts are rare but I know from US users a UPS is a recommended necessity.


    But at least it's running again :thumbup:

  • The sad part is that the system IS on a UPS. Apparently I have it set wrong as it shut down within seconds of the power blip.


    Anyway to get that B drive back in without wiping it first?

  • root@helios4:~# blkid
    /dev/mmcblk0p1: UUID="1f489a8c-b3a3-4218-b92b-9f1999841c52" TYPE="ext4" PARTUUID="7fb57f23-01"
    /dev/sda: UUID="d1e18bf2-0b0e-760b-84be-c773f4dbf945" UUID_SUB="9495186e-6df6-a7b1-c67b-4fd4ca1d6468" LABEL="helios4:Store" TYPE="linux_raid_member"
    /dev/sdb: UUID="d1e18bf2-0b0e-760b-84be-c773f4dbf945" UUID_SUB="253f9091-6914-fe71-ab40-68961aa3dbb6" LABEL="helios4:Store" TYPE="linux_raid_member"
    /dev/sdc: UUID="d1e18bf2-0b0e-760b-84be-c773f4dbf945" UUID_SUB="3186ee11-0837-b283-c653-37e39d1923d8" LABEL="helios4:Store" TYPE="linux_raid_member"
    /dev/md0: UUID="GmgEll-khiX-a7DB-5HNZ-KGRm-5vGq-1vPV4w" TYPE="LVM2_member"
    /dev/sdd: UUID="d1e18bf2-0b0e-760b-84be-c773f4dbf945" UUID_SUB="0da721df-e67c-8141-cc93-afe7e2e66f7a" LABEL="helios4:Store" TYPE="linux_raid_member"
    /dev/mapper/Store-Store: LABEL="Store" UUID="6c7b4b44-4cae-4169-95fe-d9a14d04e814" TYPE="ext4"
    /dev/zram0: UUID="3537eff4-7cb1-46ad-8814-b3d735002195" TYPE="swap"
    /dev/zram1: UUID="a00f6e11-8359-4182-beae-058c4ccb0375" TYPE="swap"
    /dev/mmcblk0: PTUUID="7fb57f23" PTTYPE="dos"
    /dev/mmcblk0p2: PARTUUID="7fb57f23-02"


    root@helios4:~# mdadm --detail /dev/md0
    /dev/md0:
    Version : 1.2
    Creation Time : Sun Feb 18 14:53:39 2018
    Raid Level : raid10
    Array Size : 15627790336 (14903.82 GiB 16002.86 GB)
    Used Dev Size : 7813895168 (7451.91 GiB 8001.43 GB)
    Raid Devices : 4
    Total Devices : 3
    Persistence : Superblock is persistent


    Intent Bitmap : Internal


    Update Time : Sun Jan 26 10:07:54 2020
    State : clean, degraded
    Active Devices : 3
    Working Devices : 3
    Failed Devices : 0
    Spare Devices : 0


    Layout : near=2
    Chunk Size : 512K


    Name : helios4:Store (local to host helios4)
    UUID : d1e18bf2:0b0e760b:84bec773:f4dbf945
    Events : 116671


    Number Major Minor RaidDevice State
    6 8 32 0 active sync set-A /dev/sdc
    2 0 0 2 removed
    7 8 0 2 active sync set-A /dev/sda
    5 8 48 3 active sync set-B /dev/sdd

    • Offizieller Beitrag

    I've had this happen before and from the gui I can only get the missing drive back in if I wipe it first.

    That's the correct way to do it, but we could try this first,


    mdadm --stop /dev/md0


    mdadm --add /dev/sdb /dev/md0

Jetzt mitmachen!

Sie haben noch kein Benutzerkonto auf unserer Seite? Registrieren Sie sich kostenlos und nehmen Sie an unserer Community teil!