RAID 1 degraded, no apparent reason, unit cannot reboot or shutdown, SMART looks fine for both HDDs

    This site uses cookies. By continuing to browse this site, you are agreeing to our Cookie Policy.

    • RAID 1 degraded, no apparent reason, unit cannot reboot or shutdown, SMART looks fine for both HDDs

      Hello, I am struggling with an issue that has just shown up, after having OMV run with no issues for quite a while.

      All of the sudden, I cannot gain access to OMV from one Win7 computer. Win10 apparently works ok. When checking system status via GUI, I found that there were some pending updates, that I tried to run several times with no avail. Anyways, I was not quite concerned about the updates, so I kept searching for potential issues, and I found that even though the SMART HDD analysis looks fine, RAID status shows /dev/md0 active, degraded.
      No clue about what led to this situation, as I have a UPS to avoid brownouts or any power loss.

      cat /proc/mdstat

      root@OMV:~# cat /proc/mdstat
      Personalities : [raid1]
      md0 : active raid1 sdb[0](F) sdc[1]
      2930135360 blocks super 1.2 [2/1] [_U]
      unused devices: <none>

      root@OMV:~# fdisk -l | grep "Disk "
      Disk /dev/md0 doesn't contain a valid partition table
      Disk /dev/sdd doesn't contain a valid partition table
      Disk /dev/sde doesn't contain a valid partition table
      Disk /dev/sda: 160.0 GB, 160041885696 bytes
      Disk identifier: 0x00071626
      Disk /dev/md0: 3000.5 GB, 3000458608640 bytes
      Disk identifier: 0x00000000
      Disk /dev/sdd: 3000.6 GB, 3000592982016 bytes
      Disk identifier: 0x00000000
      Disk /dev/sde: 3000.6 GB, 3000592982016 bytes
      Disk identifier: 0x00000000


      root@OMV:~# cat /etc/mdadm/mdadm.conf
      # mdadm.conf
      #
      # Please refer to mdadm.conf(5) for information about this file.
      #
      # by default, scan all partitions (/proc/partitions) for MD superblocks.
      # alternatively, specify devices to scan, using wildcards if desired.
      # Note, if no DEVICE line is present, then "DEVICE partitions" is assumed.
      # To avoid the auto-assembly of RAID devices a pattern that CAN'T match is
      # used if no RAID devices are configured.
      DEVICE partitions
      # auto-create devices with Debian standard permissions
      CREATE owner=root group=disk mode=0660 auto=yes
      # automatically tag new arrays as belonging to the local system
      HOMEHOST <system>
      # definitions of existing MD arrays
      ARRAY /dev/md0 metadata=1.2 name=OMV:VolumeOne UUID=3dc20369:61bff7ce:f7e09590:7aee7194



      root@OMV:~# mdadm --detail --scan --verbose
      ARRAY /dev/md0 level=raid1 num-devices=2 metadata=1.2

      I have two WD RED 3TB drives as well as a small Hitachi 150GB drive for OS


      It looks pretty odd that I cannot reboot or shutdown the unit, either from the web GUI or using a terminal. I am not quite skilled on Linux, but I guess I know how to reboot or shutdown from command line.


      Your guidance will be appreciated, as I have no idea about what is going on here.


      Regards


      Alberto
    • mdadm --stop /dev/md0
      mdadm --assemble --force --verbose /dev/md0 /dev/sd[ed]

      if that starts the rebuild, then omv-mkconf mdadm
      omv 4.1.11 arrakis | 64 bit | 4.15 proxmox kernel | omvextrasorg 4.1.11
      omv-extras.org plugins source code and issue tracker - github

      Please read this before posting a question and this and this for docker questions.
      Please don't PM for support... Too many PMs!
    • Hello, Thank you for the feedback.

      I ran mdadm to no avail, as follows:

      root@OMV:~# mdadm --stop /dev/md0
      mdadm: Cannot get exclusive access to /dev/md0:Perhaps a running process, mounted filesystem or active volume group?
      root@OMV:~# sudo mdadm --stop /dev/md0
      mdadm: Cannot get exclusive access to /dev/md0:Perhaps a running process, mounted filesystem or active volume group?

      In case I can somehow stop RAID, how can I identify the supposedly missing (damaged) drive to run next command? (the [ed] between brackets).

      mdadm --assemble --force --verbose /dev/md0 /dev/sd[ed]


      Thank you!


      Alberto
    • I forgot that since it is still assembled but degraded that the filesystem would still be mounted. You need to unmount the filesystem before running the command. Use the umount command.

      alocam wrote:

      In case I can somehow stop RAID, how can I identify the supposedly missing (damaged) drive to run next command? (the [ed] between brackets).
      You are assuming the drive is damaged. Just because an array didn't assembled doesn't mean the drive is damaged. This is a big reason I would normally tell someone to not use raid. With mirror, I think rsyncing the two drives on a regular basis is better than using mdadm mirroring.
      omv 4.1.11 arrakis | 64 bit | 4.15 proxmox kernel | omvextrasorg 4.1.11
      omv-extras.org plugins source code and issue tracker - github

      Please read this before posting a question and this and this for docker questions.
      Please don't PM for support... Too many PMs!
    • Hello,

      I tried umount:

      root@OMV:~# umount /dev/md0
      umount: /media/3156d0bd-44c3-4046-a2d0-ecf6331ef93f: device is busy.
      (In some cases useful info about processes that use
      the device is found by lsof(8) or fuser(1))

      I guess the disks are ok, as you said on previous message. Once I get this thing back on track I will seriously evaluate whether keeping RAID or using mirroring as you propose. RAID seems to be more complex to fix in case things go wrong.

      Regards

      Alberto
    • alocam wrote:

      I tried umount:
      You probably have services using that filesystem. Booting into a rescue distro like systemrescuecd is probably the best way to do this.
      omv 4.1.11 arrakis | 64 bit | 4.15 proxmox kernel | omvextrasorg 4.1.11
      omv-extras.org plugins source code and issue tracker - github

      Please read this before posting a question and this and this for docker questions.
      Please don't PM for support... Too many PMs!