strange filesystem notifications after drive replacement

  • Yesterday I had to replace a drive in my RAID5 due to some smart errors I didn't like to see. After doing it, I have been getting filesystem monitoring notifications stating the following:


    Service: mountpoint_srv_dev-disk-by-uuid-4d26adbd-029a-4d74-a81f-c69bf193a099

    Event: Status failed

    Description: status failed (32) -- /srv/dev-disk-by-uuid-4d26adbd-029a-4d74-a81f-c69bf193a099 is not a mountpoint


    and:


    Service: filesystem_srv_dev-disk-by-uuid-4d26adbd-029a-4d74-a81f-c69bf193a099

    Event: Does not exist

    Description: unable to read filesystem '/srv/dev-disk-by-uuid-4d26adbd-029a-4d74-a81f-c69bf193a099' state

    This triggered the monitoring system to: restart


    This mountpoint is indeed the one used by the RAID5 and it does exist.


    This only started after replacing the drive.


    Any help or ideas are welcome.

    Asrock B450M, AMD 5600G, 64GB RAM, 6 x 4TB RAID 5 array, 2 x 10TB RAID 1 array, 100GB SSD for OS, 1TB SSD for docker and VMs, 1TB external SSD for fsarchiver OS and docker data daily backups

    • Official Post

    You could try to redeploy the configuration of monit with

    omv-salt deploy run monit


    If that does not solve the issue, check the configuration file

    /etc/monit/conf.d/openmediavault-filesystem.conf


    Did you check, if /srv/dev-disk-by-uuid-4d26adbd-029a-4d74-a81f-c69bf193a099 is actually a mountpoint and not only a directory. Could the mountpoint of the filesystem on the RAID5 be another one (now)?

  • Thanks,


    I'll try the monit related things you mentioned.


    The mountpoint is correct. I can see all the correct data content of the array there, and if the array is unmounted there is no content visible there.

    Asrock B450M, AMD 5600G, 64GB RAM, 6 x 4TB RAID 5 array, 2 x 10TB RAID 1 array, 100GB SSD for OS, 1TB SSD for docker and VMs, 1TB external SSD for fsarchiver OS and docker data daily backups

  • I just did the monit re-deploy and a reboot.


    The array seems to have mounted where it should, but I think it was slow to mount as my nextcloud lxc tried to start before the the mount and could not run because it couldn't find the data directory I use as a passthrough filesystem to it.


    It's rather perplexing.


    Here are the outputs. The mount one I had to attach as a text files because it is too big.

    Files

    Asrock B450M, AMD 5600G, 64GB RAM, 6 x 4TB RAID 5 array, 2 x 10TB RAID 1 array, 100GB SSD for OS, 1TB SSD for docker and VMs, 1TB external SSD for fsarchiver OS and docker data daily backups

    • Official Post

    Hmmm, looks good IMHO. The fact that you get the monit notification is that the filesystem was not available at the time monit executed the check. When the check fails 2 times, monit is remounting the filesystem. That might explain why the file system is now shown as mounted.

  • Hmmm, looks good IMHO. The fact that you get the monit notification is that the filesystem was not available at the time monit executed the check. When the check fails 2 times, monit is remounting the filesystem. That might explain why the file system is now shown as mounted.

    Yeah, it's a little strange. At seems as though the array assembly is slow when starting the system up. I have had to replace drives before without a problem. This time however, something is a bit strange.


    I will have to do a bit of digging into mdadm I guess to see if I can find any problems with the assembly, or perhaps a omv-salt deploy run mdadm incase something is a bit "out of sync" for some reason

    Asrock B450M, AMD 5600G, 64GB RAM, 6 x 4TB RAID 5 array, 2 x 10TB RAID 1 array, 100GB SSD for OS, 1TB SSD for docker and VMs, 1TB external SSD for fsarchiver OS and docker data daily backups

  • UPDATE


    I ran omv-salt deploy run mdadm to ensure the mdadm config was correct. I also reseated the data cables to the drives, just in case they were causing a problem.


    The array now assembles quickly and is mounted before docker and kvm starts, and I am getting no more mdadm/mountpoint errors on boot. The filesystem passthrough to kvm still seems to not be passed through before my nextcloud lxc starts, so this particular quirk is still there, and is likely related to a more recent kvm/lib-virt update, but I can't really pinpoint the time this may have started, as my systyem is only shutdown/rebooted when needed.


    As a workaround, I set up a bash script on an on boot cron job in the lxc to check for the existence of a .ncdata file I created in the nextcloud data directory, and reboot the lxc if the files does not exist.

    Asrock B450M, AMD 5600G, 64GB RAM, 6 x 4TB RAID 5 array, 2 x 10TB RAID 1 array, 100GB SSD for OS, 1TB SSD for docker and VMs, 1TB external SSD for fsarchiver OS and docker data daily backups

  • BernH

    Added the Label resolved
  • BernH

    Added the Label OMV 7.x

Participate now!

Don’t have an account yet? Register yourself now and be a part of our community!