MDADM and other Queries

    This site uses cookies. By continuing to browse this site, you are agreeing to our Cookie Policy.

    • MDADM and other Queries

      Hi all,
      It seems for some reason that my RAID6 array on my Thecus 5200Pro decided to resync. This battered my machine for about a day to resync 5.5TB of, somewhat ironically, empty space.

      There are a few questions I have that I was hoping you could help me with
      1.) I think, but I'm not sure that this occurred due to me setting advanced power management to 1 on each of the 5 drives. The drives span down, I did a reboot, something timed out and the machine rebooted before the drives span up and sync'd. Can you think of a timeout that would cause this ? It literally takes about 90 seconds for all of the drives to spin up. I power them down as this box is supposed to just be for backups.

      2.) I don't recall getting an email about it... I checked mdadm.conf and it has my email set up and I do get emails from Monit... (every 5 fricking seconds about the CPU bouncing off the ceiling during the resync). I notice running ps -aux the process "/sbin/mdadm --monitor --scan". I'm guessing that if the issue is there at bootup then it isn't a change of status and it doesn't email ?
      I was thinking of adding mdadm to Monit and adding check file mdstat with path /proc/mdstat if match "\[.*_.*\]" then alert , do you think this is a good idea ?

      3.) would it be difficult to set up Monit to send me an email every time the server is restarted so that I can perform a health check and/or investigate why ?

      4.) It would be nice if Monit was configured to give you an idea of what was using all the CPU time if it is going to send me an email to say that the CPU is being battered, something like once of the following ?
      • then exec "/bin/bash -c 'top -bn1 | mail -s top admin@foo.bar'"
      • then exec "/bin/bash -c 'ps -Ao user,uid,comm,pid,pcpu,tty --sort=-pcpu | head -n 6 | mail -s top admin@foo.bar'"
      5.) In retrospect I get a LOT of general communication errors with the drives, even now that I've set them to 128 now. Any idea why are there timeouts I can adjust?

      6.) I disabled session timeout, but I still keep getting booted. Any idea why ?

      Thanks in advance for your replies. I know a bit about Linux, but not enough to understand fully how the architecture hangs together. I'm getting most of my education from looking in the .deb file with 7zip ;)
    • Users Online 1

      1 Guest