snapraid + drives issue

    • Offizieller Beitrag

    (What's your OMV version?)


    Why would you think it's a software bug? You're aware that some USB interfaces filter drive ATA commands, right? (And, while I might be wrong about this, with the box thrown in there, you might have two USB interfaces in a chain.) I'd be looking at the "box".


    What have you tested so far? Have you direct connected the new drive to a USB port on the odriod? If so, what happen?

    • Offizieller Beitrag

    I don't want to step on @flmaxey toes but if look at that output the errors are mainly related to sdd not sdc, as most are i/o errors it's either the drive failing or the external usb unit, the later could explain the sdc1 error.
    However, line 175 relating to sdc1 and any subsequent sdc1 line seem to relate to a problem with mergerfs on sdc in that file system.

  • @geaves is it possible that smartctl shows everything ok for that drive - sdd - and even dmesg log points its drive failure? Well not sure why still sdc got issue with mergerf even i run fsck etc after some time i still got these errors.


    also is it really possible that usb box is dyin?

    • Offizieller Beitrag

    is it possible that smartctl shows everything ok for that drive - sdd - and even dmesg log points its drive failure?

    Yes, you need to look at the drive information, just because it's passes a test doesn't mean the drive is not failing, checking the information on 2 of my 4 drives they need replacing even though the test returns OK, (on my to do list)


    If you look at line 175 regarding sdc EXT4-fs error (device sdc1): htree_dirblock_to_tree:990: inode #151259368: block 605037128: comm mergerfs: bad entry in directory: inode out of bounds - offset=0(0), inode=4180225166, rec_len=3460, name_len=6 it points to a file system error in mergerfs on that drive, but I have no idea what it means nor how to fix it.


    The I/O could be the drive, the drives connection in that box, the hardware that runs that external box on that specific port, I don't know if you could simply change 2 drives ports i.e. swap them over and mergerfs + snapraid will still function or remove sdd temporarily and add another drive to test the connection the drive is on. But my bet would be on the drive failing.

    • Offizieller Beitrag

    I don't want to step on @flmaxey toes

    Ouch! :) (Kidding - glad you chimed in.)
    _____________________________________________


    @lenovomi ;
    If you have the hardware to do it (a single drive dock?), I'd still like to see the result of moving the drive out of the box and connecting to the 2nd USB port. (The merged volume should be fine.)


    Note that, unless the drive is same make and model as the other drives in the box, it's firmware will be different. Also of note is that some of those USB muti-drive enclosures and single drive docks, have drive size limits. Is the new drive, larger than the rest? Just be aware that multi-drive external USB enclosures and other SATA/USB interfaces bring additional variables into the picture.

  • Yes, you need to look at the drive information, just because it's passes a test doesn't mean the drive is not failing, checking the information on 2 of my 4 drives they need replacing even though the test returns OK, (on my to do list)




    hi, but what exactly are these drive information? what has to be checked apart of the smarctl of drive? thank u


    @flmaxey thats the issue that i have no ability to do any tests :((

    • Offizieller Beitrag

    hi, but what exactly are these drive information? what has to be checked apart of the smarctl of drive? thank u

    From the cli smartctl -a /dev/sdd and paste the output into the code or spoiler on the toolbar @flmaxey will shed his wisdom on the output :) but my guess is, it won't be good.

    • Offizieller Beitrag

    @flmaxey is better than me at evaluating those but these stand out;


    1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 0 05/02
    1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 81 23/02


    # 1 Extended offline Completed without error 00% 50298 - 05/02
    # 1 Extended offline Completed: read failure 90% 50727 2185233880 23/02


    197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 0 05/02
    197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 49 23/02


    To me that points to the errors seen in your first post.

    • Offizieller Beitrag

    While read errors are corrected by firmware:


    Based on attribute 197 alone, I'd replace that drive ASAP. In a span of just 428 hours, as @geaves noted, this attribute incremented from 0 to 49. (197 is a harbinger of drive failure.) The drive has close to 6 years in powered-on hours. They don't last much longer than that.
    _______________________________________________________


    If you can avoid it, don't SYNC SNAPRAID, or add content, until this drive is replaced and SNAPRAID is used to restore it. You might want to look closely at your other drives, if they're all that old.


    BTW: Capturing stat's on a regular basis, for comparison, is a good idea. Being able to see "before and after" provides a much better picture.

    • Offizieller Beitrag

    means it Drive failure and it has to be replaced by the new one?

    Absolutely, if you want to safeguard your data. The sooner the better and there's no such thing as too soon. Look at this as good fortune. Some drives fail with no warning at all.


    @flmaxey is better than me at evaluating those but these stand out;

    Somehow, someway, if you really dug down deep :D ,, I'm pretty sure you would have made the same call. (No consult required.) :)


    It's fortunate that OMV will do E-mail notifications based on SMART stat changes. It'd good to catch drive problems, before they corrupt data or die completely.

Jetzt mitmachen!

Sie haben noch kein Benutzerkonto auf unserer Seite? Registrieren Sie sich kostenlos und nehmen Sie an unserer Community teil!