OMV: RAID5 extended and receiving "blk_update_request: I/O error" during reshaping

  • Hello community


    Since there was no reply to my German post, I thought it'd maybe help if my issue is being written in English as well :) (I'm really struggling)


    I have a problem with my RAID5 and really can't get any further, which is why I am contacting you.

    I looked for posts to solve the problem here as well as in other forums, however I did not conclude from the information gained, resp. I wasn't sure if that was the solution to my problem.


    By the way - for other problems / topics I have researched here in the forum several times and could actually always find a solution for everything (until now) :)


    Information:


    OMV Version: 5.4.7-1 (Usul)

    Note: The system, including RAID, was permanently stable. Ie. no dropouts, ERRORS, etc.


    Problem:

    Using the web UI, I wanted to expand my RAID5 (4x4TB) with another 4TB HDD to 5x4TB.

    Everything went well until reshaping, when the process got stuck at 71.3% and from then on I only got the "blk_update_request: I / O error" error.


    Short excerpt:

    What I've already checked:


    • All connection cables, and replacing them with new ones - no success, this wasn't the issue
    • Whether the plate is running (oscillations, vibrations) - the plate is running, so that's fine


    In the RAID management of the web UI, the RAID is no longer displayed to me - there is a pop-up occuring with the error message "communication failure".

    The strange thing though is, that not even the RAID1 (md1) is displayed in the web UI RAID management...


    The SMART test for /sdc gave the following result:


    Now the question arises whether the /sdc is broken or something can still be done in order to have the reshape process being completed.

    If it is broken and because the /sdc is "in the middle" of the reshape process, can the HDD be replaced by another disk (I already have a brand new next to me) and continued at the same reshape point (73.1%)?


    What else information would you need from me in order to help me out?


    I thank you very much in advance for your help!

  • Hi community


    I set the /sdc as faulty.

    Code
    mdadm --manage --fail /dev/md0 /dev/sdc

    At least the disk was skipped and the reshape process continued.

    Code
    root@NAS:~# cat /proc/mdstat
    
    Personalities : [raid6] [raid5] [raid4] [linear] [multipath] [raid0] [raid1] [raid10]
    md0 : active raid5 sde[2] sdf[6] sdc[3](F) sdb[4] sdd[5]
          11720658432 blocks super 1.2 level 5, 512k chunk, algorithm 2 [5/4] [UUU_U]
          [==============>......]  reshape = 71.5% (2795992432/3906886144) finish=655.7min speed=28235K/sec
          bitmap: 3/8 pages [12KB], 262144KB chunk

    Hopefully this is the solution, but let's wait another 10hrs until it's finished, I'll keep you posted :)


    Cheers

    • Offizieller Beitrag

    Since there was no reply to my German post, I thought it'd maybe help if my issue is being written in English

    :) doesn't make any difference google translate does a reasonable job + whilst those there are those that use raid very few know what to do when it goes wrong.

    Now the question arises whether the /sdc is broken

    Either the drive has failed/is failing or it's the cable or it's the connection to the hardware, sata port, the clue is in the error; Unrecovered read error - auto reallocate failed that in itself would one of the three above.


    Using the web UI, I wanted to expand my RAID5 (4x4TB) with another 4TB HDD to 5x4TB

    "The rule of thumb" is once you go over the drive mark of 4 you should move to raid6

    I set the /sdc as faulty

    :thumbup: that was your only option.


    Once it's rebuilt you could start testing sdc, either run dban on it, which will reset the drive to or use dd, create a file system then try some stress test, if it's new raise a return with the supplier.

  • Hi geaves


    Thanks for your support, I've managed to rebuild the RAID.

    Everything seems to work, but if I'm entering a specific folder, I receive this error message (one line for example):


    Code
    [ 583.452336] EXT4-fs error (device md0): ext4_lookup: 1705: inode #5505025: comm smbd: deleted inode referenced: 124911637

    Then, if I'm entering a subfolder and head back to the main folder, this appears (one line for example):


    Code
    [ 1116.506002] EXT4-fs error (device md0): ext4_lookup: 1701: inode #124911643: comm smbd: iget: checksum invalid

    I think this is because of the error the /sdc was giving me during the reshape.

    Is there any way I can "easily" delete this folder? Trying to do it over SMB is giving me an endless loop.


    Thanks!

  • Heads up:


    I was able to delete all damaged folders, except of one.

    This is what I get:


    Code
    EXT4-fs error (device md0): htree_dirblock_to_tree:997: inode #60563458: block 2: comm smbd: Directory block failed checksum
    EXT4-fs error (device md0): htree_dirblock_to_tree:997: inode #60563458: block 2: comm rm: Directory block failed checksum

    I turned of smb, so this error is not apearing anymore, but that doesn't helped me further.


    Would somebody have any suggestion on how to solve this?

    I found something related to "fsck" but I'm not confident of how to use this.


    Thanks!

    • Offizieller Beitrag

    I found something related to "fsck" but I'm not confident of how to use this.

    You can run fsck across the array I think it's fsck /dev/md0 basically what it's doing is repairing the file system, if finds an error, it will ask if you want to fix it, the answer is obviously yes

  • Ssolved it with fsck :) - everything's working now, lost a couple of files but that's fine (nothing important)


    Code
    RAID5: ***** FILE SYSTEM WAS MODIFIED *****
    RAID5: 18962/976723968 files (14.3% non-contiguous), 2572652203/3906886144 blocks
    root@NAS:~# fsck /dev/md0
    fsck from util-linux 2.33.1
    e2fsck 1.45.5 (07-Jan-2020)
    RAID5: clean, 18962/976723968 files, 2572652203/3906886144 blocks


    Many thanks for the support!

Jetzt mitmachen!

Sie haben noch kein Benutzerkonto auf unserer Seite? Registrieren Sie sich kostenlos und nehmen Sie an unserer Community teil!