ZFS: zpool offline fails to change state of faulty drive

  • Shortly after setting up my zpool, one Seagate drive started showing bad sectors ... so it is being returned for a replacement.


    The procedure for replacing the drive, seems to be:


    • zpool offline <pool> <bad drive>
    • zpool replace <pool> <bad drive> <new drive>
    • wait for resilver


    Unfortunately, zpool offline does not have the expected result of changing the state to 'OFFLINE' -- the drive still shows up as 'FAULTED'.


    I have tried all the variations of 'zpool offline' I could think of:

    • zpool offline data ata-ST12000NM0007-2A1101_ZJV2LDGN
    • zpool offline data /dev/disk/by-id/ata-ST12000NM0007-2A1101_ZJV2LDGN
    • zpool offline data 3630560290011746901 (GUID)
    • zpool offline -f data 3630560290011746901 (GUID)

    There are no errors reported from the 'zpool offline' commands, but also no change in the state of the drive.


    Will this cause me trouble when the time comes to 'zpool replace' this drive with a new drive or should I just chill? :-)


    Thanks for any insights or shared experiences!

  • New drive arrived. 'zpool replace' worked without a hitch, even if the old drive was showing as FAULTED instead of OFFLINE.


    For reference, this is what worked for me:

    Code
    /etc/init.d/zfs-zed stop
    zpool replace data ata-ST12000NM0007-2A1101_ZJV2LDGN /dev/disk/by-id/ata-ST12000NM0007-2A1101_ZJV1T4YF
    /etc/init.d/zfs-zed start


    In summary: Just chill, everything will work out fine. :-)

  • New drive arrived. 'zpool replace' worked without a hitch, even if the old drive was showing as FAULTED instead of OFFLINE.

    In mirror vdevs you can detach one drive per vdev with the pool online Had also to take one drive for warranty and came back with the new one.


    this was in proxmox
    so


    zpool detach poolname /dev/disk/by-id/ata-ST12000NM0007-2A1101_ZJV2LDGN


    zpool attach poolname /dev/disk/by-id/ata-ST12000NM0007-2A1101_ZJV24C09 /dev/disk/by-id/ata-ST12000NM0007-2A1101_ZJV1T4YF


    Then it goes resilvering


    BTW those ironwolf seem to last only about 2-2 1/2 years only, i just returned two in the last 4 months that were purchased a week apart. IN this case just returned because smart starts showing increasing offline uncorrectable

  • I found this

    https://docs.oracle.com/cd/E53394_01/html/E54801/ghzvz.html


    • Physically connect the replacement disk.
    • Attach the new disk to the root pool. # zpool attach root-pool current-disk new-disk Where current-disk becomes old-disk to be detached at the end of this procedure. The correct disk labeling and the boot blocks are applied automatically. Note - If the disks have SMI (VTOC) labels, make sure that you include the slice when specifying the disk, such as c2t0d0s0.
    • View the root pool status to confirm that resilvering is complete. If resilvering has been completed, the output includes a message similar to the following: scan: resilvered 11.6G in 0h5m with 0 errors on Fri Jul 20 13:57:25 2014
    • Verify that you can boot successfully from the new disk.
    • After a successful boot, detach the old disk. # zpool detach root-pool old-disk Where old-disk is the current-disk of Step 2. Note - If the disks have SMI (VTOC) labels, make sure that you include the slice when specifying the disk, such as c2t0d0s0.
    • If the attached disk is larger than the existing disk, enable the ZFS autoexpand property. # zpool set autoexpand=on root-pool

Participate now!

Don’t have an account yet? Register yourself now and be a part of our community!