ZFS: zpool offline fails to change state of faulty drive

    • OMV 5.x (beta)
    • Resolved
    • ZFS: zpool offline fails to change state of faulty drive

      Shortly after setting up my zpool, one Seagate drive started showing bad sectors ... so it is being returned for a replacement.

      The procedure for replacing the drive, seems to be:

      • zpool offline <pool> <bad drive>
      • zpool replace <pool> <bad drive> <new drive>
      • wait for resilver


      Unfortunately, zpool offline does not have the expected result of changing the state to 'OFFLINE' -- the drive still shows up as 'FAULTED'.

      Source Code

      1. root@omv-nas:~# zpool status
      2. pool: data
      3. state: DEGRADED
      4. status: One or more devices are faulted in response to persistent errors.
      5. Sufficient replicas exist for the pool to continue functioning in a
      6. degraded state.
      7. action: Replace the faulted device, or use 'zpool clear' to mark the device
      8. repaired.
      9. scan: none requested
      10. config:
      11. NAME STATE READ WRITE CKSUM
      12. data DEGRADED 0 0 0
      13. mirror-0 ONLINE 0 0 0
      14. ata-ST12000NM0007-2A1101_ZJV310R4 ONLINE 0 0 0
      15. ata-ST12000NM0007-2A1101_ZJV3B4T7 ONLINE 0 0 0
      16. mirror-1 DEGRADED 0 0 0
      17. ata-ST12000NM0007-2A1101_ZJV24C09 ONLINE 0 0 0
      18. ata-ST12000NM0007-2A1101_ZJV2LDGN FAULTED 0 0 0 external device fault
      19. mirror-2 ONLINE 0 0 0
      20. ata-WDC_WD80EFZX-68UW8N0_VK1B10SY ONLINE 0 0 0
      21. ata-WDC_WD80EFZX-68UW8N0_VK1E4SPY ONLINE 0 0 0
      22. mirror-3 ONLINE 0 0 0
      23. ata-WDC_WD80EFZX-68UW8N0_VK1E696Y ONLINE 0 0 0
      24. ata-WDC_WD80EFZX-68UW8N0_VLH2U7GY ONLINE 0 0 0
      25. errors: No known data errors
      Display All

      I have tried all the variations of 'zpool offline' I could think of:
      • zpool offline data ata-ST12000NM0007-2A1101_ZJV2LDGN
      • zpool offline data /dev/disk/by-id/ata-ST12000NM0007-2A1101_ZJV2LDGN
      • zpool offline data 3630560290011746901 (GUID)
      • zpool offline -f data 3630560290011746901 (GUID)
      There are no errors reported from the 'zpool offline' commands, but also no change in the state of the drive.

      Will this cause me trouble when the time comes to 'zpool replace' this drive with a new drive or should I just chill? :)

      Thanks for any insights or shared experiences!

      The post was edited 2 times, last by davidknudsen ().

    • New drive arrived. 'zpool replace' worked without a hitch, even if the old drive was showing as FAULTED instead of OFFLINE.

      For reference, this is what worked for me:

      Source Code

      1. /etc/init.d/zfs-zed stop
      2. zpool replace data ata-ST12000NM0007-2A1101_ZJV2LDGN /dev/disk/by-id/ata-ST12000NM0007-2A1101_ZJV1T4YF
      3. /etc/init.d/zfs-zed start

      In summary: Just chill, everything will work out fine. :)
    • davidknudsen wrote:

      New drive arrived. 'zpool replace' worked without a hitch, even if the old drive was showing as FAULTED instead of OFFLINE.
      In mirror vdevs you can detach one drive per vdev with the pool online Had also to take one drive for warranty and came back with the new one.

      this was in proxmox
      so

      zpool detach poolname /dev/disk/by-id/ata-ST12000NM0007-2A1101_ZJV2LDGN

      zpool attach poolname /dev/disk/by-id/ata-ST12000NM0007-2A1101_ZJV24C09 /dev/disk/by-id/ata-ST12000NM0007-2A1101_ZJV1T4YF

      Then it goes resilvering

      BTW those ironwolf seem to last only about 2-2 1/2 years only, i just returned two in the last 4 months that were purchased a week apart. IN this case just returned because smart starts showing increasing offline uncorrectable
      New wiki
      chat support at #openmediavault@freenode IRC | Spanish & English | GMT+10
      telegram.me/openmediavault broadcast channel
      openmediavault discord server
    • I found this

      Replacing Disks in a ZFS Root Pool







      You might need to replace a disk in the root pool for the following reasons:

      • The root pool is too small and you want to replace it with a larger disk
      • The root pool disk is failing. In a non-redundant pool, if the disk is failing and the system no longer boots, boot from another source such as a CD or the network. Then, replace the root pool disk.

      You can replace disks by using one of two methods:

      • Using the zpool replace command. This method involves scrubbing and clearing the root pool of dirty time logs (DTLs), then replacing the disk. After the new disk is installed, you apply the boot blocks manually.
      • Using the zpool detach|attach commands. This method involves attaching the new disk and verifying that it is working properly, then detaching the faulty disk.

      If you are replacing root pool disks that have the SMI (VTOC) label, ensure that you fulfill the following requirements:
      docs.oracle.com/cd/E53394_01/html/E54801/ghzvz.html

      1. Physically connect the replacement disk.
      2. Attach the new disk to the root pool. # zpool attach root-pool current-disk new-disk Where current-disk becomes old-disk to be detached at the end of this procedure. The correct disk labeling and the boot blocks are applied automatically. Note - If the disks have SMI (VTOC) labels, make sure that you include the slice when specifying the disk, such as c2t0d0s0.
      3. View the root pool status to confirm that resilvering is complete. If resilvering has been completed, the output includes a message similar to the following: scan: resilvered 11.6G in 0h5m with 0 errors on Fri Jul 20 13:57:25 2014
      4. Verify that you can boot successfully from the new disk.
      5. After a successful boot, detach the old disk. # zpool detach root-pool old-disk Where old-disk is the current-disk of Step 2. Note - If the disks have SMI (VTOC) labels, make sure that you include the slice when specifying the disk, such as c2t0d0s0.
      6. If the attached disk is larger than the existing disk, enable the ZFS autoexpand property. # zpool set autoexpand=on root-pool

      The post was edited 3 times, last by warhawk8080 ().

    • Users Online 1

      1 Guest