Replacing a Hard Drive in a ZFS Pool
Currently, the openmediavault-zfs plugin does not allow replacing a hard drive in a pool from the GUI. Therefore, the best way to do it is through the CLI.
1. Quick Explanation of the Process
There are several ways to replace a hard drive in a pool. Here, I'll explain one that works for virtually any situation. The steps are summarized as follows:
- Identify the serial number of the drive you want to replace.
- Shut down your server and physically replace the drive with another of the same size or larger.
- Start the server, identify the newly inserted drive, and run the CLI command that will replace the hard drive in the pool. ZFS will handle the entire process.
2. Real case: Replacing a Hard Drive in a Raidz1
We started with a Raidz1 consisting of three 12TB hard drives. After some time, one of them displays errors. ZFS seems to fix it on its own, but I've been skeptical about that hard drive for a while, so I'm going to replace it with a new one. The current status of the pool is as follows:
~# zpool status
pool: DATA
state: ONLINE
status: One or more devices has experienced an unrecoverable error. An
attempt was made to correct the error. Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
using 'zpool clear' or replace the device with 'zpool replace'.
see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-9P
scan: resilvered 2.35T in 05:16:26 with 0 errors on Sun Apr 20 22:30:28 2025
config:
NAME STATE READ WRITE CKSUM
DATA ONLINE 0 0 0
raidz1-0 ONLINE 0 0 0
ata-ST12000VN0008-2YS101_WV7009YL ONLINE 0 0 0
ata-ST12000VN0008-2PH103_WR8008CY ONLINE 0 0 0
ata-TOSHIBA_HDWG21C_90M0A01HFP8F ONLINE 0 0 1
errors: No known data errors
Display More
ZFS is already warning that something is wrong. The Toshiba hard drive is the drive to be replaced. In this case, it's easy to identify since the other two are Seagate, but in any case, the name of the hard drive in the pool already gives us the serial number: 90M0A01HFP8F, which we can use to physically identify it on the server.
Without doing anything other than updating my backup beforehand, I shut down the server and physically replace my Toshiba hard drive with a new Seagate hard drive, in this case the same as the other two in the pool, although it wouldn't be necessary.
The pool status after starting the server is as follows:
~# zpool status
pool: DATA
state: DEGRADED
status: One or more devices could not be used because the label is missing or
invalid. Sufficient replicas exist for the pool to continue
functioning in a degraded state.
action: Replace the device using 'zpool replace'.
see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-4J
scan: resilvered 2.35T in 05:16:26 with 0 errors on Sun Apr 20 22:30:28 2025
config:
NAME STATE READ WRITE CKSUM
DATA DEGRADED 0 0 0
raidz1-0 DEGRADED 0 0 0
ata-ST12000VN0008-2YS101_WV7009YL ONLINE 0 0 0
ata-ST12000VN0008-2PH103_WR8008CY ONLINE 0 0 0
1055532610611156137 UNAVAIL 0 0 0 was /dev/disk/by-id/ata-TOSHIBA_HDWG21C_90M0A01HFP8F-part1
errors: No known data errors
Display More
The pool is degraded because a disk is missing, but it's still working. The disk I removed from the server is now identified in the pool as: 1055532610611156137
To identify the newly inserted hard drive, I run ls -1 /dev/disk/by-id/ and the output is this:
ls -1 /dev/disk/by-id/
ata-ST12000VN0008-2PH103_WR8008CY
ata-ST12000VN0008-2PH103_WR8008CY-part1
ata-ST12000VN0008-2PH103_WR8008CY-part9
ata-ST12000VN0008-2PH103_ZLW2KVBM
ata-ST12000VN0008-2YS101_WV7009YL
ata-ST12000VN0008-2YS101_WV7009YL-part1
ata-ST12000VN0008-2YS101_WV7009YL-part9
... (Here are the rest of the devices in the system that I have removed from the post)
The new hard drive is clearly identifiable, it is the only Seagate in my system that does not have partitions. It's this one: ata-ST12000VN0008-2PH103_ZLW2KVBM whose serial number matches what was written on its label: ZLW2KVBM
All that's left to do is run the command for ZFS to replace the missing disk, now identified as 1055532610611156137, with the new disk, which in my case will be /dev/disk/by-id/ata-ST12000VN0008-2PH103_ZLW2KVBM. Therefore, the command to run is:
zpool replace -f DATA 1055532610611156137 /dev/disk/by-id/ata-ST12000VN0008-2PH103_ZLW2KVBM
Where:
zpool replace -f = Command that will cause ZFS to replace the hard drive
DATA = Pool name (adjust it to your configuration)
1055532610611156137 = Missing hard drive ID (adjust it to the output provided by zpool status)
/dev/disk/by-id/ata-ST12000VN0008-2PH103_ZLW2KVBN = New hard drive ID.
A short while after running this command, the pool status is as follows:
~# zpool status
pool: DATA
state: DEGRADED
status: One or more devices is currently being resilvered. The pool will
continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
scan: resilver in progress since Fri Apr 25 19:14:20 2025
3.37T / 20.1T scanned at 759M/s, 2.36T / 20.1T issued at 530M/s
805G resilvered, 11.73% done, 09:44:38 to go
config:
NAME STATE READ WRITE CKSUM
DATA DEGRADED 0 0 0
raidz1-0 DEGRADED 0 0 0
ata-ST12000VN0008-2YS101_WV7009YL ONLINE 0 0 0
ata-ST12000VN0008-2PH103_WR8008CY ONLINE 0 0 0
replacing-2 DEGRADED 0 0 0
1055532610611156137 UNAVAIL 0 0 0 was /dev/disk/by-id/ata-TOSHIBA_HDWG21C_90M0A01HFP8F-part1
ata-ST12000VN0008-2PH103_ZLW2KVBM ONLINE 0 0 0 (resilvering)
errors: No known data errors
Display More
ZFS is replacing the old Toshiba drive with the new Seagate drive and is automatically synchronizing the pool data to the new hard drive. The process will take a few hours to complete.
The operation ultimately took about 12.30 hours, and the final result was this:
~# zpool status
pool: DATA
state: ONLINE
scan: resilvered 6.70T in 12:30:21 with 0 errors on Sat Apr 26 07:44:41 2025
config:
NAME STATE READ WRITE CKSUM
DATA ONLINE 0 0 0
raidz1-0 ONLINE 0 0 0
ata-ST12000VN0008-2YS101_WV7009YL ONLINE 0 0 0
ata-ST12000VN0008-2PH103_WR8008CY ONLINE 0 0 0
ata-ST12000VN0008-2PH103_ZLW2KVBM ONLINE 0 0 0
errors: No known data errors
Display More
Done. The hard drive has been replaced and the pool status is ONLINE, everything is in order. The entire process was executed without any problems with a single command.
3. References
Official open-zfs documentation -> https://openzfs.github.io/open…er/8/zpool-replace.8.html
Guide to replacing a hard drive in a ZFS pool -> https://jordanelver.co.uk/blog…led-disk-in-a-zfs-mirror/
Useful information -> RE: Help to understand proper ZFS disk replacement process
omv-extras wiki -> The openmediavault-zfs wiki document is currently under development. When completed, it will likely describe this process among many others.