Replacing disk in ZFS pool
I have a ZFS pool on Debian Linux where I store my stuff. It is a rather silly organized array of 6 disks with disks paired in mirrors. I should have gone with a single RAID-something, but I had no idea what I was doing and what I wanted when I created it.
One of the HDDs has started to report SMART problems. It first happened in September
last year. No detected/corrected ZFS issues were showing
up, so I was slow with replacing it. Weekly zpool scrub
was fine. The only
annoying thing were emails from the SMART-daemon, reporting the same 11 sectors.
At last, a week ago, when zpool status
showed a corrected problem, I decided
to replace the disk.
The webshop where I ordered a new disk sent me a CPU by mistake. Later, a new disk arrived.
Time to open the case, clean all the dust that got collected there since the last time I opened it, and find the old disk among an array of identically looking metallic boxes.
Once the retired disk is physically replaced, I booted into a restore mode. Only a bare minimum is started at this point, and ZFS pools are not imported.
First, zpool import <poolname>
. Then zpool status
. Surely, the old disk is
missing and the pool is marked as being in the degraded state, because the mirror
is missing a part. A better way would have been to attach the new disk without
removing the old one, but I was short on the space for extra disk inside the case.
The missing disk is now identified by a long number instead of a symbolic name.
Using this number in zfs replace poolname <old disk number> <new disk id>
does
the trick. The pool is still in degraded state, but the resilvering of data
has been started.
A little scary thing is that it was required to force the replacement
operation with -f
flag. Te new disk did not have any EFI partitioning on it,
and that made zpool
suspicious.
After 7+ hours, the pool is back to normal operation.