Replacing disk in ZFS pool

I have a ZFS pool on Debian Linux where I store my stuff. It is a rather silly organized array of 6 disks with disks paired in mirrors. I should have gone with a single RAID-something, but I had no idea what I was doing and what I wanted when I created it.

One of the HDDs has started to report SMART problems. It first happened in September last year. No detected/corrected ZFS issues were showing up, so I was slow with replacing it. Weekly zpool scrub was fine. The only annoying thing were emails from the SMART-daemon, reporting the same 11 sectors. At last, a week ago, when zpool status showed a corrected problem, I decided to replace the disk.

The webshop where I ordered a new disk sent me a CPU by mistake. Later, a new disk arrived.

Time to open the case, clean all the dust that got collected there since the last time I opened it, and find the old disk among an array of identically looking metallic boxes.

Once the retired disk is physically replaced, I booted into a restore mode. Only a bare minimum is started at this point, and ZFS pools are not imported.

First, zpool import <poolname>. Then zpool status. Surely, the old disk is missing and the pool is marked as being in the degraded state, because the mirror is missing a part. A better way would have been to attach the new disk without removing the old one, but I was short on the space for extra disk inside the case.

The missing disk is now identified by a long number instead of a symbolic name. Using this number in zfs replace poolname <old disk number> <new disk id> does the trick. The pool is still in degraded state, but the resilvering of data has been started.

A little scary thing is that it was required to force the replacement operation with -f flag. Te new disk did not have any EFI partitioning on it, and that made zpool suspicious.

After 7+ hours, the pool is back to normal operation.

Written by Grigory Rechistov in Uncategorized on 27.06.2020. Tags: zfs,

Copyright © 2020 Grigory Rechistov