Unable to import pool due to mdadm #11349

jittygitty · 2020-12-15T22:47:42Z

jittygitty
Dec 15, 2020

Linux system was working and VM was running on the zpool before reboot. After reboot zpool did not import.
When I run zpool import it says:

state: UNAVAIL
status: One or more devices contains corrupted data.
action: The pool cannot be imported due to damaged devices or data.
see: http://zfsonlinux.org/msg/ZFS-8000-5E
config:
mystripepool UNAVAIL insufficient replicas
sda3 UNAVAIL
sdb3 UNAVAIL

It's not the end of the world to lose this data (and I even have some older backups) but its a "pain" nonetheless.
More worry-some though is if I can't rely on active alert of zfs or zpool being "bad" and we can at any time do a reboot and not know if zfs zpools will be gone forever, poof after the reboot. Note this is was a striped zpool for performance reasons. I was running the VM on a cloned snapshot, snapshots were working fine before reboot.

If the zfs zpool was running fine before reboot, would it have been impossible for zfs to cache/copy any memory stored data that enabled it to keep the pool running, to copy it to some non-zfs partition and re-use that info to re-import the pool on reboot? Even a warning that zpool is messed up and might die on reboot would have been helpful. I (wrongly?) assumed that zfs would "actively warn us" of any huge "problems" etc.

Answered by jittygitty

Dec 16, 2020

Thanks for the reply, actually I thought Unavail was because of corrupted data and unable to put halves together. Also smartctl on sda was saying FAILED ie replace immediately going to die in 24hours. But guess what? Smartctl still says sda is dying but I was able to import the pool no problem after all. So what happened?

Well apparently after the reboot, mdadm snapped up sda3 and sdb3 and assembled a striped array!

Luckily, after simply stopping that unused mdadm stripe array that I hadn't noticed, zpool import worked.
I'm trying to copy over the files now (getting some in/out errors) before swapping out the failing sda drive.

View full answer

ronnyegner · 2020-12-16T09:10:18Z

ronnyegner
Dec 16, 2020

To me it looks like sda3 and sdb3 are not available. Can you check if sda and sdb are present and if there is a third partition?

Also, it looks like you are using a striped for your pool. While this is doable, i would recommend frequent backups as losing one disk means losing all the data.

0 replies

jittygitty · 2020-12-16T10:47:25Z

jittygitty
Dec 16, 2020
Author

Thanks for the reply, actually I thought Unavail was because of corrupted data and unable to put halves together. Also smartctl on sda was saying FAILED ie replace immediately going to die in 24hours. But guess what? Smartctl still says sda is dying but I was able to import the pool no problem after all. So what happened?

Well apparently after the reboot, mdadm snapped up sda3 and sdb3 and assembled a striped array!

Luckily, after simply stopping that unused mdadm stripe array that I hadn't noticed, zpool import worked.
I'm trying to copy over the files now (getting some in/out errors) before swapping out the failing sda drive.

0 replies

behlendorf · 2020-12-16T19:08:10Z

behlendorf
Dec 16, 2020
Maintainer

Presumably sda3 and sdb3 must have previously been part of a md striped array which is why it claimed them. Both md and zfs want exclusive access to the block devices which is why they ended up being reported as UNAVAIL. Glad you were to identify the root cause.

0 replies

jittygitty · 2020-12-16T23:13:44Z

jittygitty
Dec 16, 2020
Author

I don't believe they were previously part of an mdadm array (but I can't 100% rule out), but when the partitions were made, perhaps the "type" of partition label had something to do with it, what partition type label should be used for zfs? (or numerical value id as per fdisk etc?)(update: seems partitions were set to "fd" not 83)

Also, strangely could ZFS somehow be impacting "network speed"? After reboot, before bringing back the zpool with zpool import, bandwidth went back to usual over 88MB/s and now after zpool import its back to some strange 1.1MB/second limit. Not sure if related to the zpool having a failing disk as part of the array. But could zfs somehow affect the network bandwidth some strange way or is it related to smartctl saying sda failing?

Fyi for testing I was using wget -O /dev/null http... seemed strange that before zpool import bandwidth was good. But perhaps coincidence and some other issue...

0 replies

ronnyegner · 2020-12-17T11:34:59Z

ronnyegner
Dec 17, 2020

Not sure if related to the zpool having a failing disk as part of the array. But could zfs somehow affect the network bandwidth some strange way or is it related to smartctl saying sda failing?

It is most likely the disk. Remember you have a STRIPED pool. So every read touches both disks. And a failing disk might need to read defective sectors over and over to be able to return data.

My recommendation is: Backup all the important data NOW. Do not waste time. And then replace the disk. Chances are high that you will encounter unreadable sectors which will then corrupt some data.

0 replies

jittygitty · 2020-12-19T10:55:57Z

jittygitty
Dec 19, 2020
Author

Was busy so forgot to update, yea you're right that perhaps write or even read operations on failing disk could of caused some issue that slowed down networking (via general slowdown), I did confirm zpool import itself does not kill the bandwidth so must of been co-incidental and secondary to load or other issues related to dying disk. I'm finishing up backups etc before replacing drive (its in datacenter so can't just add spare quick).

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unable to import pool due to mdadm #11349

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 6 comments

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

Select a reply

Unable to import pool due to mdadm #11349

jittygitty Dec 15, 2020

Replies: 6 comments

ronnyegner Dec 16, 2020

jittygitty Dec 16, 2020 Author

behlendorf Dec 16, 2020 Maintainer

jittygitty Dec 16, 2020 Author

ronnyegner Dec 17, 2020

jittygitty Dec 19, 2020 Author

jittygitty
Dec 15, 2020

ronnyegner
Dec 16, 2020

jittygitty
Dec 16, 2020
Author

behlendorf
Dec 16, 2020
Maintainer

jittygitty
Dec 16, 2020
Author

ronnyegner
Dec 17, 2020

jittygitty
Dec 19, 2020
Author