Linked by Pobrecito Hablador on Mon 2nd Nov 2009 21:19 UTC
Sun Solaris, OpenSolaris One of the advantages of ZFS is that it doesn't need a fsck. Replication, self-healing and scrubbing are a much better alternative. After a few years of ZFS life, can we say it was the correct decision? The reports in the mailing list are a good indicator of what happens in the real world, and it appears that once again, reality beats theory. The author of the article analyzes the implications of not having a fsck tool and tries to explain why he thinks Sun will add one at some point.
Thread beginning with comment 393000
To view parent comment, click here.
To read all comments associated with this story, please click here.
blu28
Member since:
2009-11-05

The "bad hardware" problems are pretty weak as well, I can't recall hearing NTFS devs or Ext3 devs complaining about it.


Of course not. Suppose that this kind of bad hardware accounted for 1% of FS corruption. It is unlikely that anybody even knows about it because it is in the noise. But now ZFS comes along and gets rid of the other 99%. Now it is responsible for 100% of ZFS file system corruption on a FS that is designed to have none. That's a big deal. In reality, the bad disks are probably responsible for even more corruption on other FS's, but since you have already accepted a bit of corruption with each crash, you can't see the difference. Remember, fsck does not get you back what you lost and arbitrarily large amounts of data corruption can still occur. But in general the amount lost and corrupted is small and everybody has learned to live with it. But now we have ZFS and and it guarantees consistent and complete data, but possibly a few milliseconds out of date in a crash, assuming the underlying disks follow the standards. Compare that to fsck where one file may be up to date, a bunch more are a few milliseconds behind, another is corrupted and another is deleted.

Up until now with ZFS, that 1% caused by bad hardware leaves the FS unusable. But with the zpool recovery just added, in that 1% of cases you end up losing a couple of seconds of data and the file system recovers virtually instantaneously, instead of scanning and rescanning and patching to get back to a inconsistent state with some data from a few seconds ago and some up to date as happens with fsck.

Reply Parent Score: 1