To view parent comment, click here.
To read all comments associated with this story, please click here.
Another way to look at it is to ask whether or not LVM needs an fsck, since that's the layer in the ZFS storage system that's being worked on.
ZFS filesystems themselves rarely need fixing (I've never come across one, and haven't read about any online, but I've only been using ZFS for a year). They take care of that automatically using self-healing via checksums and redundancy, transactions, and copy-on-write.
The storage pool could become unimportable, but was usually fixable via arcane voodoo magic commands. Now, it's made a lot simpler (via the code implemented in the PSARC mentioned above -- PSARC is like a support case, or bug report, in Sun-speak).
There are tools for fixing LVM, though. And now there are tools to fix things at the storage pool layer in ZFS.
Asking for "fsck" doesn't make sense, though, as that's the wrong layer in the stack.
PSARC has nothing to do with support cases or bug reports. PSARC stands for Plattform Support Architecture Review Commitee. That's a group of people in the Opensolaris design process discussing and voting about new additions to Solaris when it changes external interfaces or open new interfaces (ABI, command line commands et al) Looks bureaucratic at first, but at the end it's responsible for such stuff like the effectiveness of the binary compatibility guarantee and the systemic features like the dense coupling of containers, zfs snapshots and the new networking stack aka Crossbow for example.
Yeah. The problem/blessing with ZFS is that it detects many more errors than other filesystems, as it is end-to-end. ZFS being more sensitive than other filesystems, is a good thing. Which filesystem could have caught this?
http://blogs.sun.com/elowe/entry/zfs_saves_the_day_ta
And the problem was not ZFS fault. Instead, ZFS is the messenger. Dont shoot the messenger?






Member since:
2005-07-11
Sun did the computing world a *huge* disservice by calling it "ZFS, the filesystem". They really should have called it what it actually is: "ZSMS, the Zettabyte Storage Management System". That would have solved so many of these kinds of issues for people.
Once you start looking at if from a storage management position, instead of "it's just a fancy fs" position, it becomes a lot simpler to understand and work with.
Unfortunately, it's too late now, and these kinds of misunderstanding and misconceptions are just going to continue to get worse.
ZFS "the filesystem" doesn't need an fsck tool. It has features that make sure data is either written correctly, or not written at all. And if a specific block can't be read or doesn't match the checksum, then it pulls it from a different copy.
ZFS "the storage pool manager" manages all the storage transactions. If something goes wrong, it can lead to an unimportable storage pool (ie, all the filesystems and volumes above it are inaccessible). Previously, one had to manually much around with dd, zdb, and voodoo to tell the storage pool to load from a previous transaction group. Now, one can do that automatically. No filesystem checking is done. It just picks an older point in time (transaction group), and loads from there. All your data (up to that point in time) is intact.
Once the pool is imported, and all the filesystems and volumes are available, you have the option of running a background scrub on the pool (the entire pool, not individual filesytems and volumes) to make sure that the data is intact. The scrub will compare the checksums on every single block in the pool, and repair anny that are bad via redundant copies.
Thus, a filesystem-specific tool that checks that one filesystem's metadata on disk (aka fsck) is not needed. Tools are already available that give a better end result ... just from a different direction.
Edited 2009-11-08 03:25 UTC