Post a Comment
Key thing to note that all future Solaris work is being done assuming ZFS as the primary file system... so everything else (for the local machine anyway) is legacy.
The first release of OpenSolaris installs upon ZFS only, and that OS is going to be the basis of whatever Sun cobbles together to call Solaris 11, if that even happens.
It would be advisable to stay on topic and edit out any snipey and unprofessional off-topic asides like the above quoted material. This article is supposed to be about "Solaris Filesystem Choices". Please talk about Solaris filesystems.
Aside from some understandable concerns about layering, I think most "Linux folks" recognize that ZFS has some undeniable strengths.
I hope that this Article Contest does not turn into a convenient platform from which authors feel they can hurl potshots at others.
Edited 2008-04-21 20:25 UTC
Both of those quoted sentences are factual, and I think it's important to understand that technology and politics are never isolated subjects.
However, I understand the spirit of your sentiment. In my defense, I wrote the article both to educate and to entertain. If a person just wants to know about Solaris filesystems, the Sun docs are way better than anything I might write.
Let's not confuse facts with speculation.
You wrote: "It has also gotten some derision from Linux folks who are accustomed to getting that hype themselves."
In interpretive writing, you can establish that "[ZFS] has gotten some derision from Linux folks" by providing citations (which you did not provide, actually).
But appending "... who are accustomed to getting that hype themselves" is tacky and presumptuous. Do you have references to demonstrate that Linux advocates deride ZFS specifically because they are not "getting hype"? If not, this is pure speculation on your part. So don't pretend it is fact.
Moreover, referring to "Linux folks" in this context is to make a blanket generalization.
Let's not confuse facts with speculation.
You wrote: "It has also gotten some derision from Linux folks who are accustomed to getting that hype themselves."
In interpretive writing, you can establish that "[ZFS] has gotten some derision from Linux folks" by providing citations (which you did not provide, actually).
But appending "... who are accustomed to getting that hype themselves" is tacky and presumptuous. Do you have references to demonstrate that Linux advocates deride ZFS specifically because they are not "getting hype"? If not, this is pure speculation on your part. So don't pretend it is fact.
Moreover, referring to "Linux folks" in this context is to make a blanket generalization. "
+1. The author of this article is clearly a tacky, presumptuous speculator, short on references and long on partisanship.
Seriously, I know I shouldn't reply here, but in the light of the above revelation, I will. It is extremely silly to turn this into some semantic argument on whether I can find documentation on what is in someone's heart. If I could find just two 'folks' who like linux and resent non-linux hype relating to ZFS, it would make my statement technically a fact. Are you willing to bet that these two people don't exist?
Yet, would this change anything? No, it would be complete foolishness. Having spent my time in academia, I am tired of this kind of sophistry of "demonstrating facts". I read, try things, form opinions, write about it. You have the same opportunity.
I figure that with popularity comes envy of that popularity. And with that comes potshots. Ask any celebrity. As Morrissey sings, "We Hate It When Our Friends Become Successful".
http://www.oz.net/~moz/lyrics/yourarse/wehateit.htm
It's probably best to simply expect potshots to be taken at Linux and Linux users and accept them with good grace. Politely pointing out the potshots is good form. Drawing them out into long flame-threads (as has not yet happened here) is annoying to others and is thus counterproductive. It just attracts more potshots.
Edited 2008-04-22 18:41 UTC
Don't twist words. My comments were quite obviously in reference to a particular sentence. (I'd add that I enjoyed the majority of your essay.)
Facts are verifiable through credible references. This is basic Supported Argument 101.
Good god, man. What academic world do you come from where you don't have to demonstrate facts? You're the one insisting that your statements are fact.
Sure you're entitled to your opinion. But don't confuse facts with speculation. That is all.
edit: added comment
Edited 2008-04-22 19:54 UTC
> ...demonstrate that Linux advocates deride ZFS specifically because they are not "getting hype"?
I don't think he said one causes the other or is a result of jealousy. He merely pointed out that Linux advocates usually receive hype rather than derision but in this case many gave derision. I fail to see the "specifically because."
Could you elaborate on that? I have some concerns about mixing the fs and raid layers. But the self healing features are attractive. And the admin utilities are a dream. I only wish that I had such a nice command-line interface to manipulate fdisk/mdadm/lvm/mkfs in the Linux world. There is no reason that this could not be done. But the fact is that, in all these years, it hasn't been done. People point me at EVMS when I speak along these lines. But EVMS really doesn't cut it. In fact, every time I check it out, I come away wondering what it is really for, and what problem it actually solves.
Furthermore, I can imagine where plain UFS is the best solution (i. e. where ZFS would be "too much of the good"), for example on systems with lower ressources or where extending the the storage "pool" won't happen. UFS is a very stable and fast file system (the article mentions this), and along with the well known UNIX mounting operations, it can still be very powerful. For example, FreeBSD uses the UFS2 file system with "soft updates". But well, these settings usually aren't places where Solaris come to use.
But remember, kids: This doesn't obsolete your accurate backups. :-)
Once you have taken the fime to read the zfs manpages, these utilities are very welcome. Especially the central zfs service program interface makes formatting and mounting very easy. It has advantages over the relatively static /etc/vfstab.
And nice to see that the Veritas volume manager has been mentioned in the article. IN VINUM VERITAS. See vinum(8) manpage. =^_^=
Edited 2008-04-21 21:38 UTC
Wiser words are rarely spoken
Wow, never heard that before, that's a great pun. Anyway, I am a fan of VxVM / FS. I heard somewhere that it may be open sourced; I hope this is true even if it sounds somewhat unlikely. VxFS on *BSD would be good stuff.
Wiser words are rarely spoken
"
How about these: *Test* your backups from time to time. Restore an entire machine (on spare hardware!) to make sure you can. It's easy to say 'we take daily backups', but not nearly as easy to take those daily backups and make them into a running system.
Can't say I disagree. The layering violations are more important than some people realise, and what's worse is that Sun didn't need to do it that way. They could have created a base filesystem and abstracted out the RAID, volume management and other features while creating consistent looking userspace tools.
The all-in-one philosophy makes it that much more difficult to create other implementations of ZFS, and BSD and Apple will find it that much more difficult to do - if at all really. It makes coexistence with other filesystems that much more difficult as well, with more duplication of similar functionality. Despite the hype surrounding ZFS by Sun at the time of Solaris 10, ZFS still isn't Solaris' main filesystem by default. That tells you a lot.
a) There are no layering violations. The Linux camp keeps claiming that because it's implemented completely different than how they do their stuff. ZFS works differently, period.
b) So, what's inconsistent with zpool and zfs?
A filesystem, a volume manager, a software RAID manager and bad block detection and recovery code with functionality not unlike smartd, along with various other things, all in one codebase? That's a (unnecessary) layering violation in anybody's book, so saying the above isn't going to make what you've written true.
Nothing. It's about the only real advantage of ZFS.
A filesystem, a volume manager, a software RAID manager and bad block detection and recovery code with functionality not unlike smartd, along with various other things, all in one codebase? "
So, having network drivers, sockets support, file descriptors, IP support, TCP support, UDP support, and HTTP support all in one codebase means the Linux kernel's network stack is full of rampant layering violations?
Please stop parroting one Linux developer's view. Go look at the ZFS docs. ZFS is layered. Linux developers talk crap about every thing that is not linux. Classic NIH syndrome.
ZFS was designed to make volume management and filesystems easy to use and bulletproof. What you and linux guys want defeats that purpose and the current technologies in linux land illustrate that fact to no end.
That's just plain wrong. ZFS is working fine on BSD and OS X. ZFS doesn't make coexistence with other filesystems difficult. On my Solaris box I have UFS and ZFS filesytems with zero problems. In fact I can create a zvol from my pool and format it with UFS.
Feel free to describe what those layers are and what they do. It certainly isn't layered into a filesystem, volume manager and RAID subsystems.
When it's been around as long as the Vertitas Storage System, or indeed, pretty much any other filesystem, volume manager or software RAID implementation, give us a call.
I don't see lots of Linux users absolutely desperate to start ditching what they have to use ZFS.
I'm afraid you've been at the Sun koolaid drinking fountain. ZFS is not implemented in a working fashion in any way shape or form on OS X (Sun always seems to get very excited about OS X for some reason) or FreeBSD. They are exceptionally experimental, and pre-alpha, and integrating it with existing filesystems, volume managers and RAID systems is going to be exceptionally difficult unless they just go ZFS completely.
So what? You're sitting on a Solaris box. When you have HPFS, LVM, RAID and other partitions on your system and you're working out how to consolidate them (or you're a non-Solaris OS developer trying to work that out), give us a call.
So that means it isn't layered.... hmmm what are you smoking. It's different so it must be bad. I get it.
Breaking that layering was intentional because that layering adds nothing but more points of failure.
It's like saying electric cars are broken and rampant violations because they are not powered by gas. Which is utter nonsense.
Huh?? WTF does that have to do with anything? It is easy to use has no bearing on how long something has been in the market.
Your condition will never be true. Call me when Linux has been around for as long as Unix System V has been around. Unix System V has been around since 1983. Linux since 1992. Linux will never be around longer than Unix SV unless Unix dies at a particular time and linux continues.
BTW ZFS has been around longer than ReiserFS 4. Wait but ReiserFS 4 is completely useless.
The first comment on Jeff Bonwick's blog post that was linked in an earlier post had some guy running a 70TB linux storage solution who was waiting to dump it for ZFS.
It is just being ported and is unstable. That doesn't mean it is impossible to port, as you claimed, because ZFS isn't layered.
It all depends on how many resources apple wants to put into ZFS and their Business plan. Your claim was directly in relation to some rubbish quote by Andrew Morton. You then, based on ill conceived conjecture, claimed ZFS is not portable because of "rampant layering" violations. Which is just nonsense.
You can create a RAID volume and ZFS can add it to a pool. You can then create a zvol from that pool and format it with other filesysems. You can create a LVM volume and add it to a ZFS pool as long as it is a block device. You can even take a file on a filesystem and zfs can use it in a pool. You have no idea what you are talking about.
http://www.csamuel.org/2006/12/30/zfs-on-linux-works
The link above is about some guys using LVM in linux with ZFS.
You should stop drinking the Anti-Sun kool-aid. Its no secret that you are an anti-Sun troll on OSnews.
So what? You're sitting on a Solaris box. When you have HPFS, LVM, RAID and other partitions on your system and you're working out how to consolidate them (or you're a non-Solaris OS developer trying to work that out), give us a call.
WTF are you on about again? You claimed ZFS can't co exist with other files Systems because of its design. When you have figured out basic software layering and architecture or have at least learned how to look at some HTML code give us a call.
Edited 2008-04-25 21:40 UTC
"ZFS was designed to make volume management and filesystems easy to use and bulletproof.
When it's been around as long as the Vertitas Storage System, or indeed, pretty much any other filesystem, volume manager or software RAID implementation, give us a call. "
When other file systems like VXfs can detect hardware data corruption and not silently corrupt data give us a call.
case in point:
http://www.opensolaris.org/jive/thread.jspa?messageID=81720
What are the layering violations? Could someone point me toward some good links (or search terms) for the info?
I'm just curious if this is a "the tools aren't split out into separate fs, raid, volume, disk management tools" issue or a "source code is unreadable as everything is lumped together in one big lump" issue, or what. What are the layers of ZFS on Solaris, for example, as compared to the same layers in Linux. What's so different about FreeBSD that a single developer was able to get basic ZFS support working in under two weeks, and yet there's still no ZFS support on Linux?
Urh, you do realise that ZFS is already available in FreeBSD 7.0, right?
Export a zvol and put whatever filesystem you want on top (okay, may only UFS is directly supported, but you can use iSCSI and such to export the zvol to other systems and then put whatever FS you want on top).
Errrrr, no it isn't. It's extremely experimental and barely functional, and on limited architectures at that. Hell, even running it on 32-bit systems will leave you with something exceptionally borked. ZFS also needs exceptional tuning to work with non-Solaris kernels. There is a huge class of hardware it simply will not run on - probably ever.
I've seen some people wandering around assuming that they can just run ZFS in FreeBSD, and run it in production. That's just.........scary.
Edited 2008-04-25 02:56 UTC
Thank you. I kept the size of the ZFS portion reasonable for symmetry but much, much more could be written about it. In particular, now that the iSCSI target is in production Solaris, it would have been interesting to discuss the extensive integration of ZFS and the target.
Good point about the dev features. I just checked my fresh Solaris 10 Update 5 (latest) install and the pool version is 4. The current dev pool version appears to be 10 since November 2008. Since then, they have changed the zpool on-disk spec to enable gzip compression, use of NVRAM devices for acceleration, quotas, booting, cifs, and other things.
See:
http://opensolaris.org/os/community/zfs/version/10/
And cycle the number of the URL from 1 to 10.
Will we see more background articles on OSnews in the future ? These would be welcome. And it would be even better if such articles would be written by people with real-life experience. Although the article about Solaris filesystems is well written, some important facts are missing. A few examples:
* There is an important difference between ZFS in Solaris and ZFS in OpenSolaris. OpenSolaris has all the latest and greatest ZFS features, Solaris not yet.
* Sun does support Solaris, but does not offer support for the OpenSolaris builds. My opinion is that with regard to the risk of hitting bugs that the OpenSolaris developer edition builds can be compared to Linux kernel release candidates.
* If there are four filesystems to choose from, this means not every filesystem is suited for every workload. The article does not mention e.g. that when using a single disk, UFS performs better for database workloads than ZFS. This is not a coincidence but is due to the fundamental characteristics of these filesystems (see also Margo Seltzer e.a., File System Logging Versus Clustering: A Performance Comparison, USENIX 1995, http://www.eecs.harvard.edu/~margo/usenix.195).
Are you, sir, suggesting that I lack real world experience? Well I never...!!
But seriously, there are a few points here which are worth responding to; a careful eye picks out some of the things I could have talked about, but cut to avoid the problem of too many themes in one article:
This is true, but frankly I find it unimportant in the last year or so since most of the ZFS features I consider important are in mainline Solaris, since, say, U4. A lot of the dev stuff is either icing, like gzip compression, or far-off alpha stuff, like on-disk encryption.
Sure. I, personally, would never recommend that anyone runs a dev version of an OS in a commercial setting. It's rarely worth the hassle.
I did allude to this. But I am not sure that that Seltzer paper is an applicable reference because I don't think that the designs of ZFS and LFS are close enough, ie, ZFS is not a pure-play log structured FS and requires no cleaner. I am not even sure the FFS comparison stands because journaling changes a lot, performance wise.
Still, the kernel of the argument is probably that ZFS and LFS both often change sequential IO to random and vice-versa, and this can cause weird performance characteristics. I think that this is true. However, experience (!) has taught me that single disk performance is rarely important, for the simple reason that almost any modern FS can max out a single disk on sequential and random. Putting it another way, if you require performance, you probably need more disks.
That's an interesting statement. Can you tell me where I can more information about the fact that ZFS doesn't require a cleaner ? "
Check this out:
http://blogs.sun.com/nico/entry/comparing_zfs_to_the_41
By saying that it doesn't require a cleaner, what I had in mind is that it isn't background garbage collected and does not treat the disk as a circular log of segments that are either clean or dirty. It is more like a versioned tree of blocks. By cleaner I mean an asynchronous daemon that eventually "gets around to" cleaning segments that are dirty.
However, I am always willing to learn something. Do you think this evaluation is incorrect?
That's an interesting statement. Can you tell me where I can more information about the fact that ZFS doesn't require a cleaner ? "
Check this out:
http://blogs.sun.com/nico/entry/comparing_zfs_to_the_41
By saying that it doesn't require a cleaner, what I had in mind is that it isn't background garbage collected and does not treat the disk as a circular log of segments that are either clean or dirty. It is more like a versioned tree of blocks. By cleaner I mean an asynchronous daemon that eventually "gets around to" cleaning segments that are dirty.
However, I am always willing to learn something. Do you think this evaluation is incorrect? [/q]
As known, filesystems like ZFS and the Sprite LFS write data to a huge log. This log contains both data and metadata. I'm not a ZFS expert, but how can ZFS discover unreferenced blocks without rereading the metadata that was written earlier to the log (assuming that not all metadata fits in RAM) ?
What does this mean then? http://developers.sun.com/sxde/support.jsp
Indiana is going to be named OpenSolaris (before that, OpenSolaris was just a code base). The developer edition (SXDE: "Solaris Express Developer Edition" built from OpenSolaris) is supported. The developer preview (aka Indiana) is just a preview.
Following up with your wording,SXDE is an OpenSolaris build.
I got what you meant, but the wording is incorrect.
What does this mean then? http://developers.sun.com/sxde/support.jsp "
I was referring to end-user support, not to support for those who want to write OpenSolaris kernel or userland code.
"Remember, though, that this is just one developer's aesthetic opinion."
It's not just one developer aesthetic opinion. Layered design is superior to monolithic design.
ZFS is good, but layered ZFS would be better, for many reasons. Can ZFS do reiserfs over LVM over RAID over NFS, SMB and gmailfs? You would be surprised how some people use the technology sometimes.
Yes, ZFS is a great file system. Howover it's still got room to improve. I would like to see it GPL'ed (dual license or just GPL Solaris). I believe there are plans for that. Make it layered if it can be and rename it agrouffs.
Edited 2008-04-22 14:34 UTC
It's not just one developer aesthetic opinion. Layered design is superior to monolithic design.
My interpretation of the original comments were that they were about the customary layering of subsystems in the kernel, which is not something that users see or care about. In other words, I think that he meant that "ZFS does not map directly onto the storage management architecture in the linux kernel and thus would be a pain to implement in linux" and just said it in a sensationalistic way.
Yes! There is no rule saying that ZFS cannot be used over a traditional volume manager, on a loopback file (over NFS), or that a zvol cannot be formatted with a traditional filesystem like UFS. It is just not common.
It's not just that. It's maintainability. When features get added to the wrong layer, it means code redundancy, wasted developer effort, wasted memory, messy interfaces, and bugs that get fixed in one filesystem, but remain in the others.
It does make a difference just how many filesystems you care about supporting. The Linux philosophy is to have one that is considered standard, but to support many. If Sun is planning for ZFS to be the "be all and end all" filesystem for *Solaris, it is easy to see them coming to a different determination regarding proper layering. Neither determination is wrong. They just have different consequences.
Perhaps btrfs will someday implement all of ZFS's goodness in the Linux Way. I confess to being a bit impatient with the state of Linux filesystems today. But not enough to switch to Solaris. I guess one can't expect to have everything.
It does make a difference just how many filesystems you care about supporting. The Linux philosophy is to have one that is considered standard, but to support many. If Sun is planning for ZFS to be the "be all and end all" filesystem for *Solaris, it is easy to see them coming to a different determination regarding proper layering. Neither determination is wrong. They just have different consequences.
Perhaps btrfs will someday implement all of ZFS's goodness in the Linux Way. I confess to being a bit impatient with the state of Linux filesystems today. But not enough to switch to Solaris. I guess one can't expect to have everything.
This is a good, balanced explanation. I think the question is whether the features provided by ZFS are best implemented in a rethought storage stack. In my opinion, the naming of ZFS is a marketing weakness. I would prefer to see something like "ZSM", expanding to "meaningless letter storage manager". Calling it a FS makes it easy for people to understand, but usually to understand incorrectly.
I see ZFS as a third generation storage manager, following partitioned disks and regular LVMs. Now, if the ZFS feature set can be implemented on a second generation stack, I say, more power to the implementors. But the burden of proof is on them, and so far it has not happened.
I too am impatient with the state of Linux storage management. For better or worse, I just don't think it is a priority for the mainline kernel development crew, or Red Hat, which, like it or not, is all that matters in the commercial space. I think ext3 is a stable, well-tuned filesystem, but I find LVM and MD to be clumsy and fragile. Once ext4 is decently stable, I would love to see work on a Real Volume Manager (tm).
I think you wrong here. I do think the Linux kernel community is aware of the desperate need of a 'kick-ass-enterprise-ready-filesystem-like-ZFS'. A lot of people where waiting for the arrival of Reiser4, but we all know how that ended :-)
Ext4 is just a 'let's be pragmatic' solution: we need something better than Ext3.
ZFS for Linux is (besides license issues) a 'no-go' because of the (VFS) layering stuff.
But I think that there's hope: BTRFS. It doesn't sound as sexy as ZFS, but it has a lot to offer when it becomes stable and available. I'm following the development closely, and I get the idea that Chris Mason makes sure to not fall into the 'reiser trap' by communicating in a constructive matter with the rest of the kernel community.
Although not ready in the near future (read 2008), I personally have high expectations of BTRFS. And I believe it will become the default filesystem for many distributions when it arrives.
Regards Harry
The problem with BTRFS is that a whole team worked for several years on ZFS to get it to the current point, while BTRFS still has only one guy behind it (as far as I know). While there may be first workable prototypes pretty soon, the fleshing out of details (usually things work nice on paper, but practically prove to be crappy and need to be redesigned) and fine-tuning is going to need LOTS of time.
Ext4 is just a 'let's be pragmatic' solution: we need something better than Ext3.
I hope you are right. To be clear, I think ext4 could turn out to be an excellent incremental improvement.
Yes, I hope this turns out as well, since it seems to bypass some of the MD and LVM cruft. What I find interesting about BTRFS is that it is an Oracle project, and yet better storage for unstructured data is not really in Oracle's interest--what they would like is to get your unstructured dat









that you can be sure of. once open solaris has had a few general avalibility builds and gets out there more. i would say a year off or so.