Hans Reiser of the ReiserFS project created benchmarking tests designed to be fairly representative of the file-size distribution of most users. “Reiser4 does quite well on all benchmarks,” he said. “With [Reiser4], we took five different technical gambles, and all of them worked. The result is very high.
Is it just me, or are there no actual benchmarks shown, or even linked??
Anyone have real numbers to show?
–Joe
http://www.namesys.com/benchmarks.html
But I’d like to see the plugins designed for it. It would also be great if the storage developers could link minds with the reiserFS developers. I think they share similar goals.
All of my systems have been Reiser only file systems for years now.
If 4 times faster than NTFS then also at least 4 times faster than WinFS as WinFS is built on NTFS?
If 4 times faster than NTFS then also at least 4 times faster than WinFS as WinFS is built on NTFS?
Good question. I’m sure we will see as soon as WinFS is available which it is afaik not at the moment.
(As always: please correct me if I’m wrong.)
It seems that there are a lot of different file systems available for Linux, but none of them have any kind of support for VMS-like versioning. In other words, if I have a file foo.txt, and then modify it and save it, I have foo.txt;1 and foo.txt;2, with ;1 always being the latest (and the default when performing file operations). This seems like such a good idea, yet it seems to be in VMS only (is there a patent on it?). I would love to see this feature in a Linux-based fs as I would be able to revert back to a working inetd.conf after I’ve fiddled with it one too many times.
In the age of terrabyte disks from Lacie, I can’t imagine anyone’s really worried about the disk space, so what could be stopping this from being developed?
From what is stated in the article, it sounds like it will be relatively simple to write a module to do this, once Reiser4 is released as stable.
Hi
There was some discussion of a version control on top of a traditional filesystem but I believe reiserfs is more feasible and locks are very fine tuned.
It would be easy to add plugins once its in the actual 2.6 tree. andrew morton has said that he is ready to include it for testing
unlike winfs sql model reiserfs is more flexible and available right now
You probably wouldn’t want to encode it in the filenames though 😉
It is funny that you mention this.
I just recently discovered that NTFS supports versioning. It implements each version as a separate stream, and uses the same naming semantics as VMS (file1.txt:0, file1.txt:1, etc). This is not surprising given NT’s heritage. You can find more information about this at sysinternals. They provide a streams utility that reports on the number of streams for a given file.
http://www.sysinternals.com/ntw2k/source/misc.shtml#streams
Unfortunately, the WIN32 API doesn’t implement any support for using this, which means most/all applications don’t support it either. It’s a real shame!!
It is about time. Linux was lagging big time in that respect!
Thanks Hans 🙂
Reiserfs is really nice. It is a much better technical base for a file system with database features than NTFS, since it has excellent support for very small files and (as of reiser4) even real transactions.
But one thing that is missing for BeOS style life queries on linux is a fast way to get notified when a file changes. There is the dnotify mechanism, but it only works for a single directory and does not even tell you what file has changed.
Once such a mechanism is in place, it would be relatively easy to add live queries to e.g. KDE as a KIOSlave.
Hi
Yes dnotify is a hackish solution and there are people working on replacing it. i heard there is a gnome effort on this
rml might be a good choice
Could you post an URL for this replacement project for dnotify?
I wrote my own version of dnotify which works recursively and gives you a lot of information about the type of change, and I would really like to get in touch with other people interested in this. I posted a very preliminary version to the LKML in december. What I have now is completely different, but backwards compatible to the old dnotify mechanism.
I have a very nasty bug that only appears if you try to watch the / directory and all its subdirectories, but other than that it works just fine.
it becomes the default for most of the popular distro’s instead of ext3.
I have a simple question, is it easy to upgrade from ReiserFS v3 to 4?
Because my main distro is fedora core 1 and you can install it on reiserFS. For those who didn’t know.
http://fedoranews.org/krishnan/tips/tip010.shtml
So is it easy going from 3 to 4?
it is very easy to install. i use gentoo and have installed the love-sources ( http://www.linuxmall.us/~lovepatch/love-sources/ ) as my kernel. the love-sources have out of the box, support for reiser4. however… i was not able to get reiser4 as filesystem for my root partition. i was able to install it and format the root partition as reiser4, but my kernel did not mounted the root partition at boot time.
to get the root partition to be reiser4, i have archived everything on the root partition and saved that file on an nfs server.
after that i have booted with RIP ( http://www.tux.org/pub/people/kent-robotti/looplinux/rip/ ) and i formated the root partition with reiser4 and then extracted the archive from the nfs server back to the reiser4 partition.
everything worked fine, but the kernel did not like the root to be reiser4 (i have not to mention, that i compiled reiser4 support into the kernel, but this did not help).
cheers
SteveB
reiser4 is very fast. i have used in on my systems. but beside all that speed, reliable storage is the main purpose of the filesystem (at least for me).
to be honest, i don’t realy care if it is 4 time faster then ntfs or any other file system. the most important part is, that is reliable and mature.
in the past i have used xfs as my main file system for servers. the reason i choosed xfs is, that i need an file system, wich is reliable, mature, not dog slow, does not have much limits (file size, directory size, etc) and very important: has enought tools to manage daily needs.
the tools was one of the reasons i did not choose reiserfs. i missed at that time (some years ago) tools to do an dump of the complete disk and save it on an beckup medium.
the plugin architecture for reiser4 looks vers promissing in that direction. i hope that after some time, we will see alot of plugins for reiser4.
reiser4 looks to be an very nice file system. and the speed on those drives, where i have an reiser4 file system, looks to be very fast.
hans is doing an incredibile work. i am very much positivly surpriced by reiser4.
cheers
SteveB
libfam and famd
Take a look at them.
LibFam and famd are higher level services that offer network transparent file change notification. But they use the dnotify mechanism or resort to polling. This is suboptimal to say the least. LibFam would probably be the first application to benefit from a better kernel infrastructure for file change notification.
That feature is interesting, but when you do research you also realize it’s the cause of many kinds of problems.
In fact, it’s very possible it’ll get phased out of NTFS altogether. The basic problem was that virus writers could embed entire binaries into a secondary stream of a text file, for example. Because Microsoft doesn’t think about these (blatantly obvious) things, commands like dir and properties in explorer would see a text file as being only 4k, let’s say, while the stream pointed to a file that maybe measured in megabytes. Not only did this allow a very clever virus that filled up your disk (write to a filestream until I/O errors occured) without allowing you to even know how your data is being used up, but it also allowed hidden files in a whole new way.
So, maybe good intentions… bad implementation.
I think Hans Reiser is right to say that file streams or arbitrary attributes are unnessecary and unorthogonal.
A file containing multiple file streams can easily be represented as a directory if you have a file system (such as reiserfs) that permits efficient storage of small files.
Instead of an mp3 file with attributes title, album, track, genre etc. you would have a directory containing many small files for the attributes and one large file for the data.
In a graphical file manager you would just hide these small files and present them as attributes, and if you have to send it over the net you just zip the directory. Command line tools like tar would work just fine, and you could use echo to set attributes for your files…
In a graphical file manager you would just hide these small files and present them as attributes, and if you have to send it over the net you just zip the directory. Command line tools like tar would work just fine, and you could use echo to set attributes for your files…
So basically you’d be treating file system objects the way OpenOffice.org treats its files – as zipped XML objects. And, if I remember correctly, from a DDJ article from about 1995, the way Windows treats *.HLP files – as embedded file systems. And I think, the way the Mac treated its files, and likewise BeOS. So it’s not that new. But hardly a bad idea for all that.
Done right, it would be an excellent way to fully internationalize computer use – if you have a (Macintosh-type) resource fork with appropriate I18N info as one of the (internally zipped) files.
Which then links up with the I18N resources you may have already installed – fonts, BIDI info, etc – and lo and behold, you have a fully internationalized setup from go to woe.
Since WinFS is actually built on top of NTFS it would probably be slower than NTFS (which is what I’ve heard from all the longhorn beta testers), A lot slower.
> So basically you’d be treating file system objects the way
> OpenOffice.org treats its files – as zipped XML objects.
> And, if I remember correctly, from a DDJ article from
> about 1995, the way Windows treats *.HLP files – as
> embedded file systems. And I think, the way the Mac
> treated its files, and likewise BeOS. So it’s not that
> new. But hardly a bad idea for all that.
The directory containing the data and metadata would only be zipped for transport. During normal use it would be just a normal directory. And there would be no XML (shudder) involved. Just many tiny files where the name of the file is the name of the attribute and the content of the file is the value of the attribute.
The idea is not new, but with older file systems this would be extremely inefficient since even tiny files needed a block. So an attribute file with name “Artist” and content e.g. “The Pixies” would consume more than 4096 bytes. With reiserfs the overhead for small files is not that big, so you can afford having many small files.
The problem you raise is caused by the Win32 API not providing any kind of interface to NTFS streams. If it did, then the shell could easily display all stream information for a file. I don’t think they should remove this functionality, just provide the system calls (Win32 APIs) to work with it.
My biggest beef with Unix (haven’t tried them all) has always been the limited permissions schemes on its filesystems. Novell Netware (and NTFS to a lesser extent) has an awesome ones that make administration easier, stuff beyond read, write, execute, sticky and setuid. Things like create, modify, delete, access, display.
I hope that Novell will help put these things into Linux now that they’re in the game. Can someone tell me if ReiserFS4 will add more attributes? And what about ACLs?
There are external patches to Linux to support ACLs. They work with reiserfs, ext3, and XFS. Other UNIXs (eg. Solaris) support them out of box.
Because of the constant additional maintenance to ensure that the disk does not run full. OpenVMS therefore also supports limiting the no. of versions such that when a new version is created then the oldest version is automatically deleted.
Still, the Windows API does not expose this feature of NTFS – I think because MS did not want the phone calls for “why can’t I save my file…”. Besides, it might also have deepened DLL hell.
@Tim:
I just recently discovered that NTFS supports versioning. It implements each version as a separate stream, and uses the same naming semantics as VMS (file1.txt:0, file1.txt:1, etc). This is not surprising given NT’s heritage. You can find more information about this at sysinternals. They provide a streams utility that reports on the number of streams for a given file.
It’s interesting that you mention that, because I have a feeling that’s how Microsoft SharePoint works it’s magic. It allows versioning (with full tracking, etc.) of files. Pretty interesting.
Still, the Windows API does not expose this feature of NTFS – I think because MS did not want the phone calls for “why can’t I save my file…”. Besides, it might also have deepened DLL hell.
I imagine you’re find Win32 doesn’t expose it because DOS-based versions of Windows can’t do it. There’s no reason such a feature need use up disk space without it being made clear to the end user.
Incidentally, it’s not specifically for file versioning, it’s just an arbitrary ability to attach multiple data streams to a “single” file. For example, NT uses it to support the Mac’s Resource Forks with its Appleshare support.
Windows Server 2003 has a feature called Shadow Copies that (from my understanding) comes close to a versioning filesystem. You can take snapshots (of files, folders, …), and all changes to these files afterwards do not overwrite the original file contents, but allocate new space. So you can rollback to an earlier version afterwards.
See: http://www.robertmoir.co.uk/win/ShadowCopiesFAQ.html
Here’s some random comments on the earlier comments:
1) Streams are something completely different from versioning. All streams are accessed at any time, and they are not created automatically.
2) You can’t compare WinFS to NTFS. WinFS is an indexing service that runs on NTFS. Accessing files through it is necessarily slower, but performance is not the reason to use it.
3) There is no upgrade path from Reiser 3.x to Reiser4. There is a utility on Linux that can change any filesystem to any other, so you can use that. But this sort of operation would be very unwise to carry out without fresh backups, so just mkfs’ing and reading back from backup is probably just as easy.
4) Shadow copying is not snapshotting. Linux has snapshotting too. However, it is not versioning either, since it is not fully automated. Having versioning on Linux would be very neat, but there is no filesystem going there right now, I’m afraid.
acl are in 2.6 “out of the box” also
I have been using reiserfs for years now (Alpha and ia32 versions) and love it, but Version4 was suppose to have crypto and compression plugins.
Where are they?
I would love to have a crypto plugin on my laptop, but there is no links to this once touted feature….
Donaldson
I guess it’s a matter of terminology – what Microsoft calls a snapshot might differ from what other people might consider being a snapshot.
Hi
Thats right. MS uses its own terminology for everything
for example)
common internet file system for something thats not common nor for the internet
Reading all the comments on versioning, NTFS streams (not the same as SVR4 STREAMS! , etc, and the inevitable sanity of VMS’ versioning/aging cut-off, I’m forced to ask, if there is difference between file foo.bar and foo.bar+1, and you don’t want to clog up the file system, surely it would make sense to save only the difference?
So you have the versioning=1 in the config files for such and such a file system, which means that every time you add something to such and such a file in said file system, you write the parts that have changed, into the base file, and have the versioning info as a pointer (inode-style) to the previous saved material and only the previous saved which has now been superseded. And which now gets written out into a linked diff.
A diff-and-patch operation, in effect, with control of the diffing and patching being with the file system.
One thing I think where it would make a lot of sense is in book writing and publishing – you ;-(obviously 🙂 don’t want to save every step in the book’s writing, particularly not in the early stages when you are putting the first chapters down. Saving each new + old chapter/s as a complete new file would be a complete waste of space – and believe me, I know.
A complex project, but an interesting one.
Wesley Parish
4 years on Reiser3 and zero problems. ReiserFS wins Hans down.
But one thing that is missing for BeOS style life queries on linux is a fast way to get notified when a file changes. There is the dnotify mechanism, but it only works for a single directory and does not even tell you what file has changed.
First, the mechanism isn’t “dnotify”, it’s the F_NOTIFY fcntl option. dnotify is the name of a program which uses the F_NOTIFY mechanism.
Second, F_NOTIFY can be used to monitor multiple directories by opening multiple file descriptors and assigning them to the same POSIX realtime signal queue. The matching descriptor can be read using sigwaitinfo().
Finally, the BSD kqueue() mechanism is much more computationally efficient than BeOS Live Queries. While it does require an open file descriptor for each file/directory being monitored (which is not impractical for monitoring an entire volume… OS X Panther does it successfully) each event can be associated with a pointer to a particular resource, whereas fs_read_query() merely returns a dirent structure. Consequently kqueue() allows for fully O(1) directory event to data structure resolution, whereas BeOS is O(log n) minimum. So, while kqueue() may have higher initialization overhead, it is far more efficient in operation. Live Queries can only be used to track changes volume wide (similar to Win32 change journals) whereas F_NOTIFY and kqueue() can be used to monitor individual files or directories. Consequently, any program using Live Queries will be bombarded with events they don’t particularly care about.
Linux’s best bet for better directory change notification would be to implement kqueue(), but given that they’ve already gone and created their own Linux-specific interface for O(1) file descriptor event monitoring (epoll) and the problem domain overlap betweek epoll and kqueue would probably be enough to prevent this.
kqueue really has few disadvantages besides recursive VFS monitoring. It supports both edge and level triggered monitoring, can monitor file descriptors, VFS events, timers, signals, processes, is fully O(1), has a remarkably simple interface, and can transparently pass user pointers with events, allowing for maximum computational efficiency.