libferris Winds its Way Towards 1.0.0

libferris is a virtual filesystem (VFS) that runs in the user address space. This means that applications using libferris will use the shared libraries API to access the filesystem which may then delegate to the kernel using libc to perform the desired actions. Operating in the user address space allows libferris to mount things that one would generally not want the kernel to mount. For example libferris mount Berkeley database files, ftp sites, XML files, rpm files, sockets, sysv IPC, mysql databases and remote computers using ssh as a filesystem.What is it

libferris abstracts the main idea of a filesystem being a hierarchal structure of directories (container only) holding files (content only). With libferris both dirs and files are just “contexts”. A context can have both byte content and contain other contexts at the same time. libferris also attaches Extra Attributes (EA) to each context. The EA can come from a variety of places including from a native kernel filesystem (eg. e3 or XFS), out of a composite file mounted as a filesystem (eg. XML, db), or can be generated from the byte content of the context that the EA is attached to (eg. jpeg width and height).

Some of the EA provided by libferris present the content of a context in a more directly usable format. For example, when viewing a directory with png files in it, libferris will not only create EA for width, height, depth information about the png files, but also provide the decoded image data through the “rgba-32bpp”
EA. libferris can also decode mpeg2 video to the “pgmpipe” EA and demultiplex many composite media formats presenting the many streams of video and audio data as individual EA.

libferris can have its abilities extended at any time by installing new modules for mounting contexts or accessing EA.

libferris uses URLs to locate its contexts. Indeed each context has a url EA that one can read and pass to the ferris Resolve() function to get that context back again in the future. libferris deviates from other userland VFS solutions (such as gnome-vfs) in its handling of filesystem stacking. Filesystem stacking
occurs when one reads a tar.gz file as a filesystem, in this case the native disk access filesystem module can not access the individual files in the tar.gz so a special module is stacked on top that can mount this data as a filesystem. In libferris one can read a tar.gz file by just listing it.

$ mkdir ~/test && cd ~/test
$ date >testfile1
$ date >testfile2
$ tar czvf archive.tar.gz testfile*
$ ferrisls -lh ~/test/archive.tar.gz 
-rw-rw----        ben        ben                      testfile1 
-rw-rw----        ben        ben                      testfile2

The basic philosiphy with URLs in libferris is that they should contain the location of the data and where possible avoid containing how to get to that location. Thus in the above example libferris worked out itself what module to use to mount tar.gz/bz2 files as a filesystem and automatically stacked that module for you. In keeping with this philosiphy authentication data is stored in a netrc(5) like file in ~/.ferris. This allows libferris to mount mysql databases without having usernames and passwords in the URL string. Authentication for ssh is assumed to be using an ssh-agent(1).

Currently I have reimplemented many of the fileutils programs from scratch using libferris. For the console there are ferrisls, ferrismv, ferrisrm, ferriscp and fcat. Many of these commands include additonal functionallity. For example ferrisls includes options to set: the record and field seperators for output, a custom filter string, custom sorting, filesystem monitoring, recommended output mode and XML output mode.

Ferris filtering uses an extension of LDAP filter strings (see RFC 2254). Filters have been extended to include regex filters using the =~ operator. Some examples of ferris filters include:

(name=fred)
(&(size<100)(name=~^my.*e$))

Where the first matches only contexts with names (filenames) of fred and the second filter shows contexts smaller than 100 bytes with a name starting in my and ending in e.

The sorting string format has evolved over time. Support for optional modifiers is at the start of string, pre and postfix delimited by ':'. Modifiers include (!) for reverse sort, (#) for numeric sort, (VER) for version sort like ls -lv and (CIS) for case insenstive sorting. The name of the EA to sort with is then given. Examples include:

name
:!#:size

where the first line sorts by name case sensitive and the second sorts by size in reverse numerical order.

ferrisls' monitor option allow one to monitor contexts for changes. Note that the underlying module needs to emit change events for this mode to work. The native filesystem module uses fam to allow monitoring.

The two new output formats in ferrisls allow one to view and convert non traditional filesystems much easier. ferrisls --xml will output an XML document, which allows one to create web directory listings showing all of the context's EA. ferrisls -0 shows what the context as nominated as "recommended-ea". A good example of this
is mounting a mysql database, where the column names of the table or query mounted form the recommended-ea.

Ferris also provides support for cursors and file-clipboards. There are command line clients to perform cut, copy and paste of files.

$ cd ~/test 
$ date >1;
$ fclipcut 1
$ cd /tmp
$ fclippaste .

These commands make libferris easy to integrate into existing file managers like nautilus.

Enter GTK+2

Most of the code to perform the fileutils clients job is part of libferris. I have thus created graphical clients using GTK+2 and the same codebase as the command line cp, mv, and rm commands. These commands are prefixed with 'gf' to differentiate them from the console ferris client and the standard fileutils client.

Because I implemented the functionality of the fileutils code from scratch I added in libsigc++ events that are fired at interesting times which both the console and graphical clients connect to. This allows
the graphical client to show progress of an individual file copy in its interface. This event model also allows the graphical client to present more options than are convenient in an interactive console tool. For
example when replacing a context gfcp will present many EA from both the source and destination object and will allow you to create a predicate for future file replace dialogs based on any of the presented EA.

$ cd /tmp && date >fileA && date >fileB
$ gfcp -avi fileA fileB

The yes and no options act just like you expect. The auto options will respond for you the next time this dialog is needed (during this session). Your response is worked out by matching the values of the EA for each row you select in the dialog and comparing future conflict files with those values. I know that currently this auto feature is a little "un user friendly" and I again challenge GUI experts to attach gimp files and descriptions to the witme/libferris mailing list.

ferriscreate allows you to create new context objects. This is handy because you can provide metadata about the object at creation time, for example the width and height of a new png file.

Using ferriscreate provides a uniform interface for creating new objects instead of having to become familiar with many dialogs that create the same object type. For more details on ferriscreate see its paper.

Getting it

Some of the features of C++ that libferris uses make it require gcc 3.1 to function properly. Because of the gcc and library requirements of a full ferris install, I have created a "mini-distro" of rpm/srpm files that will install libferris onto a Redhat 7.3 machine. These will hopefully be hosted by
sourceforge.net soon. If you would like to host these files (about 100Mb) then drop me a line on main list.

If you just want the tar.bz2 files get them from from libferris' home page and after
installing use:

$ ferris-first-time-user --setup-defaults
$ ferris-first-time-user --logging-none

To setup default ~/.ferris for each user.

Developer details

libferris was written making use of modern C++ design such as the STL and templates. Ferris is currently at release 0.9.21, once it gets to 1.0.0 I plan to also create an LD_PRELOAD hack that will replace functions like fopen(3) with functions that use libferris to operate on files.

I will probably create a little tutorial on creating EA modules depending on community interest.

Future

For libferris I can see new context modules for cached stacking, LDAP, mng, xindice and other formats. I'd like to also abstract the sorting and filtering to use modules so that new sorting methods can be added (and cascaded stacked sorting again!).
A php4 wrapper. For the details see the TODO in the tarball.

I currently have a GTK+2 graphical client using providing both gevas and gtktreeview filesystem views. This client is the evolution of witme2 though it is still in alpha state and currently not publicly available.

About the author:
Ben Martin started coding a file manager called dzdir for the Amiga in highschool shortly after
learning C. Spent 5 years at Queensland University of Technology (QUT) studying databases, general programming, networks and security and receiving a Bachelors and Masters of Info Tech for my troubles. Have been coding filesystems code in one way or another for over ten years.

6 Comments

2002-07-03 7:59 pm

Anonymous
This looks to be like what GNOME is working on with gnome-vfs and gnome-vfs-extras. It treats all connections ftp, samba, tarballs, bzip tarballs, sockets, as a simple layer for handling different URIs.
2002-07-04 12:34 am

Anonymous
I decided to start libferris instead of joining g-vfs for many reasons:

* many people thought that the VFS should never deocde/encode data (such as png, jpg, video and audio).

* the use of EA to anywhere near the extent used in libferris would have been such a long talk on mailing lists.

* AFAIK the stacking of URLs in gnome-vfs is still using the ugly embed the handler with # like syntax, once again there were many folks on #gnome who were saying that this wouldn’t be changing.

* The roundtrip of context object <–> DOM and close integration with xslt.

* Different vision for sorting and filtering.

* The merge of file+dir=context.

* I plan to support the reiserfs EA of the new reiser (which will make for very interesting times

* The use of C++: templates and STL

* For fun 😉

That said, libferris uses glib2 and I plan to write a gnome-vfs module to mount libferris so that the GNOME community have another way to use libferris.
2002-07-04 2:56 am

Anonymous
Well, what can I say other than libferris is a great project. Good problem space, good thinking, and a good stab at coding it all up. Thank you for making it happen.

As luck would have it, I was just at the libferris site a couple days ago. I will be having fun with it as soon as my Linux box is up and going.

My first exposure to integrating disparate data worlds came at Z-Code in 1995 when we did some investigation on evolving Z-Mail into what we called “Active Mail”. At the time, Mosaic was still pretty basic and Yahoo! still fit on a couple pages of HTML 😉 This is still an area that is quite undeveloped. Much as the net and the web did for shared computing, there is still much to be done.

In the NT world, I’ve been happy with 4NT which gives me the ability to work with FTP sites as files.

Cheers for now and I hope to see libferris continue!

#m
2002-07-04 10:14 am

Anonymous
this is how it should be done! finally i can mount my home dir from work despite the fact that NFS doesnt work over a WAN. just libferris-ssh and its all there.

its a pity this can’t be used by older applications (ls,cat,etc…) but that would be expecting a little much its also a brilliant starting point for anyone wanting to write a microkernel fs server.
2002-07-05 12:29 am

Anonymous
I’m pritty sure this is mentioned many times, but I’ll do it again here. Installing fsh prior to building libferris makes ferris use fsh instead of ssh directly. fsh is a tool for establishing an ssh tunnel for remote execution of commands without requiring an ssh authentication on every connection: http://freshmeat.net/projects/fsh/?topic_id=44%2C150

Makes doing ferrisls ssh:// many times much faster.
2002-07-05 12:32 am

Anonymous
I’m currently using a subscription scheme to try to code ferris full time again. Basically ferris is GPL relying on folks who are interested enough in ferris to step forward and support it.