Linked by Thom Holwerda on Thu 4th Nov 2010 22:40 UTC, submitted by rhyder
Linux "For a fairly scruffy looking guy, I have a surprisingly healthy approach to organising my files. However, I'm constantly pushing up against the limitations of a system that is based around directories. I'm convinced that Linux needs to make greater use of tagging, but I'm also beginning to wonder if desktop Linux could abandon the hierarchical directory structure entirely."
Thread beginning with comment 448565
To read all comments associated with this story, please click here.
Not just tagging
by Zifre on Thu 4th Nov 2010 23:34 UTC
Zifre
Member since:
2009-10-04

Tagging is nice, but file systems need a complete overhaul.

First, we need transactional file systems. There is really no good reason not to have a transactional file system. It would make things like updates, installations, and removals much simpler. It would also make a lot the common synchronization hacks unnecessary. The thing is, this really isn't that hard. I created a very primitive transactional file system prototype for Linux some months ago, but I haven't had time to finish it (I plan on basing it on Btrfs). Any user could do transactions, and they would never block. The basic algorithm was that if a transaction wanted to write to something that was being read, it would be canceled, and if it wanted to read something that was being written, it would be cancelled.

Second, we need indexing of extended attributes. BFS got this right. My music should just be a folder with a bunch of files that have metadata. There should be no database. I should be able to search for songs with complex logical queries, not just simple text searches like you would find in a standard music player (e.g. iTunes, Rhythmbox).

Personally, I believe tagging is secondary to all of this. My mind naturally categorizes things hierarchically, but I have had times when I wished a file could be in two folders.

I am quite sure that the reason that none of these ideas have been implemented is not because they are hard, but because people stopped caring. File systems have hardly changed since the 1980s (the interface, not the implementations). I think the biggest problem with Linux is that most people are focused on creating a shiny interface, when the system below is inelegant and full of hacks. Of course, every major OS is like this, but I think it shows more in Linux. This is an area where Linux could really innovate and be better than Windows and Mac OS X.

Reply Score: 5

RE: Not just tagging
by koorogi on Fri 5th Nov 2010 00:09 in reply to "Not just tagging"
koorogi Member since:
2010-11-04

My mind naturally categorizes things hierarchically, but I have had times when I wished a file could be in two folders.


This has been possible for ages. It's supported by all POSIX-compliant OSes, plus Windows NT4+. It's called a hard link.

Reply Parent Score: 4

RE[2]: Not just tagging
by Zifre on Fri 5th Nov 2010 12:30 in reply to "RE: Not just tagging"
Zifre Member since:
2009-10-04

This has been possible for ages. It's supported by all POSIX-compliant OSes, plus Windows NT4+. It's called a hard link.

Yes, I know about hard links. However, I have to get out the command line to create them.

Also, I would generally like it so that if I delete the file from one directory, it would disappear from all the others too. Hard links don't work like that. (You could do that with symbolic links, but you would be left with broken links.)

Reply Parent Score: 2

RE: Not just tagging
by modmans2ndcoming on Fri 5th Nov 2010 00:53 in reply to "Not just tagging"
modmans2ndcoming Member since:
2005-11-09

unless you have come up with a magical method of concurancy, there will always be blocking unless you take on mutli- versioning but that brings with it its own issues to think through.

Reply Parent Score: 2

RE[2]: Not just tagging
by Zifre on Fri 5th Nov 2010 12:33 in reply to "RE: Not just tagging"
Zifre Member since:
2009-10-04

unless you have come up with a magical method of concurancy, there will always be blocking unless you take on mutli- versioning but that brings with it its own issues to think through.

Nope, there is no blocking. Whenever a transaction would normally block, it is aborted. If two transactions are competing, the one with the higher priority always wins. Regular file operations are treated as transactions with infinite priority, so they are never aborted or blocked for transactions.

Reply Parent Score: 2

RE: Not just tagging
by Zifre on Fri 5th Nov 2010 12:53 in reply to "Not just tagging"
Zifre Member since:
2009-10-04

One thing I forgot to mention: file names. We really need to stop relying on names to locate files. Something like a UUID would be much better. The name would solely for display purposes, and would just be a regular indexed extended attribute. Links would reference the UUID, not the name. The entire file system would essentially be a giant database. You could query the file system based on any attributes, and the result would be a list of UUIDs. You could then open a file through the UUID. Directory structures could be implemented using a parent attribute that would refer to the "directory" (really a file) containing a file. To get a listing of the files in a directory, you would query for all files with a parent attribute equal to the directory's UUID. Tagging would be implemented in a similar way.

Unfortunately, this is a bit harder to implement. The major problem is dealing with broken links. If you delete a file, do all the references to it go away, or stay broken? Would it be possible to create a file with a specific UUID in order to fix a broken link? These problems are a lot harder to solve, so I would not expect to see a system like this for a long time. It is somewhat similar to WinFS. Does anyone know how WinFS solves these problems?

Reply Parent Score: 2

RE[2]: Not just tagging
by sbalmos on Fri 5th Nov 2010 20:09 in reply to "RE: Not just tagging"
sbalmos Member since:
2008-01-31

Unfortunately, you've pretty much described an inode, and how filesystems generally work already. Especially a directory being a special file that contains other file ID (sorry, inode) references.

Reply Parent Score: 2

RE: Not just tagging
by phoenix on Fri 5th Nov 2010 16:32 in reply to "Not just tagging"
phoenix Member since:
2005-07-11

First, we need transactional file systems.


Only if by "we" you mean Linux. ;) Non-Linux systems have had transactional filesystems for years now (ZFS, HAMMERFS), and support for versioning in the filesystem (VMS).

There is really no good reason not to have a transactional file system. It would make things like updates, installations, and removals much simpler.


You're right, it does. ;) ZFS snapshot your filesystem(s), do your updates. If it fails, roll-back the snapshot and carry on. If it succeeds, you either keep the snapshot just-in-case, or you delete it. Works beautifully, even across full OS upgrades.

Second, we need indexing of extended attributes. BFS got this right. My music should just be a folder with a bunch of files that have metadata. There should be no database.


Uhm, what do you call your index, if not a database?

Personally, I believe tagging is secondary to all of this. My mind naturally categorizes things hierarchically, but I have had times when I wished a file could be in two folders.


Some kind of tagging or EA system would be nice, for just this reason. After using GMail and Zimbra for the past couple of years, it's nice being able to physically store messages in a hierarchical manner, but also access them via multiple "folders"/tags where appropriate. And having saved searches (virtual folders) that refresh each time you go into them is absolutely wonderful; something I've missed from GUI file managers like Dolphin.

Reply Parent Score: 3

RE[2]: Not just tagging
by jonas.kirilla on Fri 5th Nov 2010 22:26 in reply to "RE: Not just tagging"
jonas.kirilla Member since:
2005-07-11

When people think "database", they might think of a userland process, some kind of metadata storage on -top- of a traditional filesystem and some periodic indexing process. BFS indices (in BeOS and in Haiku) are an integral part of the filesystem. Indexing happens in the filesystem (is done by the filesystem) at the exact time when attributes are created/altered. There is no periodic indexing process, and there is no separate metadata storage. (Which could potentially get out of sync with the target files.)

Reply Parent Score: 2

RE[2]: Not just tagging
by Zifre on Sat 6th Nov 2010 20:20 in reply to "RE: Not just tagging"
Zifre Member since:
2009-10-04

Only if by "we" you mean Linux. Non-Linux systems have had transactional filesystems for years now (ZFS, HAMMERFS), and support for versioning in the filesystem (VMS).

Nope, that's an entirely different type of transaction. The only "real" transactional file systems (i.e. allow multiple user-level transactions that can be cancelled individually) that I am aware of are TxF for Windows Vista/7, and TxOS for Linux: http://www.cs.utexas.edu/~porterde/txos/

You're right, it does. ZFS snapshot your filesystem(s), do your updates. If it fails, roll-back the snapshot and carry on. If it succeeds, you either keep the snapshot just-in-case, or you delete it. Works beautifully, even across full OS upgrades.

That works fine when you only need to do one transaction at a time. There is no reason why a file manager shouldn't be able to do atomic copies or atomic unpacking of archives. Snapshotting the entire file system is not a very general or elegant way to solve the problem.

Uhm, what do you call your index, if not a database?

It is a database, but it's part of the file system (i.e. not updated by applications). Look at BFS on Haiku or BeOS.

Reply Parent Score: 2

RE: Not just tagging
by abraxas on Sun 7th Nov 2010 15:45 in reply to "Not just tagging"
abraxas Member since:
2005-07-07

Personally, I believe tagging is secondary to all of this. My mind naturally categorizes things hierarchically, but I have had times when I wished a file could be in two folders.


You can have that with a hard link although it does have its limitations.

Reply Parent Score: 2