Linked by Mo Mckinlay on Fri 31st Oct 2003 17:35 UTC
There's been much discussion over the past few months about the marriage of databases and filesystems - with Microsoft's Longhorn reportedly sporting the
Yukon integrated SQL Server, and GNOME Storage in heaty debate, if not development, there's been lots to talk about.
Permalink for comment
To read all comments associated with this story, please click here.
A long time ago, there was an operating system on IBM big-iron called MTS (Michigan Terminal System) that had a filesystem that was a lot like a simple database with fixed, 4-byte keys, a 2 byte (signed) length, and (1..32767) octets of data.
<p>
Typically, the 4-byte key was treated as a signed integer line number, except that the first line was (decimal) 1000, displayed as "1.000" (with trailing zero blanking.)
<p>
The advantages of this system were numerous. All userids were
1 to 4 characters (uppercase), which mapped directly into that 4-byte key, making various databases based on userid easy. New lines could be inserted into existing files without causing subsequent line numbers to change, making it easier to review printed listings (this was when time on the computer was strictly rationed and printouts were very useful.) There were other advantages too, but I'd rather not drone on too much. Documentation for a program could be put in the "negative" line numbers.
<p>
Of course, there were disadvantages too. Until you renumbered the file (changed the keys), you could find yourself unable to insert a new line between lines 1.123 and lines 1.124. There was no "end-of-line" character. And under this particular implementation, it wasn't possible to have a zero-length line. (Writing a zero-length line at a particular key was the DELETE operation.) And, of course, the length limit of 32767 was a problem too.
<p>
Basically, there is a problem when you transfer betweeen files that are structured (line these MTS files, OS/400, or the old Macintosh file system) and unstructyred file systems where files are simply a stream of bytes (probably addressable.) And even unstructured filesystems usually have some structure, such as how text is represented, end-of-line conditions, etc.
<p>
Having a structured filesystem can ease certain problems, but it increases the complexity of the system too. And I can't help but think a filesystem where the schema is configurable (i.e.: user-defined attributes, forks, etc.) then archiving such a mess and inspecting and validating the filesystem after a crash can be a real pain. Of course, if the filesystem is a database, then the OS must take care of issues of simultaneous update in a rigorous way.
A long time ago, there was an operating system on IBM big-iron called MTS (Michigan Terminal System) that had a filesystem that was a lot like a simple database with fixed, 4-byte keys, a 2 byte (signed) length, and (1..32767) octets of data.
<p>
Typically, the 4-byte key was treated as a signed integer line number, except that the first line was (decimal) 1000, displayed as "1.000" (with trailing zero blanking.)
<p>
The advantages of this system were numerous. All userids were
1 to 4 characters (uppercase), which mapped directly into that 4-byte key, making various databases based on userid easy. New lines could be inserted into existing files without causing subsequent line numbers to change, making it easier to review printed listings (this was when time on the computer was strictly rationed and printouts were very useful.) There were other advantages too, but I'd rather not drone on too much. Documentation for a program could be put in the "negative" line numbers.
<p>
Of course, there were disadvantages too. Until you renumbered the file (changed the keys), you could find yourself unable to insert a new line between lines 1.123 and lines 1.124. There was no "end-of-line" character. And under this particular implementation, it wasn't possible to have a zero-length line. (Writing a zero-length line at a particular key was the DELETE operation.) And, of course, the length limit of 32767 was a problem too.
<p>
Basically, there is a problem when you transfer betweeen files that are structured (line these MTS files, OS/400, or the old Macintosh file system) and unstructyred file systems where files are simply a stream of bytes (probably addressable.) And even unstructured filesystems usually have some structure, such as how text is represented, end-of-line conditions, etc.
<p>
Having a structured filesystem can ease certain problems, but it increases the complexity of the system too. And I can't help but think a filesystem where the schema is configurable (i.e.: user-defined attributes, forks, etc.) then archiving such a mess and inspecting and validating the filesystem after a crash can be a real pain. Of course, if the filesystem is a database, then the OS must take care of issues of simultaneous update in a rigorous way.