Linked by Thom Holwerda on Wed 25th Jul 2012 22:18 UTC
OSNews, Generic OSes The article I'm about to link to, by Oliver Reichenstein, is pretty terrible, but it's a good way for me to bring up something I've been meaning to talk about. First, the article: "Apple has been working on its file system and with iOS it had almost killed the concept of folders - before reintroducing them with a peculiar restriction: only one level! With Mountain Lion it brings its one folder level logic to OSX. What could be the reason for such a restrictive measure?" So, where does this crusade against directory structures (not file systems, as the article aggravatingly keeps stating) come from?
Permalink for comment 528671
To read all comments associated with this story, please click here.
Member since:

For anyone still reading - and with profuse apologies in advance for mad length - I've been giving the topic a bit more thought. Being an IPC weenie, I will tend to cast any given solution in terms of "I know, let's use some IPC." (And now we have two problems.) So I'm coming at everything from a "it's a communication problem" perspective, just so you're aware.

First, let's reiterate the root requirements: how to ensure a user's data is safe, secure and quickly and easily accessible - and all of this in a highly heterogenous, networked environment - without them having to do any of the tedious clerical work that involves?

If we phrase this in terms of a contract between computers and users, then straight off the top of my head:

1. This has to work across:

- individual user devices (not just general-purpose PCs but also phones, tablets, AV systems, games consoles, and anything else you can think of)
- home and business intranet-based NAS and server boxes
- big internet-based commercial cloud storage services.

2. It'll require behaviours such as:

- automatic replication and redundancy (including all the challenges that come with synchronising user changes across all data stores)
- full revision history
- security and trust (not only must accidental data loss never occur, but all data must be heavily protected against malicious access)
- transparency and automation-based ease of use (since it must cater to users from all walks of life).

Not a comprehensive list by any means, but enough to illustrate. It's no small request either, so any attempt at addressing it is going to have to work extremely hard at keeping a lid on complexity.

Sounds daunting, I know, but this is actually something that the early Unix world had a fantastic knack for: take a large, complicated, scary problem and solve it by the highly creative, insightful and relentlessly parsimonious application of what few crazily limited resources were available to the task. It's only later, as growing wealth and maturity generate comfort and complacency, that such careful habits become non-essential, and the old talent and tricks for efficient, effective problem solving are forgotten or lost in the middle-age spread.

For example, think about what the hierarchical Unix file system [plus its expanded and refined Plan9 descendent] represents. From a default perspective (such as the one Thom is seeing), it's just a simple 1:1 representation of the contents of one or more disk-type storage devices wired to the local system.

However, those old Unix guys were terrific at spotting opportunities for reusing concepts and behaviours, so it also works as a collection of endpoints onto various services - device drivers, network-mounted drives, etc. - thanks to stuff like Unix domain sockets, which are in turn only a tail's shake away from Internet sockets. Imagine the increased load and complexity on early Unix had they simply ploughed in with a wealth of tools and manpower at their disposal, creating a completely new interaction model as each new requirement presented itself. (They still missed some opportunities - hence Plan9 - but for a first crack at the problem it was pretty damn good.)

This parsimonious, holistic approach to all-over system design seems to have been all too often squandered or forgotten over the years (e.g. the failure of Linux DEs to follow Unix Philosophy [1]), but this is surely the right time to get back to it. Like it or not, the entire concept of what a computing system is and what it should be is rapidly and radically changing.

Just as the original primary (RAM) vs secondary (disk) storage kludge has created all sorts of awkward divisions and clumsiness, so too has local vs remote storage. We've one protocol for accessing local data (the hierarchical file system) and a whole plethora of them for accessing remote data (RESTful HTTP being just one example). These are divides drawn along technological lines - originally out of necessity, but now through inertia, laziness and lack of vision. Why people use technology has got lost due to microscopic focus on how they currently do it, with no regard to whether it's still optimal or not.

Such well-worn ruts may have become familiar and comfortable - and even a position of power for those most practised in negotiating them, - but they are going to become a liability that even the most conservative nerds/geeks will not be able to afford to ignore.

What's needed is to step all the way back and try slicing the entire problem along new and different (even novel) lines, i.e. according to [ideal] usage. And then redefine the technology so that in future it fits the way that users should interact with their data, rather than forcing users to adapt themselves to the current technology with all its legacy baggage and myriad complexities and faults.

Change is already underway, of course, but even from my largely uneducated viewpoint it looks completely piecemeal with no clear overarching strategy. For example, Mountain Lion's Core Data (data storage) framework can now use iCloud as its backing store, but this is at the application level, and just one particular framework on one particular OS using one particular cloud. In fact, ML now has no less than three different ways to interact with iCloud.

Now, I am all for 'let a thousand flowers bloom' as far as research projects go, but when it comes to production systems to be used by millions, a coherent overarching strategy is absolutely essential if complexity is to be managed at all. Such as: pushing the functionality as far down in the system as it'll possibly go (i.e. right down in the bowels of the OS, alongside the file and network subsystems), defining [open] standards and protocols to be used all the way throughout, and ruthlessly consolidating and eliminating duplication of effort and needless redundancy (c.f. Unix's 'everything is now a file' idea that instantly provided all clients with a powerful, clearly defined IPC mechanism essentially for free).

Obviously, for OSes and applications, the traditional device-centric patterns work well as a whole, providing a more than adequate benefit-vs-drawback ratio, so they will no doubt continue to rely on the existing hierarchical file system to manage their own resources.

OTOH, what's needed for user data is a user data-centric, not device-centric, approach, and that means decoupling user data management from the nitty-gritty implementation-dictated details of file systems, databases, etc. and trying as much as possible to create a single, standard interaction model for accessing user data regardless of how and where it is stored.

The more you think like this, the more you realise just how far beyond the local file system this goes. For example, what is LDAP if not a half-assed reinvention of the 'Unix file system as namespace' principle? And what are the 'everything is a file' and REST interaction philosophies, if not two sides of the same damn coin? All this and more is just crying out for massive consolidation.

So the HFS will still be required; it just won't be something users interact with directly any more. Instead of accessing file and network subsystems directly, userland processes will talk to a single standard 'data management' subsystem whenever they need to read or write user data. Once that decoupling is completed, the system is free to deliver on all of the wonderful promises made in the contract above. Plus of the file system-imposed problems currently bedevilling users (backup hell, iOS data exchange, etc.) simply cease to exist!

Ultimately then, it's all a matter of good communication. Admittedly, before the mechanical machine-to-machine challenges are addressed some additional effort may still be needed the geek-to-geek front. But with luck Thom &co will become believers yet...;)



Reply Parent Score: 1