Interview with Dr Gary Kimura, co-developer of
NTFS
Dr Gary Kimura is Professor Department of Computer Science & Engineering at the University of Washington
What was your background before you started?
I was a member of the Windows NT team, the kernel group. We
also wrote FAT and HPFS for NT, for the first NT shipment.
What tools were used to create NTFS?
It was a new OS, so there were few tools. Just the usual
kernel debugger. DBG.
How long did it take?
It took a few years longer than we anticipated. We had the
group formation and started meetings in 1989 and the first SDK was released in
1992. So that’s about 3 years
How many people were involved?
Myself, Tom Miller, Brian Andrew, and David Goebel.
What were the biggest challenges?
It was easier because we could create our own APIs, and had
already hashed out FAT for NT. There were nuances for NT interfacing. The code
had to work closely with the cache manager and memory manager.
NTFS self-hosted itself and that was a challenge for a file
system. Unlike software where if you get corrupt memory you can just reboot. If
you corrupt the file system it’s a total loss. Our file system data persisted
across reboots. We had to be very mindful of that.
Do you recall any particularly difficult bugs?
Nothing in particular since it was our 3rd File
system under NTFS. Journaling required a lot of work. If the system crashed
the NTFS recovery log had to be played back before we could continue. In that
version only metadata for system files were transacted, not the user data.
Do you have any lessons learned from the experience?
In hindsight, the architecture was done for what we thought
was a really big machine. A PC didn’t even have 1 GB of RAM yet. Can you
imagine that? The largest disk was small, not as big as today. We designed NTFS
to stretch out to 64-bit but the rest of the OS wasn’t designed for that yet.
So they didn’t have to re-architect that part.
Have you kept up with recent NTFS features?
No, not really.
What are your recent interests?
I enjoy teaching at the undergrad level. I try to embark the
hard lessons learned from my own experience.
Do you have any advice for people developing operating
systems today?
No technical advice. Just keep your life in balance. It is
seductive work and can be all-consuming. It can consume too much of your time
and life.
Do you keep in touch with the original NTFS team?
Yes, we get to together to socialize and drink beer. Once in
a while I get them to come over for a guest lecture.
Summary
NTFS is complex and yet has a highly functional elegance. It
has benefited from years of testing and continued development.
References
- PC Guide
to NTFS - PC Guide History of
NTFS - PC Guide NTFS
Architecture - Wikipedia entry
for NTFS - Paragon
Software NTFS - Soft Ambulance
History of NTFS - Toolbox
for IT NTFS - Wapedia NTFS Wiki
- NTFS.com
- Microsoft
Technet Sysinternals - Russinovich, Mark E.; Solomon, David A.; Ionescu, Alex
(2009). “Storage Management”. Windows Internals (5th ed.), Microsoft
Press. ISBN 978-0-7356-2530-3. - Custer, Helen, (1994) “Inside the Windows NT File Systemâ€,
Microsoft Press. ISBN 155615660X. - Ragar, Najeev, (1997) “Windows NT File System Internals : A
Developer’s Guideâ€, O’Reilly Media ISBN 1565922492.
To me, by far the most interesting feature that sets NTFS apart from other great file systems like ZFS and Btrfs is the support for transactions.
I really don’t understand why programs don’t make greater use of it. I would really like to see it used for things like installs, etc.
I know atleast btrfs has transactions. I would guess ZFS does too.
Yes of course. Transactions is one of the big selling points of ZFS.
Both have transactions, but not like TxF. The transactions in Btrfs and ZFS is system level (like snapshots) and they block all other operations while they are running. (Correct me if I’m wrong, but this was how it worked, last I heard.)
No most operating systems can support transactions if you block all other operations. The selling point of real transaction support is that you don’t need to block everything else.
That’s my point. As far as I’m aware, ZFS and Btrfs transactions do block everything else, making them useless except in a few very specific scenarios.
Uhm, what?
Every single write operation in ZFS is a transaction, and is atomic: either the write happened and the new data is available, or the write failed and the old data is available. You never get partially written data.
If something does go wrong, and there’s a problem with the pool at import, you use zpool import -F <poolname> to roll back one transaction. If that still fails, you keep going, rolling back transactions until you get an importable pool.
Transaction support has been built into ZFS from day one, and is one of the core principals of ZFS.
Maybe I haven’t been very clear. The transactions in ZFS are system level. Individual applications can’t create isolated transactions on different parts of the file system.
With TxF, an installer could, for example, create a transaction for the install process so that if you lost power, the install would not be half done. However, a file you saved between the start of the install and the power loss would still be there.
I am pretty sure the tree structures from filesystems like ZSH, btrfs or old Reiser4 enables transaction, but it might not be high on the list of things to implement.
If you support effecient snapshotting then transactions are not far behind.
NTFS rocks, don’t let anybody fool you. Transparent compression, encryption, rich acl and a lot of other stuff I don’t know about. Too bad some features are not exposed to the interface (hard/soft links for example).
And only after 20 years the cracks start to appear. Throw a few hundred thousand files in a directory and it breaks down. But other than that, i have little to complain.
The absolute crap Microsoft build on top of it is an insult to the original designers. We need more metadata in NTFS, but this is not the way to go. NTFS did its job, it did it well but the time for plain filesystems have passed.
We need something like ZFS for the volume management and BeFS for the metadata and we are good for another 20 years. No rewrite needed
Did you disable short name generation?
I don’t think the article mentioned this, but NTFS generates two links per file: the name you use (long name) plus an autogenerated short name for DOS applications if the name you use is not 8.3 compliant.
Unfortunately the 8.3 namespace tends to congest pretty quickly, leading to painful times to insert and delete entries as NTFS searches for a free name.
Turning off shortname generation dramatically improves scalability at the filesystem level. Explorer et al may still struggle, but that’s a different issue.
– M
Turning off short filename generation on a stable production server is a bit of a no-go. But you might be right, since the files started with the same characters. I suspected improper hashing or something like that in the first place, but it was a long time ago. Listing the directory took about 15 minutes before it even started printing in a dosbox, deleting the files was a pain.
Explorer struggles even with a few hundred files, which is what i meant with the crap they build on top of it. I really look forward for the day 8.3 is disabled by default and explorer is usable.
Now i’m a ZFS fanboy, which has different imperfections.
I admit, I view NTFS as a relatively nice and very reliable filesystem. I don’t recall ever losing a file on it as a result of a problem with the filesystem itself. It beats the living hell out of FAT, which would always manage to “lose” critical system files and require copying them back over from the OS CD back in the Win9x days (though to be fair, this could have equally been caused by these OSes’ tendency to crash and burn constantly, but IMO likely a combination of poor filesystem *and* operating system design…).
But one thing I will never praise NTFS for: performance. Sure, NTFS can take a hell of a lot more fragging than FAT ever could and slow down less due to it, but it still fragments to hell and back in very little time of normal use and will noticeable slow down in a short amount of time. Just like its predecessor, it constantly needs defragged–once every week (two at the very most) back when I used to run Windows XP on NTFS. And no, from what I’ve seen of Vista/7, it’s not improved; the filesystem still seems happy to scatter pieces of files all over the disk.
When nice and clean (no or few excess fragments) NTFS is very snappy. It’s just too bad it can’t stay that way for much more than a week.
Yet, most DVB set-top-box manufacturers still ship their PVRs with FAT32 instead. Your recordings are split into 4GB chunks. Why don’t they pay a license fee to use NTFS?! (or even an open-source filesystem)
Because nearly every OS supports FAT32.
My guess is it makes it seem somewhat future-proof in the eyes of management.
“We can change everything and that drive will still work!”
I would never use NTFS in a set-top box, because most of those run some kind of Linux, and ntfs-3g is _usable_, but I avoid writing to NTFS volumes like the plague.
Most of the problems I had with it were years ago, but partly because I’ve only had to deal with NTFS once since them (Mac user bought a sata drive from me and wanted some data from me as well, but FAT32 didn’t support many of the filenames).
Nice to see someone going into detail now and then. Thanks.
I enjoyed the article somewhat, I like technical stuff about filesystems. What I don’t understand is what public the author is trying to reach… if you are talking about using file streams or file system journalling, is it really necessary to point out that there is a button in the lower left of the Windows desktop that is called “Start”?
Some comments about NTFS’ problems would have been fair too. Like the fact that it is very prone to fragmentation and that it cannot handle directories with a large amount of files inside.
And I don’t really think Stacker was big in the last decade… he’s talking about the 90s.
Perhaps the audience you’re in? 😉
Yeah if its so advanced why does it still suffer from extreme fragmentation? One of the biggest advantages to moving to Linux is that you don’t have to defragment your hard drive.
more like one of the (smaller) disadvantages to running linux is that you can’t defragment your hard drive (short of copying the whole filesystem to another drive then back)
If you use XFS, you can defragment.
xfs_fsr improves the organization of mounted filesystems. The reorganization algorithm operates on one file at a time, compacting or otherwise improving the layout of the file extents (contiguous blocks of
file data).
I don’t like the master file table, because it doesn’t seem to be duplicated anywhere although it’s possible to manually do that. The reason is that I’ve seen what happens when it gets corrupt, especially when the information about free and used space becomes inconsistent. This gives you the disappearing file effect, where by some of your files get accidentally replaced with newly created files because the mft had the wrong information. Usually this is fixable in the sense that you can stop the corruption and return the information to a consistent state, but no amount of fixing gets you those files back again. The mft should auto-duplicate every so often so that if the main one becomes inconsistent the fs can look to the backups and automatically bring it back into line. It’s possible to manually do this, but it should be done constantly.
This post seems to be confusing a lot of different issues.
Firstly, the MFT doesn’t track free space; the volume bitmap does. If you have two files that exist on a volume that both believe they own the same cluster, that’s corruption in the volume bitmap, not the MFT.
Making a file disappear requires lot of changes. It needs to clear an MFT bitmap entry, the MFT record, the index entry/entries, etc. There is some level of redundancy here: if an MFT bitmap entry is marked as available, NTFS will try to allocate it, then realize the MFT record is still in use and will abort the transaction.
If I had to speculate, I’d say you’re seeing the effects of NTFS selfhealing, which attempts to resolve on disk inconsistencies automatically. If it really annoys you, you might want to turn it off.
Oh, and NTFS does duplicate the first four MFT records. Originally (back in Gary’s time) this spanned the whole MFT, but that didn’t seem like a valuable way to spend space. Most notably, there’s always the issue of knowing when information is wrong – having two copies is great, but you still need to know which copy (if any) to trust. Reading both would be very expensive, so this really needs a checksum, and you start getting into substantial changes after that.
– M
Very similar to Files-11/RMS on VMS.
Edited 2010-11-30 06:05 UTC
people can rip on windows and microsoft all they want. but throughout it’s entire lifecycle NTFS has been a very good and stable file system.
A File System of Files – most all file systems are.
Fill Your Quota – most server-oriented file systems have this. FAT (FAT12/16/32, exFAT, etc) are not server-oriented. Unix manages this with additional, quota-specific software; but the file system still has the ability to manage it – even ext2/3/4.
Shadow Copy – this is NOT a file system function; but an additional service Windows installs – see Services.msc for Volume Shadow Copy. As such, any other file system can have this same function with the equivalent service/daemon.
File Compression Made Easy – can’t say anything here; but I’m sure there are equivalents
Alternate Data Streams – less like a feature, more like a security hole. KISS please. (Can you say Bloat?!)
File Screening – Why do you need to be so anal to your users?
Volume Mount Points – most other operating systems have this available for ALL file systems, even FAT.
Hard Links/Soft Links/Junction Points – As you pointed out, long available on other systems. NTFS has had it for a long time too, but no one used them until recently. Microsoft now uses them to “enhance” DLL compatibility
Fun With Logs – available in other systems long before NTFS existed.
Encrypting File System – again, can you say bloat?! This isn’t a file system feature. It’s a function of a higher level piece of software. Other OS’s achieve this by using the “Volume Mount Points” and better file system support for doing this with any file system by layering the two. KISS please!
Advanced Format Drives – NTFS is not alone.
WinFS – it was never going to work at the performance levels necessary.
Is the title of this article “Features unique to NTFS”?
No?
Your comment is fail.
Edited 2010-11-29 23:30 UTC
File by file compression I think is intended for btrfs, but otherwise it is unknown in Linux.
Alternative streams are common in Mac. They sound useful, but without awareness of them, they are also quite scary.
I was mostly surprised by the description of hard link:
When you change the links attributes, the target attribute changes, ok
When you delete the link, the target is deleted.. Wait? This is not how it works in linux and posix. In posix filesystems the target is an inode and it is ref-counted, and only deleted when the last link to it disappear. Deleting a hard link doesn’t delete the file. In fact you can delete the links ‘target’ and the so-called link will still work.
I think deletion depends on the api/tool you use.
If you use basic remove function, it probably only removed the link.
However if you delete it from a tool that does not recognize links (say explorer), it will try to “clean up” the directory first, by removing all the contents, then rmdir it, which might keep the original directory in its place, but now empty instead of full.
Lol didn’t know hard links are possible since vista, dont care about it either.
File Screening, Shadow copy etc shouldn’t be mentioned here anyway.
“File Compression Made Easy” it’s actually useful on ramdisks. Ramdisk tools don’t provide on-the-fly compression.
“Alternate Data Streams” Good that it got mentioned in the article but whats funny is programmers were taking advantage of this long before hackers made big news out of it. Around that time I had to check why some expensive trial commercial tool couldn’t be reinstalled without reinstalling your win. Well it didn’t store the check key on the registry but wrote out it’s serial to an alternate data stream.
“Encrypting File System” another dumb feature what ms probably only put in cause it’s competitors have alternatives. There are way too many drive encryption programs now, this is a completely unnecessary feature to be implemented on FS level.
Which was worst nightmare than ntfs in the windows worlds in the last 10 years was the partition table fuckups made by various windowses, backup programs, partition magic and other craps. MS failed to provide anything for this over the years but for linux users ‘dd’ always been there.
Nice article.
btrfs and zfs now supports having multiple disks appearing as a single logical disk with data redundancy and integrity checks build in and disks can be added and removed on a “live system”. Any idea of when NTFS will support this features if at all?
It is my understanding that linux started btrfs with above mentioned features among others because ext FS architecture pretty much started hitting the upper limits of its feature expansion capabilities.NTFS,like ext and HFS are file systems of yesterday, zfs and btrfs are file systems of tomorrow, when will NTFS offer features they offer? Will microsoft add these features to NTFS or will they start another one to try to catch up?
Last I checked (a few months ago) the ZFS file system cannot shrink volumes. The ground work was in OpenSolaris but not anywhere near production ready. So as long as that hasn’t changed and you didn’t mean a concat volume then you are correct about ZFS.
Volumes (aka zvols) can be grown, shrunk, added, removed as needed. Volume size is just a set of properties (reservation, quota).
I believe you mean that a storage pool (aka zpool, or just pool) cannot be shrunk, in that you cannot remove root vdevs. This is true.
You can remove cache devices (aka L2ARC) and you can remove log devices (aka slogs) and you can remove spare devices.
But you cannot, currently, remove (or alter) root vdevs (bare device, mirror, raidz).
Someday, when the legendary “block-pointer rewrite” feature is completed, it will be possible to remove root vdevs; along with migrating between vdev types (mirror -> raidz; raidz1 -> raidz2), and with expanding raidz vdevs (4-drive raidz1 -> 5-drive raidz1).
Yes I meant that.
“Any idea of when NTFS will support this features if at all?”
Microsoft would not reveal any future NTFS features.
Nice article. Never have had any real trouble with NTFS that wasn’t related to hardware problems. I have seen some hair-raising MFT statistics though, eg an 800 GB MFT on an 7 TB volume, as reported by a sysinternals tool. The volume contained maybe 35 million files of a few dozen bytes each (among other things), and I believe they were being inlined into the MFT. I’m not complaining though since it did work despite these crazy numbers.
I don’t know if the MFT buffering algorithm was smart enough to separate inlined, cold files from useful metadata (probably not), but I do believe that not being able to buffer much, if any of the MFT kills performance. Questions like these are why I really wish that the OS that runs most of the world’s computers was open source: Windows is pretty good, but there is a lot of stuff you just can’t easily find out about its internals.
I am sure you can; you can purchase an NTFS Internals Book. Most major operating systems have internals books… for the OS, file systems, etc. I used to have my VMS Internals Books, until I moved this last time (5 years ago or so) and I had too many books to move so I tossed them. I gave away, or threw away, too many good books.
Shadow copy was actually first released with Windows XP.
The version that was included in Windows XP does not use NTFS for doing this. If I remember correctly, it just keeps some backup files and logfiles (so it can keep changes while a backup is running).
So it is not ‘compatible’ and the API is different.
__
I think it would have been a good idea to explain Juntion points
__
Just a quick comment about File screening, this is done by extension, not filetype. While Microsoft/Windows might be confused about the difference I would think OSNews would know the difference.
__
I wouldn’t call NTFS advanced, though it has many features. I think most poeple would consider something like ZFS advanced.
__
I always felt speed was the problem of NTFS, I think that is why Microsoft wanted to replace it with winfs and that is why SQL-Server, Exchange and some other programs don’t use the normal file-API.
__
Don’t get me wrong, I applaud the effort. 🙂
Some comments on your comments:
Re: Shadow Copy, Windows XP NTFS is the same as Windows Server 2003 NTFS.
My impression is that junction points are mostly deprecated in favor of links.
Under Windows, extensions pretty much equal file types.
While I would certainly agree that ZFS is an advanced file system, I don’t think that you can argue that NTFS is not an advanced file system 🙂 I would also add that while ZFS is an excellent server file system, you wouldn’t necessarily want it on your laptop.
And yes, databases like SQL Server and Oracle circumvent standard file systems for performance reasons.
I believe it is more about structure than performance. The file system is hierarchic whereas the Oracle and SQL Server are relational. They still rely on the filesystem but they store the tables and the records in one big file usually mapped to a tablespace. They can’t map the tables and the records directly to the filesystem because they just are not hierarchic, not to mention the space wasted by the filesystem overhead. Interestingly, Oracle 11g implements a file system on top of its database: DBFS.
I dont agree with you.
The other day I screwed my Solaris 11 Express installation. I was toying with Zones (virtual machines) as a root, and by mistake typed the root command in the real computer.
I just rebooted and chose an earlier snapshot in GRUB and that was it. I deleted the old snapshot which was screwed, and continued to work.
What happens if you screw your Linux install? Or Windows install? Then you are screwed. With ZFS, you just reboot into an earlier snapshot, which takes one minute.
Yes, you certainly want ZFS on your desktop – lest you want to reinstall and reconfig, etc for hours.
And by the way, ZFS protects your data. Which no other filesystem does good enough (according to researchers).
“Yes, you certainly want ZFS on your desktop”
Good point but let me put it this way. If you were selling inexpensive laptops to the uneducated masses, would ZFS be easy enough or transparent enough for Joe Average to use? Would people who don’t know or don’t want to know file systems be able to basic work without being impeded by ZFS features and functionality?
If the “inexpensive” laptop has 2 GB or more of RAM, then you can get away with using ZFS without worrying too much about manual tuning, so long as the OS supports auto-tuning of ZFS memory settings (FreeBSD and Solaris do, no idea about Linux).
However, it’s not the cost of the laptop that matters, it’s the integration of the ZFS features with the OS. Currently, the only real integration is done in Solaris where GNOME (via Nautilus) provides access to the automatic snapshots feature via TimeSlider, and where system updates create bootable snapshots (Boot Environments), and stuff like that.
Until non-Solaris systems integrate ZFS features into the OS (especially the GUI layers), you probably won’t be able to get “the unwashed masses” to use ZFS.
Edited 2010-12-01 19:02 UTC
I have used ZFS on OpenSolaris on a 1GB RAM, [email protected]. It worked, but not fast. I got something 30MB/sec. That is because ZFS is 128 bit, and P4 is 32 bit. If you use 64bit cpu, you get full speed.
Hence, you dont need 2GB RAM.
Regarding if ZFS is suitable for the average user. In my opinion: yes. Because ZFS is VERY easy to use. Much easier than, for instance NTFS. It is also far flexible.
In NTFS you have to partition your hard drive, that is static. In ZFS you have dynamically growing partitions (filesystems). You create a new filesystem whenever you want, and it takes less than a second. You create a zfs raid with ONE command. And there is no formatting that takes hours. No formatting at all, just use it right away. etc etc. There are numerous advantages, simplicity being one of them.
The most important, is that ZFS protects your data.
I don’t think this is true. If you run ZFS on a 64-bit CPU, you get access to some 64-bit program counter registers, but the system bus, buffer size, etc is still the same.
That’s good to know. Maybe some day we will see it in consumer devices. Although I don’t see average consumers creating RAIDs 🙂
What is not true? That ZFS needs 64 bit cpu to reach full speed? Well, everybody says so. And I think it is also stated in the ZFS guides.
Many people (including me) reports low speed with 32 bit cpu. Many people (including me) reports much faster speed with 64bit cpu.
For those who don’t know, I spent most of the last five years working on NTFS. This article is a good overview of this and other related technologies, but I thought I just just add my nits for the record.
Max FS Size
NTFS on disk allows for 2^63 cluster volumes, and files of up to 2^63 bytes. The current implementation is limited to 2^32 cluster volumes, which at default cluster size comes to 16Tb, although can go to 256Tb with a larger cluster size. The real point here is the 2Tb limit referenced in the article is a limit associated with MBR partitions, not NTFS.
File system files
One key omission here is the USN journal. It’s a nice feature that has existed since Windows 2000 and does not have a direct equivalent on other platforms. Then again, it might warrant an article all on its own.
File compression
The on-disk representation of NTFS compression requires a lot of synchronization and reallocation, which leads to fragmented files that are very slow to access. This feature should only be used when space is more important than speed (which is becoming increasingly rare.)
Alternate data streams
ADSes do not have unique cluster sizes (which is an attribute of the volume); nor independent file type associations. In Windows, file associations are extension based, and the file name is shared between all streams. Again, in another article it might be valuable to consider different approaches OSes have taken to file type associations; I still really like OS/2’s EA approach.
Hard links, soft links and junction points
Symlinks do not replace junction points; mklink allows creation of both. These have different properties with respect to security and how they are resolved over networks. Note that the \Documents and Settings heirarchy in Vista is comprised of junctions, despite the existence of symlinks.
Advanced Format Drives
Vista’s support for these drives was a good first step, but not complete. If you use one of these and have Win7, please consider using either Service Pack 1 prereleases or the patch located at http://support.microsoft.com/kb/982018/ .
Transactional NTFS
Back when Vista was under development, we had Beta releases that allowed CMD scripts to start transactions, launch processes, and commit from a script. Alas, it was not to be.
Edit: Versioning
The NTFS version was 3.0 in Windows 2000, and 3.1 ever since. The version of the NTFS on disk format has not changed since XP. New features have been added in a purposely compatible way, so a Win7 volume should be usable in Windows 2000 and vice-versa.
– M
Edited 2010-11-30 01:31 UTC
Malxau,
Thanks for your input, you are 100% correct with your nits 🙂 One challenging aspect writing an article on NTFS is using 2nd and 3rd hand resources. Only when you get in to the actual development resources do you have access to the primary sources.
The other instance I can think of, where compression can be beneficial, is to minimize the number of writes to the underlying storage medium. In the old days of sub-200MHz CPU’s and non-DMA drive controllers, the time saved on disk I/O was a win over the extra CPU work of compression and decompression.
Today, the same might be useful for increasing flash longevity. Even with wear leveling, flash can take only so many writes. (Then again, one must question the wisdom of using any journaling FS on flash…)
It really depends on the meaning of “I/O”. In terms of bytes transferred, compression should be a win (assuming compression does anything at all.) The issue with NTFS (or perhaps, filesystem level) compression is that it must support jumping into the middle of a compressed stream and changing a few bytes without rewriting the whole file. So NTFS breaks the file up into pieces. The implication is that when a single large application read or write occurs, the filesystem must break this up into several smaller reads or writes, so the number of operations being sent to the device increase.
Depending on the device and workload, this will have different effects. It may be true that flash will handle this situation better than a spinning disk, both for reads (which don’t have a seek penalty) or writes (which will be serialized anyway.) But more IOs increases the load on the host when compared to a single DMA operation.
Although flash devices typically serialize writes, this alone is insufficient to provide the guarantees that filesystems need. At a minimum we’d need to identify some form of ‘transaction’ so that if we’re halfway through creating a file we can back up to a consistent state. And then the device needs to ensure it still has the ‘old’ data lying around somewhere until the transaction gets committed, etc.
There’s definitely scope for efficiency gains here, but we’re not as close to converging on one “consistency manager” as many people seem to think.
Edited 2010-11-30 20:51 UTC
The bigger problem with journaling on flash, is that so many FS operations need two writes: one to the journal, and then one to commit. Not good for flash longevity. In fact, early flash storage without wear-leveling got destroyed fairly quickly by journaling, when constant writing to the journal wore out the underlying flash cells in a matter of days.
Well, if you’re going to be nit-picky, you might want to correct your usage of Tb to TB, since Tb is terabits.
I’m surprised he didn’t mention sparse files.
I find those quite useful.
Microsoft’s Gordon Letwin developed HPFS for the joint Microsoft and IBM OS/2 project, and I used it for years here with very few issues.
I’ve always found it interesting that MS allowed that filesystem to completely disappear from its radar. It was pretty good tech…
AFAIK it was IBM that owned HPFS. This probably explains why MS didn’t want to use it: Why use a FS designed by people employed by your competitor when you have a perfectly good in-house FS?
“WinFS was likely sacrificed to get a ship date for Vista.”
Perhaps, but it was cancelled in 2006 and has yet to appear so I assume there must have been more to it then that. Pity, as I was rather looking forward to it at the time.
Seemingly, the more time passes and the greater the data even individuals have the more useful it would’ve been.
Edited 2010-11-30 02:55 UTC
It also didn’t have very good performance. Longhorn in general was bad, but with winFS … yikes, it was scary slow. At the time it was blamed on it being a debug build, then it was just dropped. Its basically been in development since the early 90s ( as a part of cairo).
I think its just a bad idea for an implementation. My 2 cents is that a better more specific, transparent database for specific filetypes is a better idea. Take a look at your typical Computer. Count all of the files on the Hard disks. How many of them have been created by the user, and how many are just for the OS and applications? If you could somehow apply indexing/database storage on the user created files, you’d have something scary fast and very useful. You can either buid some crazy intelligence into the FS, so every application has it with no new build ( Very difficult, IMHO) Or just create a new open file function in the OS, that designates it to be a “user generated file” and get all app developers to use that ( easier implementation, not going to work for a while…)
It would be trivial to mark files with metadata indicating that they are user-generated, then consume that metadata in any indexing tools. It’s just that this would probably be so abused as to make the feature useless.
When you are thinking of adding meta data, are you thinking that programs would do it or the OS? I don’t think that would be trivial either way.
Programs would do it. What’s an OS? Certainly the kernel couldn’t do this. It would be something each program would have to do, probably transparently to the user and programmer as part of a standard file API.
And, yes, it would be trivial.
I was really wondering if you meant it woudl be transparent to the programmer or not. I don’t think it would be trivial to have it Identify user created files in the api. In order for it to be transparent the function calls to save a config file ( non user created file) and a document ( user created file) woudl have to be the same while the api would have to figure out with some sort of algorithm which was created by the user and which was not.
As a programmer, some of the documents I create are indistinguishable from config files, because they are config files as well as documents I created. I’d want those to be all metadated up, but not any that were tweaked for me by an existing application I didn’t write.
See the complexity of that? It would be easier to simply have a different api call for the two file writes, one with meta data and one with out, but then you have to get third party developer buy in.
WinFS was sacrificed because it’s a good idea that just cannot be done practically.
Be tried to do the same thing with its early releases and found that they couldn’t get it together, either, which prompted the rush to build BeFS before 1.0 shipped. It’s unquestionably a hard problem and there is some question as to whether or not it can be solved in a way which is reliable and performant.
You will probably not see a database FS any better than BeFS any time soon.
If this isn’t a half baked first year student report I am a bright red tomato. Nice summary: “NTFS is complex and yet has a highly functional elegance. It has benefited from years of testing and continued development.” Holy balls in my face.
And I have never heard embark used that way. Maybe you meant impart.
That’s quite a heavy feature list ! Now I understand better why it took so much time before NTFS was properly supported on Linux
In my opinion, a FS should focus on ensuring high performance and reliability, and the rest should be left to higher-level OS components as they are OS-specific features which all people who implement NTFS do not necessarily want to support. But well…
NTFS on Linux only supports a small number of basic NTFS features 😐
Indeed. I avoid write operations on NTFS volumes.
Well, I just avoid NTFS volumes…
Windows 95 OSR2 was the first consumer OS to support FAT32 (as opposed to Windows 95 vanilla as they were different products if essentially the same run time executive) OSR2 was OEM only and was also the first Windows version to support USB.
Good times.
Edited 2010-11-30 08:53 UTC
To be pedantic, it was Win95b that first supported USB. Win95a did not support USB. And Win95c was the best one of the bunch (aka OSR2.1 or something like that).
“OSR2” was an umbrella term that covered Win95a through Win95c. Supposedly, there was also a Win95d, but I never saw it in the wild.
Gotta love how MS names things, eh?
While it is a good and informative article, I thought I’d point out in the first graph that DOS was NOT written by Bill Gates, it was originally a CP/M clone called QDOS (Quick and Dirty Operating system) that was bought by MSFT for a one time payment of $50,000. While MSFT improved upon it later what they sold to IBM that became PC-DOS 1.0 was pretty much line for line QDOS.
I believe the table was saying not that DOS was written by billg but that FAT12 was, which is not something I can attest to but is reasonable.
Yes, Bill Gates wrote the original FAT12 implementation. FAT12 actually predates Tim Patterson’s 86-DOS. FAT12 was implemented in Microsoft Disk BASIC-86 which he used for his S-100 micro that 86-DOS was built on, hence he used it for his file system.
I know this, I just don’t know if it was billg who wrote it or Paul Allen (or even someone else being paid by MS at the time).
Uhm, “MS at the time” was just Bill G and Paul A. There were no “employees” until later.
NTFS is not safe from a data integrity view. It might corrupt your data. The worst case of corruption is when the data is corrupted silently, i.e. the hardware nor the software never notices that data has been corrupted. Thus, you get no warning at all. You think your data is safe, but it has been silently altered on disk:
http://www.zdnet.com/blog/storage/how-microsoft-puts-your-data-at-r…
Using hardware raid is no better. It does not give you data protection. But ZFS gives you data protection. Read more on this, here:
http://en.wikipedia.org/wiki/Zfs#Data_Integrity
It is technically good but it is wasted by closeness, patents and copyrights. Its value is greatly diminished by the fact that its structure is not published. You don’t know if your disk will be readable in 10 years and you can’t trust implementations based on reverse engineering. That is why FAT32 will remain the de-facto standard for removable drives. FAT32 is also closed but it is far simpler and reverse engineering is more reliable.
NTFS could have been much more useful but it has been wasted for very dumb business reasons. That is a shame.
Considering that it has been around for 17 years, and given Microsoft’s long track record on providing backwards-compatibility, I don’t think NTFS support will come anywhere near disappearing in the next 10 years.
Whether or not that’s a good thing is a different argument.
If you had some documents written with the earlier version of MS Office you would say something different.
Comment deleted
Edited 2010-12-01 22:28 UTC
Every time I read about what NTFS can do I wonder why MS always tries to cripple it. It is limited by the software on top.
Apparently it could have:
– Longer than 255 characters(I have had to shorten a lot of directories and filenames)
– Automatic backup while the system is running
– File deduplication with hardlinks or something similar
– Smarter algorithms to avoid fragmentation
– Roll back the system to a clean install(without losing userdata)
– Grown the only partition on the system by just installing a new harddrive and linking it to it
– A tagging system built in(for an extra organization method on top of the filesystem)
I still find it hard to believe that MS could have solved 99% of the problems people have with Windows by using functionality already present in the filesystem…
The reasons why the best features in NTFS don’t get used widely, I think, is mainly for the sake of compatibility.
For example, NTFS supports case sensitivity, but it is disabled by default because Win32 does not support case sensitivity for file names.
‘case sensitivity’, a good thing?
Actually, for anything other than passwords, probably not.
I think the way Windows handles it is best, from a usability standpoint, with case being preserved, but not mandatory.
Allows two files with similar names to exist in the same folder, at the least.
Yes.
If you insist on having case insensitivity for e.g. file names entered by user, do it in a library – open_file_case_insensitive(“/TmP/Hello.TXt”). Don’t break the whole operating system down to the very core for it.
While windows kernel probably is becoming ok these days, it’s still crippled by various user-visible warts like:
– drive letters
– backslash recommended as path separator
– CR LF
– inability to delete/move files that are in use
– case insensitivity
Any pretense of elegance elsewhere is stained by these issues. Unix has warts too (terminal handling, whatever), but people rarely see those while the windows ones you hit several times a day.
Excuse me, but that does not sound like hard links as they have been known in Unix-land. What you have just said is that changes to the hard link file change the target, which is true since they are the same file, but it is certainly not true on Unix filesystems that deleting a hard link deletes the file its linked to.
Having just tested it on NTFS confirms my suspicions: NTFS doesn’t do it this way either. Deleting all links to a file when you delete any one link would be crazy behavior. NTFS hard links work like Unix hard links.
Apart from inaccuracies like this I found the article usefully informative. I would like to see a comparison by someone who knows filesystems better than I do between NTFS and NSS, which was produces somewhat later, to contrast the different feature sets. I know that NTFS has relatively weak performance characteristics, possibly due to its complexity, and I know that NSS is pretty good in that area while having AFAIK an equivalent feature set.
Edited 2010-11-30 12:35 UTC
An real article about Operating Systems in OSNews.
Congratulations!!!
The chart on the front page is missing Windows 7 and the features added .. hmmm How long have window 7 been out? 2 years?
Is this a 2+ year article that was found a posted? I can’t tell …
Windows 7 has the same NTFS as Windows Server 2008, NTFS 6.0.
Good article, Andrew!
I was looking at the EFS problems and most of these (none?) strike me as problems at all, but one of them is flat out incorrect:
# If your PC is not on a domain and someone gets physical access to it, it’s very simple to reset your Windows password and log in.
This is true, however, resetting an account password from outside of the account (as opposed to logging in and changing the password) will invalidate the EFS keys bound to that account. The password for the user account is used to decrypt the relevant EFS keys, and thus, by changing it from outside of the account access to the EFS keys for that account can’t be gained and will be permanently lost once the password is forcibly reset.
Windows will infact warn you of this when you try to do so, through, e.g. Local Users & Groups:
“Resetting this password might cause irreversible loss of information for this user account. For security reasons, Windows protects certain information by making it impossible to access if the user’s password is reset. This data loss will occur the next time the user logs off.”
This makes sense, as the EFS keys that are being held in their decrypted form in memory will cease to exist on log-off and their encrypted equivalents on disk will no longer be possible to decrypt.
In short, you’d have to bruteforce the password to guarantee access to EFS encrypted user data, not reset.
Disclaimer: Windows 2000 has some flaws in the EFS implementation and default settings that make some of the above less true. XP and newer are not vulnerable; Vista and newer even less so.
Edited 2010-12-02 12:36 UTC
Thanks for that correction. It’s good to know considering how easy it is to reset a Windows account password.