Parallel file system worlds

Recently, someone asked me why, exactly, I didn’t think much of NTFS. I couldn’t come up with anything much specific more than “It’s old! and sucks!” so I decided to go and deliberately read about NTFS.

Now, you can’t trust marketing material on file systems, but I was nonetheless amazed to read a slide describing automatic incremental online file system check in NTFS (slide 18 in this slide deck). “Wow,” I thought, “Maybe NTFS is actually way ahead of everything else in file systems and I’m just another narrow-minded UNIX developer.” Then I flipped to the next slide, which described the fabulous new innovation of “symbolic links.” Whichever the truth is – NTFS is wildly advanced or terribly backward – we are clearly living in parallel file system development worlds.

I’m not the expert on NTFS, but I thought I’d summarize what I learned in an hour of half-hearted skimming. My conclusion is that NTFS is a thoroughly modern file system in many ways but oddly crippled in others. At the buzzword level, NTFS uses B+ trees and extents in places, has metadata journalling, supports encryption and compression, and stores small files inline with metadata. On the other hand, the majority of file system metadata is still stored in one big chunk at the beginning of the partition (no block/cylinder/allocation groups to spread the metadata around the file system and get it closer to file data). NTFS does support sparse files now, but from what I understand the application must deliberately specify that certain ranges of the file should not be allocated. Symlinks are a wild crazy new feature limited to privileged users only, while directory hard links are permitted. I tried to find another reference to “chkdsk on the fly” but failed; I’m going to assume that’s typical marketing hyperbole until I find a more reliable reference.

I’d summarize my overall impression of NTFS as all the latest coolest bits and pieces shoehorned into an ancient on-disk format (more ancient than usual) and never given a good high-level optimization pass. Real-world performance work bears this out; NTFS quickly fragments under load.

NB: 2.6 includes an in-kernel NTFS driver, which has had limited write support for quite some time now. I find that a lot of people (including me) assume that NTFS access was still only through a limited read-only out-of-kernel driver or through FUSE.

16 thoughts on “Parallel file system worlds”

  1. (hello, here via a roundabout route from LinuxChix…)

    NB: 2.6 includes an in-kernel NTFS driver, which has had limited write support for quite some time now.
    Yeah, I vaguely recall seeing an option for that when I was still compiling kernels on GenToo. I got the impression that the write-support was still on the level of “don’t do this at home, kids!” Or maybe I just dismissed it because limited write-support didn’t seem much better than no write-support.

  2. Part of the reason they don’t support symbolic links is that no existing software understands ’em. I’d bet, for example, if you had a directory with a symlink to ../.. that virus scanners etc. would loop forever when scanning your disk.

  3. Yeah, exactly – the software has to be able to deal with symlinks. What I also realized is that symlinks make package management and library versioning and all sorts of things much easier – imagine if you had to know the exact path to a library and couldn’t name things with different versions and then create symlinks to them… Hey, wait, now I understand why people are always complaining about Windows software installing multiple copies of the same DLL (library). Lightbulb!

  4. Last I checked (which admittedly is Sat May 31 15:49:34 MDT 2008, kernel 2.6.25.4), the NTFS in-kernel driver is woefully inadequate for write support:

    The only supported operation is overwriting existing files, without
    changing the file length.  No file or directory creation, deletion or
    renaming is possible.  Note only non-resident files can be written to
    so you may find that some very small files (<500 bytes or so) cannot
    be written to.

    So, if you want actual, useful write support, you've still got to use FUSE.

  5. I hear tell that the guy who wrote NTFS doesn’t work for MS anymore and no-one else there is quite sure what to do with it, hence just layering more stuff on top of the old structure.

    I have always found the symlink/hardlink situation in NTFS “utterly confusing” at best 8)

  6. Hm, the available documentation is contradictory. The description in Documentation/filesystems/ntfs.txt agrees with what you say, but the latest (1/23/09) announcement about NTFS support says:

    * Full core functionality implemented: full index operations, unlimited file and directory creation and transparent UTF-8 support. Moreover shared writable mmap and NFS support via the FUSE kernel module.

    Which seems to imply that the kernel module supports all forms of writes other than shared writable mmap…

    Announcement archived here:

    http://lwn.net/Articles/316661/

  7. The Microsoft NTFS driver was written by a team of developers. NTFS is a fairly complex design. According to one of the NTFS architects university lecture, NTFS was 500k source lines in 2000. It grew considerably since then.

    Afaik, Microsoft has a fairly small file system/storage team, about only 20 people. Coupled with being closed source is probably the reason they are rapidly loosing ground and have many functionality and performance problems. They indeed had to hire back one of the main NTFS architects/coder recently because they couldn’t fix some serious corruption bugs for a long period of time and the media started to write about the problems regularly.

    The layering is true though this is a consequence of the file system design to be extendible.

    The symlink/hardlink comment is right on spot :-)

    Szaka

  8. Well I’m glad the ntfs-3g guys know what they’re doing, you guys have saved my life (or, more properly, the lives of all the people who hoist off their broken laptops on me) more times than I can count :D

  9. The upstream project has had full write support working for something like 3 years. There were a lot of bugs to work out, but they did it and had a release back in 2007:

    http://sourceforge.net/forum/forum.php?forum_id=740182

    My guess is they didn’t have that much easier a time getting their code into Linus’s tree than the squashfs people did…

    The project seems to have moved from sourceforge to http://www.linux-ntfs.org now. (I haven’t been following it particularly closely, I no longer personally know anybody who still uses Windows. Everybody seems to have switched to macs…)

  10. That’s true of most microsoft technologies now.

    The 1998 antitrust trial had a significant morale impact within the company and cost them something like three dozen senior executives (both Myrvolds, Silverberg, etc). Then msft responded to the permatemps lawsuit by firing _all_ temporary workers (many of which had been there for a decade), which turned out to hose their build system so badly they stopped being able to compile Windows 2000 shortly after its release. (XP was literally the result of the Windows Millenium development team being tasked with taking the NT4 source, backporting as much of Windows 2000 as they could get to compile, and making it pretty.) Then msft stock tanked in 2000 (because they copied Cisco’s pooling method of acquisitions and triggered an SEC rule that prevented their stock buyback program from disguising the dilution inherent in their stock option income tax benefit) and has been flat ever since (really, the stock’s peak was back at the start of 2000) so everybody who was there for the money left. Then Google started seriously raiding them and hiring away everybody with a brain around 2003. Then Gates saw 4 gig memory wall and corresponding switch to 64 bit hardware coming (with a corresponding 8 bit cp/m -> 16 bit dos -> 32 bit windows operating system transition) and decided to retire rather than fight it.

    The Vista death march (taking 7 years to ship crap) was a side effect of all this, but of course it made it worse by burning out most of their remaining competent engineers. Then the Yahoo acquisition caused everybody to lose faith in Ballmer, so a big “employees vs management” vibe cropped up (blaming him for everything from the stock price to the loss of towels in the company gym). Then the economy cratered and MS started missing its numbers for the first time in over 20 years.

    Earlier this year, even mini-msft announced he was considering leaving the company. From a human resources perspective, they are _deeply_screwed_, and it’s not a new thing. it’s been festering for about ten years now.

    Of course none of those affect their real source of power, which is their lock on the distribution channels. Nobody wants Vista, but you can’t buy a machine from Frys that hasn’t got it preinstalled. (You can’t even upgrade most of ’em to XP, you MUST buy Vista if you buy that hardware.) That’s the heart of the microsoft monopoly; preinstalls. Always has been.

  11. Fascinating — didn’t know about the permatemps lawsuit and its impact on Win2k/XP before. Are there references for this? (Not that it does not sound believable, it does).

  12. Transactions?

    There’s only one NTFS feature so far that I’ve not seen replicated by any Unix file system (even ZFS/btrfs/Hammer): transaction support. There have been times when using yum that I wish an aborted transaction could trigger a rollback that leaves the filesystem in its original state.

    (Note: the aforementioned three filesystems allow snapshotting, but reverting to an earlier snapshot would undo changes made *outside* the updating operation as well)

  13. Probably. A quick google pulls up the wikipedia article on “permatemps” which uses this as an example, apparently the case was called “Vizcaino v. Microsoft” and was filed in 1996. (I think I heard about it in 1999 because it was coming to a head.)

    A couple other prominent google hits for “microsoft permatemps”:

    http://sylviavasilik.com/2006/02/microsoft-permatemps-fiasco.html
    http://news.cnet.com/Microsoft-permatemp-checks-finally-arrive/2100-1022_3-5907352.html
    http://washtech.org/news/industry/display.php?ID_Content=5016
    http://www.aflcio.org/aboutus/thisistheaflcio/publications/magazine/corp_challenging.cfm

    Unfortunately, the old stuff from the time is harder to access:
    http://articles.latimes.com/2000/mar/01/business/fi-4247

    The big policy change after the suit is that employers instituted a mandatory 90 day break after each a year of work, and most contractors found other work rather than be unemployed that long. (And most employers found other people to fill the positions because they couldn’t afford to leave them unstaffed for three months.) The permatemps weren’t _exactly_ fired (which might have been grounds for another suit if it was punitive), but they were made to leave the company anyway.

    Some of my co-workers at the time were ex-microsoft employees, so I asked them some questions, and I had a friend who worked at Dell who interfaced with microsoft people daily as part of his job, and I asked _him_ questions. Here’s what I learned by talking to these guys:

    At Microsoft in the late 90’s, the build system was considered a low status job, and microsoft staffed most of the low status jobs with temp workers (who became permatemps because “low status” doesn’t necessarily mean “low skill”). The source code to the OS itself was rigorously guarded and checked into source control, treated as IP crown jewels. But the build system wasn’t considered important. (Obviously, if you have the source code it’s trivial to figure out how to _compile_ it, right?)

    But over the years windows grew to millions of lines and hundreds of different packages, and grew layers of source control security over who was authorized to view what code, and needed nightly test builds because it just didn’t _work_ very well and was forced to function at all via heroic testing efforts and brute force. So their build system grew into this huge cluster-based monstrosity that ran on a specific set of servers. It also became littered with lots of little binary only programs that did little format conversions and fixups and so on. The people who _wrote_ this code knew what it did and how it worked, but they’d never been asked to document it because the build system wasn’t important.

    Oh, the other interesting bit of Microsoft corporate policy. Right after the antitrust trial, policies went into effect that when an employee left their machine was immediately recycled. Specifically, the drives got wiped pronto (presumably to make sure any incriminating emails they’d saved local copies of couldn’t surface as evidence in antitrust appeals).

    So the permatemps leave en masse, taking all knowledge of how the Windows 2000 build system works with them, and their machines are formatted and any copies of the source code to the weird little build tools they had are erased, and their email history (which might also explain why they did what) is purged from the servers.

    And then they find they can’t make _any_ changes to the Windows 2000 codebase, or the build breaks. The weird little binary-only tools the build is riddled with segfault, and nobody knows why they’re there or what they do, the people who did it are gone, their machines and email history wiped…

    And that’s why some people claim Windows 2000 was the best OS microsoft ever released.

  14. Fascinating. Of course, given that even the most popular .NET build tool, nant, was a port of Apache’s ant, I’d guess that even in 2002 they still did not think build systems are important…

  15. NTFS is obsolete. It’s a filesystem from 90-s.

    It’s one of the reasons why Windows has abysmal filesystem performance. In some of my tests Windows is 100 _times_ slower than ext3 on Linux. Hell, even NTFS3g on Linux is sometimes _faster_ than the native NTFS in Windows.

Comments are closed.