First major public exploitation of a compare-by-hash based system

In December, a team of hackers took the time to implement a full exploit of SSL certificates signed with a broken hash function (MD5). The paper is entitled MD5 Considered Harmful Today (a charming reference to the classic MD5 Considered Harmful Someday.

This piece of news has been in my “to-blog” queue for the better part of a month but I’ve been too depressed to write about it. For some years, I’ve tried – and failed – to convince systems programmers that they need to plan for the obsolescence of any cryptographic hash function they have incorporated into their software. Why?

Cryptographic hash functions don’t actually produce universally unique signatures/addresses/identifiers from their inputs, they simply do so with very high probability for the few years between their definition and the inevitable break of that hash function. When I wrote my first paper on the subject back in 2003, MD5 was already considered broken by cryptographers, but since none of them had bothered to find two colliding inputs, most programmers believed MD5 was still safe to use. When MD5 was completely broken the next year, I thought that the debate was over, but programmer practice remained unchanged, in part because most people had already migrated to SHA-1. Then SHA-0, a close cousin to SHA-1, was completely broken. Again, nothing changed. Then SHA-1 was significantly weakened, to the point that Bruce Schneier described it as:

SHA-1 has been broken. Not a reduced-round version. Not a simplified version. The real thing.

Still nothing. Of course, programmers are taking the exact attitude towards the weakening of SHA-1 as they did towards MD5 – it ain’t broken until you show me the collision, so I’m gonna keep using it – but this time we have a significant body of software using SHA-1. And now we know that even supposedly security-centered organizations continue to use known broken cryptographic hashes right up until someone demonstrates a collision on their exact system and creates a media storm around it.

I still recommend that designers of software in which the cryptographic “address space” is shared by untrusted users should plan for upgrading their cryptographic hash functions to the current state of the art every few years. I just have no expectation that this will happen.

I have some pretty color graphics showing the lifetime of cryptographic hash functions in an earlier post.

Finally, in a vain attempt to forestall the inevitable flame wars, I will point out that my objections do not apply to systems in which the hash address space is shared only with trusted users. In other words, hash-based source control is for the most part fine sticking with SHA-1 and could indeed use a cheaper hash like MD5 without any practical trouble. My hatred of git is based entirely on the user interface.

Update:

I should have read my Google News Alerts before posting this. Apparently a new document archive system from U Washington was just announced that does take into account migration to new hash functions. From the New York Times article by John Markoff:

The University of Washington researchers now use a modern hash algorithm called SHA-2, but they have designed the system so that it can be easily replaced with a more advanced algorithm.

But I can’t find any primary sources and the article has a technical error elsewhere: “At the heart of the system is an algorithm that is used to compute a 128-character number known as a cryptographic hash from the digital information in a particular document.” Probably a garbled reference to 128-bit MD5 checksums. Nonetheless, encouraging news.

64-bit e2fsprogs publicly available

I finished off the last major patch implementing support for creating, fscking, etc. ext4 file systems with more than 2^32 blocks about 12 hours before my shoulder surgery. Hurrah! It will at some point be pulled into Ted T’so’s mainline repository, but until then it is available in my repo on kernel.org:

git://git.kernel.org/pub/scm/fs/ext2/val/e2fsprogs.git

The branch is “64bit”. For testing only – don’t put data you care about in file systems made with these tools!

Email bugs to:

linux-ext4@vger.kernel.org

FAST ’09

Yo! Forget not! The next FAST (USENIX File systems And Storage Technologies) conference is coming up February 24 – 27:

http://www.usenix.org/events/fast09/

The best part: It’s in downtown San Francisco! Compare and contrast with the location of the last two FAST conferences in “downtown” San Jose.

I have several personal reasons for attending. First, it’s only a bus ride away, so my travel budget comes to about $12. Second, I had the honor to be on the program committee for FAST this year, an event which is unlikely to repeat itself. A third and directly related reason is that I’m very interested in the papers this year, in particular “Generating Realistic Impressions for File-System Benchmarking” from Nitin Agrawal, et alia at UW Madison. I can’t wait for this tool to be open sourced – it will be an enormous boost to file systems developers and testers.

Party at the science museum

It’s like a real-life XKCD comic: The California Academy of Sciences is turning into a nightclub every Thursday night starting February 12th:

http://www.calacademy.org/events/nightlife/

Stick Figure Girl would love it. I already ordered my tickets.

Tickets are only $10, as compared to the usurious $25 charged for a day pass to the museum. Sure, you’ll miss the insect feeding, but the martini in your hand will be some consolation.

Latest LWN article: Coccinelle semantic patches

You may remember my first post about semantic patches, which was mainly about how difficult Coccinelle (spatch) was to use. The authors got in touch with me and fixed most of the problems I was having as of version 0.1.4. I used it to great effect in both e2fsprogs and autofs. My article on article on Coccinelle for LWN appeared this week. As usual, it’s pay-only for a week and then free, and subscriptions start at $2.50/month.

Calling all video bloggers

SuperHappyDevHouse is happening again, Saturday January 31st at the Sun Menlo Park campus. If you don’t already know what SHDH is about, you can watch a video describing it, in which yours truly makes a particularly dorky appearance. The intro frame is of me, too, which means that for more than two years, my nerdly visage has been the “face” of SHDH. A wildly inappropriate state of affairs, you’ll agree, but when I suggested replacing it, the organizers pointed out that they didn’t have anything better to replace it with.

So I make my plea: please, please, please, if you have a video camera and rudimentary editing skills, attend SHDH and make a video of it. If pity for me does not move you, think of what you could gain: fame, fortune, and increased employability in the Bay area dot com sector!