Semantic patches

I’ve been porting the ext2fs file system utilities to work with 64-bit file systems. A couple weeks ago, I created, mounted, and fscked my first 33-bit ext4 file system! w00t! Now I have a lot of bugs to fix. One bug halves the free blocks and free inodes on mount, which I haven’t yet figured out. But the second bug, which results in corrupt block group descriptors, has a clear and obvious fix – it just involves a hell of a lot of search-and-replace. Basically, I have to take every instance of something like:

fs->group_desc[i].bg_flags |= EXT2_BG_BLOCK_UNINIT;

And replace it with something like:

ext2fs_bg_flag_set (fs, i, EXT2_BG_BLOCK_UNINIT);

A preliminary grep:

[val@fsbox e2fsprogs]$ grep -r 'group_desc\[' . | wc -l
[val@fsbox e2fsprogs]$ grep -rl 'group_desc\[' . | wc -l

And I have *already* replaced all instances of nearly 60 other patterns in the 110,000 LOC that is e2fsprogs. (It could be worse; Jose R. Santos did a lot of the preliminary work for 64-bit already.)

I whined about this to Ted and he pointed me at the Coccinelle semantic patch tool (or spatch, which is a wildly overloaded name). It looks pretty awesome; you can tell it things like “For every function that is somewhere assigned to this function pointer field in this kind of struct, change its arguments to include a void * pointer named arg and then replace the allocation of this other struct inside the function with a cast of this void * pointer. And, oh yeah, remove the free() later on.” My first semantic patch was a wild success on the first file I tested it on:

identifier fs;
expression group, flag;

-fs->group_desc[group].bg_flags |= flag
+ext2fs_bg_flag_set(fs, group, flag)

But it barfed on the next file I tried (didn’t like the EXT2FS_ATTR(()) define). I spent some time screwing around trying to get it to read header files, but then I wrote a trivial test case and spatch barfed on *that* too. No problem, I can debug this.

Did I mention that spatch was written in OCAML by French grad students?

I’m writing me a mess of sed scripts tonight.

Postscript: I’d like to note that spatch seems like the beginnings of a compiler checker tool comparable to Coverity, in the hopes of getting some more developer time on this thing.

Post postscript: The Coccinelle developers contacted me and we’re working on this (yay!). And they aren’t French grad students, they are French postdocs and an American professor working at the University of Copenhagen.

5 thoughts on “Semantic patches”

  1. Did I mention that spatch was written in OCAML by French grad students?

    Then count yourself lucky it doesn’t have a top level GUI element whose sole purpose is to cause the application to crash and dump core.

    No, seriously.

  2. At the risk of being redundant, this expression should do the general pattern:
    /group_desc.*bg_flags \|=/s~(\w+)->group_desc\[(\w+)].bg_flags |= (\w+);~ ext2fs_bg_flag_set (\1, \2, \3);~g

Comments are closed.