Git notes

You may have seen that I’ve made some Git-related posts lately (1, 2, 3) and if you’ve been watching you’ll have already noticed that I’ve been adding lots of Git content to the Knowledge Base. I’ve now done a shallow import (no history) of one public Subversion repository to Git, and last night got gitweb up and running. I’ve also done a non-shallow (full history) import locally; I’ve yet to decide whether I’ll run with that one or the shallow one.

I was pleased to see that Michael Tsai is jumping ship as well:

Then I heard about Git when I happened to watch a video of its creator, Linus Torvalds. It had not occurred to me that CVS and Subversion were fundamentally broken. Torvalds is undeniably a smart guy, but he’s also known for his bluster. I’d heard Git mentioned a few times before, usually in the context of it being difficult to use, something only a kernel developer could love. So I was skeptical but interested enough to try it out. What I found is that Torvalds’s bragging is justified. Learning about Git after using other version control systems is somewhat like learning a new programming language that’s radically different from what you’ve known before. Even if you might not need the more unusual features most of the time, you feel as though your eyes have been opened, your mind expanded.

Sounds like Michael followed a similar trajectory to mine, migrating from CVS to Subversion a few years ago (a "no brainer" at the time), and now to Git. Since 2004, distributed version control has gotten big and soon enough moving to that will be a "no brainer" too.

Along the way, I took a detour via SVK and I am still using it in fact, but I want to migrate everything to Git eventually. SVK is an awesome system, but it’s fragile, complex and slow compared to Git, and because it is just a layer over the top of existing (Subversion) infrastructure it must carry with it the baggage and the ideas that go with it. Git on the other hand, wipes the slate clean and starts from scratch with some basic ideas which are at once stunningly simple and revolutionary. It’s hard not to use that kind of language when you talk about Git.

As I noted yesterday, the most fundamental and important of these ideas is that metadata is an unnecessary distraction. You don’t need to track renames or merges anything else. Subversion makes this mistake (tracking renames) and SVK does too (tracking merges), even though it’s distributed and could be "smarter". Git, on the other hand, explicitly tracks nothing and absolutely everything is inferred from the changes in state of the tree. That’s the only thing that matters. As Michael notes, the structural design of the repository itself is beautifully clean; and that’s why Git doesn’t need to pack metadata into tags in order to remember when merges and copies took place (like SVK does): the parent (or parents) of every commit is clearly recorded and everything else is implicitly encoded by the structure (the Directed Acyclic Graph) of the history. This in turn leads to a clean, understandable codebase whose simplicity inspires confidence.

I get the impression that I’ve probably followed Git a little more closely than Michael – I’ve had a long-running interest in version control systems – ever since it was first mentioned on Slashdot. I too saw the video that Michael mentions and didn’t really learn anything new (although it was the first time I heard that Git could track the movement of blocks of text from one file to another, and that made me start wondering about Git again). It’s really my dissatisfaction with SVK’s speed that pushed me to start playing with Git, but the deeper I dug the more impressed I was.

Yes, Git is hard to learn. Yes, you have to read a lot of documentation in order to "get" it. But the good thing is, the documentation is available and it is very, very good. The time you invest in reading it, learning about Git’s underpinnings, and understanding the why of it all, is time well spent. Git has some very smart people working on it, scarily, intimidatingly good programmers, and the mailing list archive is full of helpful material. If you look at carefully, chances are you’ll want to switch to Git too.

And you’ll see that although Git is hard to learn, it’s dead easy to use. Once you know how it works you can do stuff that you’d never dream of doing with a version control system like Subversion; and you’ll do stuff with confidence that you might hesitate to try with a system like SVK.

So there are plenty of big changes in the air. Leopard is coming. New APIs are on the way. Objective-C 2.0 is here. And we could well be on the brink of a sea change in version control. Seems like as good a time as any to make a radical departure.