04 October 2012

Rewriting Git History for Fun

For starters, if you're looking for a good JDBC connection pool, look no farther than c3p0 [github, doc].

Back a few years commons-dbcp wasn't very stable under load, so I went looking for an alternative and landed on c3p0.  Ever since then, Steve Waldman has been making it better and better.  Except for a 2-year hiatus when he wasn't working on it, he's been very responsive and willing to accept feedback and make improvements.  I'm seriously impressed by the project.

Anyway, since I've been a long-time user and fan, when I saw Steve put his software on GitHub, I went for a look.

Here is what I saw:

But what about all those prior releases in source snapshot form, that I was used to seeing from SourceForge?

I realized that perhaps I could contribute a little to the project, so I went and grabbed all the source release zips from SourceForge, created a local git repo, created a commit for each release, tagged it, and spliced Steve's recent work on the top.

Then I submitted a GitHub issue to ask Steve what he thought -- wasn't really a pull request because it was a completely disconnected history.

Here is the top of the new history:

And here is the earliest part of that history:

The tools used were:
- git config author.name - to give credit where credit was due
- git config author.email
- curl - to download all the releases
- unzip | sed - to figure out what the commit date should be
- git commit --date "[release date]" - to create the commits
- vi .git/info/grafts - to temporarily splice the new history on the old
- git filter-branch --tag-name-filter - to rewrite the new history permanently on top
- git tag -f - to replace the existing release tags to point to the new rewritten history
- git push --tags - to push it up to github

If I could consult for projects / companies to do this kind of VCS conversion work and actually get paid for it -- wouldn't that be awesome!