Migrate SourceForge CVS repository to git

Updated to include promoting and pushing tags.

I recently had need to migrate some SourceForge CVS repositories to git. I’ll admit I’m no git expert, so Googled around for advice on the process. What I ended up doing was sufficiently distinct from any other guide that I feel it worth recording the process, here.

The SourceForge wiki page on git is a good start. It explains that you should log into the Project’s Admin page, go to Features, and tick to enable git. Although it’s not made clear, there’s no problem having both CVS and git enabled concurrently.

Enabling git for the first time will initialize a bare git repository for your project. You can have multiple repositories; the first is named the same as the project itself. If you screw things up, it’s OK to delete the repository (via an SSH login) and initialize a new one.

Just like the SourceForge documentation, I’ll use USERNAME, PROJECTNAME and REPONAME within commands. As just mentioned, the initial configuration is that the latter two are equal, until you progress to additional git repositories.

Let’s begin by grabbing a copy of the CVS repository with complete history, using the rsync utility. When you rsync, there will be a directory containing CVSROOT (which can be ignored) and one subdirectory per module:

mkdir cvs && cd cvs
rsync -av rsync://PROJECTNAME.cvs.sourceforge.net/cvsroot/PROJECTNAME/* .

Grab the latest cvs2git code and copy the default options file. Change the run_options.set_project setting to point to your project’s module subdirectory:

svn export --username=guest http://cvs2svn.tigris.org/svn/cvs2svn/trunk cvs2svn-trunk
cp cvs2svn-trunk/cvs2git-example.options cvs2git.options
vi cvs2git.options
# edit the string after run_options.set_project, to mention cvs/PROJECTNAME

Also in the options file, set the committer name mappings in the author_transforms settings. This is needed because CVS logs only show usernames but git commit logs show human name and email – a mapping can be used during import to create a sensible git history.

vi cvs2git.options
# read the comments above author_transforms and make changes

But how do you know what CVS usernames need mapping? One solution is to run through this export and git import without a mapping, then run git shortlog -se to dump the commiters. Blow the new git repo away, and re-import after configuring cvs2git author_transforms.

The cvs2git utility works by generating the input files used by git’s fast-import command:

cvs2svn-trunk/cvs2git --options=cvs2git.options --fallback-encoding utf-8
git clone ssh://USERNAME@PROJECTNAME.git.sourceforge.net/gitroot/PROJECTNAME/REPONAME
cd REPONAME
cat ../cvs2svn-tmp/git-{blob,dump}.dat | git fast-import
git reset --hard

At this point, if you’re going to continue using this new git repository for work, remember to set your user.name, user.email and color.ui options.

Now you’re ready to push the repo back to SourceForge. I did test myself that disabling so-called developer access to the repo in the SourceForge Project Member settings page does in fact prevent write access, as expected.

git push origin master

If you had tags on the CVS repo (git tag -l), they’ll have been imported as lightweight tags. Best practice is always to use annotated tags, so this short script will promote them for you:

git config user.name "Firstname Lastname"
git config user.email "me@example.com"
git tag -l | while read ver;
  do git checkout $ver;
  git tag -d $ver;
  GIT_COMMITTER_DATE="$(g show --format=%aD | head -1)" git tag -a $ver -m "prep for $ver release";
  done
git checkout master

Verify the tags are as you want, using something like:

git tag -l | while read tag; do git show $tag | head -3; echo; done

And then push them to the repository with:

git push --tags

Something you might want to do is set a post-commit email hook. For this you SSH to SourceForge, and if you have multiple projects remember to connect to the right one!

ssh -t USER,PROJECT@shell.sourceforge.net create
cd /home/scm_git/P/PR/PROJECTNAME

Download the post-receive-email script and place it in the hooks subdirectory; make it executable. Also set the permissions to have group-write, so your project colleagues can alter it if required. Set the necessary git options to allow the script to email someone after a commit. Season to taste.

curl -L http://tinyurl.com/git-post-commit-email > hooks/post-receive
chmod +x hooks/post-receive
chmod g+w hooks/post-receive
git config hooks.emailprefix "[git push]"
git config hooks.emailmaxlines 500
git config hooks.envelopesender noreply@sourceforge.net
git config hooks.showrev "t=%s; printf 'http://PROJECTNAME.git.sourceforge.net/git/gitweb.cgi?p=PROJECTNAME/REPONAME;a=commitdiff;h=%%s' ; echo;echo; git show -C ; echo"
git config hooks.mailinglist PROJECTNAME-COMMITS@lists.sourceforge.net

Remember to subscribe noreply@sourceforge.net to your announce list, if needed. Finally, set a friendly description on the repository for use by the git web-based repo browser:

echo 'PROJECTNAME git repository' > description

One other thing I did was enable an SSH key on my SourceForge account, as this makes life with SSH-based git much smoother :-) If you have the need to create additional git repositories, or even to replace the one created automatically, then it’s just a case of issuing the git command:

cd /home/scm_git/P/PR/PROJECTNAME
git --git-dir=REPONAME init --shared=all --bare

Good luck with your own migrations, and happy coding!

This entry was posted in devops, git, linux, netdisco. Bookmark the permalink.

7 Responses to Migrate SourceForge CVS repository to git

  1. Hi,

    this helped me a lot to setup my SF git repos. However, my hook script is still giving me problems displaying the right gitweb URL. If I use the line you suggested for showrev, then I get a truncated URL in the email, i.e., nothing beyond the h= , like so:

    http://PROJECTNAME.git.sourceforge.net/git/gitweb.cgi?p=PROJECTNAME/REPONAME;a=commitdiff;h=

    Thanks for this great material.

    • Thanks for the feedback :-)

      Sorry to hear you had a little trouble with the commit messages. I suggest you SSH in and cat the git config file and check it really has the right command in, with all the quoting, escaping etc. For instance the one we have on the Netdisco project looks like:


      showrev = "t=%s; printf 'http://netdisco.git.sourceforge.net/git/gitweb.cgi?p=netdisco/netdisco;a=commitdiff;h=%%s' $t; echo;echo; git show -C $t; echo"

  2. Dale Visser says:

    Nice post!

    Where can I find the cvs2git default options file? I installed it using “sudo apt-get install” on Ubuntu.

    • Hi Dale,

      I recommend getting the latest code for the cvs2git tool as shown in my post, and then the options file will be in the downloaded tree, as shown.

      However in the Ubuntu cvs2svn package (which I guess is what you installed) then you’ll find an example file in /usr/share/doc/cvs2svn/examples/ which you should copy and edit.

      Good luck with your migration!

      • Dale Visser says:

        Oliver:

        Thanks! I realized shortly after I posted my question that your command-line example in the post showed precisely how to do it. I have 3 cvs repositories to convert over to Git, and my largest and most important one is completed already, thanks to your detailed post.

        Best regards,
        Dale

  3. Rafa Carmona says:

    Fantastic job! Thank you.

  4. Benj says:

    To make it work, I had to use the following path instream of the path mentionned in your post.

    /home/git/p/PROJECTNAME/MOUNTPOINT.git/

    Some additional information here : https://sourceforge.net/p/allura/tickets/5470/

    Cheers,
    Benj