A succesful Git branching model considered harmful
Update 2018-07. The branching model described here is called trunk based development. I and other people who I collaborated with did not know about the articles that used this name. Nowadays there are excellent web resources about the subject, like trunkbaseddevelopment.com. They have a lot of material about this and other key subjects revolving around the area of efficient software development.
When people start to use git and get introduced to branches and to the ease of branching, they may do couple of Google searches and very often end up on a blog post about A successful Git branching model. The biggest issue with this article is that it comes up as one of the first ones in many git branching related searches when it should serve as a warning how not to use branches in software development.
What is wrong with “A successful Git branching model”?
To put it bluntly, this type of development approach where you use shared remote branches for everything and merge them back as they are is much more complicated than it should be. The basic principle in making usable systems is to have sane defaults. This branching model makes that mistake from the very beginning by not using the master
branch for something that a developer who clones the repository would expect it to be used, development.
Using individual (long lived) branches for features also make it harder to ensure that everything works together when changes are merged back together. This is especially pronounced in today’s world where continuous integration should be the default practice of software development regardless how big the project is. By integrating all changes together regularly you’ll avoid big integration issues that waste a lot of time to resolve, especially for bigger projects with hundreds or thousands of developers. This type of development practice where every feature is developed in its own shared remote branch drives the process naturally towards big integration issues instead of avoiding them.
Also in “A successful Git branching model” merge commits are encouraged as the main method for integrating changes. I will explain next why merge commits are bad and what you will lose by using them.
What is wrong with merge commits?
“A successful Git branching model” talks how non-fast-forward merge commits can be thought as a way to keep all commits related to a certain feature nicely in one group. Then if you decide that a feature is not for you, you can just revert that one commit and have the whole feature removed. I would argue that this is a really rare situation that you revert a feature or that you even get it done completely right on the first try.
Merges in git very often create additional commits that begin with the message that looks like following: “Merge branch 'some-branch' of git://git.some.domain/repository/
”. That does not provide any value when you want to see what has actually changed. You need go to the commit message and read what happens there, probably in the second paragraph. Not to mention going back in history to the branch and trying to see what happens in that branch.
Having non-linear history also makes git bisect harder to do when issues are only revealed during integration. You may have both of the branches good individually but then the merge commit fails because your changes don’t conflict. This is not even that hard to encounter when one developer changes some internal interface and other developer builds something new based on the old interface definition. These kind of can be easy or hard to figure out, but having the history linear without any merge commits could immediately point out the commit that causes issues.
Something more simple
Let me show a much more simple alternative, that we can call the cactus model. It gets the name from the fact that all branches branch out from a wide trunk (master
branch) and never get merged back. Cactus model should reflect much better the way that comes up naturally when working with git and making sure that continuous integration principles are used.
In figure 1 you can see the principle how the cactus branching model works and following sections explain the reasoning behind it. Some principles shown here may need Gerrit or similar integrated code review and repository management system to be fully usable.
All development happens on the master branch
master
branch is the default that is checked out after git clone
. So why not also have all development also happen there? No need to guess or needlessly document the development branch when it is the default one. This only applies to the central repository that is cloned and kept up to date by everyone. Individual developers are encouraged to use local branches for development but avoid shared remote branches.
Developers should git rebase
their changes regularly so that their local branches would follow the latest origin/master
. This is to make sure that we do not develop on an outdated baseline.
Using local branches
Cactus model does not to discourage using branches when they are useful. Especially an individual developer should use short lived feature branches in their local repository and integrate them with the origin/master
whenever there is something that can be shared with everyone else. Local branches are just to make it more easy to move between features while commits are tested or under code review.
Figures 2 and 3 show the basic principle of local branches and rebases by visualizing a tree state. In figure 2 we have a situation with two active local development branches (fuchsia circles) and one branch that is under code review (blue circles) and ready to be integrated to origin/master
(yellow circles). In figure 3 we have updated the origin/master
with two new commits (yellow-blue circles) and submitted two commits for code review (blue circle) and consider them to be ready for integration. As branches don’t automatically disappear from the repository, the integrated commits are still in the local repository (gray circles), but hopefully forgotten and ready to be garbage collected.
Shared remote branches
As a main principle, shared remote branches should be avoided. All changes should be made available on origin/master
and other developers should build their changes on top of that by continuously updating their working copies. This ensures that we do not end up in integration hell that will happen when many feature branches need to be combined at once.
If you use staged code review system, like Gerrit or github, then you can just git fetch
the commit chain and build on top of that. Then git push
your changes to your own repository to some specific branch, that you have hopefully rebased on top of the origin/master
before pushing.
Releases are branched out from origin/master
Releases get their own tags or branches that are branched out from origin/master
. In case we need a hotfix, just add that to the release branch and cherry-pick it to the master branch, if applicable. By using some specific tagging and branching naming scheme should enable for automatic releases but this should be completely invisible to developers in their daily work.
There are only fast-forward merges
git merge
is not used. Changes go to origin/master
by using git rebase
or git cherry-pick
. This avoids cluttering the repository with merge commits that do not really provide any real value and avoids christmas tree look on the repository. Rebasing also makes the history linear, so that git bisect
is really easy to use for finding regressions.
If you are using Gerrit, you can also use cherry-pick submit strategy. This simply enables putting a collection of commits to origin/master
at any desired order instead of having to settle for the order decided when commits were first put for a code review.
Concluding remarks
Git is really cool as a version control system. You can do all kinds of nifty stuff with it really easily that was hard or impossible to do. Branches are just pointers to certain commits and that way you can create a branch really cheaply from anything. Also you can do all kinds of fancy merges and this makes using and mixing branches very easy. But as with all tools, branches should be used appropriately to make it more easy for developers to their daily development tasks, not harder by default.
I have also seen these kind of scary development practices to be used in projects with hundreds of developers when moving to git from some other version control systems. Some organizations even take this as far as they fully remove the master
branch from the central repository and create all kinds of questions and obstacles by not having a sane default that is expected by anyone who has used git anywhere else.