When people start to use git and get introduced to branches and to the ease of branching, they may do couple of Google searches and very often end up on a blog post about A successful Git branching model. The biggest issue with this article is that it comes up as one of the first ones in many git branching related searches when it should serve as a warning how not to use branches in software development.
What is wrong with “A successful Git branching model”?
To put it bluntly, this type of development approach where you use
shared remote branches for everything and merge them back as they are
is much more complicated than it should be. The basic principle in
making usable systems is to have sane defaults. This branching model
makes that mistake from the very beginning by not using the
branch for something that a developer who clones the repository would
expect it to be used, development.
Using individual (long lived) branches for features also make it harder to ensure that everything works together when changes are merged back together. This is especially pronounced in today’s world where continuous integration should be the default practice of software development regardless how big the project is. By integrating all changes together regularly you’ll avoid big integration issues that waste a lot of time to resolve, especially for bigger projects with hundreds or thousands of developers. This type of development practice where every feature is developed in its own shared remote branch drives the process naturally towards big integration issues instead of avoiding them.
Also in “A successful Git branching model” merge commits are encouraged as the main method for integrating changes. I will explain next why merge commits are bad and what you will lose by using them.
What is wrong with merge commits?
“A successful Git branching model” talks how non-fast-forward merge commits can be thought as a way to keep all commits related to a certain feature nicely in one group. Then if you decide that a feature is not for you, you can just revert that one commit and have the whole feature removed. I would argue that this is a really rare situation that you revert a feature or that you even get it done completely right on the first try.
Merges in git very often create additional commits that begin with the
message that looks like following: “
Merge branch 'some-branch' of
git://git.some.domain/repository/”. That does not
provide any value when you want to see what has actually changed. You
need go to the commit message and read what happens there, probably in
the second paragraph. Not to mention going back in history to the
branch and trying to see what happens in that branch.
Having non-linear history also makes git bisect harder to do when issues are only revealed during integration. You may have both of the branches good individually but then the merge commit fails because your changes don’t conflict. This is not even that hard to encounter when one developer changes some internal interface and other developer builds something new based on the old interface definition. These kind of can be easy or hard to figure out, but having the history linear without any merge commits could immediately point out the commit that causes issues.
Something more simple
Let me show a much more simple alternative, that we can call the
cactus model. It gets the name from the fact that all branches branch
out from a wide trunk (
master branch) and never get merged
back. Cactus model should reflect much better the way that comes up
naturally when working with git and making sure that continuous
integration principles are used.
In figure 1 you can see the principle how the cactus branching model works and following sections explain the reasoning behind it. Some principles shown here may need Gerrit or similar integrated code review and repository management system to be fully usable.
All development happens on the master branch
master branch is the default that is checked out after
clone. So why not also have all development also happen there? No
need to guess or needlessly document the development branch when it is
the default one. This only applies to the central repository that is
cloned and kept up to date by everyone. Individual developers are
encouraged to use local branches for development but avoid shared
git rebase their changes regularly so that their
local branches would follow the latest
origin/master. This is to
make sure that we do not develop on an outdated baseline.
Using local branches
Cactus model does not to discourage using branches when they are
useful. Especially an individual developer should use short lived
feature branches in their local repository and integrate them with the
origin/master whenever there is something that can be shared with
everyone else. Local branches are just to make it more easy to move
between features while commits are tested or under code review.
Figures 2 and 3 show the basic
principle of local branches and rebases by visualizing a tree
state. In figure 2 we have a situation with two
active local development branches (fuchsia circles) and one branch
that is under code review (blue circles) and ready to be integrated to
origin/master (yellow circles). In
figure 3 we have updated the
two new commits (yellow-blue circles) and submitted two commits for
code review (blue circle) and consider them to be ready for
integration. As branches don’t automatically disappear from the
repository, the integrated commits are still in the local repository
(gray circles), but hopefully forgotten and ready to be
Shared remote branches
As a main principle, shared remote branches should be avoided. All
changes should be made available on
origin/master and other
developers should build their changes on top of that by continuously
updating their working copies. This ensures that we do not end up in
integration hell that will
happen when many feature branches need to be combined at once.
If you use staged code review system, like Gerrit or github, then you
git fetch the commit chain and build on top of that. Then
git push your changes to your own repository to some specific
branch, that you have hopefully rebased on top of the
Releases are branched out from origin/master
Releases get their own tags or branches that are branched out from
origin/master. In case we need a hotfix, just add that to the
release branch and cherry-pick it to the master branch, if
applicable. By using some specific tagging and branching naming scheme
should enable for automatic releases but this should be completely
invisible to developers in their daily work.
There are only fast-forward merges
git merge is not used. Changes go to
origin/master by using
git cherry-pick. This avoids cluttering the repository
with merge commits that do not really provide any real value and
christmas tree look
on the repository. Rebasing also makes the history linear, so that
git bisect is really easy to use for finding regressions.
If you are using Gerrit, you can also use cherry-pick submit
strategy. This simply enables putting a collection of commits to
origin/master at any desired order instead of having to settle for
the order decided when commits were first put for a code review.
Git is really cool as a version control system. You can do all kinds of nifty stuff with it really easily that was hard or impossible to do. Branches are just pointers to certain commits and that way you can create a branch really cheaply from anything. Also you can do all kinds of fancy merges and this makes using and mixing branches very easy. But as with all tools, branches should be used appropriately to make it more easy for developers to their daily development tasks, not harder by default.
I have also seen these kind of scary development practices to be used
in projects with hundreds of developers when moving to git from some
other version control systems. Some organizations even take this as
far as they fully remove the
master branch from the central repository
and create all kinds of questions and obstacles by not having a sane
default that is expected by anyone who has used git anywhere else.