BT

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Topics

Choose your language

InfoQ Homepage Articles Automating Your Java Project Workflow with a Modified Gitflow Branching Model

Automating Your Java Project Workflow with a Modified Gitflow Branching Model

Key Takeaways

  • Gitflow is a collaborative branching model that attempts to exploits the power, speed and simplicity of Git branching. The technique worked well in the situation we describe here, but others have noted that using Gitflow brings its own challenges.
  • Documentation for using Gitflow within a deployment pipeline is nebulous at best
  • Features are isolated within branches. You can manage your own feature changes in isolation. This approach is in contrast with trunk-based development, where every developer commits to the master branch at least once every 24 hours.
  • Feature segregation using isolated branches lets you decide which features to include within each release. The tradeoff here can be challenging merges.
     

Update 13 Feb 2019: The initial rendition of this article met with a large response, most positive, some not so much. The main point of contention was our use of the term "continuous delivery" in an environment where releases have a manual component. If you are on a team that deploys hundreds of releases per day, our framework might not work for you. However if you are in a tightly regulated industry like finance, as we are, where releases are more controlled, and you want to make the best of feature branching, automated integration, automated deploy, and versioning, then this solution might just work for you as well as it does for us

 

A thousand years ago I was at a technology conference, where I came upon a newfangled gizmo on exhibit called “Git”. I learned that it was a next generation source control tool, and my initial reaction was - why do we need that, we already have SVN? That was then. Today development teams are moving to Git in droves, and a huge ecosystem has evolved around middleware and plugins.

Gitflow is a collaborative branching model that attempts to exploit the power, speed, and simplicity of Git branching. As written before on InfoQ this approach does come with its own set of challenges, particularly in relation to continuous integration, but this was exactly the problem we were looking to solve. Gitflow was introduced by Vincent Driessen in his classic 2010 blog “A Successful Git Branching Model”, Gitflow takes the pain out of collaborative development by allowing teams to isolate new development from completed work within individual branches, allowing you to cherry-pick features for release, while still encouraging frequent commits and automated testing. As a by-product we have found that this produces cleaner code by promoting regular code-reviews during merges, even self-code reviews, thereby exposing bugs, opportunities for refactoring, and optimizations.

But when it comes to implementing Gitflow in an automated deployment pipeline, the particulars are very specific to your development environment, and there are infinite possibilities. Consequently the documentation is sparse. Given the well-known branch names - master, develop, feature, etc., which branches do we build, which do we test, which do we deploy in our team as snapshots, which deploy releases, and how do we automate deployments to Dev, UAT, Prod, etc.?

These are frequently asked questions at conferences where we have presented, and in this article we would like to share a solution we have developed in our own work at a large financial technology firm.

The project described here uses Java and Maven, but we believe that any environment could be similarly adapted. We use GitLab CI with custom run scripts, but Jenkins or a GitHub CI plugin could also be used; we use Jira for issue tracking, IntelliJ IDEA as our IDE, Nexus as our dependency repository, and we use Ansible for our automated deployment, but any similar tools can be substituted.

First, let’s see how we got here from there.

Evolution

In antediluvian times, developers would spend weeks or months building an application feature, whence they would hand off the “completed” work to an “integrator” -- a well-meaning and dedicated human chap, who would take all such features, integrate them, resolve conflicts, and prepare them for release. The integration process was a daunting, error-fraught endeavor, unpredictable in schedule and consequence, breeding the well-deserved appellation “integration hell”. Then around the turn of the century Kent Beck released his seminal book “Extreme Programming Explained”, which advocated the concept of “continuous integration”; the practice of each developer building and integrating code into a master/mainline branch and running tests in an automated fashion, every few hours, and certainly no less than a day. Not long after that, Martin Fowler’s Thoughtworks open-sourced Cruise Control, one of history’s first CI automation tools.

Enter Gitflow

Gitflow advocates the use of feature branches for developing individual features, and separate branches for integration and release, as we will see. The following graphic reprinted from Vincent Driessen’s blog is now very familiar to Git development teams.

As Git users, we are all familiar with the branch called “master”; this is the mainline or “trunk” branch created by Git by default when we first initialize any Git project. Before adopting Gitflow, you’ve most likely been committing to your master branch. 

Kicking Off Gitflow

To get a project started with Gitflow, there is a one-time initialization step where you create a branch off master called “develop”. From then on, develop becomes the catch-all branch, where all of your code is deposited and tested, essentially becoming your primary “integration” branch.

As a developer, you will never commit directly to the develop branch, and you will never, never commit directly to master. Master is referred to as a “stable” branch - containing only work that is production ready, either released or ready for release. If it’s in master, it is a past or future production release. 

The develop branch is referred to as “unstable”; perhaps a bit of a misnomer - it is stable in that it contains code that is destined for release; and it must compile and tests must pass; just that it contains work that may or may not be complete, and therefore “unstable”. 

So where do we do our work? That’s where the rest of the picture starts to materialize: 

You get a new Jira issue to work on. Immediately you branch a feature branch, typically from develop if that is at a stable point, or else from master

We have agreed on the convention of naming our feature branches as “feat-” followed by the Jira issue number. (If there are more than one Jira issue, just use the Epic or Parent task, or one of the main issue numbers, followed by a very brief description of the feature.) For example “feat-SDLC-123-add-name-field”. The “feat-” prefix provides a pattern that the CI server can use to identify this as a feature branch. We will see why this is important in a few paragraphs. In this example, SDLC-123 is our Jira issue number, which gives us a visual link to the driving issue, and the remaining description gives us a brief narrative describing the feature.

Development now proceeds in parallel, with everyone working on their feature branches concurrently, some teams working on the same branch completing that feature, others working on different features. We have found that by frequently merging to the develop branch, the team has reduced the amount of time spent in “merge hell”.

Releases, Snapshots, and Shared Repositories

Let’s spend a few words to clarify this. In most enterprises, there is a single dependency repo such as Sonatype Nexus. This repo contains two kinds of binaries. “SNAPSHOT” binaries are usually named using a semver (three-part dot-delimited) version, followed by the word “-SNAPSHOT”, (e.g. 1.2.0-SNAPSHOT). Release binaries are versioned with the same name except without the “-SNAPSHOT” suffix (e.g. 1.2.0). Snapshot builds are unique in that any time you build a binary with that snapshot version, it replaces any previous binary that had that same name. Release builds are not that way; once you build a release build you can take it to the bank that the binary associated with that version will never be changed in Nexus.

Now, imagine you are working on feature X and your companion team is working on feature Y. You both branched off developat the same time, so you both have the same base version in your POM (say 1.2.0-SNAPSHOT). Now let’s say you run your build and deployed your feature branch to Nexus, and soon after that, your companion team ran their build and deployed that to Nexus. In such a scenario you would never know which feature binary was in Nexus, since 1.2.0-SNAPSHOT would refer to two different binaries corresponding to the two distinct feature branches (or more if there are more such feature branches!) This is a very commonly occurring conflict.

GitLab CI

Yet we instruct teams to commit often and early! So how do we avoid such conflicts? The answer is to tell GitLab CI to build, but just not deploy to Nexus, by associating the “feat-” branches with a Maven verify lifecycle step (which builds locally and runs all tests), rather than a Maven deploy step (which would send the snapshot binary to Nexus).

GitLab CI is configured by defining a file (called .gitlab-ci.yml) in the project root, which contains the exact CI/CD execution steps. The beauty of this feature is that the run script is then associated with your commit, so you can vary it based on a commit or a branch.

We have configured GitLab CI with the following job, containing regex and script for building feature branches:

feature-build:
  stage: 
    build
  script:
    - mvn clean verify sonar:sonar
  only:
    - /^feat-\w+$/

Teams are encouraged (nay, mandated!) to commit frequently. Each commit runs your tests in isolation, ensuring that your current feature work has not broken anything, and allowing you to add tests to the code you have changed.

Coverage Driven Development

Now would be a good time to discuss test coverage. IntelliJ idea has a “coverage” run-mode, that allows you to run your test code with coverage, (either in debug or run mode), and paints your margins green or pink, depending on whether that code was covered or not. You can (and should) also add a coverage plugin (such as Jacoco) to Maven, so that you can receive coverage reports as part of your integration build. If you are using an IDE that does not color the margins, you can pull these reports to locate swaths of uncovered code. 

[One sidebar note on this - unfortunately there are still many professional development teams who although pontificating an intractable orthodoxy on automation and development, yet for one reason or another have been remiss in amplifying their test coverage. Now, we won’t be the ones to tell such teams to go back and add tests to every uncovered piece of code; but as good developer citizens, we consider it our duty to introduce tests at least for the code we have added or modified. By reviewing the coverage-color coded margins against the code we introduced, we can quickly identify opportunities for introducing new tests.]

Tests are executed as part of the Maven build. Maven test phase executes unit tests (designated by a name that starts with Test-something.java or ends with Test.java, Tests.java, or TestCase.java). Maven verifyphase (requires Maven Failsafe plugin) also executes integration tests. A call to mvn verify triggers the build, followed by a pipeline of lifecycle phases including test and verify. We also recommend installing the SonarQube, and the Maven SonarQube plugin, for static code analysis during the test phase. In our model, every branch commit or merge executes all of these tests.

Integrating Our Work

Back to Gitflow. We have now done some more work on our feature, we have been committing to our feature branch, but in the spirit of “integration” we want to ensure it plays nicely with all of the other team feature commits. So by policy we agree that all development teams merge to the develop branch at least once per day.

We also have a policy that we enforce inside of GitLab, that we cannot merge into develop without a code review, in the form of a merge request:

Depending on your SDLC policy you can force developers to do a code review with someone else by endowing your merges with a list of approvers. Or you can enforce a more relaxed strategy by allowing developers to perform their own code reviews after viewing their own merge request. This strategy works nicely in that it encourages developers to at least review their own code, but like any honor system, it comes with obvious risks. Note that since the binary will never be deployed to Nexus or otherwise shared, the POM version contained in the develop branch is irrelevant. You can call it 0.0.0-SNAPSHOT, or just leave the original POM version from whence it was branched.

Finally, after some number of days, the feature is complete, it is fully merged into develop and declared stable, and there are a number of such features that are ready for release. Remember that at this point, we have run our verification tests on every commit, but we have not yet deployed so much as a SNAPSHOT to Nexus. That is then our next step.

At this point we branch a releasebranch off develop. But in a slight departure from traditional Gitflow, we don’t call it release; rather we name the branch after the release version number. In our case, we use a 3 part semantic versioning, so if it is a major release (new features or breaking changes) we increment the major (first) number, a minor release we increase the minor (second) number, and if a patch, the third. So if the previous release was 1.2.0 the coming release might be 1.2.1 and the snapshot pom version would then be 1.2.1-SNAPSHOT. So our branch would be named accordingly 1.2.1.

Configuring the Pipeline

We have configured our GitLab CI pipeline to recognize that a release branch has been made (a release branch is identified by its semver three part dot delimited number; in regex-speak: \d+\.\d+\.\d+). The CI/CD runner is configured to extract the release name from the branch name, and to use the version plugin to change the POM version to include the SNAPSHOT corresponding to this branch name (1.2.1-SNAPSHOT in our example). 

release-build:
  stage:
    build
  script: 
    - mvn versions:set -DnewVersion=${CI_COMMIT_REF_NAME}-SNAPSHOT
    # now commit the version to the release branch
    - git add .
    - git commit -m "create snapshot [ci skip]"
    - git push
    # Deploy the binary to Nexus:
    - mvn deploy
  only:
    - /^\d+\.\d+\.\d+$/
  except:
    - tags

Notice the [ciskip] in the commit message. This is critical for preventing looping, where each commit would trigger a new run and a new commit!

After the CI runner makes the change to the POM, the runner commits and pushes the updated pom.xml (now containing a version that matches the branch name.) Now, the remote release branch POM contains the correct SNAPSHOT version for that branch.

GitLab CI, still identifying this release branch by the semantic versioning pattern (/^\d+\.\d+\.\d+$/, for example 1.2.1) of its name, recognizes that a push event has occurred on the branch. The GitLab runner executes mvn deploy to produce a SNAPSHOT build and a deploy to Nexus. Ansible now deploys that to a dev server, where it is available for testing. This step executes for all pushes to the release branch. This way, small tweaks that the developers make to the release candidate trigger a SNAPSHOT build, SNAPSHOT release to Nexus, and deployment of that SNAPSHOT artifact to the development servers.

We have omitted the Ansible deployment scripts because those are very personal to your specific deployment model. These can perform whatever actions are needed for artifact deployment, including restarting services after new artifacts are installed, updating cron schedules, and changing application configuration files. You will need to define your Ansible deployment specifically for your particular needs.

And at long last we merge to master, triggering Git to tag the release with the semver release number from the source releasebranch name, deploy the whole wad to Nexus, and run sonar tests.

Note that in GitLab CI, anything you want to have hang around for the next job step, you need to designate as artifacts. In this case we are going to deploy our jar artifact using Ansible, so we designate that as a GitLab CI artifact.

master-branch-build:
  stage:
	build
  script:
	# Remove the -SNAPSHOT from the POM version
	- mvn versions:set -DremoveSnapshot
	# use the Maven help plugin to determine the version. Note the grep -v at the end, to prune out unwanted log lines.
	- export FINAL_VERSION=$(mvn --non-recursive help:evaluate -Dexpression=project.version | grep -v '\[.*')
	# Stage and commit the binaries (again using [ci skip] in the comment to avoid cycles)
	- git add .
	- git commit -m "Create release version [ci skip]"
	# Tag the release
	- git tag -a ${FINAL_VERSION} -m "Create release version"
	- git push 
	- mvn sonar:sonar deploy
  artifacts:
	paths:
	# list our binaries here for Ansible deployment in the master-branch-deploy stage
  	- target/my-binaries-*.jar
  only:
	- master
 
master-branch-deploy:
  stage:
	deploy
  dependencies:
	- master-branch-build
  script:
   # "We would deploy artifacts (target/my-binaries-*.jar) here, using ansible
  only:
	- master

Squashing Bugs

During testing, bug fixes might be discovered. These are fixed right on the release branch, and then merged back to develop. (Develop always contains everything that was or will be released.)

Finally, the release branch is approved, and it is merged into master. Master has an enforced GitLab policy never to accept merges except from a release branch. The GitLab runner checks out the merged code into master, which still has the release branch SNAPSHOT version. The GitLab runner uses the Maven versions plugin again to execute the versions:set goal with the removeSnapshot parameter set. This goal will remove the “-SNAPSHOT” from the POM version, and the GitLab runner will push this change to the remote master, tag the release, increment the POM version to the next SNAPSHOT version, and deploy that to Nexus. That is deployed to UAT for QA and UAT testing. Once the artifact is approved for release to production, production services teams will take the release artifact and deploy it to production. (This step can be automated as well via Ansible, depending on your corporate policy.)

Patches and Hotfixes

There is one more side workflow we must mention, and that is for patches or hotfixes. These are triggered when an issue is found in production or during testing of a releaseartifact, for example a bug or performance issue. Hotfixes are similar to release branches. They are named for the release, just like a release branch. The only difference is that they are branched not from developbut from master. 

Work is done to complete the hotfix. Just like a release branch, the hotfix triggers a Nexus SNAPSHOT deploy, and a deploy to UAT. Once this is certified, it is merged back into develop, then it is merged into master in preparation for its release. Master will trigger the release build and deploy the release binary to Nexus.

Summary

We can summarize all of this in the following grid:

So there we have our flavor of Gitflow. We encourage development teams of any size to explore and try out this strategy. We believe it has the following merits:

  • Features are isolated. It is easy to manage your own feature changes in isolation, with the usual caveat of feature branching, that it had the potential to make team integration more challenging in a very active feature, or is commits are not merged frequently
  • Feature segregation, which lets you cherry pick which features to include within a release. An alternative approach to doing this is to continually release code related to features that are hidden behind feature flags.
  • The integration and merge process has led to our team performing more disciplined code reviews, which has helped promote clean coding
  • Automated testing, deployment and release to all environments that meets our teams requirements and preferred way of working.

Our approach may be a departure from some of the accepted norms in this space, and as such has generated some debate on social media. Indeed, an initial version of this article triggered an analysis and discussion of the approach by Steve Smith. Our intention is to share insight into our way of working, and the caveat is that the process described here won’t suit all teams or every way of working. 

We are keen to hear about your experiences with Gitflow and deployment pipelines, so please do leave your comments below.

More information

For a more traditional treatment of Gitflow using Atlassian Bamboo and BitBucket,  see this.

There is also a nifty Gitflow Maven plugin, which is actively maintained by Alex Mashchenko that works much like the Maven release plugin on Gitflow steroids. This could be adapted to accommodate our proposed Gitflow implementation.

About the Authors

Victor Grazi works at Nomura Securities on enterprise infrastructure application development. An Oracle Java Champion, Victor has also served as lead editor on the Java queue at InfoQ, and has served as a member of the Java Community Process Execute Committee.


Bryan Gardner is a recent graduate of Stevens Institute of Technology, where he obtained his BS and MS in computer science. Bryan currently works at Nomura, as a software engineer on the Infrastructure Development team. He primarily spends his day working on Spring Boot backend services or working on big data pipelines with Apache Spark.

Rate this Article

Adoption
Style

BT