Git 101

Better Together Than Alone: Pull Requests - Part 2

 

Merging Pull Requests

So far we’ve been playing the role of the unknown open-source contributor on the Internet, hoping to have his or her commits accepted by the Ratpack project. Assuming we’ve done great work and have been persuasive in the pull request discussion, the owner of the main repo will be ready to click the merge button. Let’s put that person’s shoes on now, and take a look at the different ways to merge PRs and the various scenarios that surround them.

From the Web

The simplest way to merge a pull request is on the web. If you look at the bottom of the pull request detail page as shown in Figure 7, you might see a bright green bar with a big button in it that says "Merge This Pull Request Automatically." If you see this bar, clicking the button will cause the commits submitted in the pull request to be merged into the target branch. Note that this will actually introduce a new merge commit into the hosted Git repository on GitHub—a commit you’ll have to pull down to your clone later on. (This article assumes you already know how to push and pull from upstream Git repositories, but just in case, we’ll see how to do this in a little while.)

 

Figure 7: GitHub’s built-in support for merging a pull request

 

In some cases, of course, you can’t automatically merge the PR. You’ll know this, because the green bar will be instead be gray, and will contain text telling you that you can’t merge automatically. Behind the scenes, GitHub has already attempted the merge, and knows that a conflict will result if it proceeds. Since there is no way to resolve that conflict through the web site, you’re going to have to do the merge from the command line.

From the command line

Merging a pull request is ultimately the same as merging any other kind of branch. It differs only in what branch is being merged: most merges are done on local feature branches, but the branch to be merged in the case of a PR comes from another repository entirely. We are forced to do this merge "manually" in the case of a merge conflict, but we might decide to handle unconflicted merges this way as well. Fetching the commits to a local repository gives us the freedom to experiment with the merge in isolation before sharing it with the world through the GitHub repo.

If you already know how to branch and merge in Git, there is really only one new step in this process, and even this step involves a command, fetch, which you're probably already familiar with. git fetch connects to an upstream repo, determines what objects that repo has that the local repo does not, downloads these objects to the local object database, and optionally updates any remote branch names associated with that upstream repo. Importantly, fetch does not create any new commits on any local branches of the repo; that is, it doesn’t merge any of the downloaded content. It merely saves it to the local object database and updates named pointers to the newly-downloaded commits.

To fetch the branch from which the pull request was sent, you’ll need a repo URL and a branch name. You can find all the information you need on the pull request detail page. See Figure 8 to see where the merge instructions are kept. Clicking the information icon brings up detailed pull request merge instructions, which will always work if you follow them to the letter. However, a lighter-weight procedure is easy to construct from the information we have.

 

Figure 8: Click on the information icon to get pull request merge instructions

 

Using the URL and branch name shown in the dialog box, we can simply copy them both to the clipboard as shown in Figure 9, and we will be ready to do the fetch.

 

Figure 9: The URL and branch from which to fetch to resolve the merge conflict

 

The most common use of fetch relies on a named remote repository, typically called origin. In this case, however, we’ll be fetching directly from the URL we just constructed, specifying the branch containing the commits we want to merge (in this case, master).

 

$ git fetch https://github.com/githubstudent/Ratpack.git master

 

TODO Capture what fetch gives us when we do the above

Since this may be a one-time fetch from this repository (we don’t know if this clone will be submitting more PRs or not), we didn’t bother to create a remote. That remote would have served as a handy label to easier recall the URL of the requesting repo. As a result, Git has no remote branch names to update, and instead issues us a temporary pointer to the fetched commits, called FETCH_HEAD. Keep in mind that this label is temporary; the very next fetch will overwrite it, causing us to have to repeat steps to recall it.

Now we’re ready to perform our merge. Since the pull request was targeted to the master branch of our repo, checkout the master branch and type the following:

 

$ git merge FETCH_HEAD

 

If we are merging the PR locally because of a merge conflict (and not merely because we like to be more cautious with our merges), then that conflict will arise at this point, and we’ll be able to resolve it through the normal merge conflict resolution procedure.

Once the merge is complete, push your work to the upstream repo with a git push from the master branch. This will send the pull-requested commits and the merge commit you just created to GitHub. If you go back to the pull request detail page, you’ll notice that it automatically shows that the PR has been closed. Your work is done.

An Enterprise Use Case

So the basic pull request workflow is clear enough. However, what if your project is not open source, but instead proprietary code belonging to your employer? Pull Requests have proved to be an incredibly effective mechanism for simplifying committer lists while inviting increased contributions from various open-source communities, but the concerns of open source development are far from your mind if you are an enterprise user. How, then, do pull requests benefit you? The answer is that the potential benefits are just as large as in the open-source case. Let’s talk through a scenario.

Suppose you work on an internal product with a team of 5-10 other developers. Each of you has push and pull rights on your repository, and you routinely push your work to master and various feature branches as you see fit. It’s not obvious how pull requests could help you manage this workflow. (As it turns out, the GitHub flow uses pull requests even in scenarios like this, but this is beyond the scope of this article.)

However, you also consume other components developed internally in your organization. Suppose you need to use the corporate locationalization API, a module from the company-wide security team, and some build infrastructure components from the enterprise delivery automation team. Normally your build system obtains these modules and incorporates them into your build, and you receive updates to the components as they are published by their respective authors. But as the consumer of an API, sometimes you have to change that API for reasons of your own, and push your own changes upstream to the source of the component. Sometimes the consumer legitimately becomes the producer.

Forking and pull requesting provide an ideal solution to this common enterprise problem. If you have to make changes to the enterprise localization API to implement a new feature, begin by forking the enterprise localization API repo, and reconfiguring your build to use your forked version, not the "official" one. Once you’ve made your changes to the API and tested them thoroughly—perhaps even deploying them to a localized production environment, depending on your release management policies—you are ready to submit your work back to the enterprise localization repo. The ideal mechanism for doing this is through a pull request, using exactly the same procedures outlined in the open-source use case. You are the developer of a proprietary application that consumes a proprietary component, but the workflow through which you modify that component looks exactly the same as the open-source workflow described above.

Keeping your fork up to date

Regardless of whether you’re an enterprise user or an open-source contributor, at some point you’ll submit a pull request and it will be accepted by the upstream repo. Congratulations! Your code, signed with your name and email address, is now an immutable part of that repo. If you plan to continue your own contributions on your fork to be submitted by future pull requests, you’re going to have to keep your fork up to date with the main repo. This also involves Git commands you probably already know, but may not have used in precisely this formula.

To begin with, you should create a remote to point to the main repo. You already have a remote called origin, which points to your fork on GitHub. You’ll need to add a second remote to point to the main repo, like this:

 

$ git remote add mainrepo https://github.com/tlberglund/Ratpack.git

 

With that remote established, you need only pull from it to keep your repo up to date. From your own master branch, type the following:

 

$ git pull mainrepo master

 

This will keep your master branch up to date with the main repo. As with any merge, there is the potential for this command to generate merge conflicts, which you should be prepared to resolve and commit.

Concluding thoughts

Effective use of pull requests requires just a little familiarity with some important GitHub UI features, along with basic Git network and merge commands. They are not hard to learn, and they manage to implement the sweet spot between managing contributor rights and encouraging broad-based community contributions in many open-source and enterprise contexts. They are not the only collaborative feature on GitHub, and they will not be the last, but they remain one of GitHub’s most important innovations to date, helping the site to deliver on the promise of making it better to work together on code than alone.

Author Bio: Tim is a full-stack generalist and passionate teacher who loves working with people as much as he loves to code. He is a GitHubber whose mission is to make it easy for everybody in the world to use Git. He is a speaker internationally and on the No Fluff Just Stuff tour in the United States, who loves to speak on Git, Cassandra, and other topics. He is co-president of the Denver Open Source User Group, co-presenter of the best-selling O'Reilly Git Master Class, co-author of Building and Testing with Gradle, a member of the O'Reilly Expert Network and a member of the GigOM Pro Analyst Network. He occasionally blogs at timberglund.com. He lives in Littleton, CO, USA with the wife of his youth and their three children.

This article previously appeared in JAX Magazine: Pulling Together. For that issues and others, click here.

Pages

Tim Berglund
Tim Berglund

What do you think?

Comments

Latest opinions