Better Together Than Alone: Pull Requests
In this JAX Magazine article, Tim Berglund is our guide in getting to grips with one of GitHub’s core features.
The romping success of both Git and Github is impossible to ignore, and Git practices are rapidly becoming a ubiquitous staple of a developer’s working day. To those who still believe forking is a bad word, Githubber Tim Berglund explains the beauty of Pull Requests within both open source and enterprise circles.
GitHub’s mission is to make it easier to work together than alone. Throughout the company’s history, they have worked toward this goal by providing an easy way to host Git repositories online and surrounding those repositories with a growing set of collaborative mechanisms that work in the browser and through Git itself.
Pull Requests may be the most important of these innovations. They have enabled increased open-source contributions, provided new ways for enterprise teams to work together, and offered a full-featured code review mechanism—all at the cost of a few Git commands and a simple web user interface. Let’s take a look at how pull requests work and how to use them in open-source and enterprise environments.
An open source use case
Suppose you are using the open-source Ratpack framework for a lightweight web application you want to build using the Groovy language. For simple apps, this just means you clone the template and code away, but you’ve encountered a missing feature in the framework that’s really getting in your way. (Full disclosure: the author is also the maintainer of Ratpack, and is aware of several missing features in the framework on which he would happily accept pull requests!)
To get your new feature into the framework, you need to download the code, make the changes, test them locally, and then persuade the maintainer to accept them. In the past, this meant submitting a patch to a mailing list, or worse, fighting your way into the inner circle of the project’s committers. Both of those options worked in the past, but they contain just enough friction to dissuade those who are marginally less motivated to contribute back to the project. Pull requests help capture that margin of productive committers who want to submit their contributions, and enable more highly motivated committers to contribute with less wasted time.
The pull request process starts with you making a copy of the project to which you want to contribute. You could simply clone the project to your local disk, and you’d be free to make changes to that clone as you saw fit, but you wouldn’t be able to submit them back to the project. Remember, you aren’t a committer to the project, and you might not have the goal of becoming one. Instead, to get a copy of the project, you have to go to the source and make the project your own. You have to copy it on GitHub and own that copy. You have to fork the project.
Prior to GitHub, forking was a bad word. For an open-source project to fork, it meant that factions had developed within the team writing the code, and one faction was splitting off from the other and taking the codebase in a separate and incompatible direction. On GitHub, forking simply means that you create a copy of the project under your username, maintaining a connection to the original. It means you’ve got a place to do independent work on the repo, with the promise that you’ll easily be able to submit your changes back later on.
Figure 1: The Fork button in the Github web UI
In the upper-right corner of the main repo, there’s a button labeled "Fork." Click on this button as shown in Figure 1, and you’ll be treated to a brief animation while GitHub does some work in the background. A few seconds later, you’ll be redirected to what looks like the same repo you just left—except this time you’ll notice that the URL has changed like in Figure 2. This copy of the repo belongs to you!
Figure 2: A newly forked repo, belonging to githubstudent instead of tlberglund.
To do any serious work in your fork, you’ll have to clone it to your development machine. Following the username we’re using in our example, you’d want to get to a working directory on your machine and type the following:
$ git clone https://github.com/githubstudent/Ratpack.git
You can then make changes to that clone, commit them, and push them back to GitHub. You own the clone, so you have the right to push commits to it whenever you’d like.
Once you’ve forked, you have complete control over your own copy of the original project. You can use your fork in your own local builds, push your changes to GitHub, and generally carry on with a private copy of the project in whatever way you see fit. Eventually, though, you’d probably like to get those changes incorporated back into the original project. The easiest way to do this is with a pull request.
A pull request is like a message you send from your fork to the original repository. It has a title, a message body, and a list of commits you want incorporated in the original repo. It’s a way of telling the person or organization who owns the repo that you’ve got work you would like them to merge into their version of the project.
To be precise, the pull request "message" doesn’t really contain a list of commits. It reality, it specifies the branch from which you want the pull-requested commits to come, plus the branch into which you want them merged. As shown in the pull request page screen shot in Figure 3, the source branch is on the right-hand side of the screen, and the destination is on the left. All of the commits that are in the source branch are a part of the pull request. As we’ll see later on, we can even push new commits to this branch after opening the pull request, and those new commits participate in the request as well.
Figure 3: The form used to submit a new pull request
Communicating around pull requests
Once the contributor clicks on the Send Pull Request button, he is redirected to the page showing the pull request detail. Since the destination of the pull request is the tlberglund/Ratpack repo, the PR’s page is at a URL of the form of id. This is the home base for the PR: where the owner of the repo can accept or reject it, others can view its status, and we can collaborate around the proposed code change. That collaboration takes place through three channels of communcation: comments on the PR, comments on the PR’s commits, and comments on lines of code in those commits.
If you look at the bottom of the pull request page, you’ll find a comment box like in Figure 4. Anyone with read access to the repository can enter a comment here about the PR. Generally, the owner of the repo uses this thread to discuss the proffered changes with the person who submitted the pull request. If there’s something about the submission that doesn’t look right to the owner, he or she can mention it here, and the author of the PR will get notified about the comment. It’s a great way to talk about a submission that the owner doesn’t want to accept, but also doesn’t want to reject outright. Significant collaboration can take place in this part of the page.
Figure 4: The discussion thread associated with a pull thread
Since a pull request lists all of the commits that were in the pull-requested branch (but not yet in the destination branch), you can also access those commits directly through the web interface. Each commit’s hash is a link to the commit’s detail page. Clicking on that link as shown as Figure 5, you will again see a comment box at the bottom of the commit detail page. Here you are able to engage in a discussion of a particular commit, as distinct from the entire PR in which the commit participates. If the bulk of the commits in a PR looked good to the repo owner, but s/he wanted to object to a particular commit that contained only whitespace changes, this might be the right place to do it.
Figure 5: Linking to a commit detail page from a pull request
Finally, and perhaps most powerfully, we can focus our online discussion on the diffs introduced by the pull request, and comment on individual lines of code in that diff (as shown in Figure 6). When the conversation must delve down into very low-level details, there is simply no substitute for looking directly at code. To see the aggregate diffs introduced by the pull request, click on the Files Changed tab near the top of the pull request detail page. This view provides a web-based method for conducting this discussion among multiple participants, regardless of where they are located geographically or whether they can all participate in the discussion at the same time.
Figure 6: Commenting on an individual line of code
If you receive negative, but constructive, feedback on a pull request, you’re likely to want to address it to make the PR acceptable to the repo owner rather than just abandon the effort. There’s nothing else you have to do to the PR itself to submit this additional work; simply continue making changes in the branch on which you originally submitted the PR, push those changes to GitHub, and they show up automatically. The pull request itself is a request to submit an entire branch on a given repo, so new commits in that branch on that repo automatically participate, even if they didn’t exist when you first sent the request.
- An open source use case
- Merging pull requests