Tutorial: Apache Subversion Best Practices
Its still the most popular software version control for a reason. Former JAX Magazine editor Jessica Thornsby guides through the common pitfalls when picking up Subversion.
Version control is a useful tool for software development, particularly when a project involves more than one developer, and Apache Subversion remains one of the world’s most popular open source version control systems. It’s also one of the most established, with a vibrant ecosystem of client tools, GUIs and plugins for all your version control needs.
However, there are some potential pitfalls to watch out for, especially for Subversion newbies. How should you structure your repository? And how do you implement a branching and merging strategy that won’t give you a headache?
In this article, we’ll cover some essential best practices for Apache Subversion, including repository structure, project layout, and finally branching and merging with SVN.
Apache Subversion doesn’t impose a strict file structure, which allows you to optimize the repository layout to suit a project’s particular needs. However, all of this freedom can result in unnecessary admin overhead – implementing the correct project layout from the beginning is crucial.
One of the first questions you’ll need to answer when starting a new project is: should I use a single repository for multiple projects, or a separate repository for each project?
Single repositories are typically best suited to multiple projects that require cross-tracking and cross-referencing. The benefits of a single repository approach is that there’s a single location where all the code can be accessed, and resources (e.g libraries) can be shared easily between projects. There is also typically less administrative overhead, as new projects don’t require a new repository, and data can be moved between projects without losing any versioning information. However, there are some downsides: performing admin tasks such as dumping and loading one huge repository will be more time-consuming, especially since projects have a tendency to increase in size. Another issue worth bearing in mind, is that Subversion applies its revision numbers to entire trees; this means that the revision number of all your projects will increase simultaneously, regardless of the changes being made.
These are typically used for multiple, unrelated projects. Multiple repositories give users the freedom to tailor each repository to suit the individual project, and ensures that the version number is meaningful to each project. However, sharing code can become a problem as Subversion does not support merging code between repositories, and the merged code will subsequently appear in the new repo with no history.
Once you’ve decided whether to organize your project(s) in a single or multiple repository structure, the next step is to plan your project layout. Putting some thought into planning your layout in advance can save you the administrative hassle of moving files around later.
For most projects, it’s good practice to follow the trunk/branch/tags layout and to make a clear distinction between the code contained in each:
Trunk – this is where you should store current release code – only! The trunk should always be stable and compilable.
Tags – these are used to provide a snapshot of the code at a specific point in your project’s history
Branches – useful for working on significant changes, variations of code etc, without disrupting the stable code in the trunk.
Figure 1: An illustration of how a Subversion Repository evolves using branching, tagging and a code trunk
A quick internet search will uncover countless tales of branching and merging hell but when used correctly, branching and merging can be an invaluable tool for the Subversion user. Following a few best practices can ensure you get the most out of Subversion’s powerful branching and merging functionality.
Merging Best Practices
Always start with a checkout or an update – always ensure you have the latest code from the trunk, and have committed all of your changes before you begin a merge.
Merge on logical checkpoints – in most cases, it makes sense to merge when your branch has reached a certain level of stability and maturity. Never merge when an experimental change has been made to a branch.
Always use log messages – these can be an invaluable source of information a few months down the line, when you need to check when a merge happened, what changes it included, etc. The more relevant information you provide in log messages, the better.
Merge soon – the quicker you can perform a merge and commit your changes back to the repository, the sooner the rest of your team can take your changes into account in their own part of the development effort.
Isolate the merge – it can be tempting to merge and then make some additional changes before performing a commit. However, this will make it difficult to separate which changes came from the merge, and which changes are part of the rest of the development effort, if you ever need to revisit this revision.
Run Frequent ‘SVN Updates’ – even if it seems your branch has little to do with the work going on in the trunk, by the time you’ve finished perfecting your branch, the trunk could have evolved to the point where it’s a struggle to merge. Perform regular updates to ensure any new commits fit with your own changes.
Run into problems? Just start over! – if your merge doesn’t go to plan, you can always discard the changes in your working copy and start over. A revert overrides any local changes with the code in the repository. It uses the ‘svn revert’ command, followed by the location of the working copy (Figure 2):
Figure 2: Revert if needed
Tip: TortoiseSVN, the popular Subversion client for Windows, has a useful ‘tsvn:logminsize’ property that can ensure all team members leave log messages. To set this property:
Select the properties option from the TortoiseSVN menu. In the subsequent properties dialog, select ‘New’ property and click ‘Log sizes.’ (Figure 3).
Figure 3: Log sizes
Specify the minimum number of characters for a commit message. TortoiseSVN will block any commits with a log message shorter than the characters specified. (Figure 4)
Figure 4: Specify characters for commit message
Branching Best Practices
- Delete Unwanted Branches – deleting branches in Subversion has no effect on repository size, because the branch is only removed from the revision in question and all subsequent revisions. A deleted branch still exists in all previous revisions, and can be viewed and recovered at any time. Although deleting branches doesn’t save you space, you should still delete unwanted branches if: You’ve successfully performed a merge – after a merge, the branch becomes completely redundant and should be deleted. You’ve ran the -reintegrate command – this command instructs Subversion to compare the trunk and a branch, and apply the resulting differences to the trunk. Once a ‘–reintegration’ merge has been performed, the branch can no longer be used for development, as any future reintegration will be interpreted as a trunk change. To avoid this, branches should always be deleted after a reintegration merge. Or for General Housekeeping – regularly deleting branches helps to reduce the clutter in the branches directory, making it easy to see active development at a glance.
- Test, Test, Test – consider CI and assertion testing on feature branches, which is a useful way to indicate code maturity and progress.
- Keep track of your branches – if you’re maintaining several concurrent branches, it’s easy to lose track. TortoiseSVN’s ‘Revision Graph’ function helps you keep up with the different branches, by analyzing the revision history and creating a graphical representation of the relationships between branches.
In the Revision Graph, each node represents a revision where something changed, and each change is represented by a different coloured shape. (Figure 5)
Figure 5: Revision Graph
- Added or copied items – an item has either been added to Subversion, or created by copying an existing file/folder
- Branch tip revision – this represents the HEAD revision nodes for each branch. Note, that HEAD does not refer to the latest revision in the repository; but to the latest revision committed on that path. (Figure 6)
Figure 6: Branch tip revision
- Deleted items – branches that have been deleted.
- (Optional) Working copy revision – if you select the ‘Show WC revision’ option from the ‘View’ menu, the BASE revision will be displayed with a bold outline. (Figure 7)
Figure 7: Deleted items and copy revision
If you’re after more Subversion tips, tricks and best practices, WANdisco runs frequent free training webinars for the SVN community. Visit the WANdisco training page for the full programme of upcoming webinars.
Author Bio: Jessica Thornsby is the Technical and Creative Copywriter at WANdisco. She writers regular tutorials on Apache Subversion, TortoiseSVN, uberSVN, and all things version control at www.blogs.wandisco.com. She spends her spare time editing the CD reviews over at www.leedsmusicscene.net, contributing to A Short Fanzine About Rocking, and researching her family tree.
This tutorial originally appeared in JAX Magazine: New Horizons in September. Download that and other issues here.