Open Source 101
FOSS - A Java developer's best friend
The use of open source software has pretty much become a way of life for most of us today. And for most, contributing to open source projects has allowed us to expand our visibility to development teams outside our daily grind, as well as offering a way of engaging in new areas of interest. New projects start every day: some with big backers and lots of visibility, while others only have a handful of dedicated developers trying to solve a local problem.
Another set of highly visible, important projects are hosted in local repositories through open source foundations like the Linux Foundation, the Apache Foundation, the Eclipse Foundation and the Mozilla Foundation. Further projects are available through private repositories, like the Central Repository. Together these repositories represent a great opportunity to leverage hundreds of thousands of projects and billions of lines of code.
However, sifting through the many repositories and projects can be a serious challenge. Furthermore, the noise level on many of these sites is high. Projects are regularly abandoned, morphed into other projects, split, or reborn in other forms. While many have active teams working daily, others limp along and die a long, slow death.
Late last year, we ran an analysis of the ~550,000 projects tracked on Ohloh.net to take a look into how many projects were “active”. To run this analysis, we examined both commits and committers. Here’s what we found:
- 550,000+ projects on Ohloh.
- 271,372 with a code analysis.
- 96,824 with a commit in the past 2 years.
- 46,883 with a commit in the past year.
- 29,303 with a commit in the past 6 months.
- 21,251 with a commit in the past 3 months.
- 12,870 with a commit in the past month.
- 5,629 with a commit in the past week.
- 1,224 with a commit in the past day
Looking at this in chart form in Figure 1, we saw that only 17.3% of the projects analyzed had any activity in the past 12 months.
Figure 1: Projects with activity in last 12 months
Furthermore, we found that of this 17.3%, only half of those projects had two or more developers actively working on the project. This means approximately 8.5% of all projects were still active, as shown in Figure 2.
Figure 2: Still active projects in the last 12 months
Figure 3: Leader of the pack
From this analysis we can see that only a small fraction of created projects ever gain and sustain long-term traction. This might seem obvious, but activity matters. Without an active project team, you have little chance of getting bugs fixed for a project that you depend upon. So be sure to check out the activity levels for any project you are considering using.
Finding the Right Project
With so much choice, finding a project that can help you with your next development effort should be easy, right? Most of the open source forges offer project search capabilities, some with a more faceted search function than others. Here’s a short list of the biggies:
You can also search or post a question on Stackoverflow.com to get recommendations from other developers who may have solved similar problems. There are also a number of public open source directories that you can search to come up with options, including:
- Ohloh (550,000 projects)
- Olex (330,000 projects)
- Ostatic (120,000 projects)
- (Maven) Central Repository
- Free Software Foundation (6850 projects)
- osalt.com (~500 projects)
- EOS Directory (Enterprise-ready OSS) (~400 projects)
If you are looking for a more specific piece of code, you can alternatively use code search tools to discover projects that meet your needs. Here’s a list of public code search options:
Choosing the “Right” Project By now, you can see that there are a ton of options to search for open source projects. But how do you go about choosing the “right” project that both solves the problem you are trying to solve and also has the right characteristics to make it fit your use case? You’ll need to think about many considerations as you evaluate options. Below reflects my list, but you may have others:
- Which languages are used?
- Which license is used for the project?
- How does the documentation look?
- How active is the project?
- How well maintained is the project?
- Is the code widely used in other places
- How big is the project and how complex is it?
- Are there known security vulnerabilities?
- Any outstanding lawsuits?
- Is commercial support available?
- Do you have export requirements? Does the project use encryption?
- What is the quality of the code?
Some of these are easier to answer than others. While you can download the code and look at it to determine things, such as languages used, size and complexity and code quality, others are a bit trickier to work out. For example, tracking down the licenses throughout the dependency chain; finding out how active and maintained the project is; how widely used; how widely adopted; lawsuits and so on. Here’s an attempt to provide you with some direction on how to answer the questions.
The Easy Ones
These you can answer by using one of the code search tools or by downloading the source:
- Which languages are used?
- Which license is used for the project? Or check a project directory like Ohloh, OLEX, etc.
- How is the documentation? Look in the wiki, review the code for comments, or check Ohloh (it counts comments)
- How big is the project and how complex is it?
A Little Harder (but still available)
- Are there known vulnerabilities? (via National Vulnerabilities database)
- How well maintained is the project? Check the bugbase to see how many high priority bugs are open and for how long
- How active is the project? The number of active committers and commit stream will help you here (Ohloh summarizes this data for you)
- Is the code widely used in other places? Search StackOverflow, Google and download stats
The Tougher Questions
- Any outstanding lawsuits? Google search for project name and “lawsuit”
- Is commercial support available? Companies like Credativ and OpenLogic in the US support a subset of FOSS projects.
- Does the project use encryption? Sometimes this is documented on project sites, otherwise explore the project yourself.
- What is the quality of the code? A limited number of projects have code quality audits available from Coverity.
Your organization likely has policies and guidelines in place surrounding the licenses and types of projects which can be used in your projects. Many organizations have formal approval processes in place (some are more manual than others.) The more information you can provide to your approval body, the faster your request will be approved.
It’s important to know your policies in advance, so you don’t end up wasting time with projects which have obvious reasons to be rejected. If you use a lot of open source, explore putting an automated solution in place to speed up the process – it can make a huge difference. Some of the tools available can automatically start the process by detecting new open source components in the build process. Depending on your environment, you may be able to really simplify the whole process.
Tracking Your Open Source Use
It is super important to know what FOSS you are using and where it is all being used. Security vulnerabilities are often reported post-production, so you’ll need to be able to quickly and easily find all instances of a specific component across your application portfolio.
Other issues that come up include license issues, new releases and bug fixes. Several options are available for tracking your open source, from the simple list/spreadsheet option, to much more sophisticated catalogues that integrate into the selection and approval process. Depending on the size and complexity of your organization, in addition to the amount of open source components and applications you are using, you can decide on the best approach. What is most important is to implement a diligent tracking method first.
Are you an Open Source Freeloader?
Ian Skerrett, VP of Marketing for the Eclipse Foundation, wrote a great blog titled ‘What to do with Open Source Freeloaders’. He starts the blogpost by saying: “On occasion, people working on open source projects will lament how a lot of organizations are using the output of the open source project but not contributing back.” Turns out that one of the important metrics open source projects use to measure their success is the number of users they have. And let’s face it – “freeloading” is pretty much where everyone starts in the world of open source. So being a freeloader isn’t really a bad thing. But it’s important to take a look at why it is important, over time, to contribute back to some of the projects you are using.
People who use open source follow a “life-cycle” as they learn how to leverage open source. It looks something like Figure 4.
Figure 4: The open source life-cycle
Your initial use of open source is likely opportunistic, in that you are looking for a way to save time by using code someone else wrote. On a task-by-task basis, you will make decisions on whether or not to use open source, and the benefits are typically unplanned in overall project development.
After a while, you start to realize that beyond the time savings, using open source frees you up to spend more time innovating code that is really important to your business. You will then likely reach a point where you start to make open source part of your core development strategy.
When you make this decision, you’ll implement more formal policies around open source use and push your development teams to use open source as often as possible in projects. And once you are up and running with your open source strategy, you will quickly realize that helping fix bugs and contributing to key open source components you depend upon can help you stay away from forking projects. This in turn helps reduce maintenance costs for your organization. Otherwise if you modify and fork the code base, you’re stuck with it forever. Additionally, you’ll quickly find the community can help you refine and advance new features you want.
Starting and Managing Projects – Important versus Strategic
The final stop in your open source journey comes when you realize that starting and managing open source projects can allow you to leverage a broader community to advance technology that is important, but may not be strategic to your business.
Hadoop is a great example of this. Yahoo! needed to be able to manage big data within its organization. While important, the company soon realized big data wasn’t strategic to its business. Quickly, Yahoo! realized other people were interested. By open sourcing Hadoop, the company found it was able to advance the Hadoop project quicker, (still investing in approximately the same way), which in turn allowed it to move faster in building out a broader Yahoo! platform that depended on Hadoop.
If you create a technology that is important to your cause, but not a competitive differentiator, then leading the project into the open source world can help you accelerate the project and avoid getting bogged down in building technology that isn’t strategic to your business.
The Open Source Opportunity
The world of open source is growing at an unprecedented pace. Companies of all sizes are realizing that the use of open source is no longer an option, but instead is a necessity to compete in today’s market.
As a developer, your value depends on your ability to navigate, adopt, manage, and participate in the open source world. Whether you are using, contributing, or managing your own open source project, engaging with the open source community is critical. And if you are only using OSS today, look for opportunities to join the community and start contributing. You’ll not only become more valuable and relevant, but you’ll also learn new approaches to solving software problems and expand your network of developers who can help you increase your value in the market.
Author Bio: Dave Gruber is Black Duck’s Director of Developer Programs. He has an extensive background in software development, with over 30 years’ experience in enterprise application development, IT management, product management and product marketing. Gruber was an early pioneer of web infrastructure and development technologies working at companies like Allaire, Macromedia and Adobe. At Black Duck, Gruber drives go-to-market strategies and programs, with a focus on helping developers gain greater visibility and insights into the world of open source software leading to faster development.
This article appeared in JAX Magazine: Pulling together. For that issue and others, click here.
Teaser image courtesy of dlofink