Open Source 101

FOSS – A Java developer’s best friend

DaveGruber
open

From March’s JAX Magazine, Black Duck’s Dave Gruber explains everything you need to consider when fully embracing open source projects in the enterprise.

The use of open source software has pretty much become a way of
life for most of us today. And for most, contributing to open
source projects has allowed us to expand our visibility to
development teams outside our daily grind, as well as offering a
way of engaging in new areas of interest. New projects start every
day: some with big backers and lots of visibility, while others
only have a handful of dedicated developers trying to solve a local
problem. 

Open source projects are hosted on many different “forges”
around the globe, in sites like GitHub, SourceForge, Google Code, Launchpad, RubyForge and Codeplex.

Another set of highly visible, important projects are hosted in
local repositories through open source foundations like the
Linux Foundation, the
Apache Foundation, the
Eclipse Foundation and the
Mozilla
Foundation
. Further projects are available through private
repositories, like the Central
Repository
. Together these repositories represent a great
opportunity to leverage hundreds of thousands of projects and
billions of lines of code. 

However, sifting through the many repositories and projects can
be a serious challenge. Furthermore, the noise level on many of
these sites is high. Projects are regularly abandoned, morphed into
other projects, split, or reborn in other forms. While many have
active teams working daily, others limp along and die a long, slow
death. 

Late last year, we ran an analysis of the ~550,000 projects
tracked on Ohloh.net to take a
look into how many projects were “active”. To run this analysis, we
examined both commits and committers. Here’s what we found:

  • 550,000+ projects on Ohloh.
  • 271,372 with a code analysis.
  • 96,824 with a commit in the past 2 years.
  • 46,883 with a commit in the past year.
  • 29,303 with a commit in the past 6 months.
  • 21,251 with a commit in the past 3 months.
  • 12,870 with a commit in the past month.
  • 5,629 with a commit in the past week.
  • 1,224 with a commit in the past day

Looking at this in chart form in Figure 1, we
saw that only 17.3% of the projects analyzed had any activity in
the past 12 months. 

Figure 1: Projects with activity in
last 12 months

Furthermore, we found that of this 17.3%, only half of those
projects had two or more developers actively working on the
project. This means approximately 8.5% of all projects were still
active, as shown in Figure 2.

Figure 2: Still active projects in
the last 12 months

Examining primary languages used within these active projects in
Figure 3, we see that Java continues to lead the pack. That said,
we also found newer projects were using Python, PHP, and JavaScript
heavily. 

Figure 3: Leader of the pack

From this analysis we can see that only a small fraction of
created projects ever gain and sustain long-term traction. This
might seem obvious, but activity matters. Without an active project
team, you have little chance of getting bugs fixed for a project
that you depend upon.  So be sure to check out the activity
levels for any project you are considering using. 

Finding the Right Project

With so much choice, finding a project that can help you with
your next development effort should be easy, right? Most of the
open source forges offer project search capabilities, some with a
more faceted search function than others. Here’s a short list of
the biggies:

You can also search or post a question on Stackoverflow.com to
get recommendations from other developers who may have solved
similar problems. There are also a number of public open
source directories that you can search to come up with options,
including:

If you are looking for a more specific piece of code, you can
alternatively use code search tools to discover projects that meet
your needs. Here’s a list of public code search options:

Choosing the “Right” Project By now, you can see that there
are a ton of options to search for open source projects. But how do
you go about choosing the “right” project that both solves the
problem you are trying to solve and also has the right
characteristics to make it fit your use case? You’ll need to think
about many considerations as you evaluate options. Below reflects
my list, but you may have others:

  1. Which languages are used? 
  2. Which license is used for the project? 
  3. How does the documentation look? 
  4. How active is the project? 
  5. How well maintained is the project? 
  6. Is the code widely used in other places
  7. How big is the project and how complex is it?
  8. Are there known security vulnerabilities?
  9. Any outstanding lawsuits?
  10. Is commercial support available?
  11. Do you have export requirements? Does the project use
    encryption?
  12. What is the quality of the code?

Some of these are easier to answer than others. While you can
download the code and look at it to determine things, such as
languages used, size and complexity and code quality, others are a
bit trickier to work out. For example, tracking down the licenses
throughout the dependency chain; finding out how active and
maintained the project is; how widely used; how widely adopted;
lawsuits and so on. Here’s an attempt to provide you with some
direction on how to answer the questions.

The Easy Ones 

These you can answer by using one of the code search tools or by
downloading the source:

  • Which languages are used?
  • Which license is used for the project? Or check a project
    directory like Ohloh, OLEX, etc.
  • How is the documentation? Look in the wiki, review the code
    for comments, or check Ohloh (it counts comments)
     
  • How big is the project and how complex is it?

A Little Harder (but still available)

  • Are there known vulnerabilities?  (via National
    Vulnerabilities database
    )
  • How well maintained is the project? Check the bugbase to
    see how many high priority bugs are open and for how long
  • How active is the project? The number of active committers
    and commit stream will help you here (Ohloh summarizes this data
    for you)
  • Is the code widely used in other places? Search
    StackOverflow, Google and download stats

The Tougher Questions

  • Any outstanding lawsuits? Google search for project name
    and “lawsuit”
  • Is commercial support available? Companies like Credativ
    and OpenLogic in the US support a subset of FOSS
    projects.
  • Does the project use encryption? Sometimes this is
    documented on project sites, otherwise explore the project
    yourself.
  • What is the quality of the code? A limited number of
    projects have code quality audits available from
    Coverity.

Approvals

Your organization likely has policies and guidelines in place
surrounding the licenses and types of projects which can be used in
your projects. Many organizations have formal approval processes in
place (some are more manual than others.) The more information you
can provide to your approval body, the faster your request will be
approved. 

It’s important to know your policies in advance, so you don’t
end up wasting time with projects which have obvious reasons to be
rejected. If you use a lot of open source, explore putting an
automated solution in place to speed up the process – it can make a
huge difference. Some of the tools available can automatically
start the process by detecting new open source components in the
build process. Depending on your environment, you may be able to
really simplify the whole process.

Tracking Your Open Source Use

It is super important to know what FOSS you are
using and where it is all being used. Security
vulnerabilities are often reported post-production, so you’ll need
to be able to quickly and easily find all instances of a specific
component across your application portfolio. 

Other issues that come up include license issues, new releases
and bug fixes. Several options are available for tracking your open
source, from the simple list/spreadsheet option, to much more
sophisticated catalogues that integrate into the selection and
approval process. Depending on the size and complexity of your
organization, in addition to the amount of open source components
and applications you are using, you can decide on the best
approach. What is most important is to implement a diligent
tracking method first.

Are you an Open Source Freeloader?

Ian Skerrett, VP of Marketing for the Eclipse Foundation, wrote
a great blog titled ‘What
to do with Open Source Freeloaders’
. He starts the
blogpost by saying: “On occasion, people working on open source
projects will lament how a lot of organizations are using the
output of the open source project but not contributing back.” Turns
out that one of the important metrics open source projects use to
measure their success is the number of users they have. And let’s
face it – “freeloading” is pretty much where everyone starts in the
world of open source. So being a freeloader isn’t really a bad
thing. But it’s important to take a look at why it is important,
over time, to contribute back to some of the projects you are
using. 

People who use open source follow a “life-cycle” as they learn
how to leverage open source. It looks something like Figure
4
.

Figure 4: The open source
life-cycle

Your initial use of open source is likely opportunistic, in that
you are looking for a way to save time by using code someone else
wrote. On a task-by-task basis, you will make decisions on whether
or not to use open source, and the benefits are typically unplanned
in overall project development.

After a while, you start to realize that beyond the time
savings, using open source frees you up to spend more time
innovating code that is really important to your business. You will
then likely reach a point where you start to make open source part
of your core development strategy. 

When you make this decision, you’ll implement more formal
policies around open source use and push your development teams to
use open source as often as possible in projects. And once you are
up and running with your open source strategy, you will quickly
realize that helping fix bugs and contributing to key open source
components you depend upon can help you stay away from forking
projects. This in turn helps reduce maintenance costs for your
organization. Otherwise if you modify and fork the code base,
you’re stuck with it forever. Additionally, you’ll quickly find the
community can help you refine and advance new features you
want. 

Starting and Managing Projects – Important versus
Strategic

The final stop in your open source journey comes when you
realize that starting and managing open source projects can allow
you to leverage a broader community to advance technology that is
important, but may not be strategic to your business. 

Hadoop is a great example of this. Yahoo! needed to be able to
manage big data within its organization. While important, the
company soon realized big data wasn’t strategic to its business.
Quickly, Yahoo! realized other people were interested. By open
sourcing Hadoop, the company found it was able to advance the
Hadoop project quicker, (still investing in approximately the same
way), which in turn allowed it to move faster in building out a
broader Yahoo! platform that depended on Hadoop. 

If you create a technology that is important to your cause, but
not a competitive differentiator, then leading the project into the
open source world can help you accelerate the project and avoid
getting bogged down in building technology that isn’t strategic to
your business. 

The Open Source Opportunity

The world of open source is growing at an unprecedented pace.
Companies of all sizes are realizing that the use of open source is
no longer an option, but instead is a necessity to compete in
today’s market. 

As a developer, your value depends on your ability to navigate,
adopt, manage, and participate in the open source world. Whether
you are using, contributing, or managing your own open source
project, engaging with the open source community is critical. And
if you are only using OSS today, look for opportunities to join the
community and start contributing. You’ll not only become more
valuable and relevant, but you’ll also learn new approaches to
solving software problems and expand your network of developers who
can help you increase your value in the market. 

Author Bio: Dave Gruber is Black Duck’s
Director of Developer Programs. He has an extensive background in
software development, with over 30 years’ experience in enterprise
application development, IT management, product management and
product marketing. Gruber was an early pioneer of web
infrastructure and development technologies working at companies
like Allaire, Macromedia and Adobe. At Black Duck, Gruber drives
go-to-market strategies and programs, with a focus on helping
developers gain greater visibility and insights into the world of
open source software leading to faster development.

This article appeared in JAX Magazine: Pulling together. For
that issue and others, click here. 

Teaser image courtesy of dlofink

Author
Comments
comments powered by Disqus