Charting a flexible course

Cory Isaacson: “MapDB is a pure Java database, for Java developers”

Lucy Carey
map.1

Inside the database that carves out its own agile paradigm around application needs.

Cory Isaacson, CEO of CodeFutures (vendors of database
scalability suite dbShards, which provides a true “shared nothing”
architecture for relational DB) speaks to JAXenter about MapDB –
the Apache-licensed open source database especially for Java
developers, tipped to become the Java storage engine of the
future.

JAX: Can you give our readers an overarching
view of what MapDB is all about?

Isaacson: MapDB is a pure Java
database, for Java developers. It is natural to use, all based on
the Java Collections API (Maps, Lists, Sets).

The key to MapDB is that developers can create a
database structure in a new agile paradigm, exactly matching
application needs. It is somewhat like creating a schema in a
typical database, but goes well beyond what you can do with a
typical key-value store. For example, MapDB allows you to create
related maps, supporting object semantics with built-in data
relationships. This makes it very intuitive to create the ideal
structure for the data needed by the application – without the
burden of complex ORM (object-relational mapping) frameworks. Just
create your maps, bind them together, and use the same semantics
and syntax you use today.

It is very easy to get going with MapDB, All a
developer needs to do is add a single jar to the classpath, modify
the Map creation syntax, and everything else just works. The
project is incredibly flexible and powerful, offering performance
comparable to native C-language embedded databases (like BerkeleyDB
and LevelDB). With MapDB you can access collections in the 10s to
100s of GBs – the same way you access small in-memory object
stores.

MapDB is extremely configurable, using a simple
Builder-style API. You can configure caching, the type of store,
durability guarantees and many other features. This way you can
select the right balance of features and performance needed for
your application.

What problems are you trying to solve with
this software?

That is a big question – MapDB can be used for
many common use cases and problems. The main focus is to offer a
natural way for Java developers to access large data stores in a
very agile paradigm, with a schema that precisely matches
application needs.

One common problem many applications suffer is
running out of Java heap memory, or excessive Garbage Collection
from attempting to cram too many objects into the application
runtime. This is almost always the result of large memory
collections (Maps, Lists, etc.) with a lot of “churn”. Convert
those to MapDB and now you have the bulk of the data in a durable
form on disk – with an automatic in-memory cache – with exactly the
same API used for your existing collection code.

Another big problem is how to perform many
common database tasks in an easy manner (sorts, iterating through
collections, transactions). MapDB supports all of these, using the
native Java concurrent APIs plus a few easy-to-learn
extensions.

What makes you so convinced that MapDB has a
good chance to become “the de facto standard Java storage
engine?

Because MapDB is flexible, fast and freely
available for use in any type of project, under the Apache 2.0
License. In many cases there is no other competition, except
writing your own solution. It is trivial to plug MapDB into an
existing project, and instantly gain all of the power of many other
database offerings in one single package (most of which are either
commercial closed source projects or offered with restrictive
“viral” open source licenses).

The upshot is that MapDB is powerful, agile,
flexible – and freely available to use and distribute in any way
the developer sees fit.

What’s the history of MapDB’s
development?

DBM (Database Manager) was simple database
engine written by Ken Thompson for UNIX – basically a hash table on
disk. JDBM (a Java port) project was started around year 2000, by a
group of developers, with JDBM 1.0 released in 2005. The project
languished with not much active support or interest, yet the
potential for this type of data structure was immensely
useful.

Jan Kotek worked on persistence for an
astronomical application. Originally he modified H2 database, but
SQL had major overhead. In 2010 he spent several weeks doing
astronomical observations at remote region in Chilean Andes, and by
some stroke of luck had the JDBM source code on his laptop. To beat
the long boring days (all astronomy activities are at night of
course), he started modifying and improving JDBM. And as they say,
the rest is history.

Jan soon released JDBM2, the subsequently JDBM3.
These libraries were widely used by many companies.

Realizing the potential for a full-fledged,
powerful and agile Java database relying on native APIs, Jan
renamed the project MapDB and left his “day job” to dedicate
himself to the project full time in early 2013. At CodeFutures we
have long used the JDBM and MapDB projects, and love the database
and its potential. In November 2013 CodeFutures brought Jan on the
team on a full-time basis, giving him the freedom and economic
support needed to fully dedicate his efforts toward making MapDB
the leading Java database in the world.

Are there any disadvantages to MapDB being so
generic?

There are some disadvantages, oddly enough tied
to the agile nature of MapDB’s flexible features. It takes a bit of
learning to understand the various configuration options, and the
many ways you can structure the data to meet application needs.
This learning curve is comparable and perhaps easier than other new
database options (such as MongoDB and Redis). There are many ways
to use MapDB and we are hard at work improving the documentation to
address these issues, including how-to recipes for common use
cases.

How close would you say MapDB is to meeting
its original design goals?

The idea was to make a database natural for Java
developers, one that is agile and has the ability to support data
structures needed by a wide array of applications. In this regard,
Jan has done a great job of meeting these goals.

What’s on the roadmap ahead for
MapDB?

Bug fixing, bug fixing, bug fixing.

There are many new features being considered,
with a pending TODO list of 400 improvements. Topping the list are
Append-only file stores, improved snapshots, incremental backups
and faster commits.  Another very exciting new feature on the
list is to fully support new Lambda expressions in Java 8,; this
will enable true parallel processing within large maps,
accelerating capabilities such as complex aggregations.

What attracted CodeFutures to MapDB, and where
is the company using it?

CodeFutures used both JDBM3 and MapDB in its
products, and always found it very fast, powerful and easy to
implement. The focus of the company has always been on
high-performance data engines (we don’t offer our own database, we
make other engines better) – so it was natural to team up with Jan
to expand MapDB’s capabilities.

What would you say are the headline features
of MapDB – and which are you personally most excited
about?

- Agile data structures, meeting exact needs of
Java applications

- Drop-in replacement for Java
collections

- Full transaction support

Is there an active community around
MapDB?

Yes, there is mailing list and many users
reports bugs, send test cases and patches. We are seeing an
expanding group of users and companies taking advantage of MapDB,
and will help to expand the community even further going
forward.

Author
Comments
comments powered by Disqus