Hibernate OGM: Lowering the Barrier of Entry to NoSQL
NoSQL tools and big data in general will revolutionize what we do in our applications.
Earlier this month, the public got their first look at a brand new, NoSQL-orientated Hibernate project: Hibernate Object Grid Mapping (OGM.) This project aims to provide JPA engine storing data into NoSQL stores, and the first Alpha release was used to power the 2011 JBoss World Keynote. JAXenter spoke to JBoss platform architect and Hibernate Search and Hibernate Validator founder, Emmanuel Bernard, to find out more about this new project.
JAXenter: The first public alpha release of new Hibernate project Hibernate OGM has just been announced. What is the OGM project?
Emmanuel Bernard: Hibernate OGM stands for Object / Grid Mapper. The idea is to offer the full JPA API and semantic (or its Hibernate native counterpart). But instead of storing data in a relational database like Hibernate Core does, we store them in a NoSQL datastore. We also hope to offer a subset of JP-QL. We are talking about a full-fledge JPA engine: same API, same semantic(cascading, association etc), same query language.
The project started as a solution to offer JPA APIs on top of
Infinispan (JBoss’s data grid). While working on the project, we
realized that the choices we made:
* could be generalized to other key/value stores
* fit nicely to alternative NoSQL families and in particular Document oriented ones
So we have decided to make Hibernate OGM datastore agnostic (which in retrospect fits nicely into Hibernate Core’s philosophy).
In Hibernate OGM, we try very hard to make the underlying data storage independent of your application. The side effect is that the same data will be readable by other platforms like Ruby, .net or whatever the next big thing ends up being. It’s quite critical for your data to be portable. Data outlive application tenfold quite easily.
JAXenter: How does Hibernate OGM aim to lower the barrier of entry for NoSQL?
Emmanuel: When you chose one NoSQL product over
another (or over a relational database), many choices are at
* the programmatic API
* the (non-)query engine
* the transaction / throughput / availability / partitioning semantics
* the tools to facilitate ops and support
The idea behind Hibernate OGM is to let you reuse the same
programmatic API, same object lifecycle semantic and (to a certain
extend) the same query engine. There is a difference between a
developer that can use APIs that are already familiar and well
integrated with his programmatic model, versus using an entirely
different set of APIs and programmatic model. You will still need
to focus on which engine best fits your data storage needs, but
assuming your application is domain model driven, Hibernate OGM
will be a useful tool to limit leakage between your app and
datastore. You will also be able to choose your NoSQL datastore later in the application lifecycle just like Hibernate Core dialects let you decide which relational engine to use later in your development cycle. To be honest, it won’t be as abstracted for Hibernate OGM: NoSQL engines are vastly different.
Fundamentally, Hibernate OGM is a tool that reduces the barrier of entry to NoSQL solutions. We think, like many, that NoSQL tools and big data in general will revolutionize what we do in our applications. We want people to try and explore new data patterns without having to invest a lot in time and money.
JAXenter: What technologies are at work, in Hibernate OGM?
Emmanuel: That’s one of the beauty of the project. Instead of writing a JPA engine from scratch, we reuse most of the mature Hibernate Core engine. We simply replace two components which are interacting with the datastore (respectively Persisters and Loaders). To be honest, I did not believe Hibernate Core was flexible enough for this but I was wrong and the engine fits quite well.
On the query side, Hibernate OGM uses Lucene and Hibernate Search to build indexes and keep them up to date. The query engine converts JP-QL queries into one or several full-text queries. That’s our first step and will let us do JP-QL queries with restrictions (where clauses) and simple *-to-one joins. Once this is stabilized, we will reuse Teiid, a database federation engine that queries several datasources as if they were one and compute joins (doing the aggregation and join work if needed). The Teiid team is working on an embedded version of their query engine as we speak.
Finally, our initial NoSQL engine is Infinispan which is the evolution of JBoss Cache. As you can see, we reuse many mature projects and add the additional layers on top when needed. The idea is to get a very rich feature set very quickly.
JAXenter: What do you have planned for future releases of Hibernate OGM?
Emmanuel: We have just released Alpha 2, so don’t move your production data to Hibernate OGM just yet!
We do have a fairly mature CRUD support (Create Read Update Delete) even though we are still exploring the best way to store data (especially associations). JP-QL support for simple queries is what we are working on at the moment and we hope to get it out as soon as possible. Once that is out, we will test Hibernate OGM for performance and stability improvements. From there we will be ready for a GA release. Support for more complex JP-QL queries will come next with Teiid’s integration.
In parallel to this feature oriented roadmap, we are exploring various NoSQL engines. We do support Infinispan but we have also worked on the abstraction layer. It is already quite advanced but needs more improvements. The EhCache team is working on a prototype which will help us refine the contract some more. We also have community proposals to work on a MongoDB, CouchDB and Redis dialect. Once we are happy with the abstraction contract, we will reach out to them for contribution.
Hibernate OGM is still very young and in a state where every contribution is shaping the project. This is extremely exciting! If you are interested, please contact us, we have many things to do!