Rocking your NoSQL foundations
FoundationDB on uniting NoSQL and ACID compliance
FoundationDB, an innovative ACID compliant, high-performance NoSQL database, had its GA launch on August 20. We caught up with Nick Lavezzo and Dave Scherer, two co-founders of the low cost platform, to find out more about this innovative project.
JAXenter: What were the big challenges you initially set out to solve with this platform?
Lavezzo: We looked at the database market, and realised that almost every application being developed needs a database, and there was a real problem with what was available in terms of choice. Application developers were being forced to choose between traditional relational databases like MySQL. But the problem with those databases was that they could not scale a single database beyond one machine. So they were really built for a time period before cloud architectures and parallel computing became the norm. With the explosion of all this data coming off social media sites, they were just becoming overwhelmed and not feasible for any large or even medium scale web implementation.
And then there was the NoSQL camp. Emerging NoSQL databases were really great and built with this new frame of mind of scalability and distributed computing- but what they gave up to get there was data consistency – the level of data consistency provided by relational databases previously.
And so the choice was between a modern scalable design and data consistency. And basically, everyone had convinced themselves that it was just an essentially compromise that had to be made. Basically we looked at the problem and said, you know, it seems like it should be possible to build a database that combines the properties of horizontal scalability, high fault tolerance, but still retain the strong data consistency guarantees provided by ACID transactions.
So FoundationDB emerged as a bit of a conceptual leap?
Lavezzo: Yeah- and it turns out most people thought we were crazy. And even we weren’t actually sure it was possible, because it was very much the conventional wisdom that is wasn’t, so when we started out, we thought it would work, but we really weren’t sure. It was probably about one year into the company that we were sure it was possible.
We spent two years just working on our own, just making a bet with ourselves that it would work. We saw how popular NoSQL was, it was growing despite the limitations around consistency. We thought if we could make a database that had the best of both worlds that there would be people that were really interested in that.
Scherer: Because combining transactions and scalability is so hard, what we decided to do was build our product with a very simple data model without a lot of external input. Whereas the development of data models on top of that can be much more collaborative and accessible.
Lavezzo: its not just us. When we’re developing these things we are just working with the APIs of our core data storage products. And so other people can do it to. And that’s our vision. The world of things that need to store data is way too big for one company to ever develop. But we can provide the foundation for an ecosystem of things that do that.
What would you say makes FoundationDB unique?
Lavezzo: The technological thing that makes us stand out among other NoSQL databases is that we are the only one which supports ACID transactions across the entire database, in a high performance manner, at least, in the sense that ACID transactions have been defined for the past 30 years.
It looks like Google has something similar internally that’s just for them, but it requires atomic clocks and crazy stuff like that, not exactly in reach of the average person or company. Because we can truly support these ACID transactions, it gives us the ability to expose not just one data model.
If you’re using MongoDB, you’re getting just one data model. With MySQL you’re getting SQL, with Riak you’re getting a key value store. With all these, you’re getting just one thing.
But because we have ACID transactions, we can efficiently map operations that are in a document storage layer into our core key value store.
We can also map SQL operations from our SQL layer into a key value store, and you can have three, four, or 5 different layers that are exposing different data models to your developers. With FoundationDB, they can all be running on the same cluster, and be doing operations between different data models perfectly consistently, and that’s something that no other database technology on the market can do, or is even claiming to be able to do.
Scherer: There’s an increasing problem people have operationally where they are adopting a bunch of different databases which are convenient for different parts of their application. We’ve literally seen people with a dozen, and it makes for an operational nightmare because you have to keep all of those things running to keep your site running, and they are not stateless, so they are not naturally fault tolerant.
Lavezzo: And on top of that, the C-level technology executives at large enterprises in this situation also have to maintain expertise internally on all of these technologies. So it’s an operational nightmare, and an HR problem. Whereas with Foundation DB, the thing that is appealing is that it gives the developers what they want, which is the ability to model their data in different ways but keep the architecture sane, getting back to one reliable system. That’s our biggest differentiator, I think.
Going forward, what do think the biggest challenges are that you face? Will you be looking to up performance on ACID transactions?
Lavezzo: Our biggest challenge is to continue to build out the community around FoundationDB, and to build out this layer ecosystem. Our story gets stronger the more high quality layers that are available to the market.
Scherer: Right now we’re selling a database that has the raw capability to do almost anything- but that doesn’t mean it does anything super easily outside of the box. Our goal over the next period of time- at least engineering wise- will be to give it more out of the box engineering capabilities, and you know, to work with the community to grow an ecosystem on top of it, because there are more things you could want out of the box than any one company could ever build. And so we need to develop a whole community and ecosystem and industry building stateless things on top of our core.
At the recent NoSQL Now 2013 conference in San Jose, California, Max Schireson, the CEO of MongoDB said, "There's lots of work to do before NoSQL as a sector can win." What is your opinion on this?
Lavezzo: We think that the primary driver is ACID transactions. Enterprise data architectures are build primarily around Oracle right now, and the thing that they have above their competitors is the ability to provide ACID transactions at a really high scale. They do it differently than us. They do it by taking advantage of machines with 1000 cores and terabytes of memory, whereas we do it by using cheap servers that use a bunch of cheap computers versus one big expensive one. So, we think that has held the NoSQL market back from mass enterprise penetration. And, he [Max Schireson] is not going to say that, because there is no way that MongoDB can add ACID transactions into their system.
What would you say the differences between the open source and enterprise versions of FoundationDB are?
Lavezzo: To be clear, neither version is open source. We have just one software package- obviously that comes in different builds for different environments- but there's just one FoundationDB package that people download. No sign ups, just one package that is totally unaware of your current license, so you can download it under the community license we created which allows people to use unlimited amounts of FoundationDB for non-production use. If you're putting it into production you can run it on up to six servers for free without contacting us, getting support, or anything like that. That lets medium and small sized businesses run a meaningful sized cluster that gives them fault tolerance, scalability- all those good things- and then, if they succeed in becoming the next Pintrest or whatever and they need to go beyond six nodes, at that point they are supposed to get in touch with us and get support contracts and licenses.
Scherer: And we have customers who are not using more than six properties but chose to pay for the enterprise version, and the difference there is just support. It's that they know they can call us in the middle of the night if something is going wrong.
Lavezzo: But there's no additional features, to be clear.
How have you kept your prices low?
Lavezzo: We're going for higher adoption. We really think that someone is going to win this. SQL databases won in the eighties, and were the standard for twenty- thirty years. We think that there's going to be a database technology and a data storage technology that's going to become the standard for modern distribution systems. And, if it's going to do that, it has to be accessible to not just large financial institutions, or ridiculous companies like that- it needs to be accessible to everybody. And so, we think that there's tons of money to be made at a relatively low price point if we can succeed in really changing the market.
Scherer: Solving this problem is really, really hard, and we don't want us, or anyone else to do it again. So we really feel like our mission is to solve this. Every big web company has had to do something to solve the scaling problems, and I feel like way too many smart people have invested way too much time and energy 75 or 80 percent solving this problem again and again and again. Our mission is really to solve it 100 percent once and for all, so that all the people can do something more interesting and more important. And we really want it to be something everyone else can use.
How does the deploy anywhere model work?
Lavezzo: Well you may have seen we have the image of the Death Star on our site- we're not quite on there yet- though I do want a Death Star...but really, we designed the software to be run on a wide range of commodity level hardware and we don't require anything exotic.
Scherer: If we'd used atomic clocks or whatever, there would be people out there who need this problem solved badly enough that they would buy atomic clocks- but you're not going to get atomic clocks in your cloud computers or your laptop. So,we've chosen a distributed design that can use lots of little computers instead of needing one monster one, and the fact that we haven't required anything particularly exotic in the design of the system. And we've also gone to some effort to make it easy to apply anywhere. One quick cloud formation launch can get you something running You just fill out a little tiny form and you have something running on Amazon systems with n computers all wired together. We can run on cheap computers, but we've also designed FoundationDB to take advantage of large multi-core processor computers, and the cool technology we've embraced.
What do you think the future holds for FoundationDB?
Lavezzo: Well I think our hopes are that we become adopted more and more and grow a community and eventually serve as the foundation for modern distributed systems, or one of few solutions out there that provides distributed transactions and multiple data models. That's in ten years time. This year, we're looking to release our SQL layer soon, and some other really exciting layers shortly thereafter, and we think that's going to really help us to be able to plug directly into existing communities that are facing the pain of having systems that don't scale.
Once we have the SQL layer available, we can jump in and be like, "Hey Drupal guys, you're having problems-here, this fixes them!" or 'Wordpress, it doesn't scale- now it does!". There are so many ways we can gain traction without having to build our own communities from scratch and we can kind of jump into communities that are already existing, already have these pain points, and sort of solve them.