A look at NoSQL database Aerospike with CTO Brian Bulkowski
Aerospike claims to be “the fastest, most reliable database in the world.” We took a look under the hood with Brian Bulkowski, co-founder and CTO of the open-source NoSQL database.
JAXenter: For anyone that’s still unfamiliar with the Aerospike database – what are its major characteristics?
Brian Bulkowski, Aerospike CTO: Aerospike was founded to push the limits of modern processors, storage technologies and distributed systems. Aerospike founders wanted to make scaling easy for all developers, not just the few who worked at companies with deep enough pockets to build their own.
As a result, they built the fastest, most reliable database in the world. The first database optimized for flash (patent pending) with a hybrid RAM/flash storage architecture. The first NoSQL database to combine transactions with hot analytics (patent for indexed map reduce) and the first in-memory NoSQL ACID database with strong consistency (ACID Compliant).
Databases are categorized in a number of different ways:
- SQL vs NoSQL: Aerospike is a NoSQL database, so you can change schema on the fly
- Operational vs Analytics: Aerospike is an operational database, a front edge database, a very fast key value store often used for caching, as a session or cookie or user profile store.
- Disk based vs In-Memory: Aerospike is an in-memory database that can run in pure RAM with rotational disks for persistence.
But Aerospike uniquely breaks out of all these traditional categories:
- It is a NoSQL database, the first with immediate consistency (ACID properties). Replication is synchronous and writes complete only after data is written to all replicas (link to VLDB paper).
- It is an operational database, the first to also run hot analytics, real-time queries and aggregations on operational data. We have the patent on indexed map reduce.
- It is an in-memory database, the first to be optimized for flash storage and can run in a hybrid mode, with indexes in RAM and data on flash SSDs.
Some major characteristics are:
- Aerospike is the fastest database on the planet with RAM-like speed, similar to Redis, often 10x faster than other databases. This high performance enables applications to process more data faster to make better decisions, better recommendations, better personalization, and better user engagement which all translate into higher profits.
- Aerospike gives you the ability to scale up and out with flash. This means that you can scale very simply, on just a handful of servers compared to other databases like Cassandra. In addition, Aerospike has a self-healing architecture, so fail-over, replication, data migration etc is automatic, upgrades can happen live and the system requires no downtime, no manual intervention except for changes to hardware.
- It also means that you can scale with the economics of flash – again, often with 10-15x lower costs.
Aerospike is making it possible for organizations to speed the development of a whole new category of real-time, data-driven applications and lucrative business models that were not previously possible. Today, both startups and enterprises require revolutionary database technology that delivers unlimited scale at blazing speeds, and Aerospike is delivering on that demand with unrivaled price performance ratios that exceed other options.
So you recently unveiled an update to the Aerospike database. What’s new?
Aerospike has unveiled new features and enhancements to its open-source database, solidifying the company’s lead in fueling the next generation of real-time, context-driven applications.
With the availability of powerful new clients, easier installation and deployment, storage and performance improvements, enterprise security enhancements, Hadoop integration and numerous community contributions including connectors for Spark, Aerospike continues to meet and exceed developers’ current and future needs.
Its unmatched speed, scale and simplicity make it possible for organizations of all sizes to innovate with new applications and rapidly grow new businesses that drive top-line improvements, while accruing massive bottom-line savings.
You made the claim that Aerospike is the fastest database there is…
Just like there are “Lies, damned lies and statistics”, database benchmarks are equally confusing. Having said that, the only way to compare performance is to validate it for yourself and Aerospike customers have done just that. In addition to numerous testimonials, in the 2014 Magic Quadrant on Operational Databases, Gartner noted that “Aerospike’s reference customers supported its claims of high performance by awarding it the highest scores for performance of any vendor in this Magic Quadrant.” They also gave it the highest score for ease of doing business.
Aerospike is able to perform at these levels because it was built from the ground up for speed:
- Server code in ‘C’ (not Java or Erlang) precisely tuned to avoid context switching and memory copies. Highly parallelized multi-threaded, multi-core, multi-cpu, multi-SSD execution.
- Indexes are always stored in RAM. Pure RAM mode is backed by spinning disks. In hybrid mode, individual tables are stored in either RAM or flash.
In the SQL world, the Transaction Processing Council (TPC) would publish benchmarks comparing different databases. In NoSQL, the Yahoo Cloud Serving Benchmark (YCSB) is emerging as the standard. The YCSB tooling is open source so anyone can view the code, modify the code and replicate testing. Aerospike has created a YCSB plugin and documented hardware specifications and test methodology so that anyone can validate the numbers.
SEE ALSO: Why is my database slowing down?
For key value use cases, numerous benchmarks show that Aerospike is faster. A recent Google post documents how Aerospike Hits 1 Million Writes Per Second with 6x fewer servers than Cassandra. A HighScalability post documents how Aerospike achieves 1 Million Reads per Second on a single server on AWS for $1.68 per hour.
Could you tell us a bit about Aerospike’s decision to release the code to the open-source community? What motivated you to go open-source?
We founded Aerospike because we wanted to simplify scaling and empower all developers—not just the few who worked at companies with deep enough pockets to build their own technologies.
Now that Aerospike has been in production non-stop for close to four years in some of the world’s most data-intensive environments (like AppNexus, the largest independent ad-exchange) we are fulfilling our commitment to developers by open sourcing the technology and expanding our community.
Younger technologies are often favoured by younger companies. Is Aerospike also popular among tech startups?
Absolutely. Startups are by definition building new apps and to compete, they need to build better functionality using the latest technology that speeds time to market and scales with the bottom line in mind.
Most Aerospike customers are startups and were startups, although our first customer AppNexus, is now a $1Billion company. In order to support startups, Aerospike has announced a startup special which gives free access to the enterprise Edition of Aerospike with no limits on nodes, TPS and volume of data managed. To qualify, startups must have revenue of under $2 million and funding of under $20 million.
What NoSQL trends do you think will emerge in 2015?
Fast on the heels of HortonWorks, the first open source Hadoop company going public, 2015 will see the first open source NoSQL company going public.
NoSQL will become the new normal. More developers will decide up front whether relational database options are good enough or whether they should start with NoSQL.
With growing developer sophistication, developers will select NoSQL databases based on use cases rather than simplistic distinctions like document store, key value store and column store.
As flash storage matures and more cloud vendors offer SSDs, more developers will look to take advantage of the price/performance of flash.