Java-based Presto announces its new foundation, bringing more stability for the Big Data SQL engine
Keep your eyes on this hat trick – Presto has just announced it is going solo! The new Presto Software Foundation shows that this big data SQL engine is ready for the main stage. Developers can take advantage of Presto’s high performance for interactive queries on data, wherever it lives. Magic!
Performing interactive queries on data doesn’t have to be like sawing a lady in half. Thanks to Presto, a distributed big data SQL engine, developers can interact with their data wherever it lives. And now, they can be assured that their SQL engine is community-run and community-driven, with the announcement of the Presto Software Foundation.
Initially, Presto started life at Facebook in 2012 as a data analysis tool for Apache Hadoop. In 2013, it went open source, providing users like Netflix, Airbnb, and Uber with the ability to quickly query their databases. Now, Presto has outgrown Facebook and is going solo with its own software foundation.
“From the beginning, we stressed the importance of code quality, architectural extensibility and open collaboration with the community,” said Martin Traverso, co-creator of Presto and co-founder of the Presto Software Foundation.
“With the rapid expansion of both the Presto user base and Presto developer community over the last several years, establishing a non-profit to institutionalize these values is the next logical step to ensure that this project stands the test of time.”
Presto is a Java-based query engine that is both highly parallel and scalable. It’s built from the ground up for efficient, low latency analytics. As you might expect from a distributed system, developers can access data from multiple systems within a single query.
Over time, features like LDAP integration, Kerberos integration, spill to disk, decimal data type, data encryption in transit (“secure internal communication”), and correlated subquery support have ensured that Presto is essential for any developer working with big data databases. The latest improvement – a cost-based optimizer – ensures that Presto always considers alternative query plans before choosing the plan and then executing it.
Although it was developed with Hadoop in mind, Presto works with and without it. It can natively query data from a number of other options, including Hadoop, S3, Cassandra, MySQL, and more. That way, developers don’t need to spend time laboriously copying data from one system to another, possibly introducing errors into the mix. Plus, it can also be utilized with Kubernetes in mind, if that’s the technology your organization goes for.
Presto works well with IDEs, although the team does specifically endorse IntelliJ IDEA. It comes with a sample configuration for out-of-the-box development. That said, it also works well with AWS and is a part of the Amazon EMR platform.
Trying out Presto
Want to give it a whirl? Head on over to GitHub for more information. However, Presto is a standard Maven project and has a few requirements, including:
- Mac OS X or Linux
- Java 8 Update 151 or higher (8u151+), 64-bit. Both Oracle JDK and OpenJDK are supported.
- Maven 3.3.9+ (for building)
- Python 2.4+ (for running with the launcher script)
If you’d like to join the Presto community or learn more about the Software Foundation, more information is available here.