Big data database Apache Rya becomes a Top Level Project
The Apache Software Foundation recently announced that Apache Rya has moved on up to become a Top Level project. What is Apache Rya? This scalable RDF big data management system is used by a number of organizations, including U.S. Department of Defense agencies for advanced tactical communications. Find out more, including how you can contribute and join the community.
The Apache Software Foundation, world’s largest open source foundation, oversees more than 350 open source projects. While many of these projects are household names, such as Apache Maven, Groovy, Apache Cassandra, and Apache CouchDB, it’s always worthwhile to browse through the lesser-known names. Today, we are looking at Apache Rya, an open source big data database.
ASF recently announced that Apache Rya has moved on up to a Top Level project. In celebration of its maturity and welcome aboard, let’s have a quick look at Apache Rya and see what it’s all about.
SEE ALSO: Big data in a nutshell
What is Apache Rya? According to its repo description on GitHub it is “a scalable RDF Store that is built on top of a Columnar Index Store (such as Accumulo). It is implemented as an extension to RDF4J to provide easy query mechanisms (SPARQL, SERQL, etc) and Rdf data storage (RDF/XML, NTriples, etc). Rya stands for RDF y(and) Accumulo.”
The ASF description of Rya reads:
Rya (pronounced “ree-uh” /rēə/) is a cloud-based RDF triple store that supports SPARQL queries. Rya is a scalable RDF data management system built on top of Accumulo. Rya uses novel storage methods, indexing schemes, and query processing techniques that scale to billions of triples across multiple nodes. Rya provides fast and easy access to the data through SPARQL, a conventional query mechanism for RDF data.
Structurally, Apache Rya looks like an ASF nesting doll, since Accumulo is also an Apache project. It allows users to store and manage large data sets across a cluster and uses Apache Hadoop’s HDFS for data storage and Apache ZooKeeper for consensus.
Apache Rya entered incubation back in September of 2015 and has come a long way since then, with several releases and new committers joining.
Read more about it in the following articles:
- SPARQL in the Cloud Using Rya. Roshan Punnoose, Adina Crainiceanu and David Rapp, Information Systems. Volume 48, March 2015, p. 181-195
- Rya: A Scalable RDF Triple Store for the Clouds. Roshan Punnoose, Adina Crainiceanu and David Rapp, 1st International Workshop on Cloud Intelligence, Cloud-I, 2012
Who’s using Rya?
According to ASF, some of the organizations using Rya include: Enlighten IT Consulting, Modus Operandi, Parsons Corporation, Semantic Arts, Semantic Web Company, Sierra Nevada Corporation, and U.S. Department of Defense agencies.
It is in use for various semi-autonomous content production operations, autonomous small robots, drones, and advanced tactical communications through manned-unmanned teaming.
Top Level project
What does it mean to be a Top Level project? Top Level projects have graduated from the incubation stage.
According to ASF, a Top Level Project meets the following characteristics: “Projects with healthy communities and active development; and a supported listing by technologies and experimental listing of projects as well”.
Looking to join the community? See how you can get involved and help with volunteer efforts. Apache Rya is looking for users who can help contribute code and javadocs, report bugs, submit patches, provide them with use cases, give feedback, and make feature requests.
Download the latest version
You can download the open source artifact through Maven.