days
-1
-6
hours
-1
-4
minutes
-5
-9
seconds
-4
-8
search
Becoming acquainted with Cassandra

Tutorial: Getting started with Apache Cassandra

Scott Hendrickson
apache cassandra
© Shutterstock / pluie_r

Apache Cassandra is a database that offers high availability and massive scalability. In this tutorial, Scott Hendrickson, Cloud Architect Team Lead, at DataStax shows you how to get started using Apache Cassandra. Find out how to create a service for a video recommendation application as a starting point for learning more about how Apache Cassandra works.

Applications today create lots of data, and if you want to get value from that data then you have to capture it in the right way. If your application will have to scale up to serve thousands or millions of customers – or if you intend to get a lot of data writes from your devices – then you have to be able to scale up easily. If you are looking for a database with massive scalability and high availability, you need to consider Apache Cassandra™

Firstly, Apache Cassandra has a fully distributed architecture, which makes scaling up very easy over time – you simply add more nodes. Secondly, with Cassandra, your data replicates across data centres and cloud platforms with ease meaning you never need to worry about downtime for maintenance or due to unforeseen circumstances. This also means it is possible to run across multiple cloud services at the same time and avoiding being locked into a specific cloud service. This keeps you in control.

Lastly, choosing Apache Cassandra means you’ll be in great company. Cassandra is currently in use at CERN, Comcast, eBay, GitHub, GoDaddy, Hulu, Instagram, Intuit, Netflix, Reddit, The Weather Channel, and many more companies running active global datasets.

Getting Started with Cassandra

Getting Apache Cassandra up and running involves creating a cluster of Cassandra instances, or nodes. You can then connect to your cluster using any of the drivers for Apache Cassandra™, which come in different languages such as Java, Python, C++, C#, Node.js, Ruby, and PHP.

Below we’ll go through the steps to create a simple Java application using version 3.7.1 of the DataStax Java Driver for Apache Cassandra™. There are API changes for newer versions of the Java driver (4.0+). Please make sure you use the appropriate version for this example.

For this tutorial, we’ll be creating a service for a video recommendation application that will take data and then use it. This app – called KillrVideo – should be a useful starting point for learning more about how Cassandra works and how you can apply this in your own applications. 

It has a three-tiered architecture that is common for cloud-scale applications, with a web application, services tier and database tier. In the full application, it will use a microservice approach with multiple stateless services.

Prerequisites:

  • Set your deployment to use public IPs for your nodes.
  • Download the driver from Github and add to your CLASSPATH, or you can add the following dependency to your Maven POM file:

<dependencies>

<dependency>

<groupId>com.datastax.cassandra</groupId>

<artifactId>cassandra-driver-core</artifactId>

<version>3.7.1</version>

  <dependency>

<dependencies>

1. Create a Cluster object

Cluster cluster = Cluster.builder().addContactPoint("40.83.177.33").build();

  • The Cluster object is the starting point to connect to a Cassandra cluster, and is created using the Cluster.builder() helper class.
  • Replace the IP address shown in the addContactPoint() method with the public IP of the node in your deployment.

2. Create a Session object

Session session = cluster.connect();

  • This is when the driver makes connections to the cluster nodes.

3. Execute statements using the Session object

session.execute("CREATE KEYSPACE IF NOT EXISTS killrvideo WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 1};");
session.execute("CREATE TABLE IF NOT EXISTS killrvideo.videos (name TEXT, description TEXT, PRIMARY KEY(name))");

session.execute("INSERT INTO killrvideo.videos (name, description) VALUES (?, ?);", "Avengers: Endgame", "No spoilers");

ResultSet rs = session.execute("SELECT * FROM killrvideo.videos WHERE name = ?;", "Avengers: Endgame");

for (Row row : rs) {
System.out.println("Name:" + row.getString("name"));
System.out.println("Description: " + row.getString("description"));
}
  • The execute() method is used to run a CQL statement.
  • The first two statements creates our data model, which is a keyspace killrvideo, and a table videos. Normally you’ll want to create the data model outside of the application, but we do it here to minimize the steps for this example.
  • The third statement inserts a row into the newly created table.
  • The last statement runs a query, which returns a ResultSet object.
  • The ResultSet is an Iterable object and can be processed using a for-loop to iterate through each of the rows in the query results.

The full code

import com.datastax.driver.core.Cluster;
import com.datastax.driver.core.ResultSet;
import com.datastax.driver.core.Row;
import com.datastax.driver.core.Session;

public class Application {

   public static void main(String[] args) {
       Cluster cluster = Cluster.builder().addContactPoint("40.83.177.33").build();
       Session session = cluster.connect();

       session.execute("CREATE KEYSPACE IF NOT EXISTS killrvideo WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 1};");
       session.execute("CREATE TABLE IF NOT EXISTS killrvideo.videos (name TEXT, description TEXT, PRIMARY KEY(name))");

       session.execute("INSERT INTO killrvideo.videos (name, description) VALUES (?, ?);", "Avengers: Endgame", "No spoilers");

       ResultSet rs = session.execute("SELECT * FROM killrvideo.videos WHERE name = ?;", "Avengers: Endgame");

       for (Row row : rs) {
           System.out.println("Name:" + row.getString("name"));
           System.out.println("Description: " + row.getString("description"));
       }

       cluster.close();
   }
}

SEE ALSO: How to perform speculative query executions in Apache Cassandra using GoCql

Run the application

apache cassandra

What’s next?

These are the steps one would use to begin to build an app using Apache Cassandra™. 

For questions about architecture, procedures, and best practices refer to the Apache Cassandra Documentation, or check out any of these additional resources

You can also check out the full GitHub link for KillrVideo here to see how the app is designed: https://killrvideo.github.io/docs/guides/architecture/

Author

Scott Hendrickson

Scott is Cloud Architect Team Lead at DataStax, responsible for leading the build and development of cloud services based on Apache Cassandra for the company. He works with companies on how to use cloud database services effectively to meet their needs. Prior to DataStax, he was responsible for running enterprise software engineering at a national retail chain.


Leave a Reply

2 Comments on "Tutorial: Getting started with Apache Cassandra"

avatar
400
  Subscribe  
Notify of
pratap
Guest

What API and from where to be downloaded for this Cassandra applications development

scott
Guest

You can find the driver at https://downloads.datastax.com/#enterprise-drivers and also get download of the Datastax server at https://downloads.datastax.com/#enterprise