Chapter 1 Using Cassandra with Hector

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

CHAPTER 1 USING CASSANDRA WITH HECTOR

Hector is a Java client used to access Cassandra from a Java or Java EE application. Hector provides several features, which include the following:

It’s suitable for large-scale production systems.

It offers support for object-oriented and object-relational mapping (ORM).

It offers enhanced performance using connection pooling.

It supports round-robin load balancing and client failover.

It supports fault tolerance using replication of data to multiple nodes.

It offers elasticity using automatic discovery of hosts.

It supports automatic retry of downed hosts.

It is designed for Cassandra’s data model.

It is scalable and highly available.

It is durable, with no single points of failure.

This chapter discusses using the Hector Java client to access Cassandra in the Eclipse IDE. First, it discusses the Cassandra storage model.

CASSANDRA STORAGE MODEL

Cassandra is a NoSQL, highly available, distributed database based on a row/column structure. NoSQL implies that Cassandra is not a relational database system. Examples of relational database systems are MySQL server, Oracle database, and DB2 database. Relational databases store data in a table structure in rows and columns. A relational database is queried with Structured Query Language (SQL), while a NoSQL database such as Cassandra may be accessed using several different kinds of clients such as Java client, PHP client, and Ruby client, to name a few.

The top-level namespace in Cassandra is a keyspace. A keyspace is the equivalent of a database instance in a SQL relational database. An installation of Cassandra may have several keyspaces. The top-level data structure for data storage is a column family, which is a set of key-value pairs. A column family definition consists of columns, with one of the columns being the primary key column and the other columns being the data columns. A column is the smallest unit of data stored in Cassandra. It is associated with a name, a value and a timestamp.

One of the columns in a column family is the primary key, or row key. A primary key is identified with PRIMARY KEY in a column family definition. Some Cassandra APIs require the primary key column to be called KEY, which is the default name for the primary key column. Other Cassandra APIs do not have such a requirement. When an identifier other than KEY is used for the primary key column, a key alias for the primary key is set automatically. The only requirements to define a new column family are a column family name and a primary key and its associated type. The storage model used by Cassandra is shown in Figure 1.1.

Figure 1.1
Cassandra storage model.

As of Cassandra Query Language (CQL) 3, which is similar to SQL, a column family is also called a table. A key-value pair in a table is also called a record. Column values that have the same primary key comprise a row, which makes a column family a container of rows, as shown in Figure 1.2. A key-value pair in a column family is the primary key and the row of data (value) associated with a primary key.

Figure 1.2
Column family as a container of rows.

The primary key must be associated with a data type. Each column may optionally be associated with a data type, which is used during the serialization and de-serialization of data. The different data types supported by the row KEY values and the data columns values are called the CQL data types. In fact, a data type may also be associated with a column name, not just the column values. The different data types supported by CQL are discussed in Table 1.1.

Table 1.1 CQL Data Types

OVERVIEW OF HECTOR JAVA CLIENT

This section discusses the different packages and classes in the Hector Java client API. The entry points of the Hector API are defined in the me.prettyprint.hector.api package, which is illustrated in Figure 1.3.

Figure 1.3
Entry points of the Hector API.

The main interfaces in the me.prettyprint.hector.api package are discussed in Table 1.2.

Table 1.2 Main Interfaces in the me.prettyprint.hector.api Package

The serializers used to convert between bytes and different data types are defined in the me.prettyprint.cassandra.serializers package, which is illustrated in Figure 1.4.

Figure 1.4
Serializers.

The main classes in the me.prettyprint.cassandra.serializers package are discussed in Table 1.3.

Table 1.3 Main Classes in the me.prettyprint.cassandra.serializers Package

The service interfaces and classes are defined in the me.prettyprint.cassandra.service package, which is illustrated in Figure 1.5.

Figure 1.5
Service interfaces.

The main classes in the me.prettyprint.cassandra.service package are discussed in Table 1.4.

Table 1.4 Main Classes in the me.prettyprint.cassandra.service Package

The bean interfaces used to encapsulate columns, column slices, and rows are specified in the me.prettyprint.hector.api.beans package, which is illustrated in Figure 1.6.

Figure 1.6
Bean interfaces.

The main interfaces in the me.prettyprint.hector.api.beans package are discussed in Table 1.5.

Table 1.5 Main Interfaces in the me.prettyprint.hector.api.beans Package

The data definition language operations supported by Hector are specified in the me.prettyprint.hector.api.ddl package, which is illustrated in Figure 1.7. The package is used for adding and removing new keyspaces and column families, and for defining indices.

Figure 1.7
DDL classes and interfaces.

The main interfaces and classes in the me.prettyprint.hector.api.ddl package are discussed in Table 1.6. DDL operations are performed serially. Concurrent DDL operations are not supported.

Table 1.6 Main Interfaces in the me.prettyprint.hector.api.ddl Package

The exceptions that a Hector client application could throw are specified in the me.prettyprint.hector.api.exceptions package, which is illustrated in Figure 1.8.

Figure 1.8
Exceptions.

The main exception classes are discussed in Table 1.7.

Table 1.7 Main Classes in the me.prettyprint.hector.api.exceptions Package

The me.prettyprint.hector.api.factory package, which is illustrated in Figure 1.9, contains only the HFactory class, which is a convenience class with static methods to create keyspaces, column definitions, mutators, columns, and queries, to list a few.

Figure 1.9
Factory Class.

The me.prettyprint.hector.api.mutation package contains classes for mutations (insertions, deletions, and such), and is illustrated in Figure 1.10.

Figure 1.10
Mutation Classes.

The me.prettyprint.hector.api.mutation package contains only two classes, which are discussed in Table 1.8.

Table 1.8 Classes in the me.prettyprint.hector.api.mutation Package

The different types of queries supported by Hector are defined in the me.prettyprint.hector.api.query package interfaces, as illustrated in Figure 1.11.

Figure 1.11
Queries.

The main interfaces in the me.prettyprint.hector.api.query package are discussed in Table 1.9.

Table 1.9 Main Interfaces in the me.prettyprint.hector.api.query Package

Some of the fields, such as keyspace, column family name, key serializer, and column family serializer, are used in every Hector client operation and have to be passed in for every operation separately. The me.prettyprint.cassandra.service.template package provides class and interface types to create templates for Hector operations—templates that may be used repeatedly without having to pass in the fields for each operation separately. The me.prettyprint.cassandra.service.template package class and interface types are illustrated in Figure 1.12.

Figure 1.12
Templates.

The class and interfaces in the me.prettyprint.cassandra.service.template package are discussed in Table 1.10.

Table 1.10 Class and Interfaces in the me.prettyprint.cassandra.service.template Package

In the next section, you will set the environment to access Cassandra from the Hector Java client.

SETTING THE ENVIRONMENT

To set the environment, you must download the following software:

Apache Cassandra apache-cassandra-2.0.4-bin.tar.gz or a later version from http://cassandra.apache.org/download/.

Hector Java client hector-core-1.1-4.jar or a later version from http://repo2.maven.org/maven2/org/hectorclient/hector-core/1.1-4/.

Eclipse IDE for Java EE developers from https://eclipse.org/downloads/packages/eclipse-ide-java-ee-developers/kepler.

Apache Commons Lang 2.6 from http://commons.apache.org/proper/commons-lang/download_lang.cgi.

Java SE 6 or later, preferably Java SE 7 or Java SE 8. Java SE 7 is used in this chapter.

Then follow these steps:

1. Install the Eclipse IDE.

2. Extract the Apache Cassandra TAR file to a directory (for example, C:Cassandraapache-cassandra-2.0.4).

3. Add the bin folder, C:Cassandraapache-cassandra-2.0.4in, to the PATH environment variable.

4. Start Apache Cassandra server with the following command: cassandra –f

The Cassandra server starts and begins listening for CQL clients on localhost:9042. Cassandra also listens for Thrift clients on localhost:9160, as shown in Figure 1.13.

Figure 1.13
Starting the Cassandra server.

Source: Microsoft Corporation.

CREATING A JAVA PROJECT

In this section, you will develop a Java project in Eclipse to use the Hector Java client with Cassandra. Follow these steps:

1. Choose File > New > Other in the Eclipse IDE.

2. In the New window, select the Java Project wizard as shown in Figure 1.14. Then click Next.

Figure 1.14
Selecting the Java Project wizard.

Source: Eclipse Foundation.

3. In the Create a Java Project screen, specify a project name (Hector) and a directory location for the Java project and click Next. (See Figure 1.15.)

Figure 1.15
Configuring a new Java project.

Source: Eclipse Foundation.

4. In the Java Settings dialog box, select the default settings and click Finish, as shown in Figure 1.16. A Java project is created and is added to the Package Explorer, as shown in Figure 1.17.

Figure 1.16
Configuring Java settings.

Source: Eclipse Foundation.

Figure 1.17
The new Java project.

Source: Eclipse Foundation.

5. Add a Java client class to access Cassandra using Hector. To do so, again choose File > New > Other. This time, however, choose Java > Class in the New window. Then click Next. (See Figure 1.18.)

Figure 1.18
Selecting the Java Class wizard.

Source: Eclipse Foundation.

6. In the New Java Class wizard, select a source folder, specify a package (hector), enter a class name (HectorClient), and click Finish, as shown in Figure 1.19. A Java class HectorClient is created, as shown in the Package Explorer in Figure 1.20.

Figure 1.19
Configuring a new Java class.

Source: Eclipse Foundation.

Figure 1.20
The new Java class.

Source: Eclipse Foundation.

7. To be able to access Cassandra from the Java application using Hector, you need to add some JAR files to the Java build path of the application. To begin, right-click the Hector project node in the Package Explorer and select Properties.

8. In the Properties window, select the Java Build Path node. Then select Libraries and click Add External JARs to add external JAR files. Add the JAR files listed in Table 1.11.

Table 1.11 JAR Files

9. The external JAR files required for accessing Cassandra from a Hector Java client application are shown in the Eclipse IDE Properties wizard. Click OK after adding the required JAR files, as shown in Figure 1.21.

Figure 1.21
Adding JAR files to the Java build path.

Source: Eclipse Foundation.

CREATING A CASSANDRA `Cluster` OBJECT

The me.prettyprint.hector.api.Cluster interface defines a cluster of Cassandra hosts. To be able to access a Cassandra cluster, you must first create a Cluster instance for a Cassandra cluster. The HFactory class provides several static methods to get or create a Cluster instance, as listed in Table 1.12.

Table 1.12 HFactory Class Methods to Create or Get a Cluster

In the HectorClient class, create a Cluster instance using the getOrCreateCluster (String clusterName, String hostIp) method as follows:

Cluster cluster = HFactory.getOrCreateCluster("hector-cluster","localhost:9160");

Alternatively, you may create a Cluster instance as follows:

String clusterName = " hector-cluster";
String host = "localhost:9160";
Cluster cluster = HFactory.getOrCreateCluster(clusterName, new
CassandraHostConfigurator(host));

You’ll add a method createSchema() to create a column family definition in the next section. You are not expected to build the HectorClient class from code snippets. Instead, copy the listing at the end of the discussion.

CREATING A SCHEMA

A schema consists of a column family definition and a keyspace definition. The HFactory class provides several static methods to create a column family definition, as listed in Table 1.13.

Table 1.13 HFactory Class Methods to Create a Column Family Definition

The HFactory class also provides the methods discussed in Table 1.14 to create a keyspace definition.

Table 1.14 HFactory Class Methods to Create a Keyspace Definition

Add a method createSchema() to create a column family definition and a keyspace definition for the schema. Then create a column family definition for a column family named "catalog", a keyspace named HectorKeyspace, and a comparator named ComparatorType.BYTESTYPE:

ColumnFamilyDefinition cfDef = HFactory.createColumnFamilyDefinition
("HectorKeyspace", "catalog", ComparatorType.BYTESTYPE);

Using a replication factor of 1, create a KeyspaceDefinition instance from the preceding column family definition. The replication factor is the number of copies or replicas of each row of data stored in a cluster node. Specify the strategy class as org.apache.cassandra.locator.SimpleStrategy using the constant

ThriftKsDef.DEF_STRATEGY_CLASS:

KeyspaceDefinition keyspace = HFactory.createKeyspaceDefinition
("HectorKeyspace", ThriftKsDef.DEF_STRATEGY_CLASS,replicationFactor,
Arrays.asList(cfDef));

Cassandra supports the strategy classes, which refer to the replica placement strategy class, discussed in Table 1.15.

Table 1.15 Strategy Classes

Having created a keyspace definition, you need to add the keyspace definition to the Cluster instance. The Cluster interface provides the methods discussed in Table 1.16 to add a keyspace definition.

Table 1.16 Cluster Interface Methods

Add the keyspace definition to the Cluster instance. With the blockUntilComplete set to true, the method blocks until schema agreement is received from the server:

cluster.addKeyspace(keyspace, true);

Adding a keyspace definition to a Cluster instance does not create a keyspace. In the next section, you will create a keyspace. Invoke the createSchema() method based on whether the KeyspaceDefinition is not already defined. The Cluster interface provides a method describeKeyspace(String) to find out whether a KeyspaceDefinition is already defined. If the method returns null, the KeyspaceDefinition is not defined.

KeyspaceDefinition keyspaceDef = cluster.describeKeyspace("HectorKeyspace");
if (keyspaceDef == null) {
        createSchema();
}

CREATING A KEYSPACE

Having added a keyspace definition, you need to create a keyspace. A keyspace is represented with the me.prettyprint.hector.api.Keyspace interface. The HFactory class provides static methods to create a keyspace from a Cluster instance to which a keyspace definition has been added. Invoke the method createKeyspace(String keyspace, Cluster cluster) to create a keyspace with the name HectorKeyspace:

private static void createKeyspace() {
keyspace = HFactory.createKeyspace("HectorKeyspace", cluster);
}

CREATING A TEMPLATE

Templates provide a reusable construct containing the fields common to all Hector client operations. Create an instance of ThriftColumnFamilyTemplate using a class constructor ThriftColumnFamilyTemplate(Keyspace keyspace, String columnFamily, Serializer<K> keySerializer, Serializer<N> topSerializer). Use the keyspace instance created in the preceding section and specify the column family name as "catalog".

ThriftColumnFamilyTemplate template = new ThriftColumnFamilyTemplate<String,
String>(keyspace,"catalog", StringSerializer.get(), StringSerializer.get());

Next, you will add table data to the column family "catalog" in the keyspace HectorKeyspace.

ADDING TABLE DATA

As discussed, the me.prettyprint.hector.api.mutation package provides the Mutator class to add data. First, you need to create an instance of Mutator using the static method createMutator(Keyspace keyspace, Serializer<K> keySerializer) in HFactory. Supply the keyspace instance previously created as well as a StringSerializer instance.

Mutator<String> mutator = HFactory.createMutator(keyspace,StringSerializer.get());

Column data may be added as a single column or a batch of columns. We will discuss each of these approaches in the next two sections.

ADDING A SINGLE COLUMN OF DATA IN A TABLE

First, you’ll learn how to add a single column of data. The Mutator class provides the method discussed in Table 1.17 to add a single column of data.

Table 1.17 Mutator Class Method

Add a column with the insert method using primary key column "catalog3" and the column family name "catalog". Create the HColumn instance using the HFactory static method createStringColumn(String name,String value).

private static void addTableDataColumn() {
       Mutator<String> mutator = HFactory.createMutator(keyspace,
       StringSerializer.get());
       MutationResult result=mutator.insert("catalog3", "catalog",
       HFactory.createStringColumn("journal", "Oracle Magazine"));
       System.out.println(result);
}

Output the MutationResult returned by the insert method. The HFactory class also provides several overloaded createColumn methods that return an HColumn instance. To run the HectorClient class and invoke the addTableDataColumn() method, add an invocation of the method in the main method. To run the class, right-click the HectorClient Java file in Package Explorer and select Run As > Java Application, as shown in Figure 1.22.

Figure 1.22
Running the HectorClient.java application.

Source: Eclipse Foundation.

A single column is added, as shown by MutationResult. The output in Eclipse, shown in Figure 1.23, also has the column added, having been retrieved using a column query, which is discussed later in this chapter.

Figure 1.23
Single column added.

Source: Eclipse Foundation.

In the next section, you will add multiple columns.

ADDING MULTIPLE COLUMNS OF DATA IN A TABLE

The Mutator class provides the method discussed in Table 1.18 to add an HColumn instance and return the Mutator instance, which may be used again to add another HColumn instance. You can add a series of HColumn instances by invoking the Mutator instance sequentially.

Table 1.18 Mutator Class Method to Add a Series of Columns

Add a static method addTableData() to make multiple mutations using the same instance of Mutator. Add multiple columns to a Mutator instance using the addInsertion invocations in series.

Mutator<String> mutator = HFactory.createMutator(keyspace,StringSerializer
.get());
mutator.addInsertion("catalog1", "catalog",HFactory.createStringColumn
("journal", "Oracle Magazine")).addInsertion("catalog1","catalog",HFactory.
createStringColumn("publisher","Oracle Publishing")).addInsertion
("catalog1","catalog",HFactory.createStringColumn("edition","November-December
2013")).addInsertion("catalog1","catalog",HFactory.createStringColumn
("title","Quintessential and Collaborative")).addInsertion("catalog1",
"catalog",HFactory.createStringColumn("author", "Tom Haunert"));

Instances of HColumn added using the same KEY constitute a row. The preceding example creates a row of data with KEY "catalog1" in the "catalog" column family. Another row with KEY "catalog2" could be added similarly.

mutator.addInsertion("catalog2", "catalog", HFactory.createStringColumn
("journal", "Oracle Magazine"))
.addInsertion("catalog2","catalog",HFactory.createStringColumn
("publisher","Oracle Publishing")).addInsertion("catalog2","catalog",HFactory.
createStringColumn("edition", "November-December 2013")).addInsertion
("catalog2","catalog",HFactory.createStringColumn("title", "Engineering as a
Service")).addInsertion("catalog2","catalog",HFactory.createStringColumn
("author", "David A. Kelly"));

The mutations added to the Mutator instance are not sent to the Cassandra server yet. To send them, you invoke the execute() method. This runs the batch of mutations added to the Mutator instance.

mutator.execute();

Invoke the addTableData() method from the main method and run the HectorClient class to add data in a batch.

RETRIEVING TABLE DATA

In this section, you will retrieve the previously added table data. As discussed, the me.prettyprint.hector.api.query package provides several interfaces representing different types of queries. First, you will query a single column.

Querying Single Column

The ColumnQuery<K,N,V> interface represents a single standard column query. HFactory provides the methods discussed in Table 1.19 to query a single column.

Table 1.19 HFactory Methods to Query a Single Column

Create a ColumnQuery instance using the static method createStringColumnQuery (Keyspace keyspace):

ColumnQuery<String, String, String> columnQuery = HFactory.createStringColumn
Query(keyspace);

The ColumnQuery interface provides the methods discussed in Table 1.20 to set the fields of the query, each of which return a ColumnQuery<K,N,V> instance.

Table 1.20 HFactory Methods to Query a Single Column

Set the column family name to "catalog", the primary key value to "catalog3", and the column name to "journal":

private static void retrieveTableDataColumnQuery() {
       columnQuery.setColumnFamily("catalog").setKey("catalog3").setName
("journal");
}

The QueryResult<T> interface represents the return type from queries, with the type parameter T being the type of result. After setting the query attributes, invoke the execute() method to return a QueryResult<HColumn<String, String>> object.

QueryResult<HColumn<String, String>> result = columnQuery.execute();

Next, output the result value using the method get() in the QueryResult interface:

System.out.println(result.get());

Finally, invoke the retrieveTableDataColumnQuery() method from the main method to output the result of the query, as shown in Figure 1.24.

Figure 1.24
The result of the query.

Source: Eclipse Foundation.

Querying Multiple Columns

In this section, you will query multiple columns using an instance of ThriftColumnFamilyTemplate. This provides a reusable template with the common query attributes set to make repeated Hector queries. You created an instance of ThriftColumnFamilyTemplate in an earlier section. The ThriftColumnFamilyTemplate class provides several overloaded methods called queryColumns to query multiple columns in the same query, as discussed in Table 1.21.

Table 1.21 Overloaded queryColumns Methods to Query Multiple Columns

Each of the methods in Table 1.21 returns a ColumnFamilyResult instance. Add a retrieveTableData() method to query multiple columns. Using the template, query the columns in the row corresponding to the "catalog1" key.

ColumnFamilyResult<String, String> res = template.queryColumns("catalog1");

The ColumnFamilyResult interface provides several get methods to get the different types of results, as discussed in Table 1.22.

Table 1.22 ColumnFamilyResult Interface Methods

You can use the hasResults() method to find out whether a ColumnFamilyResult instance has a result. Output the String column values in the ColumnFamilyResult instance obtained from the preceding query.

if(res.hasResults()){
       String journal = res.getString("journal");
       String publisher = res.getString("publisher");
       String edition = res.getString("edition");
       String title = res.getString("title");
       String author = res.getString("author");
 
       System.out.println(journal);
       System.out.println(publisher);
       System.out.println(edition);
       System.out.println(title);
       System.out.println(author);
}

Similarly, query the columns corresponding with the row with the "catalog2" key and output the result. Invoke the retrieveTableData() method in the main method and run the HectorClient class to output the query result, as shown in Figure 1.25.

Figure 1.25
The query result for multiple columns.

Source: Eclipse Foundation.

Querying with a Slice Query

A slice query is a query of only a slice of columns—that is, columns that are either specified or in a certain range indicated. A set of columns is represented with the ColumnSlice<N,V> interface. A slice query is represented with the SliceQuery<K,N,V> interface.

The SliceQuery<K,N,V> interface provides the methods discussed in Table 1.23 to set the attributes of the query.

Table 1.23 SliceQuery Interface Methods

Add a retrieveTableDataSliceQuery() method to the query using a slice query. The HFactory class provides the method discussed in Table 1.24 to create a SliceQuery instance.

Table 1.24 HFactory Class Method to Create a SliceQuery Instance

Using the Keyspace instance previously created, create a SliceQuery<String, String, String> instance using the createSliceQuery() method. Set the column family as "catalog" and set the row key as "catalog2". Use StringSerializer instances for the column name, key, and column value.

SliceQuery<String, String, String> query = HFactory.createSliceQuery(keyspace,
StringSerializer.get(),StringSerializer.get(), StringSerializer.get()).setKey
("catalog2").setColumnFamily("catalog");

The ColumnSliceIterator class is used to iterate over the columns in a SliceQuery instance and to retrieve the column values. The ColumnSliceIterator class provides the constructors discussed in Table 1.25.

Table 1.25 ColumnSliceIterator Class Constructors

Create a ColumnSliceIterator instance using a start for the column name of "u0000", which is the smallest value of type char, and using a finish of "uFFFF", the largest value of type char. Specify the SliceQuery instance and set the reversed parameter to false.

ColumnSliceIterator<String, String, String> iterator = new
ColumnSliceIterator<String, String, String>(query, "u0000", "uFFFF", false);

Then iterate over the columns to get the column name and column value for each of the columns.

while (iterator.hasNext()) {
       HColumn<String, String> column = iterator.next();
       System.out.println(column.getName());
       System.out.println(column.getValue());
}

Invoke the retrieveTableDataSliceQuery() method from the main method to output the column names and column values, as shown in Eclipse in Figure 1.26, when the HectorClient application is run.

Figure 1.26
Query result for SilceQuery.

Source: Eclipse Foundation.

Querying with the `MultigetSliceQuery`

In the preceding section, you queried multiple columns from only a single row. In this section, you will query columns from multiple rows. The MultigetSliceQuery<K,N,V> interface is used for a query over multiple rows. The MultigetSliceQuery<K,N,V> interface provides the methods discussed in Table 1.26 to set and get query fields.

Table 1.26 MultigetSliceQuery Interface Methods

All the methods in Table 1.26 return a MultigetSliceQuery instance except the getColumnNames() method. First, however, you need to create an instance of MultigetSliceQuery. The HFactory class provides the method discussed in Table 1.27 to create an instance of MultigetSliceQuery.

Table 1.27 HFactory Class Method to Create a MultigetSliceQuery Instance

Add a retrieveTableDataMultigetSliceQuery() method to the query using a multi-get query. Using the Keyspace instance created earlier and StringSerializer instances, create an instance of MultigetSliceQuery<String, String, String> using the HFactory method createMultigetSliceQuery.

MultigetSliceQuery<String, String, String> multigetSliceQuery =
HFactory.createMultigetSliceQuery(keyspace, StringSerializer.get(),
StringSerializer.get(), StringSerializer.get());

Next, set the column family as "catalog" and row keys as "catalog1", "catalog2", and "catalog3".

multigetSliceQuery.setColumnFamily("catalog");
multigetSliceQuery.setKeys("catalog1", "catalog2",
"catalog3");

Set the range of columns with the setRange method. Empty strings for start and finish imply that all the columns are to be queried. Set the number of columns to get to 5 and set the reversed boolean to false.

multigetSliceQuery.setRange("", "", false, 5);

Next, invoke the execute() method on the MultigetSliceQuery<String, String, String> instance to get the query result as a QueryResult<Rows<String, String, String>> instance.

QueryResult<Rows<String, String, String>> result = multigetSliceQuery.execute();

Get the result value using the get() method in the QueryResult interface. The type of the result is Rows<String, String, String>. Get each of the Row instances in Rows using the getByKey(K key) method. The Row<K,N,V> interface is a tuple consisting of a Key and a column slice.

System.out.println(result.get().getByKey("catalog1"));
System.out.println(result.get().getByKey("catalog2"));
System.out.println(result.get().getByKey("catalog3"));

Invoke the retrieveTableDataMultigetSliceQuery() method from the main method to output the result of the multigetSliceQuery instance, as shown in Figure 1.27.

Figure 1.27
Query result for the multigetSliceQuery instance.

Source: Eclipse Foundation.

In another run of the application, set the number of columns in the query to 3.

multigetSliceQuery.setRange("", "", false, 3);

As shown in Figure 1.28, only three of the columns are included in the query result.

Figure 1.28
Query result for multigetSliceQuery instance for three columns.

Source: Eclipse Foundation.

Querying with a Range Slices Query

The MultigetSliceQuery interface discussed in the preceding section sets the row keys for which columns are to be retrieved explicitly. Alternatively, you can use the RangeSlicesQuery<K,N,V> interface to set the row keys as a range instead of setting each key explicitly. For example, if row key values "catalog1", "catalog2", "catalog3", "catalog4", and "catalog5" are defined, you could set the range to start at "catalog1" and end at "catalog5" to include all the row key values in between. Some of the methods in the RangeSlicesQuery<K,N,V> interface are discussed in Table 1.28.

Table 1.28 RangeSlicesQuery Interface Methods

Add a retrieveTableDataRangeSlicesQuery() method to use the RangeSlicesQuery<K,N,V> interface. The HFactory class provides the method discussed in Table 1.29 to create a RangeSlicesQuery instance.

Table 1.29 HFactory Class Method to Create a RangeSlicesQuery Instance

Using StringSerializer instances, create a RangeSlicesQuery<String, String, String> instance using the HFactory method createRangeSlicesQuery.

RangeSlicesQuery<String, String, String> rangeSlicesQuery =HFactory.
createRangeSlicesQuery(keyspace, StringSerializer.get(),
StringSerializer.get(), StringSerializer.get());

Next, set the column family to "catalog" and set the range of keys to start at "catalog1" and end at "catalog3".

rangeSlicesQuery.setColumnFamily("catalog");
rangeSlicesQuery.setKeys("catalog1", "catalog3");

Set the range of columns to include all the columns as indicated by the empty strings for start and finish. Set the number of columns to get to 5.

rangeSlicesQuery.setRange("", "", false, 5);

Next, invoke the execute() method on the RangeSlicesQuery<String, String, String> instance to make the query. The result is returned as a QueryResult<OrderedRows< String, String, String>> instance.

QueryResult<OrderedRows<String, String, String>> result = rangeSlicesQuery.
execute();

Invoke the get() method on the QueryResult instance to get the result value. Then invoke the getByKey method on each of the Row instances to get the row retrieved.

System.out.println(result.get().getByKey("catalog1"));
System.out.println(result.get().getByKey("catalog2"));
System.out.println(result.get().getByKey("catalog3"));

Invoke the retrieveTableDataRangeSlicesQuery() method in the main method and run the HectorClient class to output the result. The result of the query as output in Eclipse is shown in Figure 1.29.

Figure 1.29
Query result for a RangeSlicesQuery instance.

Source: Eclipse Foundation.

UPDATING DATA

In this section, you will update row data added previously. The ColumnFamilyUpdater<K,N> class is used to update a row of data and provides the constructors discussed in Table 1.30.

Table 1.30 ColumnFamilyUpdater Class Constructors

Alternatively, a ColumnFamilyUpdater may be created using a ThriftColumnFamilyTemplate instance, which provides the methods discussed in Table 1.31 for creating a ColumnFamilyUpdater.

Table 1.31 ThriftColumnFamilyTemplate Methods to create a ColumnFamilyUpdater

Add an updateTableData() method to update a row of data. Create a ColumnFamilyUpdater<String, String> instance using the createUpdater(K key) method with the supplied key—for example, "catalog2".

ColumnFamilyUpdater<String, String> updater = template.createUpdater("catalog2");

The ColumnFamilyUpdater interface provides several methods for setting an updated value for a column, some of which are listed in Table 1.32.

Table 1.32 ColumnFamilyUpdater Interface Methods

Set the updated values for the columns in the row for key "catalog2".

updater.setString("journal", "Oracle-Magazine");
updater.setString("publisher", "Oracle-Publishing");
updater.setString("edition", "11/12 2013");
updater.setString("title", "Engineering as a Service");
updater.setString("author", "Kelly, David A.");

When a ColumnFamilyUpdater instance has been constructed with the updated values, you can invoke the update(ColumnFamilyUpdater<K, N> updater) method to update.

try {
template.update(updater);
} catch (HectorException e) {
}

Invoke the updateTableData() method from the main method and run the HectorClient application to update the row with key "catalog2". Then query row "catalog2" using the retrieveTableData() method to output the updated values, as shown in Figure 1.30.

Figure 1.30
Updated column values.

Source: Eclipse Foundation.

DELETING TABLE DATA

Next, you will delete data from Cassandra database. As when adding row column(s), you need to create a Mutator instance using a Keyspace instance and a StringSerializer.

Mutator<String> mutator = HFactory.createMutator(keyspace,StringSerializer.get());

As with adding data, you can delete data as a single column or delete multiple columns of data as a batch.

Deleting a Single Column

The Mutator interface provides the method discussed in Table 1.33 for deleting a column.

Table 1.33 Mutator Interface Method to Delete a Column

Add a deleteTableDataColumn() method to the HectorClient class. Then delete the "journal" column in the "catalog" column family in the row with key as "catalog3" and using a StringSerializer.

mutator.delete("catalog3", "catalog", "journal", StringSerializer.get());

Invoke the deleteTableDataColumn() method in the main method and run the HectorClient application. The delete method returns a MutationResult instance. Invoke the retrieveTableDataMultigetSliceQuery() method after invoking the deleteTableDataColumn() method to output the modified row set. The row set output using the retrieveTableDataMultigetSliceQuery() method before a single column is deleted is shown in Figure 1.31.

Figure 1.31
Result of query before deleting a row.

Source: Eclipse Foundation.

Figure 1.32 shows the row set output using the retrieveTableDataMultigetSliceQuery() method after a single column is deleted. The journal column is not included in the "catalog3" row because the column has been deleted.

Figure 1.32
Result of query after deleting a row.

Source: Eclipse Foundation.

Deleting Multiple Columns

In this section, you will delete multiple columns from a row. The Mutator interface provides the overloaded addDeletion methods for deleting multiple columns from a row. Some of the overloaded addDeletion methods are listed in Table 1.34.

Table 1.34 Mutator Interface Methods for Deleting Multiple Columns

All the addDeletion methods return a Mutator instance, which can be used to invoke the addDeletion method again to link a batch of deletions. Add a deleteTableData() method to delete a batch of columns. Then create a Mutator instance from the HFactory class.

Mutator<String> mutator = HFactory.createMutator(keyspace,StringSerializer
.get());

Invoke the addDeletion() method multiple times in sequence to add delete mutations for the "journal", "publisher", and "edition" columns from the "catalog2" row in the "catalog" column family. Adding delete mutations with the addDeletion() method does not delete the columns by itself. Invoke the execute() method to delete the mutations added to the Mutator instance.

mutator.addDeletion("catalog2", "catalog",
"journal",StringSerializer.get()).addDeletion("catalog2", "catalog",
"publisher",
StringSerializer.get())
addDeletion("catalog2", "catalog", "edition",
StringSerializer.get()).execute();

Invoke the deleteTableData() method in the main method and run the HectorClient application to delete the columns added using the addDeletion() method. If the retrieveTableData() method is invoked before the batch deletions are applied, the query result shown in Figure 1.33 is output.

Figure 1.33
Result of query before batch deletions.

Source: Eclipse Foundation.

If the retrieveTableData() method is invoked after the batch deletions are applied, the query result shown in Figure 1.34 is output. The "journal", "publisher", and "edition" columns are shown as deleted.

Figure 1.34
Result of query after batch deletions.

Source: Eclipse Foundation.

THE `HectorClient` CLASS

The HectorClient class appears in Listing 1.1.

Listing 1.1 The HectorClient Class

package hector;
 
import java.util.Arrays;
import me.prettyprint.cassandra.serializers.StringSerializer;
import me.prettyprint.cassandra.service.ColumnSliceIterator;
import me.prettyprint.cassandra.service.ThriftKsDef;
import me.prettyprint.hector.api.Cluster;
import me.prettyprint.hector.api.Keyspace;
import me.prettyprint.hector.api.beans.HColumn;
import me.prettyprint.hector.api.beans.OrderedRows;
import me.prettyprint.hector.api.beans.Rows;
import me.prettyprint.hector.api.ddl.ColumnFamilyDefinition;
import me.prettyprint.hector.api.ddl.ComparatorType;
import me.prettyprint.hector.api.ddl.KeyspaceDefinition;
import me.prettyprint.hector.api.exceptions.HectorException;
import me.prettyprint.hector.api.factory.HFactory;
import me.prettyprint.hector.api.mutation.MutationResult;
import me.prettyprint.hector.api.mutation.Mutator;
import me.prettyprint.hector.api.query.ColumnQuery;
import me.prettyprint.hector.api.query.MultigetSliceQuery;
import me.prettyprint.hector.api.query.Query;
import me.prettyprint.hector.api.query.QueryResult;
import me.prettyprint.hector.api.query.RangeSlicesQuery;
import me.prettyprint.hector.api.query.SliceQuery;
import me.prettyprint.cassandra.service.template.ColumnFamilyResult;
import me.prettyprint.cassandra.service.template.ColumnFamilyTemplate;
import me.prettyprint.cassandra.service.template.ColumnFamilyUpdater;
import me.prettyprint.cassandra.service.template.ThriftColumnFamilyTemplate;
public class HectorClient {
 
   private static Cluster cluster;
private static Keyspace keyspace;
       private static ColumnFamilyTemplate<String, String> template;
 
       public static void main(String[] args) {
 
              cluster = HFactory.getOrCreateCluster("hector_cluster",
                           "localhost:9160");
 
              KeyspaceDefinition keyspaceDef = cluster
                           .describeKeyspace("HectorKeyspace");
              if (keyspaceDef == null) {
                    createSchema();
              }
              createKeyspace();
              createTemplate();
               addTableData();
              // addTableDataColumn();
              // deleteTableDataColumn();
              // addTableDataColumn();
              // retrieveTableDataColumnQuery();
              // updateTableData();
              // deleteTableDataColumn();
              // retrieveTableDataColumnQuery();
              // deleteTableData();
              // retrieveTableData();
              // retrieveTableDataSliceQuery();
               retrieveTableDataMultigetSliceQuery();
       }
 
       private static void createSchema() {
              int replicationFactor = 1;
              ColumnFamilyDefinition cfDef = HFactory.createColumnFamily
Definition(
                           "HectorKeyspace", "catalog", ComparatorType.
BYTESTYPE);
              KeyspaceDefinition keyspace = HFactory.createKeyspaceDefinition(
                           "HectorKeyspace", ThriftKsDef.DEF_STRATEGY_CLASS,
                           replicationFactor, Arrays.asList(cfDef));
              cluster.addKeyspace(keyspace, true);
       }
       private static void createKeyspace() {
              keyspace = HFactory.createKeyspace("HectorKeyspace", cluster);
       }
       private static void createTemplate() {
 
              template = new ThriftColumnFamilyTemplate<String, String>
(keyspace,
                           "catalog", StringSerializer.get(),
StringSerializer.get());
       }
       private static void addTableData() {
              Mutator<String> mutator = HFactory.createMutator(keyspace,
                           StringSerializer.get());
              mutator.addInsertion("catalog1", "catalog",
                           HFactory.createStringColumn("journal", "Oracle
Magazine"))
                           .addInsertion(
                                          "catalog1",
                                          "catalog",
                                          HFactory.createStringColumn
("publisher",
                                                        "Oracle Publishing"))
                           .addInsertion(
                                          "catalog1",
                                          "catalog",
                                          HFactory.createStringColumn
("edition",
                                                        "November-December
2013"))
                           .addInsertion(
                                          "catalog1",
                                          "catalog",
                                          HFactory.createStringColumn
("title",
                                                        "Quintessential and
Collaborative"))
                           .addInsertion("catalog1", "catalog",
                                          HFactory.createStringColumn
("author", "Tom Haunert"));
              mutator.addInsertion("catalog2", "catalog",
                           HFactory.createStringColumn("journal", "Oracle
Magazine"))
                           .addInsertion(
                                          "catalog2",
                                          "catalog",
                                          HFactory.createStringColumn
("publisher",
                                                        "Oracle Publishing"))
                           .addInsertion(
                                          "catalog2",
                                          "catalog",
                                          HFactory.createStringColumn
("edition",
                                                        "November-December
2013"))
                           .addInsertion(
                                          "catalog2",
                                          "catalog",
                                          HFactory.createStringColumn
("title",
                                                        "Engineering as a
Service"))
                           .addInsertion("catalog2", "catalog",
                                          HFactory.createStringColumn
("author", "David A. Kelly"));
              mutator.execute();
       }
       private static void retrieveTableData() {
 
              try {
                      ColumnFamilyResult<String, String> res = template
                                     .queryColumns("catalog1");
              if(res.hasResults()){
 
                     String journal = res.getString("journal");
                     String publisher = res.getString("publisher");
                     String edition = res.getString("edition");
                     String title = res.getString("title");
                     String author = res.getString("author");

 
                     System.out.println(journal);
                     System.out.println(publisher);
                     System.out.println(edition);
                     System.out.println(title);
                     System.out.println(author);
              }
                     res = template.queryColumns("catalog2");
              if(res.hasResults()){
                     journal = res.getString("journal");
                     publisher = res.getString("publisher");
                     edition = res.getString("edition");
                     title = res.getString("title");
                     author = res.getString("author");
 
                     System.out.println(journal);
                     System.out.println(publisher);
                     System.out.println(edition);
                     System.out.println(title);
                     System.out.println(author);
              }
              }catch (HectorException e) {
              }
       }
       private static void retrieveTableDataColumnQuery() {
              ColumnQuery<String, String, String> columnQuery = HFactory
                            .createStringColumnQuery(keyspace);
              columnQuery.setColumnFamily("catalog").setKey("catalog3")
                            .setName("journal");
              //
columnQuery.setColumnFamily("catalog").setKey("catalog1").setName("journal");
              QueryResult<HColumn<String, String>> result = columnQuery.execute
();
              System.out.println(result.get());
       }
       private static void retrieveTableDataSliceQuery() {
              SliceQuery<String, String, String> query = HFactory
                            .createSliceQuery(keyspace, StringSerializer.get(),
                                           StringSerializer.get(),
StringSerializer.get())
                            .setKey("catalog2").setColumnFamily("catalog");
              ColumnSliceIterator<String, String, String> iterator = new
ColumnSliceIterator<String, String,
String>(query, "u0000", "uFFFF", false);
 
              while (iterator.hasNext()) {
                    HColumn<String, String> column = iterator.next();
                    System.out.println(column.getName());
                    System.out.println(column.getValue());
              }
       }
 
       private static void addTableDataColumn() {
              Mutator<String> mutator = HFactory.createMutator(keyspace,
                            StringSerializer.get());
       MutationResult result=mutator.insert("catalog3", "catalog",
                            HFactory.createStringColumn("journal", "Oracle
Magazine"));
       System.out.println(result);
       }
 
       private static void updateTableData() {
              ColumnFamilyUpdater<String, String> updater = template
                            .createUpdater("catalog2");
              updater.setString("journal", "Oracle-Magazine");
              updater.setString("publisher", "Oracle-Publishing");
              updater.setString("edition", "11/12 2013");
              updater.setString("title", "Engineering as a Service");
              updater.setString("author", "Kelly, David A.");
 
              try {
                     template.update(updater);
              } catch (HectorException e) {
              }
       }
 
       private static void deleteTableDataColumn() {
              Mutator<String> mutator = HFactory.createMutator(keyspace,
                            StringSerializer.get());
              mutator.delete("catalog3", "catalog", "journal",
StringSerializer.get());
       }
 
       private static void deleteTableData() {
              Mutator<String> mutator = HFactory.createMutator(keyspace,
                           StringSerializer.get());
              mutator.addDeletion("catalog2", "catalog", "journal",
                           StringSerializer.get())
                           .addDeletion("catalog2", "catalog", "publisher",
                                          StringSerializer.get())
                           .addDeletion("catalog2", "catalog", "edition",
                                          StringSerializer.get()).execute();
       }
       private static void retrieveTableDataMultigetSliceQuery() {
              MultigetSliceQuery<String, String, String> multigetSliceQuery =
                         HFactory.createMultigetSliceQuery(keyspace,
StringSerializer.get(),
 
StringSerializer.get(), StringSerializer.get());
                     multigetSliceQuery.setColumnFamily("catalog");
                     multigetSliceQuery.setKeys("catalog1", "catalog2",
                         "catalog3");
                     //multigetSliceQuery.setRange("", "", false, 3);
                     //multigetSliceQuery.setRange("", "", false, 2);
                     multigetSliceQuery.setRange("", "", false, 5);
                     QueryResult<Rows<String, String, String>> result =
multigetSliceQuery.execute();
                     System.out.println(result.get().getByKey("catalog1"));
                     System.out.println(result.get().getByKey("catalog2"));
                     System.out.println(result.get().getByKey("catalog3"));
       }
       private static void retrieveTableDataRangeSlicesQuery() {
              RangeSlicesQuery<String, String, String> rangeSlicesQuery =
                            HFactory.createRangeSlicesQuery(keyspace,
StringSerializer.get(),
 
StringSerializer.get(), StringSerializer.get());
                            rangeSlicesQuery.setColumnFamily("catalog");
                            rangeSlicesQuery.setKeys("catalog1", "catalog3");
                            //rangeSlicesQuery.setRange("", "", false, 5);
                            //rangeSlicesQuery.setRange("", "", false, 3);
                            QueryResult<OrderedRows<String, String, String>>
result =
rangeSlicesQuery.execute();
                           System.out.println(result.get().getByKey
("catalog1"));
                           System.out.println(result.get().getByKey
("catalog2"));
                           System.out.println(result.get().getByKey
("catalog3"));
       }
 
}

SUMMARY

This chapter discussed using the Hector Java client to access the Apache Cassandra database and make create, read, update, and delete (CRUD) operations on the database data. The Hector client supports adding and deleting column data as single columns or a batch of columns. Hector supports retrieving column data as single columns or a column slice. Row data may be queried one row at a time or multiple rows in the same query. This chapter discussed the various interfaces and classes involved in making the CRUD operations. The next chapter will discuss the Cassandra Query Language (CQL) for querying Cassandra. You will use the Hector Java client to run the CQL queries.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Chapter 1 Using Cassandra with Hector

Create new playlist

Sign In

Sign Up

CHAPTER 1

USING CASSANDRA WITH HECTOR

CASSANDRA STORAGE MODEL

OVERVIEW OF HECTOR JAVA CLIENT

SETTING THE ENVIRONMENT

CREATING A JAVA PROJECT

CREATING A CASSANDRA Cluster OBJECT

CREATING A SCHEMA

CREATING A KEYSPACE

CREATING A TEMPLATE

ADDING TABLE DATA

ADDING A SINGLE COLUMN OF DATA IN A TABLE

ADDING MULTIPLE COLUMNS OF DATA IN A TABLE

RETRIEVING TABLE DATA

Querying Single Column

Querying Multiple Columns

Querying with a Slice Query

Querying with the MultigetSliceQuery

Querying with a Range Slices Query

UPDATING DATA

DELETING TABLE DATA

Deleting a Single Column

Deleting Multiple Columns

THE HectorClient CLASS

SUMMARY

Table of Contents for
Chapter 1 Using Cassandra with Hector

CREATING A CASSANDRA `Cluster` OBJECT

Querying with the `MultigetSliceQuery`

THE `HectorClient` CLASS