Hector is a Java client used to access Cassandra from a Java or Java EE application. Hector provides several features, which include the following:
It’s suitable for large-scale production systems.
It offers support for object-oriented and object-relational mapping (ORM).
It offers enhanced performance using connection pooling.
It supports round-robin load balancing and client failover.
It supports fault tolerance using replication of data to multiple nodes.
It offers elasticity using automatic discovery of hosts.
It supports automatic retry of downed hosts.
It is designed for Cassandra’s data model.
It is scalable and highly available.
It is durable, with no single points of failure.
This chapter discusses using the Hector Java client to access Cassandra in the Eclipse IDE. First, it discusses the Cassandra storage model.
Cassandra is a NoSQL, highly available, distributed database based on a row/column structure. NoSQL implies that Cassandra is not a relational database system. Examples of relational database systems are MySQL server, Oracle database, and DB2 database. Relational databases store data in a table structure in rows and columns. A relational database is queried with Structured Query Language (SQL), while a NoSQL database such as Cassandra may be accessed using several different kinds of clients such as Java client, PHP client, and Ruby client, to name a few.
The top-level namespace in Cassandra is a keyspace. A keyspace is the equivalent of a database instance in a SQL relational database. An installation of Cassandra may have several keyspaces. The top-level data structure for data storage is a column family, which is a set of key-value pairs. A column family definition consists of columns, with one of the columns being the primary key column and the other columns being the data columns. A column is the smallest unit of data stored in Cassandra. It is associated with a name, a value and a timestamp.
One of the columns in a column family is the primary key, or row key. A primary key is identified with PRIMARY KEY
in a column family definition. Some Cassandra APIs require the primary key column to be called KEY
, which is the default name for the primary key column. Other Cassandra APIs do not have such a requirement. When an identifier other than KEY
is used for the primary key column, a key alias for the primary key is set automatically. The only requirements to define a new column family are a column family name and a primary key and its associated type. The storage model used by Cassandra is shown in Figure 1.1.
As of Cassandra Query Language (CQL) 3, which is similar to SQL, a column family is also called a table. A key-value pair in a table is also called a record. Column values that have the same primary key comprise a row, which makes a column family a container of rows, as shown in Figure 1.2. A key-value pair in a column family is the primary key and the row of data (value) associated with a primary key.
The primary key must be associated with a data type. Each column may optionally be associated with a data type, which is used during the serialization and de-serialization of data. The different data types supported by the row KEY
values and the data columns values are called the CQL data types. In fact, a data type may also be associated with a column name, not just the column values. The different data types supported by CQL are discussed in Table 1.1.
This section discusses the different packages and classes in the Hector Java client API. The entry points of the Hector API are defined in the me.prettyprint.hector.api
package, which is illustrated in Figure 1.3.
The main interfaces in the me.prettyprint.hector.api
package are discussed in Table 1.2.
The serializers used to convert between bytes and different data types are defined in the me.prettyprint.cassandra.serializers
package, which is illustrated in Figure 1.4.
The main classes in the me.prettyprint.cassandra.serializers
package are discussed in Table 1.3.
The service interfaces and classes are defined in the me.prettyprint.cassandra.service
package, which is illustrated in Figure 1.5.
The main classes in the me.prettyprint.cassandra.service
package are discussed in Table 1.4.
The bean interfaces used to encapsulate columns, column slices, and rows are specified in the me.prettyprint.hector.api.beans
package, which is illustrated in Figure 1.6.
The main interfaces in the me.prettyprint.hector.api.beans
package are discussed in Table 1.5.
The data definition language operations supported by Hector are specified in the me.prettyprint.hector.api.ddl
package, which is illustrated in Figure 1.7. The package is used for adding and removing new keyspaces and column families, and for defining indices.
The main interfaces and classes in the me.prettyprint.hector.api.ddl
package are discussed in Table 1.6. DDL operations are performed serially. Concurrent DDL operations are not supported.
The exceptions that a Hector client application could throw are specified in the me.prettyprint.hector.api.exceptions
package, which is illustrated in Figure 1.8.
The main exception classes are discussed in Table 1.7.
The me.prettyprint.hector.api.factory
package, which is illustrated in Figure 1.9, contains only the HFactory
class, which is a convenience class with static methods to create keyspaces, column definitions, mutators, columns, and queries, to list a few.
The me.prettyprint.hector.api.mutation
package contains classes for mutations (insertions, deletions, and such), and is illustrated in Figure 1.10.
The me.prettyprint.hector.api.mutation
package contains only two classes, which are discussed in Table 1.8.
The different types of queries supported by Hector are defined in the me.prettyprint.hector.api.query
package interfaces, as illustrated in Figure 1.11.
The main interfaces in the me.prettyprint.hector.api.query
package are discussed in Table 1.9.
Some of the fields, such as keyspace, column family name, key serializer, and column family serializer, are used in every Hector client operation and have to be passed in for every operation separately. The me.prettyprint.cassandra.service.template
package provides class and interface types to create templates for Hector operations—templates that may be used repeatedly without having to pass in the fields for each operation separately. The me.prettyprint.cassandra.service.template
package class and interface types are illustrated in Figure 1.12.
The class and interfaces in the me.prettyprint.cassandra.service.template
package are discussed in Table 1.10.
In the next section, you will set the environment to access Cassandra from the Hector Java client.
To set the environment, you must download the following software:
Apache Cassandra apache-cassandra-2.0.4-bin.tar.gz or a later version from http://cassandra.apache.org/download/.
Hector Java client hector-core-1.1-4.jar or a later version from http://repo2.maven.org/maven2/org/hectorclient/hector-core/1.1-4/.
Eclipse IDE for Java EE developers from https://eclipse.org/downloads/packages/eclipse-ide-java-ee-developers/kepler.
Apache Commons Lang 2.6 from http://commons.apache.org/proper/commons-lang/download_lang.cgi.
Java SE 6 or later, preferably Java SE 7 or Java SE 8. Java SE 7 is used in this chapter.
Then follow these steps:
1. Install the Eclipse IDE.
2. Extract the Apache Cassandra TAR file to a directory (for example, C:Cassandraapache-cassandra-2.0.4).
3. Add the bin folder, C:Cassandraapache-cassandra-2.0.4in, to the PATH
environment variable.
4. Start Apache Cassandra server with the following command: cassandra
–f
The Cassandra server starts and begins listening for CQL clients on localhost:9042
. Cassandra also listens for Thrift clients on localhost:9160
, as shown in Figure 1.13.
In this section, you will develop a Java project in Eclipse to use the Hector Java client with Cassandra. Follow these steps:
1. Choose File > New > Other in the Eclipse IDE.
2. In the New window, select the Java Project wizard as shown in Figure 1.14. Then click Next.
3. In the Create a Java Project screen, specify a project name (Hector) and a directory location for the Java project and click Next. (See Figure 1.15.)
4. In the Java Settings dialog box, select the default settings and click Finish, as shown in Figure 1.16. A Java project is created and is added to the Package Explorer, as shown in Figure 1.17.
5. Add a Java client class to access Cassandra using Hector. To do so, again choose File > New > Other. This time, however, choose Java > Class in the New window. Then click Next. (See Figure 1.18.)
6. In the New Java Class wizard, select a source folder, specify a package (hector
), enter a class name (HectorClient
), and click Finish, as shown in Figure 1.19. A Java class HectorClient
is created, as shown in the Package Explorer in Figure 1.20.
7. To be able to access Cassandra from the Java application using Hector, you need to add some JAR files to the Java build path of the application. To begin, right-click the Hector project node in the Package Explorer and select Properties.
8. In the Properties window, select the Java Build Path node. Then select Libraries and click Add External JARs to add external JAR files. Add the JAR files listed in Table 1.11.
9. The external JAR files required for accessing Cassandra from a Hector Java client application are shown in the Eclipse IDE Properties wizard. Click OK after adding the required JAR files, as shown in Figure 1.21.
Cluster
OBJECTThe me.prettyprint.hector.api.Cluster
interface defines a cluster of Cassandra hosts. To be able to access a Cassandra cluster, you must first create a Cluster
instance for a Cassandra cluster. The HFactory
class provides several static
methods to get or create a Cluster
instance, as listed in Table 1.12.
In the HectorClient
class, create a Cluster
instance using the getOrCreateCluster (String clusterName, String hostIp)
method as follows:
Cluster cluster = HFactory.getOrCreateCluster("hector-cluster","localhost:9160");
Alternatively, you may create a Cluster
instance as follows:
String clusterName = " hector-cluster"; String host = "localhost:9160"; Cluster cluster = HFactory.getOrCreateCluster(clusterName, new CassandraHostConfigurator(host));
You’ll add a method createSchema()
to create a column family definition in the next section. You are not expected to build the HectorClient
class from code snippets. Instead, copy the listing at the end of the discussion.
A schema consists of a column family definition and a keyspace definition. The HFactory
class provides several static
methods to create a column family definition, as listed in Table 1.13.
The HFactory
class also provides the methods discussed in Table 1.14 to create a keyspace definition.
Add a method createSchema()
to create a column family definition and a keyspace definition for the schema. Then create a column family definition for a column family named "catalog"
, a keyspace named HectorKeyspace
, and a comparator named ComparatorType.BYTESTYPE
:
ColumnFamilyDefinition cfDef = HFactory.createColumnFamilyDefinition ("HectorKeyspace", "catalog", ComparatorType.BYTESTYPE);
Using a replication factor of 1, create a KeyspaceDefinition
instance from the preceding column family definition. The replication factor is the number of copies or replicas of each row of data stored in a cluster node. Specify the strategy class as org.apache.cassandra.locator.SimpleStrategy
using the constant
ThriftKsDef.DEF_STRATEGY_CLASS
:
KeyspaceDefinition keyspace = HFactory.createKeyspaceDefinition ("HectorKeyspace", ThriftKsDef.DEF_STRATEGY_CLASS,replicationFactor, Arrays.asList(cfDef));
Cassandra supports the strategy classes, which refer to the replica placement strategy class, discussed in Table 1.15.
Having created a keyspace definition, you need to add the keyspace definition to the Cluster
instance. The Cluster
interface provides the methods discussed in Table 1.16 to add a keyspace definition.
Add the keyspace definition to the Cluster
instance. With the blockUntilComplete
set to true
, the method blocks until schema agreement is received from the server:
cluster.addKeyspace(keyspace, true);
Adding a keyspace definition to a Cluster
instance does not create a keyspace. In the next section, you will create a keyspace. Invoke the createSchema()
method based on whether the KeyspaceDefinition
is not already defined. The Cluster
interface provides a method describeKeyspace(String)
to find out whether a KeyspaceDefinition
is already defined. If the method returns null
, the KeyspaceDefinition
is not defined.
KeyspaceDefinition keyspaceDef = cluster.describeKeyspace("HectorKeyspace"); if (keyspaceDef == null) { createSchema(); }
Having added a keyspace definition, you need to create a keyspace. A keyspace is represented with the me.prettyprint.hector.api.Keyspace
interface. The HFactory
class provides static
methods to create a keyspace from a Cluster
instance to which a keyspace definition has been added. Invoke the method createKeyspace(String keyspace, Cluster cluster)
to create a keyspace with the name HectorKeyspace
:
private static void createKeyspace() { keyspace = HFactory.createKeyspace("HectorKeyspace", cluster); }
Templates provide a reusable construct containing the fields common to all Hector client operations. Create an instance of ThriftColumnFamilyTemplate
using a class constructor ThriftColumnFamilyTemplate(Keyspace keyspace, String columnFamily, Serializer<K> keySerializer, Serializer<N> topSerializer)
. Use the keyspace
instance created in the preceding section and specify the column family name as "catalog"
.
ThriftColumnFamilyTemplate template = new ThriftColumnFamilyTemplate<String, String>(keyspace,"catalog", StringSerializer.get(), StringSerializer.get());
Next, you will add table data to the column family "catalog"
in the keyspace HectorKeyspace
.
As discussed, the me.prettyprint.hector.api.mutation
package provides the Mutator
class to add data. First, you need to create an instance of Mutator
using the static
method createMutator(Keyspace keyspace, Serializer<K> keySerializer)
in HFactory
. Supply the keyspace
instance previously created as well as a StringSerializer
instance.
Mutator<String> mutator = HFactory.createMutator(keyspace,StringSerializer.get());
Column data may be added as a single column or a batch of columns. We will discuss each of these approaches in the next two sections.
First, you’ll learn how to add a single column of data. The Mutator
class provides the method discussed in Table 1.17 to add a single column of data.
Add a column with the insert
method using primary key column "catalog3"
and the column family name "catalog"
. Create the HColumn
instance using the HFactory static
method createStringColumn(String name,String value)
.
private static void addTableDataColumn() { Mutator<String> mutator = HFactory.createMutator(keyspace, StringSerializer.get()); MutationResult result=mutator.insert("catalog3", "catalog", HFactory.createStringColumn("journal", "Oracle Magazine")); System.out.println(result); }
Output the MutationResult
returned by the insert
method. The HFactory
class also provides several overloaded createColumn
methods that return an HColumn
instance. To run the HectorClient
class and invoke the addTableDataColumn()
method, add an invocation of the method in the main
method. To run the class, right-click the HectorClient Java file in Package Explorer and select Run As > Java Application, as shown in Figure 1.22.
A single column is added, as shown by MutationResult
. The output in Eclipse, shown in Figure 1.23, also has the column added, having been retrieved using a column query, which is discussed later in this chapter.
In the next section, you will add multiple columns.
The Mutator
class provides the method discussed in Table 1.18 to add an HColumn
instance and return the Mutator
instance, which may be used again to add another HColumn
instance. You can add a series of HColumn
instances by invoking the Mutator
instance sequentially.
Add a static method addTableData()
to make multiple mutations using the same instance of Mutator
. Add multiple columns to a Mutator
instance using the addInsertion
invocations in series.
Mutator<String> mutator = HFactory.createMutator(keyspace,StringSerializer .get()); mutator.addInsertion("catalog1", "catalog",HFactory.createStringColumn ("journal", "Oracle Magazine")).addInsertion("catalog1","catalog",HFactory. createStringColumn("publisher","Oracle Publishing")).addInsertion ("catalog1","catalog",HFactory.createStringColumn("edition","November-December 2013")).addInsertion("catalog1","catalog",HFactory.createStringColumn ("title","Quintessential and Collaborative")).addInsertion("catalog1", "catalog",HFactory.createStringColumn("author", "Tom Haunert"));
Instances of HColumn
added using the same KEY
constitute a row. The preceding example creates a row of data with KEY "catalog1"
in the "catalog"
column family. Another row with KEY "catalog2"
could be added similarly.
mutator.addInsertion("catalog2", "catalog", HFactory.createStringColumn ("journal", "Oracle Magazine")) .addInsertion("catalog2","catalog",HFactory.createStringColumn ("publisher","Oracle Publishing")).addInsertion("catalog2","catalog",HFactory. createStringColumn("edition", "November-December 2013")).addInsertion ("catalog2","catalog",HFactory.createStringColumn("title", "Engineering as a Service")).addInsertion("catalog2","catalog",HFactory.createStringColumn ("author", "David A. Kelly"));
The mutations added to the Mutator
instance are not sent to the Cassandra server yet. To send them, you invoke the execute()
method. This runs the batch of mutations added to the Mutator
instance.
mutator.execute();
Invoke the addTableData()
method from the main
method and run the HectorClient
class to add data in a batch.
In this section, you will retrieve the previously added table data. As discussed, the me.prettyprint.hector.api.query
package provides several interfaces representing different types of queries. First, you will query a single column.
The ColumnQuery<K,N,V>
interface represents a single standard column query. HFactory
provides the methods discussed in Table 1.19 to query a single column.
Create a ColumnQuery
instance using the static
method createStringColumnQuery (Keyspace keyspace)
:
ColumnQuery<String, String, String> columnQuery = HFactory.createStringColumn Query(keyspace);
The ColumnQuery
interface provides the methods discussed in Table 1.20 to set the fields of the query, each of which return a ColumnQuery<K,N,V>
instance.
Set the column family name to "catalog"
, the primary key value to "catalog3"
, and the column name to "journal"
:
private static void retrieveTableDataColumnQuery() { columnQuery.setColumnFamily("catalog").setKey("catalog3").setName ("journal"); }
The QueryResult<T>
interface represents the return type from queries, with the type parameter T
being the type of result. After setting the query attributes, invoke the execute()
method to return a QueryResult<HColumn<String, String>>
object.
QueryResult<HColumn<String, String>> result = columnQuery.execute();
Next, output the result value using the method get()
in the QueryResult
interface:
System.out.println(result.get());
Finally, invoke the retrieveTableDataColumnQuery()
method from the main
method to output the result of the query, as shown in Figure 1.24.
In this section, you will query multiple columns using an instance of ThriftColumnFamilyTemplate
. This provides a reusable template with the common query attributes set to make repeated Hector queries. You created an instance of ThriftColumnFamilyTemplate
in an earlier section. The ThriftColumnFamilyTemplate
class provides several overloaded methods called queryColumns
to query multiple columns in the same query, as discussed in Table 1.21.
Each of the methods in Table 1.21 returns a ColumnFamilyResult
instance. Add a retrieveTableData()
method to query multiple columns. Using the template, query the columns in the row corresponding to the "catalog1"
key.
ColumnFamilyResult<String, String> res = template.queryColumns("catalog1");
The ColumnFamilyResult
interface provides several get
methods to get the different types of results, as discussed in Table 1.22.
You can use the hasResults()
method to find out whether a ColumnFamilyResult
instance has a result. Output the String
column values in the ColumnFamilyResult
instance obtained from the preceding query.
if(res.hasResults()){ String journal = res.getString("journal"); String publisher = res.getString("publisher"); String edition = res.getString("edition"); String title = res.getString("title"); String author = res.getString("author"); System.out.println(journal); System.out.println(publisher); System.out.println(edition); System.out.println(title); System.out.println(author); }
Similarly, query the columns corresponding with the row with the "catalog2"
key and output the result. Invoke the retrieveTableData()
method in the main
method and run the HectorClient
class to output the query result, as shown in Figure 1.25.
A slice query is a query of only a slice of columns—that is, columns that are either specified or in a certain range indicated. A set of columns is represented with the ColumnSlice<N,V>
interface. A slice query is represented with the SliceQuery<K,N,V>
interface.
The SliceQuery<K,N,V>
interface provides the methods discussed in Table 1.23 to set the attributes of the query.
Add a retrieveTableDataSliceQuery()
method to the query using a slice query. The HFactory
class provides the method discussed in Table 1.24 to create a SliceQuery
instance.
Using the Keyspace
instance previously created, create a SliceQuery<String, String, String>
instance using the createSliceQuery()
method. Set the column family as "catalog"
and set the row key as "catalog2"
. Use StringSerializer
instances for the column name, key, and column value.
SliceQuery<String, String, String> query = HFactory.createSliceQuery(keyspace, StringSerializer.get(),StringSerializer.get(), StringSerializer.get()).setKey ("catalog2").setColumnFamily("catalog");
The ColumnSliceIterator
class is used to iterate over the columns in a SliceQuery
instance and to retrieve the column values. The ColumnSliceIterator
class provides the constructors discussed in Table 1.25.
Create a ColumnSliceIterator
instance using a start
for the column name of "u0000"
, which is the smallest value of type char
, and using a finish
of "uFFFF"
, the largest value of type char
. Specify the SliceQuery
instance and set the reversed
parameter to false
.
ColumnSliceIterator<String, String, String> iterator = new ColumnSliceIterator<String, String, String>(query, "u0000", "uFFFF", false);
Then iterate over the columns to get the column name and column value for each of the columns.
while (iterator.hasNext()) { HColumn<String, String> column = iterator.next(); System.out.println(column.getName()); System.out.println(column.getValue()); }
Invoke the retrieveTableDataSliceQuery()
method from the main
method to output the column names and column values, as shown in Eclipse in Figure 1.26, when the HectorClient application is run.
MultigetSliceQuery
In the preceding section, you queried multiple columns from only a single row. In this section, you will query columns from multiple rows. The MultigetSliceQuery<K,N,V>
interface is used for a query over multiple rows. The MultigetSliceQuery<K,N,V>
interface provides the methods discussed in Table 1.26 to set and get query fields.
All the methods in Table 1.26 return a MultigetSliceQuery
instance except the getColumnNames()
method. First, however, you need to create an instance of MultigetSliceQuery
. The HFactory
class provides the method discussed in Table 1.27 to create an instance of MultigetSliceQuery
.
Add a retrieveTableDataMultigetSliceQuery()
method to the query using a multi-get query. Using the Keyspace
instance created earlier and StringSerializer
instances, create an instance of MultigetSliceQuery<String, String, String>
using the HFactory
method createMultigetSliceQuery
.
MultigetSliceQuery<String, String, String> multigetSliceQuery = HFactory.createMultigetSliceQuery(keyspace, StringSerializer.get(), StringSerializer.get(), StringSerializer.get());
Next, set the column family as "catalog"
and row keys as "catalog1"
, "catalog2"
, and "catalog3"
.
multigetSliceQuery.setColumnFamily("catalog"); multigetSliceQuery.setKeys("catalog1", "catalog2", "catalog3");
Set the range of columns with the setRange
method. Empty strings for start
and finish
imply that all the columns are to be queried. Set the number of columns to get to 5
and set the reversed boolean
to false
.
multigetSliceQuery.setRange("", "", false, 5);
Next, invoke the execute()
method on the MultigetSliceQuery<String, String, String>
instance to get the query result as a QueryResult<Rows<String, String, String>>
instance.
QueryResult<Rows<String, String, String>> result = multigetSliceQuery.execute();
Get the result value using the get()
method in the QueryResult
interface. The type of the result is Rows<String, String, String>
. Get each of the Row
instances in Rows
using the getByKey(K key
) method. The Row<K,N,V>
interface is a tuple consisting of a Key
and a column slice.
System.out.println(result.get().getByKey("catalog1")); System.out.println(result.get().getByKey("catalog2")); System.out.println(result.get().getByKey("catalog3"));
Invoke the retrieveTableDataMultigetSliceQuery()
method from the main
method to output the result of the multigetSliceQuery
instance, as shown in Figure 1.27.
In another run of the application, set the number of columns in the query to 3.
multigetSliceQuery.setRange("", "", false, 3);
As shown in Figure 1.28, only three of the columns are included in the query result.
Figure 1.28
Query result for multigetSliceQuery
instance for three columns.
Source: Eclipse Foundation.
The MultigetSliceQuery
interface discussed in the preceding section sets the row keys for which columns are to be retrieved explicitly. Alternatively, you can use the RangeSlicesQuery<K,N,V>
interface to set the row keys as a range instead of setting each key explicitly. For example, if row key values "catalog1"
, "catalog2"
, "catalog3"
, "catalog4"
, and "catalog5"
are defined, you could set the range to start at "catalog1"
and end at "catalog5"
to include all the row key values in between. Some of the methods in the RangeSlicesQuery<K,N,V>
interface are discussed in Table 1.28.
Add a retrieveTableDataRangeSlicesQuery()
method to use the RangeSlicesQuery<K,N,V>
interface. The HFactory
class provides the method discussed in Table 1.29 to create a RangeSlicesQuery
instance.
Using StringSerializer
instances, create a RangeSlicesQuery<String, String, String>
instance using the HFactory
method createRangeSlicesQuery
.
RangeSlicesQuery<String, String, String> rangeSlicesQuery =HFactory. createRangeSlicesQuery(keyspace, StringSerializer.get(), StringSerializer.get(), StringSerializer.get());
Next, set the column family to "catalog"
and set the range of keys to start at "catalog1"
and end at "catalog3"
.
rangeSlicesQuery.setColumnFamily("catalog"); rangeSlicesQuery.setKeys("catalog1", "catalog3");
Set the range of columns to include all the columns as indicated by the empty strings for start
and finish
. Set the number of columns to get to 5
.
rangeSlicesQuery.setRange("", "", false, 5);
Next, invoke the execute()
method on the RangeSlicesQuery<String, String, String>
instance to make the query. The result is returned as a QueryResult<OrderedRows< String, String, String>>
instance.
QueryResult<OrderedRows<String, String, String>> result = rangeSlicesQuery. execute();
Invoke the get()
method on the QueryResult
instance to get the result value. Then invoke the getByKey
method on each of the Row
instances to get the row retrieved.
System.out.println(result.get().getByKey("catalog1")); System.out.println(result.get().getByKey("catalog2")); System.out.println(result.get().getByKey("catalog3"));
Invoke the retrieveTableDataRangeSlicesQuery()
method in the main
method and run the HectorClient
class to output the result. The result of the query as output in Eclipse is shown in Figure 1.29.
In this section, you will update row data added previously. The ColumnFamilyUpdater<K,N>
class is used to update a row of data and provides the constructors discussed in Table 1.30.
Alternatively, a ColumnFamilyUpdater
may be created using a ThriftColumnFamilyTemplate
instance, which provides the methods discussed in Table 1.31 for creating a ColumnFamilyUpdater
.
Add an updateTableData()
method to update a row of data. Create a ColumnFamilyUpdater<String, String>
instance using the createUpdater(K key)
method with the supplied key—for example, "catalog2"
.
ColumnFamilyUpdater<String, String> updater = template.createUpdater("catalog2");
The ColumnFamilyUpdater
interface provides several methods for setting an updated value for a column, some of which are listed in Table 1.32.
Set the updated values for the columns in the row for key "catalog2"
.
updater.setString("journal", "Oracle-Magazine"); updater.setString("publisher", "Oracle-Publishing"); updater.setString("edition", "11/12 2013"); updater.setString("title", "Engineering as a Service"); updater.setString("author", "Kelly, David A.");
When a ColumnFamilyUpdater
instance has been constructed with the updated values, you can invoke the update(ColumnFamilyUpdater<K, N> updater)
method to update.
try { template.update(updater); } catch (HectorException e) { }
Invoke the updateTableData()
method from the main
method and run the HectorClient application to update the row with key "catalog2"
. Then query row "catalog2"
using the retrieveTableData()
method to output the updated values, as shown in Figure 1.30.
Next, you will delete data from Cassandra database. As when adding row column(s), you need to create a Mutator
instance using a Keyspace
instance and a StringSerializer
.
Mutator<String> mutator = HFactory.createMutator(keyspace,StringSerializer.get());
As with adding data, you can delete data as a single column or delete multiple columns of data as a batch.
The Mutator
interface provides the method discussed in Table 1.33 for deleting a column.
Add a deleteTableDataColumn()
method to the HectorClient
class. Then delete the "journal"
column in the "catalog"
column family in the row with key as "catalog3"
and using a StringSerializer
.
mutator.delete("catalog3", "catalog", "journal", StringSerializer.get());
Invoke the deleteTableDataColumn()
method in the main
method and run the HectorClient application. The delete
method returns a MutationResult
instance. Invoke the retrieveTableDataMultigetSliceQuery()
method after invoking the deleteTableDataColumn()
method to output the modified row set. The row set output using the retrieveTableDataMultigetSliceQuery()
method before a single column is deleted is shown in Figure 1.31.
Figure 1.32 shows the row set output using the retrieveTableDataMultigetSliceQuery()
method after a single column is deleted. The journal
column is not included in the "catalog3"
row because the column has been deleted.
In this section, you will delete multiple columns from a row. The Mutator
interface provides the overloaded addDeletion
methods for deleting multiple columns from a row. Some of the overloaded addDeletion
methods are listed in Table 1.34.
All the addDeletion
methods return a Mutator
instance, which can be used to invoke the addDeletion
method again to link a batch of deletions. Add a deleteTableData()
method to delete a batch of columns. Then create a Mutator
instance from the HFactory
class.
Mutator<String> mutator = HFactory.createMutator(keyspace,StringSerializer .get());
Invoke the addDeletion()
method multiple times in sequence to add delete mutations for the "journal"
, "publisher"
, and "edition"
columns from the "catalog2"
row in the "catalog"
column family. Adding delete mutations with the addDeletion()
method does not delete the columns by itself. Invoke the execute()
method to delete the mutations added to the Mutator
instance.
mutator.addDeletion("catalog2", "catalog", "journal",StringSerializer.get()).addDeletion("catalog2", "catalog", "publisher", StringSerializer.get()) addDeletion("catalog2", "catalog", "edition", StringSerializer.get()).execute();
Invoke the deleteTableData()
method in the main
method and run the HectorClient application to delete the columns added using the addDeletion()
method. If the retrieveTableData()
method is invoked before the batch deletions are applied, the query result shown in Figure 1.33 is output.
If the retrieveTableData()
method is invoked after the batch deletions are applied, the query result shown in Figure 1.34 is output. The "journal"
, "publisher"
, and "edition"
columns are shown as deleted.
HectorClient
CLASSThe HectorClient
class appears in Listing 1.1.
Listing 1.1 The HectorClient
Class
package hector; import java.util.Arrays; import me.prettyprint.cassandra.serializers.StringSerializer; import me.prettyprint.cassandra.service.ColumnSliceIterator; import me.prettyprint.cassandra.service.ThriftKsDef; import me.prettyprint.hector.api.Cluster; import me.prettyprint.hector.api.Keyspace; import me.prettyprint.hector.api.beans.HColumn; import me.prettyprint.hector.api.beans.OrderedRows; import me.prettyprint.hector.api.beans.Rows; import me.prettyprint.hector.api.ddl.ColumnFamilyDefinition; import me.prettyprint.hector.api.ddl.ComparatorType; import me.prettyprint.hector.api.ddl.KeyspaceDefinition; import me.prettyprint.hector.api.exceptions.HectorException; import me.prettyprint.hector.api.factory.HFactory; import me.prettyprint.hector.api.mutation.MutationResult; import me.prettyprint.hector.api.mutation.Mutator; import me.prettyprint.hector.api.query.ColumnQuery; import me.prettyprint.hector.api.query.MultigetSliceQuery; import me.prettyprint.hector.api.query.Query; import me.prettyprint.hector.api.query.QueryResult; import me.prettyprint.hector.api.query.RangeSlicesQuery; import me.prettyprint.hector.api.query.SliceQuery; import me.prettyprint.cassandra.service.template.ColumnFamilyResult; import me.prettyprint.cassandra.service.template.ColumnFamilyTemplate; import me.prettyprint.cassandra.service.template.ColumnFamilyUpdater; import me.prettyprint.cassandra.service.template.ThriftColumnFamilyTemplate; public class HectorClient { private static Cluster cluster; private static Keyspace keyspace; private static ColumnFamilyTemplate<String, String> template; public static void main(String[] args) { cluster = HFactory.getOrCreateCluster("hector_cluster", "localhost:9160"); KeyspaceDefinition keyspaceDef = cluster .describeKeyspace("HectorKeyspace"); if (keyspaceDef == null) { createSchema(); } createKeyspace(); createTemplate(); addTableData(); // addTableDataColumn(); // deleteTableDataColumn(); // addTableDataColumn(); // retrieveTableDataColumnQuery(); // updateTableData(); // deleteTableDataColumn(); // retrieveTableDataColumnQuery(); // deleteTableData(); // retrieveTableData(); // retrieveTableDataSliceQuery(); retrieveTableDataMultigetSliceQuery(); } private static void createSchema() { int replicationFactor = 1; ColumnFamilyDefinition cfDef = HFactory.createColumnFamily Definition( "HectorKeyspace", "catalog", ComparatorType. BYTESTYPE); KeyspaceDefinition keyspace = HFactory.createKeyspaceDefinition( "HectorKeyspace", ThriftKsDef.DEF_STRATEGY_CLASS, replicationFactor, Arrays.asList(cfDef)); cluster.addKeyspace(keyspace, true); } private static void createKeyspace() { keyspace = HFactory.createKeyspace("HectorKeyspace", cluster); } private static void createTemplate() { template = new ThriftColumnFamilyTemplate<String, String> (keyspace, "catalog", StringSerializer.get(), StringSerializer.get()); } private static void addTableData() { Mutator<String> mutator = HFactory.createMutator(keyspace, StringSerializer.get()); mutator.addInsertion("catalog1", "catalog", HFactory.createStringColumn("journal", "Oracle Magazine")) .addInsertion( "catalog1", "catalog", HFactory.createStringColumn ("publisher", "Oracle Publishing")) .addInsertion( "catalog1", "catalog", HFactory.createStringColumn ("edition", "November-December 2013")) .addInsertion( "catalog1", "catalog", HFactory.createStringColumn ("title", "Quintessential and Collaborative")) .addInsertion("catalog1", "catalog", HFactory.createStringColumn ("author", "Tom Haunert")); mutator.addInsertion("catalog2", "catalog", HFactory.createStringColumn("journal", "Oracle Magazine")) .addInsertion( "catalog2", "catalog", HFactory.createStringColumn ("publisher", "Oracle Publishing")) .addInsertion( "catalog2", "catalog", HFactory.createStringColumn ("edition", "November-December 2013")) .addInsertion( "catalog2", "catalog", HFactory.createStringColumn ("title", "Engineering as a Service")) .addInsertion("catalog2", "catalog", HFactory.createStringColumn ("author", "David A. Kelly")); mutator.execute(); } private static void retrieveTableData() { try { ColumnFamilyResult<String, String> res = template .queryColumns("catalog1"); if(res.hasResults()){ String journal = res.getString("journal"); String publisher = res.getString("publisher"); String edition = res.getString("edition"); String title = res.getString("title"); String author = res.getString("author"); System.out.println(journal); System.out.println(publisher); System.out.println(edition); System.out.println(title); System.out.println(author); } res = template.queryColumns("catalog2"); if(res.hasResults()){ journal = res.getString("journal"); publisher = res.getString("publisher"); edition = res.getString("edition"); title = res.getString("title"); author = res.getString("author"); System.out.println(journal); System.out.println(publisher); System.out.println(edition); System.out.println(title); System.out.println(author); } }catch (HectorException e) { } } private static void retrieveTableDataColumnQuery() { ColumnQuery<String, String, String> columnQuery = HFactory .createStringColumnQuery(keyspace); columnQuery.setColumnFamily("catalog").setKey("catalog3") .setName("journal"); // columnQuery.setColumnFamily("catalog").setKey("catalog1").setName("journal"); QueryResult<HColumn<String, String>> result = columnQuery.execute (); System.out.println(result.get()); } private static void retrieveTableDataSliceQuery() { SliceQuery<String, String, String> query = HFactory .createSliceQuery(keyspace, StringSerializer.get(), StringSerializer.get(), StringSerializer.get()) .setKey("catalog2").setColumnFamily("catalog"); ColumnSliceIterator<String, String, String> iterator = new ColumnSliceIterator<String, String, String>(query, "u0000", "uFFFF", false); while (iterator.hasNext()) { HColumn<String, String> column = iterator.next(); System.out.println(column.getName()); System.out.println(column.getValue()); } } private static void addTableDataColumn() { Mutator<String> mutator = HFactory.createMutator(keyspace, StringSerializer.get()); MutationResult result=mutator.insert("catalog3", "catalog", HFactory.createStringColumn("journal", "Oracle Magazine")); System.out.println(result); } private static void updateTableData() { ColumnFamilyUpdater<String, String> updater = template .createUpdater("catalog2"); updater.setString("journal", "Oracle-Magazine"); updater.setString("publisher", "Oracle-Publishing"); updater.setString("edition", "11/12 2013"); updater.setString("title", "Engineering as a Service"); updater.setString("author", "Kelly, David A."); try { template.update(updater); } catch (HectorException e) { } } private static void deleteTableDataColumn() { Mutator<String> mutator = HFactory.createMutator(keyspace, StringSerializer.get()); mutator.delete("catalog3", "catalog", "journal", StringSerializer.get()); } private static void deleteTableData() { Mutator<String> mutator = HFactory.createMutator(keyspace, StringSerializer.get()); mutator.addDeletion("catalog2", "catalog", "journal", StringSerializer.get()) .addDeletion("catalog2", "catalog", "publisher", StringSerializer.get()) .addDeletion("catalog2", "catalog", "edition", StringSerializer.get()).execute(); } private static void retrieveTableDataMultigetSliceQuery() { MultigetSliceQuery<String, String, String> multigetSliceQuery = HFactory.createMultigetSliceQuery(keyspace, StringSerializer.get(), StringSerializer.get(), StringSerializer.get()); multigetSliceQuery.setColumnFamily("catalog"); multigetSliceQuery.setKeys("catalog1", "catalog2", "catalog3"); //multigetSliceQuery.setRange("", "", false, 3); //multigetSliceQuery.setRange("", "", false, 2); multigetSliceQuery.setRange("", "", false, 5); QueryResult<Rows<String, String, String>> result = multigetSliceQuery.execute(); System.out.println(result.get().getByKey("catalog1")); System.out.println(result.get().getByKey("catalog2")); System.out.println(result.get().getByKey("catalog3")); } private static void retrieveTableDataRangeSlicesQuery() { RangeSlicesQuery<String, String, String> rangeSlicesQuery = HFactory.createRangeSlicesQuery(keyspace, StringSerializer.get(), StringSerializer.get(), StringSerializer.get()); rangeSlicesQuery.setColumnFamily("catalog"); rangeSlicesQuery.setKeys("catalog1", "catalog3"); //rangeSlicesQuery.setRange("", "", false, 5); //rangeSlicesQuery.setRange("", "", false, 3); QueryResult<OrderedRows<String, String, String>> result = rangeSlicesQuery.execute(); System.out.println(result.get().getByKey ("catalog1")); System.out.println(result.get().getByKey ("catalog2")); System.out.println(result.get().getByKey ("catalog3")); } }
This chapter discussed using the Hector Java client to access the Apache Cassandra database and make create, read, update, and delete (CRUD) operations on the database data. The Hector client supports adding and deleting column data as single columns or a batch of columns. Hector supports retrieving column data as single columns or a column slice. Row data may be queried one row at a time or multiple rows in the same query. This chapter discussed the various interfaces and classes involved in making the CRUD operations. The next chapter will discuss the Cassandra Query Language (CQL) for querying Cassandra. You will use the Hector Java client to run the CQL queries.