Chapter 12. Persisting data reactively

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Chapter 12. Persisting data reactively

This chapter covers

Spring Data’s reactive repositories
Writing reactive repositories for Cassandra and MongoDB
Adapting non-reactive repositories for reactive use
Data modeling with Cassandra

As I think about non-blocking reactive code and blocking imperative code, I start to think about rush hour. Rush hour is strangely named. Everybody seems to be in a rush to get where they’re going, but usually they’re all sitting near-motionless in traffic. If it weren’t for everyone else on the road, I’d have no trouble getting to my destination.

Even though I’m eager to get somewhere (I’m non-blocking), that doesn’t mean that someone else on the road isn’t blocking me somehow. There may be other motorists ahead who have had a fender bender and are literally blocking the road for other commuters. So even though my efforts to get home are essentially non-blocking, I’m still blocked until the accident scene is cleared up.

In the previous chapter, you saw how to create reactive, non-blocking controllers with Spring WebFlux. This helps to improve scalability in the web layer. But those controllers are only truly non-blocking if other components that they work with are also non-blocking. If we write Spring WebFlux controllers that still depend on blocking repositories, our reactive controllers will be blocked waiting for them to produce data.

Therefore, it’s important that the entire flow of data, all the way from the controllers to the database, be reactive and non-blocking. In this chapter, you’ll see how to write reactive repositories using Spring Data that follow a similar programming model as those you created in chapter 3. We’ll start by taking a high-level survey of Spring Data’s reactive support.

12.1. Understanding Spring Data’s reactive story

Beginning with the Spring Data Kay release train, Spring Data offered its first support for working with reactive repositories. This includes support for a reactive programming model when persisting data with Cassandra, MongoDB, Couchbase, or Redis.

What’s in a name?

Although Spring Data projects are versioned at their own pace, they’re collectively published in a release train, where each version of the release train is named for a significant figure in computer science.

These names are alphabetical in nature and include names such as Babbage, Codd, Dijkstra, Evans, Fowler, Gosling, Hopper, and Ingalls. At the time this is being written, the most recent release train version is Spring Data Kay, named after Alan Kay, one of the designers of the Smalltalk programming language.

You may have noticed that I failed to mention relational databases or JPA. Unfortunately, there’s no support for reactive JPA. Although relational databases are certainly the most prolific databases in the industry, supporting a reactive programming model with Spring Data JPA would require that the databases and JDBC drivers involved also support non-blocking reactive models. It’s unfortunate that, at least for now, there’s no support for working with relational databases reactively. Hopefully, this situation will be resolved in the near future.

In the meantime, this chapter focuses on using Spring Data to develop repositories that deal in reactive types for those databases that do support a reactive model. Let’s see how Spring Data’s reactive model compares to its non-reactive model.

12.1.1. Spring Data reactive distilled

The essence of Spring Data’s reactive story can be summed up by saying that reactive repositories have methods that accept and return Mono and Flux instead of domain entities and collections. A repository method that fetches Ingredient objects by ingredient type from the backing database might be declared as follows in the repository interface:

Flux<Ingredient> findByType(Ingredient.Type type);

As you can see, this findByType() method returns a Flux<Ingredient> instead of a List<Ingredient> or an Iterable<Ingredient> as its non-reactive analog would.

Likewise, when saving a Taco, the repository would have a saveAll() method with the following signature:

<Taco> Flux<Taco> saveAll(Publisher<Taco> tacoPublisher);

In this case, the saveAll() method accepts a Publisher of type Taco (either a Mono<Taco> or a Flux<Taco>) and returns a Flux<Taco>. This is in contrast to a non-reactive repository, which would have a save() method that deals with the domain type directly, accepting a Taco object and returning the saved Taco object.

Put simply, Spring Data’s reactive repositories share a near-identical programming model with Spring Data’s non-reactive repositories, like those in chapter 3. The only material difference is that reactive repositories have methods that take and return Flux and Mono instead of raw domain types and collections.

12.1.2. Converting between reactive and non-reactive types

Before we look any further at how to write reactive repositories with Spring Data, let’s take a moment to address the elephant in the room. You may have an existing relational database and it may not be practical to migrate your data to one of the four databases that Spring Data supports with its reactive programming model. Does that mean you can’t apply reactive programming in your application at all?

Although the full benefit of reactive programming comes when you have a reactive model from end to end, including at the database level, there’s still some benefit to be had by using reactive flows on top of a non-reactive database. Even though your chosen database doesn’t support non-blocking reactive queries, you can still fetch data in a blocking fashion and then translate it into a reactive type as soon as possible for the benefit of upstream components.

Suppose, for example, that you’re working with a relational database and using Spring Data JPA for persistence. Your OrderRepository may have a method with the following signature:

List<Order> findByUser(User user);

This method will return a non-reactive List<Order> containing all of the Order entities for a given User. When findByUser() is called, it will block while the query is executed and the results are collected into List. Because List isn’t a reactive type, you won’t be able to perform any of the operations afforded by Flux on it. Moreover, if the caller is a controller, it won’t be able to work with the results reactively to achieve improved scalability.

You can’t do anything about the blocking nature of invoking a method on a JPA repository. What you can do, however, is convert the non-reactive List into a Flux as soon as you receive it, so that you can deal with the results reactively from there on. To do so, you simply use Flux.fromIterable():

List<Order> orders = repo.findByUser(someUser);
Flux<Order> orderFlux = Flux.fromIterable(orders);

Likewise, if you were to fetch a single Order by its ID, you could immediately convert it to a Mono:

Order order repo.findById(Long id);
Mono<Order> orderMono = Mono.just(order);

By using Mono.just() and the fromIterable(), fromArray(), and fromStream() methods of Flux, you can isolate the non-reactive blocking code in your repositories and deal with reactive types elsewhere in your application.

What about going the other way? What if you have a Mono or Flux given to you and you need to call save() on a non-reactive JPA repository? Fortunately, Mono and Flux both have operations to extract the data that they publish into domain types or an Iterable.

For example, suppose a WebFlux controller accepts a Mono<Taco>, and you need to save it using the save() method in a Spring Data JPA repository. No problem—just call the block() method on the Mono to extract the Taco object:

Taco taco = tacoMono.block();
tacoRepo.save(taco);

As its name implies, the block() method will perform a blocking operation to perform the extractions.

As for extracting data from a Flux, you’ll likely want to use toIterable(). Let’s say you’re given a Flux<Taco> and need to call saveAll() on a Spring Data JPA repository. The following snippet of code shows how to extract an Iterable<Taco> from a Flux<Taco> to do precisely that:

Iterable<Taco> tacos = tacoFlux.toIterable();
tacoRepo.saveAll(tacos);

As with Mono.block(), Flux.toIterable() blocks as it collects all the objects published by the Flux into an Iterable. Because of their blocking nature, Mono.block() and Flux.toIterable() should be used sparingly and with the clear understanding that using them breaks out of the reactive programming model.

Another more reactive approach that avoids a blocking extraction operation is to subscribe to the Mono or Flux and perform the desired operation on each element as it’s published. For example, to save all Taco objects published by a Flux<Taco> when the repository is non-reactive, you might do something like this:

tacoFlux.subscribe(taco -> {
  tacoRepo.save(taco);
});

Even though the call to the repository’s save() method is still a non-reactive blocking operation, using subscribe() is a more natural, reactive approach to consuming and processing the data published by a Flux or Mono.

But that’s enough talk about how to work with non-reactive repositories. Let’s start using the real power of Spring Data’s reactive support to create reactive repositories for the Taco Cloud application.

12.1.3. Developing reactive repositories

As you saw in chapter 3, one of the most amazing features of Spring Data is the ability to declare repository interfaces and have Spring Data automatically implement them at runtime. In that chapter, we focused primarily on Spring Data JPA, but the same programming model is applicable for nonrelational databases, including Cassandra and MongoDB.

Built on top of their non-reactive repository support, Spring Data Cassandra and Spring Data MongoDB both support a reactive model. With these databases in the backend providing data persistence, Spring applications can truly offer end-to-end reactive flows that span from the web layer to the database. Let’s start by looking at how to persist data to Cassandra using reactive Spring Data repositories.

12.2. Working with reactive Cassandra repositories

Cassandra is a distributed, high-performance, always available, eventually consistent, partitioned-row-store, NoSQL database.

That’s a mouthful of adjectives to describe a database, but each one accurately speaks to the power of working with Cassandra. To put it in simpler terms, Cassandra deals in rows of data, which are written to tables, which are partitioned across one-to-many distributed nodes. No single node carries all the data, but any given row may be replicated across multiple nodes, thus eliminating any single point of failure.

Spring Data Cassandra provides automatic repository support for the Cassandra database that’s quite similar to—and yet quite different from—what’s offered by Spring Data JPA for relational databases. In addition, Spring Data Cassandra offers mapping annotations to map application domain types to the backing database structures.

Before we explore Cassandra any further, it’s important to understand that although Cassandra shares many similar concepts with relational databases like Oracle and SQL Server, Cassandra isn’t a relational database and is in many ways quite a different beast. I’ll try to explain the idiosyncrasies of Cassandra as they pertain to working with Spring Data. But I encourage you to read Cassandra’s own documentation (http://cassandra.apache.org/doc/latest/) for a thorough understanding of what makes Cassandra tick.

Let’s get started by enabling Spring Data Cassandra in the Taco Cloud project.

12.2.1. Enabling Spring Data Cassandra

To get started using Spring Data Cassandra’s reactive repository support, you’ll need to add the Spring Boot starter dependency for reactive Spring Data Cassandra. There are actually two separate Spring Data Cassandra starter dependencies to choose from.

If you aren’t planning to write reactive repositories for Cassandra, you can use the following dependency in your build:

<dependency>
  <groupId>org.springframework.boot</groupId>
  <artifactId>spring-boot-starter-data-cassandra</artifactId>
</dependency>

This dependency is also available from the Initializr by checking the Cassandra check box.

In this chapter, however, we focus on writing reactive repositories, so you’ll want to use the other starter dependency that enables reactive Cassandra repositories:

<dependency>
  <groupId>org.springframework.boot</groupId>
  <artifactId>
    spring-boot-starter-data-cassandra-reactive
  </artifactId>
</dependency>

If you’re using the Spring Initializr to create your project, you can get this dependency in your build by checking the Reactive Cassandra check box.

It’s important to understand that this dependency is in lieu of the Spring Data JPA starter dependency. Instead of persisting Taco Cloud data to a relational database with JPA, you’ll be using Spring Data to persist data to a Cassandra database. Therefore, you’ll probably want to remove the Spring Data JPA starter dependency and any relational database dependencies (such as JDBC drivers or the H2 dependency) from the build.

The Spring Data Reactive Cassandra starter dependency brings a handful of dependencies to the project, among which are the Spring Data Cassandra library and Reactor. As a result of those libraries being in the runtime classpath, autoconfiguration for creating reactive Cassandra libraries is triggered. This means you’re able to begin writing reactive Cassandra repositories with little explicit configuration.

You’ll need to provide a small amount of configuration, though. At the very least, you’ll need to configure the name of a key space within which your repositories will operate. To do that, you’ll first need to create such a key space.

Note

In Cassandra, a keyspace is a grouping of tables in a Cassandra node. It’s roughly analogous to how tables, views, and constraints are grouped in a relational database.

Although it’s possible to configure Spring Data Cassandra to create the key space automatically, it’s typically much easier to manually create it yourself (or to use an existing key space). Using the Cassandra CQL (Cassandra Query Language) shell, you can create a key space for the Taco Cloud application with the following create key-space command:

cqlsh> create keyspace tacocloud
   ... with replication={'class':'SimpleStrategy', 'replication_factor':1}
   ... and durable_writes=true;

Put simply, this will create a key space named tacocloud with simple replication and durable writes. By setting the replication factor to 2, you ask Cassandra to keep one copy of each row. The replication strategy determines how replication is handled. The SimpleStrategy replication strategy is fine for single data center use (and for demo code), but you might consider the NetworkTopologyStrategy if you have your Cassandra cluster spread across multiple data centers. I refer you to the Cassandra documentation for more details of how replication strategies work and alternative ways of creating key spaces.

Now that you’ve created a key space, you need to configure the spring.data.cassandra.keyspace-name property to tell Spring Data Cassandra to use that key space:

spring:
  data:
    cassandra:
      keyspace-name: tacocloud
      schema-action: recreate-drop-unused

Here, you also set the spring.data.cassandra.schema-action to recreate-drop-unused. This setting is very useful for development purposes because it ensures that any tables and user-defined types will be dropped and recreated every time the application starts. The default value, none, takes no action against the schema and is useful in production settings where you’d rather not drop all tables whenever an application starts up.

These are the only properties you’ll need for working with a locally running Cassandra database. In addition to these two properties, however, you may wish to set others, depending on how you’ve configured your Cassandra cluster.

By default, Spring Data Cassandra assumes that Cassandra is running locally and listening on port 9092. If that’s not the case, as in a production setting, you may want to set the spring.data.cassandra.contact-points and spring.data.cassandra.port properties:

spring:
  data:
    cassandra:
      keyspace-name: tacocloud
      contact-points:
      - casshost-1.tacocloud.com
      - casshost-2.tacocloud.com
      - casshost-3.tacocloud.com
      port: 9043

Notice that the spring.data.cassandra.contact-points property is where you identify the hostname(s) of Cassandra. A contact point is the host where a Cassandra node is running. By default, it’s set to localhost, but you can set it to a list of hostnames. It will try each contact point until it’s able to connect to one. This is to ensure that there’s no single point of failure in the Cassandra cluster and that the application will be able to connect with the cluster through one of the given contact points.

You may also need to specify a username and password for your Cassandra cluster. This can be done by setting the spring.data.cassandra.username and spring.data.cassandra.password properties:

spring:
  data:
    cassandra:
       ...
      username: tacocloud
      password: s3cr3tP455w0rd

Now that Spring Data Cassandra is enabled and configured in your project, you’re almost ready to map your domain types to Cassandra tables and write repositories. But first, let’s step back and consider a few basic points of Cassandra data modeling.

12.2.2. Understanding Cassandra data modeling

As I mentioned, Cassandra is quite different from a relational database. Before you can start mapping your domain types to Cassandra tables, it’s important to understand a few of the ways that Cassandra data modeling is different from how you might model your data for persistence in a relational database.

These are a few of the most important things to understand about Cassandra data modeling:

Cassandra tables may have any number of columns, but not all rows will necessarily use all of those columns.
Cassandra databases are split across multiple partitions. Any row in a given table may be managed by one or more partitions, but it’s unlikely that all partitions will have all rows.
A Cassandra table has two kinds of keys: partition keys and clustering keys. Hash operations are performed on each row’s partition key to determine which partition(s) that row will be managed by. Clustering keys determine the order in which the rows are maintained within a partition (not necessarily the order that they may appear in the results of a query).
Cassandra is highly optimized for read operations. As such, it’s common and desirable for tables to be highly denormalized and for data to be duplicated across multiple tables. (For example, customer information may be kept in a customer table as well as duplicated in a table containing orders placed by customers.)

Suffice it to say that adapting the Taco Cloud domain types to work with Cassandra won’t be a matter of simply swapping out a few JPA annotations for Cassandra annotations. You’ll have to rethink how you model the data.

12.2.3. Mapping domain types for Cassandra persistence

In chapter 3, you marked up your domain types (Taco, Ingredient, Order, and so on) with annotations provided by the JPA specification. These annotations mapped your domain types as entities to be persisted to a relational database. Although those annotations won’t work for Cassandra persistence, Spring Data Cassandra provides its own set of mapping annotations for a similar purpose.

Let’s start with the Ingredient class, as it’s the simplest to map for Cassandra. The new Cassandra-ready Ingredient class looks like this:

package tacos;
import org.springframework.data.cassandra.core.mapping.PrimaryKey;
import org.springframework.data.cassandra.core.mapping.Table;
import lombok.AccessLevel;
import lombok.Data;
import lombok.NoArgsConstructor;
import lombok.RequiredArgsConstructor;

@Data
@RequiredArgsConstructor
@NoArgsConstructor(access=AccessLevel.PRIVATE, force=true)
@Table("ingredients")
public class Ingredient {

  @PrimaryKey
  private final String id;
  private final String name;
  private final Type type;

  public static enum Type {
    WRAP, PROTEIN, VEGGIES, CHEESE, SAUCE
  }

}

The Ingredient class seems to contradict everything I said about just swapping out a few annotations. Rather than annotating the class with @Entity as you did for JPA persistence, it’s annotated with @Table to indicate that ingredients should be persisted to a table named ingredients. And rather than annotate the id property with @Id, this time it’s annotated with @PrimaryKey. So far, it seems that you’re only swapping out a few annotations.

But don’t let the Ingredient mapping fool you. The Ingredient class is one of your simplest domain types. Things get more interesting when you map the Taco class for Cassandra persistence.

Listing 12.1. Annotating the `Taco` class for Cassandra persistence

package tacos;
import java.util.Date;
import java.util.List;
import java.util.UUID;
import javax.validation.constraints.NotNull;
import javax.validation.constraints.Size;
import org.springframework.data.cassandra.core.cql.Ordering;
import org.springframework.data.cassandra.core.cql.PrimaryKeyType;
import org.springframework.data.cassandra.core.mapping.Column;
import org.springframework.data.cassandra.core.mapping.PrimaryKeyColumn;
import org.springframework.data.cassandra.core.mapping.Table;
import org.springframework.data.rest.core.annotation.RestResource;
import com.datastax.driver.core.utils.UUIDs;
import lombok.Data;

@Data
@RestResource(rel="tacos", path="tacos")
@Table("tacos")                                                   1
public class Taco {

  @PrimaryKeyColumn(type=PrimaryKeyType.PARTITIONED)              2
  private UUID id = UUIDs.timeBased();

  @NotNull
  @Size(min=5, message="Name must be at least 5 characters long")
  private String name;

  @PrimaryKeyColumn(type=PrimaryKeyType.CLUSTERED,                3
                    ordering=Ordering.DESCENDING)
  private Date createdAt = new Date();

  @Size(min=1, message="You must choose at least 1 ingredient")
  @Column("ingredients")                                          4
  private List<IngredientUDT> ingredients;

}

1 Persists to tacos table
2 Defines the partition key
3 Defines the clustering key
4 Maps list to ingredients column

As you can see, mapping the Taco class is a bit more involved. As with Ingredient, the @Table annotation is used to identify tacos as the name of the table that tacos should be written to. But that’s the only thing similar to Ingredient.

The id property is still your primary key, but it’s only one of two primary key columns. More specifically, the id property is annotated with @PrimaryKeyColumn with a type of PrimaryKeyType.PARTITIONED. This specifies that the id property serves as the partition key, used to determine which Cassandra partition(s) each row of taco data will be written to.

You’ll also notice that the id property is now a UUID instead of a Long. Although it’s not required, properties that hold a generated ID value are commonly of type UUID. Moreover, the UUID is initialized with a time-based UUID value for new Taco objects (but which may be overridden when reading an existing Taco from the database).

A little further down, you see the createdAt property that’s mapped as another primary key column. But in this case, the type attribute of @PrimaryKeyColumn is set to PrimaryKeyType.CLUSTERED, which designates the createdAt property as a clustering key. As mentioned earlier, clustering keys are used to determine the ordering of rows within a partition. More specifically, the ordering is set to descending order—therefore, within a given partition, newer rows appear first in the tacos table.

Finally, the ingredients property is now a List of IngredientUDT objects instead of a List of Ingredient objects. As you’ll recall, Cassandra tables are highly denormalized and may contain data that’s duplicated from other tables. Although the ingredient table will serve as the table of record for all available ingredients, the ingredients chosen for a taco will be duplicated in the ingredients column. Rather than simply reference one or more rows in the ingredients table, the ingredients property will contain full data for each chosen ingredient.

But why do you need to introduce a new IngredientUDT class? Why can’t you just reuse the Ingredient class? Put simply, columns that contain collections of data, such as the ingredients column, must be collections of native types (integers, strings, and so on) or must be collections of user-defined types.

In Cassandra, user-defined types enable you to declare table columns that are richer than simple native types. Often they’re used as a denormalized analog for relational foreign keys. In contrast to foreign keys, which only hold a reference to a row in another table, columns with user-defined types actually carry data that may be copied from a row in another table. In the case of the ingredients column in the tacos table, it will contain a collection of data structures that define the ingredients themselves.

You can’t use the Ingredient class as a user-defined type, because the @Table annotation has already mapped it as an entity for persistence in Cassandra. Therefore, you must create a new class to define how ingredients will be stored in the ingredients column of the taco table. IngredientUDT (where “UDT” means user-defined type) is the class for the job:

package tacos;

import org.springframework.data.cassandra.core.mapping.UserDefinedType;

import lombok.AccessLevel;
import lombok.Data;
import lombok.NoArgsConstructor;
import lombok.RequiredArgsConstructor;

@Data
@RequiredArgsConstructor
@NoArgsConstructor(access=AccessLevel.PRIVATE, force=true)
@UserDefinedType("ingredient")
public class IngredientUDT {

  private final String name;
  private final Ingredient.Type type;

}

Although IngredientUDT looks a lot like Ingredient, its mapping requirements are much simpler. It’s annotated with @UserDefinedType to identify it as a user-defined type in Cassandra. But otherwise, it’s a simple class with a few properties.

You’ll also note that the IngredientUDT class doesn’t include an id property. Although it could include a copy of the id property from the source Ingredient, that’s not necessary. In fact, the user-defined type may include any properties you wish—it doesn’t need to be a one-to-one mapping with any table definition.

I realize that it might be difficult to visualize how data in a user-defined type relates to data that’s persisted to a table. Figure 12.1 shows the data model for the entire Taco Cloud database, including user-defined types.

Figure 12.1. Instead of using foreign keys and joins, Cassandra tables are denormalized, with user-defined types containing data copied from related tables.

Specific to the user-defined type that you just created, notice how Taco has a list of IngredientUDT, which holds data copied from Ingredient objects. When a Taco is persisted, it’s the Taco object and the list of IngredientUDT that’s persisted to the tacos table. The list of IngredientUDT is persisted entirely within the ingredients column.

Another way of looking at this that might help you understand how user-defined types are used is to query the database for rows from the tacos table. Using CQL and the cqlsh tool that comes with Cassandra, you see the following results:

cqlsh:tacocloud> select id, name, createdAt, ingredients from tacos;

 id       | name      | createdat | ingredients
----------+-----------+-----------+----------------------------------------
 827390...| Carnivore | 2018-04...| [{name: 'Flour Tortilla', type: 'WRAP'},
                                     {name: 'Carnitas', type: 'PROTEIN'},
                                     {name: 'Sour Cream', type: 'SAUCE'},
                                     {name: 'Salsa', type: 'SAUCE'},
                                     {name: 'Cheddar', type: 'CHEESE'}]

(1 rows)

As you can see, the id, name, and createdat columns contain simple values. In that regard, they aren’t much different than what you’d expect from a similar query against a relational database. But the ingredients column is a little different. Because it’s defined as containing a collection of the user-defined ingredient type (defined by IngredientUDT), its value appears as a JSON array filled with JSON objects.

You likely noticed other user-defined types in figure 12.1. You’ll certainly be creating some more as you continue mapping your domain to Cassandra tables, including some that will be used by the Order class. The next listing shows the Order class, modified for Cassandra persistence.

Listing 12.2. Mapping the `Order` class to a Cassandra `tacoorders` table

@Data
@Table("tacoorders")                                      1
public class Order implements Serializable {

  private static final long serialVersionUID = 1L;

  @PrimaryKey                                             2
  private UUID id = UUIDs.timeBased();

  private Date placedAt = new Date();

  @Column("user")                                         3
  private UserUDT user;

  // delivery and credit card properties omitted for brevity's sake

  @Column("tacos")                                        4
  private List<TacoUDT> tacos = new ArrayList<>();

  public void addDesign(TacoUTD design) {
    this.tacos.add(design);
  }

}

1 Maps to tacoorders table
2 Declares the primary key
3 Maps to the user column
4 Maps a list to the tacos column

Listing 12.2 purposefully omits many of the properties of Order that don’t lend themselves to a discussion of Cassandra data modeling. What’s left are a few properties and mappings, similar to how Taco was defined. @Table is used to map Order to the tacoorders table, much as @Table has been used before. In this case, you’re unconcerned with ordering, so the id property is simply annotated with @PrimaryKey, designating it as both a partition key and a clustering key with default ordering.

The tacos property is of some interest in that it’s a List<TacoUDT> instead of a list of Taco objects. The relationship between Order and Taco/TacoUDT here is similar to the relationship between Taco and Ingredient/IngredientUDT. That is, rather than joining data from several rows in a separate table through foreign keys, the Order table will contain all of the pertinent taco data, optimizing the table for quick reads.

Similarly, the user property references a UserUDT property to be persisted within the user column. Again, this is in contrast to the relational database strategy of joining in another table.

As for the TacoUDT class, it’s quite similar to the IngredientUDT class, although it does include a collection that references another user-defined type:

@Data
@UserDefinedType("taco")
public class TacoUDT {

  private final String name;
  private final List<IngredientUDT> ingredients;

}

The UserUDT class is only marginally more interesting in that it has three properties instead of two:

@UserDefinedType("user")
@Data
public class UserUDT {

  private final String username;
  private final String fullname;
  private final String phoneNumber;

}

Although it would have been nice to reuse the same domain classes you created in chapter 3, or at most to swap out some JPA annotations for Cassandra annotations, the nature of Cassandra persistence is such that it requires you to rethink how your data is modeled. But now that you’ve mapped your domain, you’re ready to write repositories.

12.2.4. Writing reactive Cassandra repositories

As you saw in chapter 3, writing a repository with Spring Data involves simply declaring an interface that extends one of Spring Data’s base repository interfaces and optionally declaring additional query methods for custom queries. As it turns out, writing reactive repositories isn’t much different. The primary difference is that you’ll extend a different base repository interface, and your methods will deal with reactive publishers such as Mono and Flux, instead of domain types and collections.

When it comes to writing reactive Cassandra repositories, you have the choice of two base interfaces: ReactiveCassandraRepository and ReactiveCrudRepository. Which we choose largely depends on how the repository will be used. ReactiveCassandraRepository extends ReactiveCrudRepository to offer a few variations of an insert() method, which is optimized when the object to be saved is new. Otherwise, ReactiveCassandraRepository offers the same operations as ReactiveCrudRepository. If you’ll be inserting a lot of data, you might choose ReactiveCassandraRepository. Otherwise, it’s better to stick with ReactiveCrudRepository, which is more portable across other database types.

Do my Cassandra repositories have to be reactive?

Although this chapter is about writing reactive repositories with Spring Data, you may be interested to know that you can write non-reactive repositories for Cassandra as well. Rather than extend ReactiveCrudRepository or ReactiveCassandraRepository, your repository interfaces can extend the non-reactive CrudRepository or CassandraRepository interfaces. Then your repository methods can simply return Cassandra-annotated domain types and collections of those domain types instead of Flux and Mono.

If you decide to work with non-reactive repositories, you can also change the starter dependency to spring-boot-starter-data-cassandra instead of spring-boot-starter-data-cassandra-reactive, although it’s not strictly required that you do so.

Revisiting some of the repositories you’ve already written for the Taco Cloud application, the first thing you should do to make them reactive is to have them extend ReactiveCrudRepository or ReactiveCassandraRepository instead of CrudRepository. For example, consider IngredientRepository. Aside from initializing the database with ingredient data, you won’t be inserting many new ingredients. Therefore, IngredientRepository can extend ReactiveCrudRepository as shown here:

public interface IngredientRepository
         extends ReactiveCrudRepository<Ingredient, String> {
}

You never defined any custom query methods in IngredientRepository, so there’s not much else you need to do to make IngredientRepository a reactive repository. But because it now extends ReactiveCrudRepository, its methods will deal in terms of Flux and Mono. For example, the findAll() method now returns Flux<Ingredient> instead of an Iterable<Ingredient>. Consequently, you’ll need to be sure that it’s being used properly wherever it is being used. The allIngredients() method in IngredientController, for instance, will need to be rewritten to return a Flux<Ingredient>:

@GetMapping
public Flux<Ingredient> allIngredients() {
  return repo.findAll();
}

The changes to TacoRepository are only subtly more complicated. Instead of extending PagingAndSortingRepository, it will need to extend ReactiveCassandraRepository. And instead of being parameterized for Taco objects with Long ID properties, it will need to work with Taco objects with UUID properties for their IDs:

public interface TacoRepository
         extends ReactiveCrudRepository<Taco, UUID> {
}

Because this new TacoRepository will return Flux<Ingredient> from its findAll() method, you no longer need to worry about it extending PagingAndSortingRepository or working with a page of results. Instead, the recentTacos() method of DesignTacoController will just need to call take() on the returned Flux to limit the number of Taco objects consumed. (In fact, you already made this change to DesignTacoController and its recentTacos() method in section 11.1.2.)

The changes required for OrderRepository are similarly straightforward. Rather than extend CrudRepository, it will now extend ReactiveCassandraRepository:

public interface OrderRepository
         extends ReactiveCassandraRepository<Order, UUID> {
}

Finally, let’s look at UserRepository. As you’ll recall, UserRepository has a custom query method, findByUsername(). This method adds a little twist to how you must define the repository for Cassandra persistence. Here’s what a Cassandra-ready UserRepository interface looks like:

public interface UserRepository
         extends ReactiveCassandraRepository<User, UUID> {

  @AllowFiltering
  Mono<User> findByUsername(String username);

}

Following suit with all of the other repository interfaces (except IngredientRepository), UserRepository extends ReactiveCassandraRepository. No surprises so far. But its findByUsername() method demands a little bit of extra attention.

First, because this is intended to be a reactive repository, a findByUsername() method that simply returns a User object won’t do. You redefine it to return a Mono<User>. Generally speaking, any custom query methods you write in a reactive repository should return either a Mono (if there will be no more than one value returned) or a Flux (if there could be many values returned).

Also, the nature of Cassandra is such that you can’t simply query a table with a where clause, like you might do in SQL against a relational database. Cassandra is optimized for reading. But filtering results with a where clause could potentially slow down an otherwise fast query. Even so, querying a table where the results are filtered by one or more columns is very useful. Therefore, the @AllowFiltering annotation makes it possible to filter the results, acting as an opt-in for those cases where it’s needed.

In the case of findByUsername(), you’d expect a CQL query that looks like this:

select * from users where username='some username';

Again, that isn’t allowed by Cassandra. But when the @AllowFiltering annotation is placed on findByUsername(), the resulting CQL query looks like this:

select * from users where username='some username' allow filtering;

The allow filtering clause at the end of the query alerts Cassandra that you’re aware of the potential impacts to the query’s performance and that you need it anyway. In that case, Cassandra will allow the where clause and filter the results accordingly.

There’s a lot of power in Cassandra, and when it’s teamed up with Spring Data and Reactor, you can wield that power in your Spring applications. But let’s shift our attention to another database for which reactive repository support is available: MongoDB.

12.3. Writing reactive MongoDB repositories

MongoDB is a another well-known NoSQL database. Whereas Cassandra is a row-store database, MongoDB is considered a document database. More specifically, MongoDB stores documents in BSON (Binary JSON) format, which can be queried for and retrieved in a way that’s roughly similar to how you might query for data in any other database.

As with Cassandra, it’s important to understand that MongoDB isn’t a relational database. The way you manage your MongoDB server cluster, as well as how you model your data, requires a different mindset than when working with other kinds of databases.

That said, working with MongoDB and Spring Data isn’t dramatically different from how you might use Spring Data for working with JPA or Cassandra. You’ll annotate your domain classes with annotations that map the domain type to a document structure. And you’ll write repository interfaces that very much follow the same programming model as those you’ve seen for JPA and Cassandra. Before you can do any of that, though, you must enable Spring Data MongoDB in your project.

12.3.1. Enabling Spring Data MongoDB

To get started with Spring Data MongoDB, you’ll need to add the Spring Data MongoDB starter to the project build. Spring Data MongoDB has two separate starters to choose from.

If you’re working with non-reactive MongoDB, you’ll add the following dependency to the build:

<dependency>
  <groupId>org.springframework.boot</groupId>
  <artifactId>
    spring-boot-starter-data-mongodb
  </artifactId>
</dependency>

This dependency is also available from the Spring Initializr by checking the MongoDB check box. But this chapter is all about writing reactive repositories, so you’ll choose the reactive Spring Data MongoDB starter dependency instead:

<dependency>
  <groupId>org.springframework.boot</groupId>
  <artifactId>
    spring-boot-starter-data-mongodb-reactive
  </artifactId>
</dependency>

The reactive Spring Data MongoDB starter can also be added to the build by checking the Reactive MongoDB check box from the Initializr. By adding the starter to the build, autoconfiguration will be triggered to enable Spring Data support for writing automatic repository interfaces, such as those you wrote for JPA in chapter 3 or for Cassandra earlier in this chapter.

By default, Spring Data MongoDB assumes that you have a MongoDB server running locally and listening on port 27017. But for convenience in testing or developing, you can choose to work with an embedded Mongo database instead. To do that, add the Flapdoodle Embedded MongoDB dependency to your build:

<dependency>
  <groupId>de.flapdoodle.embed</groupId>
  <artifactId>de.flapdoodle.embed.mongo</artifactId>
</dependency>

The Flapdoodle embedded database affords you all of the same convenience of working with an in-memory Mongo database as you’d get with H2 when working with relational data. That is, you won’t need to have a separate database running, but all data will be wiped clean when you restart the application.

Embedded databases are fine for development and testing, but once you take your application to production, you’ll want to be sure you set a few properties to let Spring Data MongoDB know where and how your production Mongo database can be accessed:

spring:
  data:
    mongodb:
      host: mongodb.tacocloud.com
      port: 27018
      username: tacocloud
      password: s3cr3tp455w0rd
      database: tacoclouddb

Not all of these properties are required, but they’re available to help point Spring Data MongoDB in the right direction in the event that your Mongo database isn’t running locally. Breaking it down, here’s what each property configures:

spring.data.mongodb.host— The hostname where Mongo is running (default: localhost)
spring.data.mongodb.port— The port that the Mongo server is listening on (default: 27017)
spring.data.mongodb.username— The username to use to access a secured Mongo database
spring.data.mongodb.password— The password to use to access a secured Mongo database
spring.data.mongodb.database— The database name (default: test)

Now that you have Spring Data MongoDB enabled in your project, you need to annotate your domain objects for persistence as documents in MongoDB.

12.3.2. Mapping domain types to documents

Spring Data MongoDB offers a handful of annotations that are useful for mapping domain types to document structures to be persisted in MongoDB. Although Spring Data MongoDB provides a half dozen annotations for mapping, only three of them are useful for most common use cases:

@Id— Designates a property as the document ID (from Spring Data Commons)
@Document— Declares a domain type as a document to be persisted to MongoDB
@Field— Specifies the field name (and optionally the order) for storing a property in the persisted document

Of those three annotations, only the @Id and @Document annotations are strictly required. Unless you specify otherwise, properties that aren’t annotated with @Field will assume a field name equal to the property name.

Applying these annotations to the Ingredient class, you get the following:

package tacos;
import org.springframework.data.annotation.Id;
import org.springframework.data.mongodb.core.mapping.Document;
import lombok.AccessLevel;
import lombok.Data;
import lombok.NoArgsConstructor;
import lombok.RequiredArgsConstructor;

@Data
@RequiredArgsConstructor
@NoArgsConstructor(access=AccessLevel.PRIVATE, force=true)
@Document
public class Ingredient {

  @Id
  private final String id;
  private final String name;
  private final Type type;

  public static enum Type {
    WRAP, PROTEIN, VEGGIES, CHEESE, SAUCE
  }

}

As you can see, you place the @Document annotation at the class level to indicate that Ingredient is a document entity that can be written to and read from a Mongo database. By default, the collection name (the Mongo analog to a relational database table) is based on the class name, with the first letter lowercased. Because you haven’t specified otherwise, Ingredient objects will be persisted to a collection named ingredient. But you can change that by setting the collection attribute of @Document:

@Data
@RequiredArgsConstructor
@NoArgsConstructor(access=AccessLevel.PRIVATE, force=true)
@Document(collection="ingredients")
public class Ingredient {
...
}

You’ll also notice that the id property has been annotated with @Id. This designates the property as being the ID of the persisted document. You can use @Id on any property whose type is Serializable, including String and Long. In this case, you’re already using the String-defined id property as a natural identifier, so there’s no need to change it to any other type.

So far, so good. But you’ll recall from earlier in this chapter that Ingredient was the easy domain type to map for Cassandra. The other domain types, such as Taco, were a bit more challenging. Let’s look at how you can map the Taco class to see what surprises it might hold.

As with any domain-to-document mapping for MongoDB, you’ll certainly need to annotate Taco with @Document. And you’ll also need to designate an ID property with @Id. Doing so yields the following Taco class annotated for MongoDB persistence:

@Data
@RestResource(rel="tacos", path="tacos")
@Document
public class Taco {

  @Id
  private String id;

  @NotNull
  @Size(min=5, message="Name must be at least 5 characters long")
  private String name;

  private Date createdAt = new Date();

  @Size(min=1, message="You must choose at least 1 ingredient")
  private List<Ingredient> ingredients;

}

Believe it or not, that’s it! The challenges of dealing with two different primary key fields and referencing user-defined types were specific to Cassandra. For MongoDB, the Taco mapping is much simpler.

Even so, there are a few interesting things to point out in Taco. First, notice that the id property has been changed to be a String (as opposed to a Long in the JPA version or a UUID in the Cassandra version). As I said earlier, @Id can be applied to any Serializable type. But if you choose to use a String property as the ID, you get the benefit of Mongo automatically assigning a value to it when it’s saved (assuming that it’s null). By choosing String, you get a database-managed ID assignment and needn’t worry about setting that property manually.

Also, take a look at the ingredients property. Notice that it’s a List<Ingredient>, just like it was in the JPA version from chapter 3. But unlike the JPA version, the list isn’t stored in a separate MongoDB collection. Much like its Cassandra counterpart, the list of ingredients is stored directly, denormalized, in the taco document. But unlike the Cassandra implementation, you don’t need to make up a user-defined type—MongoDB is happy to use any type here, whether it’s another @Document-annotated type or just a POJO.

It certainly is a relief to see that mapping Taco for document persistence is easy. Will that ease of mapping carry over to the Order domain class? Take a look at the following MongoDB-annotated Order class to see for yourself:

@Data
@Document
public class Order implements Serializable {

  private static final long serialVersionUID = 1L;

  @Id
  private String id;

  private Date placedAt = new Date();

  @Field("customer")
  private User user;

  // other properties omitted for brevity's sake

  private List<Taco> tacos = new ArrayList<>();

  public void addDesign(Taco design) {
    this.tacos.add(design);
  }

}

For brevity’s sake, I’ve snipped out the various delivery and credit card fields. But from what’s left, it’s clear that all you need is @Document and @Id, as with the other domain types. Even so, you annotate the user property with @Field to specify that it be stored as customer in the persisted document.

By now, it shouldn’t be surprising that mapping the User domain class for MongoDB persistence should be just as easy:

@Data
@NoArgsConstructor(access=AccessLevel.PRIVATE, force=true)
@RequiredArgsConstructor
@Document
public class User implements UserDetails {

  private static final long serialVersionUID = 1L;

  @Id
  private String id;

  private final String username;

  private final String password;
  private final String fullname;
  private final String street;
  private final String city;
  private final String state;
  private final String zip;
  private final String phoneNumber;

  // UserDetails method omitted for brevity's sake

}

Although there are some more-advanced and unusual use cases that require additional mapping, you’ll find that for most cases, @Document and @Id, along with an occasional @Field, are sufficient for MongoDB mapping. They certainly do the job for the Taco Cloud domain types.

All that’s left is to write the repository interfaces.

12.3.3. Writing reactive MongoDB repository interfaces

Spring Data MongoDB offers automatic repository support similar to what’s provided by Spring Data JPA and Spring Data Cassandra. When it comes to writing reactive repositories for MongoDB, you have a choice between ReactiveCrudRepository and ReactiveMongoRepository. The key difference is that ReactiveMongoRepository provides a handful of special insert() methods that are optimized for persisting new documents, whereas ReactiveCrudRepository relies on save() methods for new and existing documents.

What about non-reactive MongoDB repositories?

The focus of this chapter is on writing reactive repositories with Spring Data. But if for some reason you wish to work with non-reactive repositories, you can do so by simply having your repository interfaces extend CrudRepository or MongoRepository instead of ReactiveCrudRepository or ReactiveMongoRepository. Then you can have the repository methods return Mongo-annotated domain types and collections of those domain types.

It’s not strictly required that you do so, but you can also choose to change the spring-boot-starter-data-mongodb-reactive dependency to spring-boot-starter-data-mongodb.

You’ll start by defining a repository for persisting Ingredient objects as documents. You won’t be creating ingredient documents frequently, or at all, after the database is initialized. Therefore, the optimizations offered by ReactiveMongoRepository won’t be as helpful. You can write IngredientRepository to extend ReactiveCrudRepository:

package tacos.data;
import org.springframework.data.repository.reactive.ReactiveCrudRepository;
import org.springframework.web.bind.annotation.CrossOrigin;
import tacos.Ingredient;

@CrossOrigin(origins="*")
public interface IngredientRepository
         extends ReactiveCrudRepository<Ingredient, String> {
}

Wait a minute! That looks identical to the IngredientRepository interface you wrote in section 12.2.4 for Cassandra! Indeed, it’s the same interface, with no changes. This highlights one of the benefits of extending ReactiveCrudRepository—it’s more portable across various database types and works equally well for MongoDB as for Cassandra.

Because it’s a reactive repository, its methods deal in terms of Flux and Mono rather than raw domain types and collections of those domain types. The findAll() method, for instance, will return Flux<Ingredient> instead of Iterable<Ingredient>. Likewise, findById() will return Mono<Ingredient> instead of Optional<Ingredient>. As a result, this reactive repository could be part of an end-to-end reactive flow.

Now let’s try defining a repository for persisting Taco objects as documents in MongoDB. Unlike ingredient documents, you’ll be creating taco documents rather frequently. Thus, the optimized insert() methods from ReactiveMongoRepository might prove valuable. Here’s your new MongoDB-ready TacoRepository interface:

package tacos.data;
import org.springframework.data.mongodb.repository.ReactiveMongoRepository;
import reactor.core.publisher.Flux;
import tacos.Taco;

public interface TacoRepository
         extends ReactiveMongoRepository<Taco, String> {

  Flux<Taco> findByOrderByCreatedAtDesc();

}

The only drawback of using ReactiveMongoRepository as opposed to ReactiveCrudRepository is that it’s very specific to MongoDB and not portable to other databases. In your projects, you’ll need to decide if that trade-off is worth it or not. If you don’t anticipate switching to a different database at some point, it’s safe enough to choose ReactiveMongoRepository and benefit from the insertion optimizations.

Notice that you introduce a new method in TacoRepository. This method is to support the use case of presenting a list of recently created tacos. In the JPA version of this repository, you achieved that by extending PagingAndSortingRepository. But PagingAndSortingRepository doesn’t make much sense (especially the paging part of it) in a reactive repository. In the Cassandra version, sorting was determined by the clustering key in the table definition, so you didn’t have anything special in the repository to support fetching recent taco creations.

But for MongoDB, you’d like to be able to fetch the most recently created tacos. Despite its odd name, the findByOrderByCreatedAtDesc() method follows the custom query method-naming convention. It says that you’re finding a Taco object by, well, by nothing. You don’t specify any properties that must match. Then you tell it to order the results by the createdAt property in descending order.

The reason to name it with an empty By clause is to avoid a misinterpretation of the method name, given that there’s another By in the method name. Had you named it findAllOrderByCreatedAtDesc(), the AllOrder portion of the name would’ve been ignored, and Spring Data would try to find tacos by matching against a createdAtDesc property. Because no such property exists, the application would fail to start, with an error.

Because findByOrderByCreatedAtDesc() returns a Flux<Taco>, you needn’t worry about paging. Instead, you can simply apply the take() operation to take only the first dozen Taco objects published in the Flux returned. For example, your controller that displays the recently created tacos could make a call to findByOrderByCreatedAtDesc() like this:

Flux<Taco> recents = repo.findByOrderByCreatedAtDesc()
                         .take(12);

The resulting Flux would only ever have, at most, 12 Taco items published.

Moving on to the OrderRepository interface, you can see that it’s straightforward:

package tacos.data;
import org.springframework.data.mongodb.repository.ReactiveMongoRepository;
import reactor.core.publisher.Flux;
import tacos.Order;

public interface OrderRepository
         extends ReactiveMongoRepository<Order, String> {

}

You’ll be frequently creating Order documents, so OrderRepository extends ReactiveMongoRepository to gain the optimizations afforded in its insert() methods. Otherwise, there’s nothing terribly special about this repository, compared to some of the other repositories you’ve defined thus far.

Finally, let’s take a look at the repository that will persist User objects as documents:

package tacos.data;
import org.springframework.data.mongodb.repository.ReactiveMongoRepository;
import reactor.core.publisher.Mono;
import tacos.User;

public interface UserRepository
         extends ReactiveMongoRepository<User, String> {

  Mono<User> findByUsername(String username);

}

By now, there should be nothing terribly surprising about this repository interface. Like the others, it extends ReactiveMongoRepository (although it could have also extended ReactiveCrudRepository). The only thing unique is the addition of a findByUsername() method, which you added in chapter 4 to support authentication against this repository. Here, it’s been tweaked to return a Mono<User> instead of a raw User object.

Summary

Spring Data supports reactive repositories for Cassandra, MongoDB, Couchbase, and Redis databases.
Spring Data’s reactive repositories follow the same programming model as non-reactive repositories, except that they deal in terms of reactive publishers such as Flux and Mono.
Non-reactive repositories (such as JPA repositories) can be adapted to work with Mono and Flux, but they ultimately still block while data is saved and fetched.
Working with nonrelational databases demands an understanding of how to model data appropriately for how the database ultimately stores the data.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Chapter 12. Persisting data reactively

Create new playlist

Sign In

Sign Up

Chapter 12. Persisting data reactively

12.1. Understanding Spring Data’s reactive story

12.1.1. Spring Data reactive distilled

12.1.2. Converting between reactive and non-reactive types

12.1.3. Developing reactive repositories

12.2. Working with reactive Cassandra repositories

12.2.1. Enabling Spring Data Cassandra

Note

12.2.2. Understanding Cassandra data modeling

12.2.3. Mapping domain types for Cassandra persistence

Listing 12.1. Annotating the Taco class for Cassandra persistence

Figure 12.1. Instead of using foreign keys and joins, Cassandra tables are denormalized, with user-defined types containing data copied from related tables.

Listing 12.2. Mapping the Order class to a Cassandra tacoorders table

12.2.4. Writing reactive Cassandra repositories

12.3. Writing reactive MongoDB repositories

12.3.1. Enabling Spring Data MongoDB

12.3.2. Mapping domain types to documents

12.3.3. Writing reactive MongoDB repository interfaces

Summary

Table of Contents for
Chapter 12. Persisting data reactively

Listing 12.1. Annotating the `Taco` class for Cassandra persistence

Listing 12.2. Mapping the `Order` class to a Cassandra `tacoorders` table