They say that variety is the spice of life.
You probably have a favorite flavor of ice cream. It’s that one flavor that you choose the most often because it satisfies that creamy craving more than any other. But most people, despite having a favorite flavor, try different flavors from time to time to mix things up.
Databases are like ice cream. For decades, the relational database has been the favorite flavor for storing data. But these days, we have more options available than ever before. So-called “NoSQL” databases (https://aws.amazon.com/nosql/) offer different concepts and structures in which data can be stored. And although the choice may still be somewhat based on taste, some databases are better suited for persisting different kinds of data than others.
Fortunately, Spring Data has you covered for many of the NoSQL databases, including MongoDB, Cassandra, Couchbase, Neo4j, Redis, and many more. And fortunately, the programming model is nearly identical, regardless of which database you choose.
There’s not enough space in this chapter to cover all of the databases that Spring Data supports. But to give you a sample of Spring Data’s other “flavors,” we’ll look at two popular NoSQL databases, Cassandra and MongoDB, and see how to create repositories to persist data to them. Let’s start by looking at how to create Cassandra repositories with Spring Data.
Cassandra is a distributed, high-performance, always available, eventually consistent, partitioned-column-store, NoSQL database.
That’s a mouthful of adjectives to describe a database, but each one accurately speaks to the power of working with Cassandra. To put it in simpler terms, Cassandra deals in rows of data written to tables, which are partitioned across one-to-many distributed nodes. No single node carries all the data, but any given row may be replicated across multiple nodes, thus eliminating any single point of failure.
Spring Data Cassandra provides automatic repository support for the Cassandra database that’s quite similar to—and yet quite different from—what’s offered by Spring Data JPA for relational databases. In addition, Spring Data Cassandra offers annotations for mapping application domain types to the backing database structures.
Before we explore Cassandra any further, it’s important to understand that although Cassandra shares many concepts similar to relational databases like Oracle and SQL Server, Cassandra isn’t a relational database and is in many ways quite a different beast. I’ll explain the idiosyncrasies of Cassandra as they pertain to working with Spring Data. But I encourage you to read Cassandra’s own documentation (http://cassandra.apache.org/doc/latest/) for a thorough understanding of what makes it tick.
Let’s get started by enabling Spring Data Cassandra in the Taco Cloud project.
To get started using Spring Data Cassandra, you’ll need to add the Spring Boot starter dependency for nonreactive Spring Data Cassandra. There are actually two separate Spring Data Cassandra starter dependencies to choose from: one for reactive data persistence and one for standard, nonreactive persistence.
We’ll talk more about writing reactive repositories later in chapter 15. For now, though, we’ll use the nonreactive starter in our build as shown here:
<dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-data-cassandra</artifactId> </dependency>
This dependency is also available from the Initializr by checking the Cassandra check box.
It’s important to understand that this dependency is in lieu of the Spring Data JPA starter or Spring Data JDBC dependencies we used in the previous chapter. Instead of persisting Taco Cloud data to a relational database with JPA or JDBC, you’ll be using Spring Data to persist data to a Cassandra database. Therefore, you’ll want to remove the Spring Data JPA or Spring Data JDBC starter dependencies and any relational database dependencies (such as JDBC drivers or the H2 dependency) from the build.
The Spring Data Cassandra starter dependency brings a handful of dependencies to the project, specifically, the Spring Data Cassandra library. As a result of Spring Data Cassandra being in the runtime classpath, autoconfiguration for creating Cassandra repositories is triggered. This means you’re able to begin writing Cassandra repositories with minimal explicit configuration.
Cassandra operates as a cluster of nodes that together act as a complete database system. If you don’t already have a Cassandra cluster to work with, you can start a single-node cluster for development purposes using Docker like this:
$ docker network create cassandra-net $ docker run --name my-cassandra --network cassandra-net -p 9042:9042 -d cassandra:latest
This starts the single-node cluster and exposes the node’s port (9042) on the host machine so that your application can access it.
You’ll need to provide a small amount of configuration, though. At the very least, you’ll need to configure the name of a keyspace within which your repositories will operate. To do that, you’ll first need to create such a keyspace.
Note In Cassandra, a keyspace is a grouping of tables in a Cassandra node. It’s roughly analogous to how tables, views, and constraints are grouped in a relational database.
Although it’s possible to configure Spring Data Cassandra to create the keyspace automatically, it’s typically much easier to manually create it yourself (or to use an existing keyspace). Using the Cassandra CQL (Cassandra Query Language) shell, you can create a keyspace for the Taco Cloud application. You can start the CQL shell using Docker like this:
Note If this command fails to start up the CQL shell with an error indicating “Unable to connect to any servers,” wait a minute or two and try again. You need to be sure that the Cassandra cluster is fully started before the CQL shell can connect to it.
When the shell is ready, use the create
keyspace
command like this:
cqlsh> create keyspace tacocloud ... with replication={'class':'SimpleStrategy', 'replication_factor':1} ... and durable_writes=true;
Put simply, this will create a keyspace named tacocloud
with simple replication and durable writes. By setting the replication factor to 1
, you ask Cassandra to keep one copy of each row. The replication strategy determines how replication is handled. The SimpleStrategy
replication strategy is fine for single data center use (and for demo code), but you might consider the NetworkTopologyStrategy
if you have your Cassandra cluster spread across multiple data centers. I refer you to the Cassandra documentation for more details of how replication strategies work and alternative ways of creating keyspaces.
Now that you’ve created a keyspace, you need to configure the spring.data
.cassandra.keyspace-name
property to tell Spring Data Cassandra to use that keyspace, as shown next:
spring: data: cassandra: keyspace-name: taco_cloud schema-action: recreate local-datacenter: datacenter1
Here, you also set the spring.data.cassandra.schema-action
to recreate
. This setting is very useful for development purposes because it ensures that any tables and user-defined types will be dropped and recreated every time the application starts. The default value, none
, takes no action against the schema and is useful in production settings where you’d rather not drop all tables whenever an application starts up.
Finally, the spring.data.cassandra.local-datacenter
property identifies the name of the local data center for purposes of setting Cassandra’s load-balancing policy. In a single-node setup, "datacenter1"
is the value to use. For more information on Cassandra load-balancing policies and how to set the local data center, see the DataStax Cassandra driver’s reference documentation (http://mng.bz/XrQM).
These are the only properties you’ll need for working with a locally running Cassandra database. In addition to these two properties, however, you may wish to set others, depending on how you’ve configured your Cassandra cluster.
By default, Spring Data Cassandra assumes that Cassandra is running locally and listening on port 9042. If that’s not the case, as in a production setting, you may want to set the spring.data.cassandra.contact-points
and spring.data.cassandra
.port
properties as follows:
spring: data: cassandra: keyspace-name: tacocloud local-datacenter: datacenter1 contact-points: - casshost-1.tacocloud.com - casshost-2.tacocloud.com - casshost-3.tacocloud.com port: 9043
Notice that the spring.data.cassandra.contact-points
property is where you identify the hostname(s) of Cassandra. A contact point is the host where a Cassandra node is running. By default, it’s set to localhost
, but you can set it to a list of hostnames. It will try each contact point until it’s able to connect to one. This is to ensure that there’s no single point of failure in the Cassandra cluster and that the application will be able to connect with the cluster through one of the given contact points.
You may also need to specify a username and password for your Cassandra cluster. This can be done by setting the spring.data.cassandra.username
and spring.data .cassandra.password
properties, as shown next:
Here the spring.data.cassandra.username
and spring.data.cassandra.password
properties specify “tacocloud” and “s3cr3tP455w0rd” as the credentials needed to access the Cassandra cluster.
Now that Spring Data Cassandra is enabled and configured in your project, you’re almost ready to map your domain types to Cassandra tables and write repositories. But first, let’s step back and consider a few basic points of Cassandra data modeling.
As I mentioned, Cassandra is quite different from a relational database. Before you can start mapping your domain types to Cassandra tables, it’s important to understand a few of the ways that Cassandra data modeling is different from how you might model your data for persistence in a relational database.
A few of the most important things to understand about Cassandra data modeling follow:
Cassandra tables may have any number of columns, but not all rows will necessarily use all of those columns.
Cassandra databases are split across multiple partitions. Any row in a given table may be managed by one or more partitions, but it’s unlikely that all partitions will have all rows.
A Cassandra table has two kinds of keys: partition keys and clustering keys. Hash operations are performed on each row’s partition key to determine which partition(s) that row will be managed by. Clustering keys determine the order in which the rows are maintained within a partition (not necessarily the order in which they may appear in the results of a query). Refer to Cassandra documentation (http://mng.bz/yJ6E) for a more detailed explanation of data modeling in Cassandra, including partitions, clusters, and their respective keys.
Cassandra is highly optimized for read operations. As such, it’s common and desirable for tables to be highly denormalized and for data to be duplicated across multiple tables. (For example, customer information may be kept in a customer table as well as duplicated in a table containing orders placed by customers.)
Suffice it to say that adapting the Taco Cloud domain types to work with Cassandra won’t be a matter of simply swapping out a few JPA annotations for Cassandra annotations. You’ll have to rethink how you model the data.
In chapter 3, you marked up your domain types (Taco
, Ingredient
, TacoOrder
, and so on) with annotations provided by the JPA specification. These annotations mapped your domain types as entities to be persisted to a relational database. Although those annotations won’t work for Cassandra persistence, Spring Data Cassandra provides its own set of mapping annotations for a similar purpose.
Let’s start with the Ingredient
class, because it’s the simplest to map for Cassandra. The new Cassandra-ready Ingredient
class looks like this:
package tacos; import org.springframework.data.cassandra.core.mapping.PrimaryKey; import org.springframework.data.cassandra.core.mapping.Table; import lombok.AccessLevel; import lombok.AllArgsConstructor; import lombok.Data; import lombok.NoArgsConstructor; import lombok.RequiredArgsConstructor; @Data @AllArgsConstructor @NoArgsConstructor(access=AccessLevel.PRIVATE, force=true) @Table("ingredients") public class Ingredient { @PrimaryKey private String id; private String name; private Type type; public enum Type { WRAP, PROTEIN, VEGGIES, CHEESE, SAUCE } }
The Ingredient
class seems to contradict everything I said about just swapping out a few annotations. Rather than annotating the class with @Entity
as you did for JPA persistence, it’s annotated with @Table
to indicate that ingredients should be persisted to a table named ingredients
. And rather than annotate the id
property with @Id
, this time it’s annotated with @PrimaryKey
. So far, it seems that you’re only swapping out a few annotations.
But don’t let the Ingredient
mapping fool you. The Ingredient
class is one of your simplest domain types. Things get more interesting when you map the Taco
class for Cassandra persistence, as shown in the next listing.
package tacos; import java.util.ArrayList; import java.util.Date; import java.util.List; import java.util.UUID; import javax.validation.constraints.NotNull; import javax.validation.constraints.Size; import org.springframework.data.cassandra.core.cql.Ordering; import org.springframework.data.cassandra.core.cql.PrimaryKeyType; import org.springframework.data.cassandra.core.mapping.Column; import org.springframework.data.cassandra.core.mapping.PrimaryKeyColumn; import org.springframework.data.cassandra.core.mapping.Table; import com.datastax.oss.driver.api.core.uuid.Uuids; import lombok.Data; @Data @Table("tacos") ❶ public class Taco { @PrimaryKeyColumn(type=PrimaryKeyType.PARTITIONED) ❷ private UUID id = Uuids.timeBased(); @NotNull @Size(min = 5, message = "Name must be at least 5 characters long") private String name; @PrimaryKeyColumn(type=PrimaryKeyType.CLUSTERED, ❸ ordering=Ordering.DESCENDING) private Date createdAt = new Date(); @Size(min=1, message="You must choose at least 1 ingredient") @Column("ingredients") ❹ private List<IngredientUDT> ingredients = new ArrayList<>(); public void addIngredient(Ingredient ingredient) { this.ingredients.add(TacoUDRUtils.toIngredientUDT(ingredient)); } }
❶ Persists to the "tacos" table
❹ Maps the list to the "ingredients" column
As you can see, mapping the Taco
class is a bit more involved. As with Ingredient
, the @Table
annotation is used to identify tacos
as the name of the table that tacos should be written to. But that’s the only thing similar to Ingredient
.
The id
property is still your primary key, but it’s only one of two primary key columns. More specifically, the id
property is annotated with @PrimaryKeyColumn
with a type
of PrimaryKeyType.PARTITIONED
. This specifies that the id
property serves as the partition key, used to determine to which Cassandra partition(s) each row of taco data will be written.
You’ll also notice that the id
property is now a UUID
instead of a Long
. Although it’s not required, properties that hold a generated ID value are commonly of type UUID
. Moreover, the UUID
is initialized with a time-based UUID value for new Taco
objects (but which may be overridden when reading an existing Taco
from the database).
A little further down, you see the createdAt
property that’s mapped as another primary key column. But in this case, the type
attribute of @PrimaryKeyColumn
is set to PrimaryKeyType.CLUSTERED
, which designates the createdAt
property as a clustering key. As mentioned earlier, clustering keys are used to determine the ordering of rows within a partition. More specifically, the ordering is set to descending order—therefore, within a given partition, newer rows appear first in the tacos
table.
Finally, the ingredients
property is now a List
of IngredientUDT
objects instead of a List
of Ingredient
objects. As you’ll recall, Cassandra tables are highly denormalized and may contain data that’s duplicated from other tables. Although the ingredient
table will serve as the table of record for all available ingredients
, the ingredients chosen for a taco will be duplicated in the ingredients
column. Rather than simply reference one or more rows in the ingredients
table, the ingredients
property will contain full data for each chosen ingredient.
But why do you need to introduce a new IngredientUDT
class? Why can’t you just reuse the Ingredient
class? Put simply, columns that contain collections of data, such as the ingredients
column, must be collections of native types (integers, strings, and so on) or user-defined types.
In Cassandra, user-defined types enable you to declare table columns that are richer than simple native types. Often they’re used as a denormalized analog for relational foreign keys. In contrast to foreign keys, which only hold a reference to a row in another table, columns with user-defined types actually carry data that may be copied from a row in another table. In the case of the ingredients
column in the tacos
table, it will contain a collection of data structures that define the ingredients themselves.
You can’t use the Ingredient
class as a user-defined type, because the @Table
annotation has already mapped it as an entity for persistence in Cassandra. Therefore, you must create a new class to define how ingredients will be stored in the ingredients
column of the taco
table. IngredientUDT
(where UDT
means user-defined type) is the class for the job, as shown here:
package tacos; import org.springframework.data.cassandra.core.mapping.UserDefinedType; import lombok.AccessLevel; import lombok.Data; import lombok.NoArgsConstructor; import lombok.RequiredArgsConstructor; @Data @RequiredArgsConstructor @NoArgsConstructor(access = AccessLevel.PRIVATE, force = true) @UserDefinedType("ingredient") public class IngredientUDT { private final String name; private final Ingredient.Type type; }
Although IngredientUDT
looks a lot like Ingredient
, its mapping requirements are much simpler. It’s annotated with @UserDefinedType
to identify it as a user-defined type in Cassandra. But otherwise, it’s a simple class with a few properties.
You’ll also note that the IngredientUDT
class doesn’t include an id
property. Although it could include a copy of the id
property from the source Ingredient
, that’s not necessary. In fact, the user-defined type may include any properties you wish—it doesn’t need to be a one-to-one mapping with any table definition.
I realize that it might be difficult to visualize how data in a user-defined type relates to data that’s persisted to a table. Figure 4.1 shows the data model for the entire Taco Cloud database, including user-defined types.
Specific to the user-defined type that you just created, notice how Taco
has a list of IngredientUDT
objects, which holds data copied from Ingredient
objects. When a Taco
is persisted, it’s the Taco
object and the list of IngredientUDT
objects that’s persisted to the tacos
table. The list of IngredientUDT
objects is persisted entirely within the ingredients
column.
Another way of looking at this that might help you understand how user-defined types are used is to query the database for rows from the tacos
table. Using CQL and the cqlsh
tool that comes with Cassandra, you see the following results:
cqlsh:tacocloud> select id, name, createdAt, ingredients from tacos; id | name | createdat | ingredients ----------+-----------+-----------+---------------------------------------- 827390...| Carnivore | 2018-04...| [{name: 'Flour Tortilla', type: 'WRAP'}, {name: 'Carnitas', type: 'PROTEIN'}, {name: 'Sour Cream', type: 'SAUCE'}, {name: 'Salsa', type: 'SAUCE'}, {name: 'Cheddar', type: 'CHEESE'}] (1 rows)
As you can see, the id
, name
, and createdat
columns contain simple values. In that regard, they aren’t much different than what you’d expect from a similar query against a relational database. But the ingredients
column is a little different. Because it’s defined as containing a collection of the user-defined ingredient
type (defined by IngredientUDT
), its value appears as a JSON array filled with JSON objects.
You likely noticed other user-defined types in figure 4.1. You’ll certainly be creating some more as you continue mapping your domain to Cassandra tables, including some that will be used by the TacoOrder
class. The next listing shows the TacoOrder
class, modified for Cassandra persistence.
package tacos; import java.io.Serializable; import java.util.ArrayList; import java.util.Date; import java.util.List; import java.util.UUID; import javax.validation.constraints.Digits; import javax.validation.constraints.NotBlank; import javax.validation.constraints.Pattern; import org.hibernate.validator.constraints.CreditCardNumber; import org.springframework.data.cassandra.core.mapping.Column; import org.springframework.data.cassandra.core.mapping.PrimaryKey; import org.springframework.data.cassandra.core.mapping.Table; import com.datastax.oss.driver.api.core.uuid.Uuids; import lombok.Data; @Data @Table("orders") ❶ public class TacoOrder implements Serializable { private static final long serialVersionUID = 1L; @PrimaryKey ❷ private UUID id = Uuids.timeBased(); private Date placedAt = new Date(); // delivery and credit card properties omitted for brevity's sake @Column("tacos") ❸ private List<TacoUDT> tacos = new ArrayList<>(); public void addTaco(TacoUDT taco) { tacos.add(taco); } }
❸ Maps a list to the tacos column
Listing 4.2 purposefully omits many of the properties of TacoOrder
that don’t lend themselves to a discussion of Cassandra data modeling. What’s left are a few properties and mappings, similar to how Taco
was defined. @Table
is used to map TacoOrder
to the orders
table, much as @Table
has been used before. In this case, you’re unconcerned with ordering, so the id
property is simply annotated with @PrimaryKey
, designating it as both a partition key and a clustering key with default ordering.
The tacos
property is of some interest in that it’s a List<TacoUDT>
instead of a list of Taco
objects. The relationship between TacoOrder
and Taco/TacoUDT
here is similar to the relationship between Taco
and Ingredient/IngredientUDT
. That is, rather than joining data from several rows in a separate table through foreign keys, the orders
table will contain all of the pertinent taco data, optimizing the table for quick reads.
The TacoUDT
class is quite similar to the IngredientUDT
class, although it does include a collection that references another user-defined type, as follows:
package tacos; import java.util.List; import org.springframework.data.cassandra.core.mapping.UserDefinedType; import lombok.Data; @Data @UserDefinedType("taco") public class TacoUDT { private final String name; private final List<IngredientUDT> ingredients; }
Although it would have been nice to reuse the same domain classes you created in chapter 3, or at most to swap out some JPA annotations for Cassandra annotations, the nature of Cassandra persistence is such that it requires you to rethink how your data is modeled. But now that you’ve mapped your domain, you’re ready to write repositories.
As you saw in chapter 3, writing a repository with Spring Data involves simply declaring an interface that extends one of Spring Data’s base repository interfaces and optionally declaring additional query methods for custom queries. As it turns out, writing Cassandra repositories isn’t much different.
In fact, there’s very little that you’ll need to change in the repositories we’ve already written to make them work for Cassandra persistence. For example, consider the following IngredientRepository
we created in chapter 3:
package tacos.data; import org.springframework.data.repository.CrudRepository; import tacos.Ingredient; public interface IngredientRepository extends CrudRepository<Ingredient, String> { }
By extending CrudRepository
as shown here, IngredientRepository
is ready to persist Ingredient
objects whose ID property (or, in the case of Cassandra, the primary key property) is a String
. That’s perfect! No changes are needed for IngredientRepository
.
The changes required for OrderRepository
are only slightly more involved. Instead of a Long
parameter, the ID parameter type specified when extending CrudRepository
will be changed to UUID
as follows:
package tacos.data; import java.util.UUID; import org.springframework.data.repository.CrudRepository; import tacos.TacoOrder; public interface OrderRepository extends CrudRepository<TacoOrder, UUID> { }
There’s a lot of power in Cassandra, and when it’s teamed up with Spring Data, you can wield that power in your Spring applications. But let’s shift our attention to another database for which Spring Data repository support is available: MongoDB.
MongoDB is a another well-known NoSQL database. Whereas Cassandra is a column-store database, MongoDB is considered a document database. More specifically, MongoDB stores documents in BSON (Binary JSON) format, which can be queried for and retrieved in a way that’s roughly similar to how you might query for data in any other database.
As with Cassandra, it’s important to understand that MongoDB isn’t a relational database. The way you manage your MongoDB server cluster, as well as how you model your data, requires a different mindset than when working with other kinds of databases.
That said, working with MongoDB and Spring Data isn’t dramatically different from how you might use Spring Data for working with JPA or Cassandra. You’ll annotate your domain classes with annotations that map the domain type to a document structure. And you’ll write repository interfaces that very much follow the same programming model as those you’ve seen for JPA and Cassandra. Before you can do any of that, though, you must enable Spring Data MongoDB in your project.
To get started with Spring Data MongoDB, you’ll need to add the Spring Data MongoDB starter to the project build. As with Spring Data Cassandra, Spring Data MongoDB has two separate starters to choose from: one reactive and one nonreactive. We’ll look at the reactive options for persistence in chapter 13. For now, add the following dependency to the build to work with the nonreactive MongoDB starter:
<dependency> <groupId>org.springframework.boot</groupId> <artifactId> spring-boot-starter-data-mongodb </artifactId> </dependency>
This dependency is also available from the Spring Initializr by checking the MongoDB check box under NoSQL.
By adding the starter to the build, autoconfiguration will be triggered to enable Spring Data support for writing automatic repository interfaces, such as those you wrote for JPA in chapter 3 or for Cassandra earlier in this chapter.
By default, Spring Data MongoDB assumes that you have a MongoDB server running locally and listening on port 27017. If you have Docker installed on your machine, an easy way to get a MongoDB server running is with the following command line:
But for convenience in testing or developing, you can choose to work with an embedded Mongo database instead. To do that, add the following Flapdoodle embedded MongoDB dependency to your build:
<dependency> <groupId>de.flapdoodle.embed</groupId> <artifactId>de.flapdoodle.embed.mongo</artifactId> <!-- <scope>test</scope> --> </dependency>
The Flapdoodle embedded database affords you all of the same convenience of working with an in-memory Mongo database as you’d get with H2 when working with relational data. That is, you won’t need to have a separate database running, but all data will be wiped clean when you restart the application.
Embedded databases are fine for development and testing, but once you take your application to production, you’ll want to be sure you set a few properties to let Spring Data MongoDB know where and how your production Mongo database can be accessed, as shown next:
spring: data: mongodb: host: mongodb.tacocloud.com port: 27017 username: tacocloud password: s3cr3tp455w0rd database: tacoclouddb
Not all of these properties are required, but they’re available to help point Spring Data MongoDB in the right direction in the event that your Mongo database isn’t running locally. Breaking it down, here’s what each property configures:
spring.data.mongodb.host
—The hostname where Mongo is running (default: localhost
)
spring.data.mongodb.port
—The port that the Mongo server is listening on (default: 27017
)
spring.data.mongodb.username
—The username for accessing a secured Mongo database
spring.data.mongodb.password
—The password for accessing a secured Mongo database
spring.data.mongodb.database
—The database name (default: test
)
Now that you have Spring Data MongoDB enabled in your project, you need to annotate your domain objects for persistence as documents in MongoDB.
Spring Data MongoDB offers a handful of annotations that are useful for mapping domain types to document structures to be persisted in MongoDB. Although Spring Data MongoDB provides a half-dozen annotations for mapping, only the following four are useful for most common use cases:
@Id
—Designates a property as the document ID (from Spring Data Commons)
@Document
—Declares a domain type as a document to be persisted to MongoDB
@Field
—Specifies the field name (and, optionally, the order) for storing a property in the persisted document
Of those three annotations, only the @Id
and @Document
annotations are strictly required. Unless you specify otherwise, properties that aren’t annotated with @Field
or @Transient
will assume a field name equal to the property name.
Applying these annotations to the Ingredient
class, you get the following:
package tacos; import org.springframework.data.annotation.Id; import org.springframework.data.mongodb.core.mapping.Document; import lombok.AccessLevel; import lombok.AllArgsConstructor; import lombok.Data; import lombok.NoArgsConstructor; @Data @Document @AllArgsConstructor @NoArgsConstructor(access=AccessLevel.PRIVATE, force=true) public class Ingredient { @Id private String id; private String name; private Type type; public enum Type { WRAP, PROTEIN, VEGGIES, CHEESE, SAUCE } }
As you can see, you place the @Document
annotation at the class level to indicate that Ingredient
is a document entity that can be written to and read from a Mongo database. By default, the collection name (the Mongo analog to a relational database table) is based on the class name, with the first letter lowercase. Because you haven’t specified otherwise, Ingredient
objects will be persisted to a collection named ingredient
. But you can change that by setting the collection
attribute of @Document
as follows:
@Data @AllArgsConstructor @NoArgsConstructor(access=AccessLevel.PRIVATE, force=true) @Document(collection="ingredients") public class Ingredient { ... }
You’ll also notice that the id
property has been annotated with @Id
. This designates the property as being the ID of the persisted document. You can use @Id
on any property whose type is Serializable
, including String
and Long
. In this case, you’re already using the String
-defined id
property as a natural identifier, so there’s no need to change it to any other type.
So far, so good. But you’ll recall from earlier in this chapter that Ingredient
was the easy domain type to map for Cassandra. The other domain types, such as Taco
, were a bit more challenging. Let’s look at how you can map the Taco
class to see what surprises it might hold.
MongoDB’s approach to document persistence lends itself very well to the domain-driven-design way of applying persistence at the aggregate root level. Documents in MongoDB tend to be defined as aggregate roots, with members of the aggregate as subdocuments.
What that means for Taco Cloud is that because Taco
is only ever persisted as a member of the TacoOrder
-rooted aggregate, the Taco
class doesn’t need to be annotated as a @Document
, nor does it need an @Id
property. The Taco
class can remain clean of any persistence annotations, as shown here:
package tacos; import java.util.ArrayList; import java.util.Date; import java.util.List; import javax.validation.constraints.NotNull; import javax.validation.constraints.Size; import lombok.Data; @Data public class Taco { @NotNull @Size(min=5, message="Name must be at least 5 characters long") private String name; private Date createdAt = new Date(); @Size(min=1, message="You must choose at least 1 ingredient") private List<Ingredient> ingredients = new ArrayList<>(); public void addIngredient(Ingredient ingredient) { this.ingredients.add(ingredient); } }
The TacoOrder
class, however, being the root of the aggregate, will need to be annotated with @Document
and have an @Id
property, as follows:
package tacos; import java.io.Serializable; import java.util.ArrayList; import java.util.Date; import java.util.List; import javax.validation.constraints.Digits; import javax.validation.constraints.NotBlank; import javax.validation.constraints.Pattern; import org.hibernate.validator.constraints.CreditCardNumber; import org.springframework.data.annotation.Id; import org.springframework.data.mongodb.core.mapping.Document; import lombok.Data; @Data @Document public class TacoOrder implements Serializable { private static final long serialVersionUID = 1L; @Id private String id; private Date placedAt = new Date(); // other properties omitted for brevity's sake private List<Taco> tacos = new ArrayList<>(); public void addTaco(Taco taco) { tacos.add(taco); } }
For brevity’s sake, I’ve snipped out the various delivery and credit card fields. But from what’s left, it’s clear that all you need is @Document
and @Id
, as with the other domain types.
Notice, however, that the id
property has been changed to be a String
(as opposed to a Long
in the JPA version or a UUID
in the Cassandra version). As I said earlier, @Id
can be applied to any Serializable
type. But if you choose to use a String
property as the ID, you get the benefit of Mongo automatically assigning a value to it when it’s saved (assuming that it’s null
). By choosing String
, you get a database-managed ID assignment and needn’t worry about setting that property manually.
Although there are some more-advanced and unusual use cases that require additional mapping, you’ll find that for most cases, @Document
and @Id
, along with an occasional @Field
or @Transient
, are sufficient for MongoDB mapping. They certainly do the job for the Taco Cloud domain types.
All that’s left is to write the repository interfaces.
Spring Data MongoDB offers automatic repository support similar to what’s provided by the Spring Data JPA and Spring Data Cassandra.
You’ll start by defining a repository for persisting Ingredient
objects as documents. As before, you can write IngredientRepository
to extend CrudRepository
, as shown here:
package tacos.data; import org.springframework.data.repository.CrudRepository; import tacos.Ingredient; public interface IngredientRepository extends CrudRepository<Ingredient, String> { }
Wait a minute! That looks identical to the IngredientRepository
interface you wrote in section 4.1 for Cassandra! Indeed, it’s the same interface, with no changes. This highlights one of the benefits of extending CrudRepository
—it’s more portable across various database types and works equally well for MongoDB as for Cassandra.
Moving on to the OrderRepository
interface, you can see in the following snippet that it’s quite straightforward:
package tacos.data; import org.springframework.data.repository.CrudRepository; import tacos.TacoOrder; public interface OrderRepository extends CrudRepository<TacoOrder, String> { }
Just like IngredientRepository
, OrderRepository
extends CrudRepository
to gain the optimizations afforded in its insert()
methods. Otherwise, there’s nothing terribly special about this repository, compared to some of the other repositories you’ve defined thus far. Note, however, that the ID parameter when extending CrudRepository
is now String
instead of Long
(as for JPA) or UUID
(as for Cassandra). This reflects the change we made in TacoOrder
to support automatic assignment of IDs.
In the end, working with Spring Data MongoDB isn’t drastically different from the other Spring Data projects we’ve worked with. The domain types are annotated differently. But aside from the ID parameter specified when extending CrudRepository
, the repository interfaces are nearly identical.
Spring Data supports repositories for a variety of NoSQL databases, including Cassandra, MongoDB, Neo4j, and Redis.
The programming model for creating repositories differs very little across different underlying databases.
Working with nonrelational databases demands an understanding of how to model data appropriately for how the database ultimately stores the data.