Chapter 5. Handling data

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Handling data

In an environment with polyglot persistence, which can exist in a system of microservices, it is necessary to keep the data handling manageable. To explain how this goal can be achieved, this chapter provides a short description of characteristics of microservices concerning data handling and then looks how it can be done with Java based microservices.

The following topics are covered:

•Data-specific characteristics of a microservice

•Support in Java

5.1 Data-specific characteristics of a microservice

One way to identify the data, which must be stored in the data store of your microservice, is a top down approach. Start at the business level to model your data. The following sections show how to identify this data, how to handle the data, and how it can be shared with other data stores of other microservices. For a description of the top down approach, see the following website:

http://en.wikipedia.org/wiki/Top-down_and_bottom-up_design

5.1.1 Domain-driven design leads to entities

From the approach in domain-driven design, you get the following objects, among others:

•Entity

“An object that is not defined by its attributes, but rather by a thread of continuity and its identity.”

•Value Objects

“An object that contains attributes but has no conceptual identity. They should be treated as immutable.”

•Aggregate

“A collection of objects that are bound together by a root entity, otherwise known as an aggregate root. The aggregate root guarantees the consistency of changes being made within the aggregate by forbidding external objects from holding references to its members.”

•Repository

“Methods for retrieving domain objects should delegate to a specialized Repository object such that alternative storage implementations may be easily interchanged.”

These quotations are taken from the following source:

http://en.wikipedia.org/wiki/Domain-driven_design

These objects should be mapped to your persistence storage (you should only aggregate entities when they have the same lifecycle). This domain model is the basis for both a logical data model and a physical data model.

For more information about domain-drive design terms, see the following website:

http://dddcommunity.org/resources/ddd_terms/

5.1.2 Separate data store per microservice

Every microservice should have its own data store (Figure 5-1) and is decoupled from the other microservice and its data. The second microservice must not directly access the data store of the first microservice.

Reasons for this characteristic:

•If two microservices share a data store, they are tightly coupled. Changing the structure of the data store (such as tables) for one microservice can cause problems for the other microservice. If microservices are coupled this way, then deployment of new versions must be coordinated, which must be avoided.

•Every microservice should use the type of database that best fits its needs (polyglot persistence, see 5.1.3, “Polyglot persistence” on page 45 for more details). You do not need to make a trade off between different microservices when choosing a database system. This configuration leads to a separate data store for each microservice.

•From a performance aspect, it can be useful for every microservice to have its own data store because scaling can become easier. The data store can be hosted on its own server.

Figure 5-1 Separate data store per microservice

The separation of data stores for a relational database can be achieved in one of the following ways:

•Schema per microservice

Each service has its own schema in the database (Figure 5-2 on page 44). Other services can use the same database but must use a different schema. This configuration can be enforced with the database access control mechanism (using grants for connected database users) because developers suffering from time pressures tend to use shortcuts and can access the other schema directly.

•Database per microservice

Each microservice can have its own database but share database server with the other microservices (Figure 5-2 on page 44). With a different database, users can connect to the database server and there is a good separation of the databases.

Figure 5-2 Schema per microservice and database per microservice

•Database server per microservice

This is the highest degree of separation. It can be useful, for example, when performance aspects need to be addressed (Figure 5-3).

Figure 5-3 Database server per microservice

5.1.3 Polyglot persistence

Every microservice should use its own data store, which means that it can also use a different data storage technology. The NoSQL movement has led to many new data storage technologies, which can be used alongside a traditional relational database. Based on the requirements that an application must implement, it can choose between different types of technologies to store its data. Having a range of data storage technologies in one application is known as polyglot persistence.

For some microservices, it is best to store their data in a relational database. Other services, with a different type of data (such as unstructured, complex, and graph oriented) can store their data in some of the NoSQL databases.

Polyglot programming is a term that means that applications should be written in a programming language that best fits the challenge that an application must deal with.

A more in-depth description of these two terms can be found on the Martin Fowler website at:

http://martinfowler.com/bliki/PolyglotPersistence.html

5.1.4 Data sharing across microservices

In some situations, a client of a microservice application might want to request data that is owned by a different service. For example, the client might want to see all his Payments and the corresponding state of his payments. Then, he would have to query the service for his Account and afterward the service for Payment. It is against the best practices for microservices to join the data by using the database. To handle this business case, you must implement an adapter service (Figure 5-4) that queries the Account service and the Payment service, and then returns the collected data to the client. The adapter service is also responsible for doing the necessary data transformations on the data it has received.

Figure 5-4 Adapter microservice

Changing data can become more complicated. In a system of microservices, some business transactions can span multiple microservices. In a microservice application for a retail store, for example, there might be a service to place an Order and a service to do the Payment. Therefore, if a customer wants to buy something in your shop and pays for it, the business transaction spans the two microservices. Every microservice has its own data store so, with a business transaction spanning two or more microservices, two or more data stores are involved in this business transaction. This section describes how to implement these business transactions.

Event-Driven architecture

You need a method to ensure the consistency of the data involved in a business transaction that spans two or more microservices. One way would be a distributed transaction, but there are many reasons why this should not be done in a microservice application. The main reason is the tight coupling of the microservices that are involved in a distributed transaction. If two microservices are involved in a distributed transaction and one service fails or has a performance problem, the other service must wait for the time out to roll back the transaction.

The best way to span a business transaction across microservices is to use an event-driven architecture. To change data, the first service updates its data and, in the same (inner) transaction, it publishes an event. The second microservice, which has subscribed to this event, receives this event and does the data change on its data. Using a publish/subscribe communication model, the two microservices are loosely coupled. The coupling only exists on the messages that they exchange. This technique enables a system of microservices to maintain data consistency across all microservices without using a distributed transaction.

If there are microservices in the microservice application that send many messages to each other, then they might be good candidates for merging into one service. But be careful because this configuration might break the domain-driven design aspect of your microservices. Adapter services doing complex updates, which span more than one service, can also be implemented by using events.

The programming model of an event-driven architecture is more complex, but Java can help to keep the complexity manageable.

Eventual Consistency

In an event-driven architecture, sending messages to other microservices creates a problem called Eventual Consistency. It is most often a runtime problem resulting from the following situation: microservice A changes data in its data store and sends, in the same inner transaction, a message to microservice B. After a short amount of time, microservice B receives the message and changes the data in its data store. In this normally short period, the data in the two data stores is not consistent. For example: service A updates the Order data in its data store and sends a message to service B to do the Payment. Until the Payment gets processed, there is an Order that has not been paid. Things get worse when the receiver of the message is not able to process the message. In this case, the messaging system or the receiving microservice must implement strategies to cope with this problem¹.

In a microservice application, every microservice has its own database. A business transaction that spans more than one microservice introduces eventual consistency because distributed transactions are discouraged to solve this problem. One way to cope with such business transactions is shown in Figure 5-5 on page 47. The Order microservice saves the Order in its data store and sends an event, for example OrderCreated, to the Payment microservice. While the Order microservice has not received the confirmation of the Payment from the Payment microservice, the Order is in a pending state.

The Payment service is subscribed to the OrderCreated event so it processes this event and does the Payment in its data store. If the Payment succeeds, then it publishes a PaymentApproved event that is subscribed by the Order microservice. After processing the PaymentApproved event, the state of the Order gets changed from Pending to Approved. If the customer queries the state of his Order he gets one of the following two responses: Order is in a pending state or Order is approved.

Figure 5-5 Event messaging between microservices

In situations where the data is not available, the service can send out to the client something like the following message: “Sorry, please try again later”.

Data replication

The separation of the data stores and the requirement to get data from different data stores can lead to the idea that using the data replication mechanisms of the database system would solve the problem. Doing this, for example, with a database trigger or a timed stored procedure or other processes, has the same disadvantage as sharing the database with more microservices. Changing the structure of the data on one side of the replication pipe leads to problems with the replication process. The process must be adapted when a new version of a service is being deployed. This is also a form of tight coupling and must be avoided.

As stated, the event-based processing can decouple the two data stores. The services processing the events can do the relevant data transformation, if needed, and store the data in their own data stores.

5.1.5 Event Sourcing and Command Query Responsibility Segregation

In an event-driven architecture, consider Command Query Responsibility Segregation (CQRS) and Event Sourcing. These two architectural patterns can be combined to handle the events flowing through your microservice application.

CQRS splits the access to the data store into two separate parts: One part for the read operations and the other part for the write operations. Read operations do not change the state of a system. They only return the state. Write operations (commands) change the state of the system, but do not return values. Event Sourcing stores sequences of these events that occur as changes happened to your data. Figure 5-6 shows a sample of using CQRS.

Figure 5-6 Sample of using CQRS

As shown in Figure 5-6, the events are stored sequentially in the event store. The data in the query model gets synchronized with the data from the event store. To support the event store or the query model, you can use specialized systems, for example Elastic Search to support the queries of your microservice. For more information about Elastic Search, see:

https://www.elastic.co/

This architecture can also be used to deal with events in your microservice application.

5.1.6 Messaging systems

You can use messaging systems or a message-oriented middleware to support your event-driven architecture.

“Message-oriented middleware (MOM) is software or hardware infrastructure supporting sending and receiving messages between distributed systems. MOM allows application modules to be distributed over heterogeneous platforms and reduces the complexity of developing applications that span multiple operating systems and network protocols. The middleware creates a distributed communications layer that insulates the application developer from the details of the various operating systems and network interfaces. APIs that extend across diverse platforms and networks are typically provided by MOM. MOM provides software elements that reside in all communicating components of a client/server architecture and typically support asynchronous calls between the client and server applications. MOM reduces the involvement of application developers with the complexity of the master-slave nature of the client/server mechanism.”²

To communicate with message oriented middleware, you can use different protocols. The following are the most common protocols:

•Advanced Message Queuing Protocol (AMQP)³

AMQP “mandates the behavior of the messaging provider and client to the extent that implementations from different vendors are interoperable, in the same way as SMTP, HTTP, FTP, etc. have created interoperable systems.”⁴

•MQ Telemetry Transport (MQTT)⁵

MQTT is “publish-subscribe-based “lightweight” messaging protocol for use on top of the TCP/IP protocol. It is designed for connections with remote locations where a “small code footprint” is required or the network bandwidth is limited.”⁶ Mostly it is used in Internet of Things (IoT) environments.

In the Java world there exists an API to communicate with message-oriented middleware: Java Message Service (JMS) which is part of the Java EE Specification. The specific version is JMS 2.0⁷.

Due to the long existence of JMS (since 2001), many JMS Message Brokers are available that can be used as MOM systems⁸. But there also exist messaging systems that implement AMQP:

– RabbitMQ

– Apache Qpid⁹

– Red Hat Enterprise MRG

All of these systems provide Java APIs so they can be used in your Java based microservice.

5.1.7 Distributed transactions

Most, if not all, of the messaging systems support transactions. It is also possible to use distributed transactions when sending messages to the message system and changing data in a transactional data store.

Distributed transactions and two-phase commit can be used between a microservice and its backing store, but should not be used between microservices. Given the independent nature of microservices, there must not be an affinity between specific service instances, which is required for two-phase commit transactions.

For interactions spanning services, compensation or reconciliation logic must be added to ensure consistency is maintained.

5.2 Support in Java

In a polyglot persistence environment, the programming language you use to implement your microservices must deal with the different persistence technologies. Your programming language must be able to support each of the different ways to persist your data. Java as a programming language has many APIs and frameworks that can support the developer to deal with different persistence technologies.

5.2.1 Java Persistence API

“The Java Persistence API (JPA) is a Java specification for accessing, persisting, and managing data between Java objects / classes and a relational database. JPA was defined as part of the EJB 3.0 specification as a replacement for the EJB 2 CMP Entity Beans specification. JPA is now considered the standard industry approach for Object to Relational Mapping (ORM) in the Java Industry.

JPA itself is just a specification, not a product; it cannot perform persistence or anything else by itself. JPA is just a set of interfaces, and requires an implementation. There are open-source and commercial JPA implementations to choose from and any Java EE 5 application server should provide support for its use. JPA also requires a database to persist to.”¹⁰

Java Enterprise Edition 7 (Java EE 7) has included Java Persistence 2.1 (JSR 338)¹¹.

The main focus for inventing JPA was to get an object-relational mapper to persist Java objects in a relational database. The implementation behind the API, the persistence provider, can be implemented by different open source projects or vendors. The other benefit that you get from using JPA is your persistence logic is more portable.

JPA has defined its own query language (Java Persistence Query Language (JPQL)) that can be used to generate queries for different database vendors. Java classes get mapped to tables in the database and it is also possible to use relationships between the classes to correspond to the relationships between the tables. Operations can be cascaded by using these relationships, so an operation on one class can result in operations on the data of the other class. JPA 2.0 has introduced the Criteria API that can help to get a correct query at both run time and compile time. All queries that you implement in your application can get a name and can be addressed with this name. This configuration makes it easier to know after a few weeks of programming what the query does.

The JPA persistence providers implement the database access. The following are the most common:

•Hibernate

http://hibernate.org/

•EclipseLink

http://www.eclipse.org/eclipselink/

•Apache OpenJPA

http://openjpa.apache.org/

The default JPA provider of Liberty for Java EE 7 is EclipseLink.

JPA from a birds eye view

The following short explanations and code snippets show some of the features of JPA. The best way to start with JPA is to create the Entity-Classes to hold the data (Example 5-1).

Example 5-1 JPA Class to hold entity data

@Entity

public class Order {

@Id

@GeneratedValue(strategy = GenerationType.IDENTITY)

private Long id;

private String description;

@Temporal(TemporalType.DATE)

private Date orderDate;

public String getDescription() {

return description;

}

public void setDescription(String description) {

this.description = description;

}

public Date getOrderDate() {

return orderDate;

}

public void setOrderDate(Date orderDate) {

this.orderDate = orderDate;

}

…

}

Every entity class needs an @Entity annotation to be managed by the persistence provider. The entity class gets mapped by name to the corresponding table in the database (convention over configuration). Different mappings can also be applied. The attributes of the class get mapped by name to the columns of the underlying table. The automatic mapping of the attributes can also be overwritten (@Column). Every entity class must have an identity (see Domain-Driven Design in“Mapping domain elements into services” on page 13). The identity column (or columns), annotated with @Id are necessary for the persistence provider to map the values of the object to the data row in the table. The value of the identity column can be generated in different ways by the database or the persistence provider. Some of the attributes of an entity class must be transformed in a special way to be stored in the database. For example, a database column DATE must be mapped in the entity class with the annotation @Temporal.

The main JPA interface used to query the database is the EntityManager¹². It has methods to create, read, update, and delete data from the database. There are different ways to get a reference to the EntityManager, depending on the environment that the application is running. In an unmanaged environment (no servlet, EJB, or CDI container) you must use a factory method of the class EntityManagerFactory as shown in Example 5-2.

Example 5-2 How to get an EntityManager in a Java SE environment

EntityManagerFactory entityManagerFactory =

Persistence.createEntityManagerFactory("OrderDB");

EntityManager entityManager =

entityManagerFactory.createEntityManager();

The String OrderDB is the name of the persistence unit that the EntityManager is created for. Persistence units are used to logically group entity classes and related properties to configure the persistence provider (configuration in the persistence.xml file).

In a managed environment, things are easier. The EntityManager can be injected from the container as shown in Example 5-3.

Example 5-3 How to get an EntityManager In a Java EE environment

@PersistenceContext

EntityManager em;

If the persistence context gets injected without using a unitName, which is the name of the persistence unit configured in the configuration file (persistence.xml), then it uses a default value. If there is only one persistence unit, then this is the default that JPA uses.

The following sections show how to implement some simple create, retrieve, update, and delete operations with the methods from the EntityManager as shown in Example 5-4.

Example 5-4 JPA create operation

@PersistenceContext

EntityManager em;

...

public Order createOrder(Order order) {

em.persist(order);

return order;

}

The persist method of the EntityManager does two things. First, the EntityManager manages the object, which means it holds the object in its persistence context. The persistence context can be seen as a cache holding the objects that are related to the rows in the database. The relation is done by using the database transactions. Second, the object gets persisted. It gets stored in the database. If the Order entity class has an ID attributed whose value gets generated by the database, then the attribute of the value will be set by the EntityManager after the insertion into the database. That is the reason why the return value of the createOrder method is the object itself (Example 5-5).

Example 5-5 JPA read operation by using the find method

@PersistenceContext

EntityManager em;

...

public Order readOrder(Long orderID) {

Order order = em.find(Order.class, orderID);

return order;

}

The EntityManager method find searches the table for a row whose primary key is given as a parameter (orderID). The result is casted to a Java class of type Order (Example 5-6).

Example 5-6 JPA read operation by using JPQL

@PersistenceContext

EntityManager em;

...

public Order readOrder(Long orderID) {

TypedQuery<Order> query =

em.createQuery( "Select o from Order o " +

"where o.id = :id", Order.class );

query.setParameter("id", orderID);

Order order = query.getSingleResult();

return order;

}

Example 5-6 on page 52 shows the function of the find method by using a JPQL with a parameter. Beginning from the JPQL string Select o from Order o where o.id = :id, a TypedQuery is generated. With a TypedQuery, you can omit the Java cast after generating the result object (see the parameter Order.class). The parameter in the JPQL gets addressed by a name (id) that makes it more readable for the developer. The method getSingleResult makes sure that only one row is found in the database. If there is more than one row corresponding to the SQL, then a RuntimeException is thrown.

The merge method does the update in the database (Example 5-7). The parameter order is a detached object, which means it is not in the persistence context. After the update in the database, an attached (it is now part of the persistence context) order object is returned by the EntityManger.

Example 5-7 JPA update operation

public Order updateOrder(Order order, String newDesc) {

order.setDescription(newDesc);

return em.merge(order);

}

To delete a row in the database, you need an attached object (Example 5-8). To attach the object to the persistence context, you can do a find. If the object is already attached, you do not need the find method. The method remove with the parameter of the attached object deletes the object in the database.

Example 5-8 JPA delete operation

public void removeOrder(Long orderId) {

Order order = em.find(Order.class, orderId);

em.remove(order);

}

As mentioned before, some configuration must be used to tell the persistence provider where to find the database and how to work with the database. This is done in a configuration file called persistence.xml. The configuration file needs to be in the class path of your service. Depending on your environment (Java containers or not), the configuration must be done in one of two ways (Example 5-9).

Example 5-9 Persistence.xml in a Java SE environment

<persistence-unit name="OrderDB"

transaction-type="RESOURCE_LOCAL">

<class>com.service.Order</class>

<property name="javax.persistence.jdbc.url"

value="<jdbc-url-of-database" />

<property name="javax.persistence.jdbc.user"

value="user1" />

<property name="javax.persistence.jdbc.password"

value="password1" />

<property name="javax.persistence.jdbc.driver"

value="<package>.<DriverClass>" />

</properties>

</persistence-unit>

</persistence>

To configure the persistence provider, the first thing that must be done is to define the persistence unit (OrderDB). One attribute of the persistence unit is the transaction-type. Two values can be set: RESOURCE_LOCAL and JTA. The first option makes the developer responsible for doing the transaction handling in his code. If there is no transaction manager available in your environment, then use this option. The second option is JTA, which stands for Java Transaction API and is part of the Java Community Process (JCP). This options tells the persistence provider to delegate transaction handling to a transaction manager that exists in your runtime environment.

Between the XML-tags <class></class>, you can enlist the entity classes to be used in this persistence unit.

In the properties section of the file, you can set values to configure the way the persistence provider will work with the database. Property names beginning with javax.persistence.jdbc are defined by the JPA standard. Example 5-9 on page 53 shows how to set the database URL (used for database connections) and the user and password. The javax.persistence.jdbc.driver property tells the persistence provider which JDBC-driver class to use.

The main difference in the configuration file for the Java EE environment is shown in Example 5-10.

Example 5-10 Persistence.xml in a Java EE environment

<persistence-unit name="OrderDB">

<jta-data-source>jdbc/OrderDB</jta-data-source>

<class>com.widgets.Order</class>

...

</persistence-unit>

</persistence>

The default transaction handling in JPA is to use JTA so that you do not need to set it in the configuration file. The jta-data-source property points to the JNDI-Name of the data source configured in the application server of your Java EE environment.

To do the transaction management in non Java EE environments, the EntityManager has some methods that can be used as shown in Example 5-11.

Example 5-11 Transaction management in no Java EE environments

EntityManagerFactory emf =

Persistence.createEntityManagerFactory("OrderDB");

EntityManager em = emf.createEntityManager();

EntityTransaction tx = em.getTransaction();

tx.begin();

try {

em.persist(yourEntity);

em.merge(anotherEntity);

tx.commit();

} finally {

if (tx.isActive()) {

tx.rollback();

}

Example 5-11 on page 54 shows the transaction handling in a non Java EE environment. Avoid doing transaction management on your own in a Java EE environment. There are better ways of doing this management as described in 5.2.2, “Enterprise JavaBeans” on page 57.

To separate the data store for your microservice, you can set the default schema of your relational database by using a configuration file as in Example 5-12.

Example 5-12 Setting default schema in my-orm.xml

<?xml version="1.0" encoding="UTF-8"?>

<entity-mappings xmlns="http://java.sun.com/xml/ns/persistence/orm"

xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

xsi:schemaLocation="http://java.sun.com/xml/ns/persistence/orm orm_2_0.xsd"

version="2.0">

<persistence-unit-metadata>

<persistence-unit-defaults>

<schema>ORDER</schema>

</persistence-unit-defaults>

</persistence-unit-metadata>

</entity-mappings>

This file must be referenced in the JPA configuration file persistence.xml (Example 5-13).

Example 5-13 Snippet showing reference in persistence.xml to a mapping file

<persistence-unit name="OrderDB">

<mapping-file>custom-orm.xml</mapping-file>

This configuration sets the schema name to the name given in the my-orm.xml mapping file, in this case ORDER, to all your JPA classes. It ensures that you can only use tables in this schema.

JPA in combination with NoSQL databases

EclipseLink is one of the JPA providers that started to support NoSQL databases (from version 2.4 forward). From this version forward, they support MongoDB and Oracle NoSQL. Other NoSQL databases are expected to be supported in the future releases.

MongoDB is a NoSQL document-oriented database. The data structure is favorable to JSON like documents with the ability to have dynamic schemas. MongoDB has a specialized version of the JSON format called BSON:

http://docs.mongodb.com/manual/reference/bson-types/

The EclipseLink Solution Guide for EclipseLink Release 2.5 shows an example of how to access a MongoDB:

http://www.eclipse.org/eclipselink/documentation/2.5/solutions/nonrelational_db002.htm

Before deciding to use JPA (like with EclipseLink) as the provider for MongoDB, consider the following points:

•SQL is a specified language that has gone through a number of revisions. The database vendors have implemented this standard, but they have also added some features to the SQL that have not been standardized. JPA has good support for SQL, but does not support all of the features that are exposed by MongoDB. If some of these features are needed in your microservice, then the benefits you get from using JPA are fewer.

•JPA has numerous features that do not make sense in a document-oriented database, but the EntityManager has methods for these features. So you must define which methods to use in your service.

If you are familiar with JPA and only require simple functions for storing your data in a NoSQL database, then it is acceptable to start doing this with your JPA provider. If the data access gets more complicated, it is better to use the Java drivers from your NoSQL database. JPA does not really fit with NoSQL databases, but it can be a good starting point for your implementation. For a sample of how to use the native Java driver of MongoDB, see:

http://docs.mongodb.com/getting-started/java/

To get more out of your JPA provider, it can be useful to use Spring Data JPA. In addition to the JPA provider, Spring Data JPA adds an extra layer on top of the JPA provider:

http://projects.spring.io/spring-data-jpa/

Suitability of JPA for data handling in microservices

The following list shows some arguments why JPA is useful for data handling in your microservices:

•Defining your microservices from the perspective of Domain Driven Design leads to the situation where most of the microservices need only simple queries to persist their entities (simple create, retrieve, update, and delete operations). JPA's EntityManager has these create, retrieve, update, and delete methods that you need: persist, find, merge, delete). Not much programming is needed to call these methods.

•In some cases, the queries get more complex. These queries can be done with the query language defined in JPA: JPQL. The need for complex queries should be an exceptional case. JSON documents with a hierarchy of entity data should each be stored separately. This leads to simple IDs and simple queries.

•JPA is standardized and has a great community to support you in developing your microservice. All Java EE servers must support JPA.

•To implement microservices not running in a Java EE container, you can also use JPA.

•Generating entity classes from the database (reverse engineering) can reduce the number of lines of code you must implement by yourself.

•For polyglot persistence, JPA has support for a relational data store and a document-oriented data store (EclipseLink).

•JPA as an abstraction of the relational database allows you to exchange your relational data store with another relational data store if you need to. This portability prevents your microservice from a vendor lock in.

•To implement the strategy, every microservice should have its own schema in a relational database, you can set the default schema for your service in the JPA configuration file persistence.xml.

5.2.2 Enterprise JavaBeans

Enterprise JavaBeans 3.2 (EJB) (specified in the JSR 345¹³) is part of the Java EE specification. EJBs are more than normal Java classes for these reasons:

•They have a lifecycle.

•They are managed by an EJB container (runtime environment for the EJB).

•They have a lot more features that can be helpful.

EJBs are server-side software components. Since EJB 3.0, it is no longer necessary to use a Deployment Descriptor. All declarations of an EJB can be done with annotations in the EJB class itself. The implementation to handle the EJB inside an EJB container is as lightweight as it is for CDI managed beans. The thread handling of the EJBs is done by the EJB container (similar to thread handling in a servlet container). An additional feature of EJBs is that they can be used in combination with Java EE Security.

EJBs can be divided into the following types:

•Stateless

•Stateful

•Singleton

•Message-driven beans (MDB)

Stateless EJBs cannot hold any state, but stateful EJBs can. Due to the characteristics of a microservice, stateful EJBs should not be used in a microservice. A Singleton Bean exists only one time in a Java EE server. MDBs are used in the processing of asynchronous messages in combination with a JMS provider.

EJBs can implement several business views, which must be annotated accordingly:

•Local interface (@Local)

Methods in this Interface can only be called by clients in the same Java virtual machine (JVM).

•Remote interface (@Remote)

Methods that are listed in this interface can be called by clients from outside the JVM.

•No interface (@LocalBean)

Nearly the same as local interface, except all public methods of the EJB class are exposed to the client.

In a lightweight architecture, which a microservice should have, it is useful to implement the EJBs as no interface EJBs.

One of the main benefits that EJBs provide is automatic transaction handling. Every time the a business method is called, the transaction manager of the EJB container is invoked (exception: an EJB with explicitly switched off support for transactions). So it is easy to use EJBs in combination with a transactional data store. Integrating EJBs with JPA is also easy.

The code snippet in Example 5-14 shows an example of how to combine the EJBs with the JPA framework.

Example 5-14 Stateless (no-interface) EJB with PersistenceContext

@Stateless

@LocalBean

public class OrderEJB {

@PersistenceContext

private EntityManager entityManager;

public void addOrder(Order order) {

entityManager.persist(order);

}

public void deleteOrder(Order order) {

entityManager.remove(order);

}

. . .

}

The EntityManager gets injected as described in 5.2.1, “Java Persistence API” on page 50.

An EJB can have one of the following transaction attributes to operate with a transactional data store (must be implemented by the EJB container):

•REQUIRED (default)

•MANDATORY

•NEVER

•NOT_SUPPORTED

•REQUIRES_NEW

•SUPPORTS

For more information about these attributes, see the following website:

http://docs.oracle.com/javaee/7/api/javax/ejb/TransactionAttributeType.html

These so called container-managed transactions (CMTs) can be used on every business method of an EJB. Bean-managed transactions (BMT), which can also be used in an EJB, should be avoided. The annotation TransactionAttribute can be set on a class level so every business method of that class has this transaction attribute (Example 5-15). If nothing is set, then all methods have the default transaction level (REQUIRED). A transaction attribute at the method level overwrites the class attribute.

Example 5-15 Set transaction attribute explicit in an EJB

@TransactionAttribute(REQUIRED)

@Stateless

@LocalBean

public class OrderEJB {

...

@TransactionAttribute(REQUIRES_NEW)

public void methodA() {...}

@TransactionAttribute(REQUIRED)

public void methodB() {...}

}

REST Endpoint not implemented as EJB

Some of the database changes or verifications are done at the point where the transaction manager has decided to commit. In some situations, check constraints in the database for example, are verified as one of the last steps before the transaction is committed. If this verification fails, then the outcome is a RuntimeException that the JPA provider throws, because JPA uses runtime exceptions to report errors. If you use EJBs for transaction management, then the place to catch the RuntimeException is in the stub code of the EJB where the EJB container does the transaction management. The stub code is generated by the EJB container. Therefore, you cannot react to this RuntimeException and the exception is thrown further to the place it will be caught.

If you implement your REST endpoint as an EJB, as some people prefer, then the exception must be caught in your REST provider. REST providers do have exception mappers to do the conversion of exceptions to HTTP error codes. However, these exception mappers do not interfere in the case when the RuntimeException is thrown during the database commit. Therefore, the RuntimeException is received by the REST client, which should be avoided.

The best way to handle these problems is to implement the REST endpoint as a CDI managed request scoped bean. In this CDI bean, you can use injection the same way as inside an EJB. So it is easy to inject your EJB into the CDI managed bean (Example 5-16).

Example 5-16 REST endpoint implemented as CDI managed bean with EJB injected

@RequestScoped

@Path("/Order")

public class OrderREST {

@EJB

private OrderEJB orderEJB;

...

}

It is also possible to integrate EJBs with Spring (Enterprise JavaBeans (EJB) integration - Spring) and if preferred, the transaction management can be done using Transaction Management Spring. However, in a Java EE world it is better to delegate transaction management to the server.

For more information about Spring, see the following website:

http://docs.spring.io/spring/docs/current/spring-framework-reference/html/ejb.html

5.2.3 BeanValidation

BeanValidation is also part of the Java EE 7 specification: Bean Validation 1.1 JSR 349¹⁴. The intention of Bean Validation is to easily define and enforce validation constraints on your bean data. In a Java EE environment, Bean Validation is done by the different Java EE containers automatically. The developer only needs to set the constraints, in the form of annotations on attributes, methods, or classes. The validation is done on the invocation of these elements automatically (if configured). The validation can also be done explicitly in your source code.

For more information about Bean Validation, see the following website:

http://beanvalidation.org/

Examples of built-in constraints in the javax.validation.constraints package are shown in Example 5-17.

Example 5-17 Default built-in constraints in Bean Validation

private String username; // username must not be null

@Pattern(regexp="\(\d{3}\)\d{3}-\d{4}")

private String phoneNumber; // phoneNumber must match the regular expression

@Size(min=2, max=40)

String briefMessage; // briefMessage netween 2 and 40 characters

Constraints can also be combined as shown in Example 5-18.

Example 5-18 Combination of constraints

@NotNull

@Size(min=1, max=16)

private String firstname;

It is also possible to extend the constraints (custom constraints). Example 5-19 shows how to do the validation on your own.

Example 5-19 Validating programmatically

Order order = new Order( null, "This is a description", null );

ValidatorFactory factory =

Validation.buildDefaultValidatorFactory();

Validator validator = factory.getValidator();

Set<ConstraintViolation<Order>> constraintViolations = validator.validate(order);

assertEquals( 2, constraintViolations.size() );

assertEquals( "Id may not be null", constraintViolations.iterator().next().getMessage() );

assertEquals( "Order date may not be null", constraintViolations.iterator().next().getMessage() );

JPA can be configured to do the bean validation automatically. The JPA specification mandates that a persistence provider must validate so called managed classes (see Example 5-20). Managed classes in the sense of JPA are, for example, entity classes. All other classes used for the JPA programming must also be validated (for example, embedded classes, super classes). This process must be done in the lifecycle events these managed classes are involved in.

Example 5-20 Persistence.xml with bean validation turned on

<persistence-unit name="OrderDB">

org.eclipse.persistence.jpa.PersistenceProvider

</provider>

<property name="javax.persistence.validation.mode"

value="AUTO" />

</properties>

</persistence-unit>

</persistence>

All the other frameworks used from the Java EE stack can be used to validate the beans automatically (for example JAX-RS, CDI) and also EJB as shown in Example 5-21.

Example 5-21 Bean validation in an EJB

@Stateless

@LocalBean

public class OrderEJB {

public String setDescription(@Max(80) String newDescription){

. . .

}

Architectural layering aspects of JPA and BeanValidation

Depending on the layers that are implemented in a microservice, there are some aspects to address.

In a service with just a few layers, it is easier to use the JPA entity classes as a Data Transfer Object (DTO) as well. After the JPA object gets detached from its persistence context, it can be used as a plain old Java object (POJO). This POJO can also be used as a DTO to transfer the data to the REST endpoint. Doing it this way has some disadvantages. The Bean Validation annotations get mixed with the JPA annotations that can lead to Java classes with numerous annotations and your REST endpoints become more closely related to your database.

If the microservice is a little bit larger or has to deal with a more complex data model, then it can be better to use a separate layer to access the database. This extra layer implements all the data access methods based on the JPA classes. The classes in this layer are data access objects (DAOs). You can use the DAO classes to generate the DTO classes for the REST endpoint and focus, on the one side, on the data model (DAO) and, on the other side, on the client (DTO). The disadvantage of this pattern is that you must convert your JPA classes handled in the DAO layer to DTOs and vice versa. To avoid creating a lot of boilerplate code to do this, you can use some frameworks to help you do the converting. The following frameworks are available to convert Java objects:

•ModelMapper

http://modelmapper.org/

•MapStruct

http://mapstruct.org/

To extend the possibilities you get from Bean Validation, it can be useful to get additional features by using Springs. For more information, see “Validation, Data Binding, and Type Conversion” at:

http://docs.spring.io/spring/docs/current/spring-framework-reference/html/validation.html

5.2.4 Contexts and Dependency Injection

If your microservice does not store its data in a transactional data store, then consider using CDI managed beans instead of EJBs. CDI 1.1 is specified in the JSR 346¹⁵ and is part of the Java EE 7 specification. CDI managed beans can be a good alternative in these situations.

For more information about CDI managed beans, see the following website:

http://docs.oracle.com/javaee/6/tutorial/doc/giwhl.html

In comparison to EJBs, CDI has no Java EE security on its own and does not get the persistence context injected. This process must be done by the developer themselves or by using additional frameworks. Apache DeltaSpike, for example, has many modules that you can use to extend the functions of CDI. For more information about Apache DeltaSpike, see:

http://deltaspike.apache.org/index.html

Additional frameworks can be used to extend the functions of CDI managed beans. EJBs have a thread pool that can be managed in an application server. CDI has nothing that corresponds to this function yet. To be able to configure a thread pool can be helpful in an environment with high loads.

To implement a microservice not running in a Java EE application server, CDI and additional modules provide you many functions that are useful in these environments.

5.2.5 Java Message Service API

To implement an event-driven architecture in a Java world, JMS API provides support, which is also specified in Java Message Service 2.0 JSR 343¹⁶. JMS is used to communicate to a so-called message provider that must be implemented by the message-oriented middleware (MOM).

When JMS was updated from version 1.1 to version 2.0, which is part of the Java EE 7 specification, much rework was done to make the API easier to use. JMS 2.0 is compatible with an earlier version, so you can use your existing code or use the new simplified API for your new microservices. The old API will not be deprecated in the next version.

According to the smart endpoints and dump pipes methodology¹⁷, your Java based microservice must just send your JSON message to an endpoint hosted by the JMS provider. The sender of the messages is called producer and the receiver of the message is the consumer. These endpoints can be of these types:

•Queue

Messages in a queue are consumed by only one consumer. The sequence of the messages in the queue might be consumed in a different order. Queues are used in a point to point semantic.

•Topic

These messages can be consumed by more than one consumer. This is the implementation for publish/subscribe semantics.

In a REST-based microservice, where JSON-based requests are sent by the client, it is a good idea to also use the JSON format for your messaging system. Other messaging applications use XML. If you have only one format in your system of microservice, it is easier to implement JSON.

Example 5-22 Producer sending messages to a JMS queue with EJB

@Stateless

@LocalBean

public class OrderEJB {

@Inject

@JMSConnectionFactory("jms/myConnectionFactory")

JMSContext jmsContext;

@Resource(mappedName = "jms/PaymentQueue")

Queue queue;

public void sendMessage(String message) {

jmsContext.createProducer().send(queue, message);

}

A JMSContext and a Queue are needed to send a message (Example 5-22 on page 62). These objects are injected if the message sender is running in a Java EE application server. Example 5-22 on page 62 uses an EJB so these resources are injected. The configuration for the injected objects must be done in the application server. If an exception occurs, a runtime exception JMSRuntimeException is thrown.

The scope of an injected JMSContext in a JTA transaction is transaction. So if your message producer is an EJB, your message is delivered in the context of a transaction, which avoids loosing messages.

To consume the message from a queue, it is easy to use a message driven EJB (MDB) as shown in Example 5-23. Using an MDB also has the advantage that consuming messages is done in a transaction, so no message is lost.

Example 5-23 Consumer - process messages from a JMS queue

@MessageDriven(

name="PaymentMDB",

activationConfig = {

@ActivationConfigProperty(

propertyName="messagingType",

propertyValue="javax.jms.MessageListener"),

@ActivationConfigProperty(

propertyName = "destinationType",

propertyValue = "javax.jms.Queue"),

@ActivationConfigProperty(

propertyName = "destination",

propertyValue = "PaymentQueue"),

@ActivationConfigProperty(

propertyName = "useJNDI",

propertyValue = "true"),

}

)

public class PaymentMDB implements MessageListener {

@TransactionAttribute(

value = TransactionAttributeType.REQUIRED)

public void onMessage(Message message) {

if (message instanceof TextMessage) {

TextMessage textMessage = (TextMessage) message;

String text = message.getText();

. . .

}

You must use the @MessageDriven annotation to declare the EJB type as message-driven EJB. Within this annotation, you can set the name of the MDB and some activation configurations. The properties of the activation configuration tie the MDB to the JMS messaging system that handles the queues or topics. In an application server environment, where your MDB gets hosted, it is easy to configure these elements. The MDB itself implements a MessageListener interface that has only one method: onMessage. Every time the JMS provider has a message to process, it calls this method. The method is annotated with a transactional attribute to show that this method is called in a transaction.The default transaction attribute for an MDB is TransactionAttributeType.REQUIRED. Inside the method, the message object must be converted and the message can be extracted as a String. Other message types are also possible.

It is good practice to use the MDB only for the message handling. Keep the MDB as a technical class. The rest of your business code should be implemented in a Java POJO that gets called by the MDB. This configuration makes the business code easier to test in your JUnits.

As stated before, every MDB runs in a transaction, so no message gets lost. If an error occurs during the processing of the message and the EJB container, which handles the MDB, receives this runtime exception, then the message is redelivered to the MDB (error loop). The retry count, which can be configured in your JMS provider, specifies how often this occurs. After the maximum number of retries, the message is normally put into an error queue. Messages in an error queue must be handled separately.

If a business transaction spans more than one microservice, use an event-driven architecture (see section 5.1.4, “Data sharing across microservices” on page 45). This means that the microservice sending the event must perform the following tasks:

•Change data in its data store

•Send the message to the second microservice

The receiving microservice must perform these tasks:

•Receive the message from the queue

•Change data in its data store

To be consistent, these two things must be done in one transaction if the data store is transactional. This requirement is also true for the producer and the consumer side of the messaging system. In these cases, you must use a distributed transaction. The transaction partners are the data store and the JMS provider (not the two data stores of the two microservices).

To track the messages produced and consumed, it is useful to use a correlation ID. A correlation ID specified by the producer of the message correlates to the message consumed by the consumer. This correlation ID can also be used in the logging of the microservice to get the complete communication path of your microservices call (see 9.2, “Logging” on page 111).

Java provides a class to generate a unique Id: UUID. This class can be used to generate a correlation ID. Example 5-24 shows setting a correlation ID.

Example 5-24 Setting correlation ID in a JMS message

// JMSContext injected as before

JMSProducer producer = jmsContext.createProducer();

producer.setJMSCorrelationID(UUID.randomUUID().toString());

producer.send(queue, message);

Example 5-25 shows retrieving the correlation ID.

Example 5-25 Retrieving the correlation ID from a JMS message

// message received as before

String correlationId = message.getJMSCorrelationID();

For more information about UUID, see the following website:

http://docs.oracle.com/javase/7/docs/api/java/util/UUID.html

If you are using a non-JMS provider as the message-oriented middleware, JMS might not be the correct way to send messages to this system. Using RabbitMQ (which is an AMQP Broker) can be done with JMS because Pivotal has implemented a JMS Adapter for RabbitMQ. For more information about RabbitMQ, see the following website:

http://www.rabbitmq.com/

Apache Qpid also implements a JMS Client for the AMQP Protocol. These are just a few examples to show that it can also be useful to use JMS to communicate with non-JMS providers. But, depending on your requirements, it can be better to use the native Java driver of the messaging system. For more information about Apache Qpid, see the following website:

http://qpid.apache.org/index.html

Spring's support for JMS Message Providers is another alternative for handling JMS messages. Fore more information about Spring, see the following website:

http://docs.spring.io/spring/docs/current/spring-framework-reference/html/jms.html

5.2.6 Java and other messaging protocols

Depending on the features that you need from your message-oriented middleware, you might need a non-JMS message provider system. MQTT is a messaging protocol that best fits the needs of Internet of Things (IoT) and AMQP. It was developed to be portable across vendors. JMS cannot be used to communicate with these systems. Using their provided clients allows you to use all special features they provide. For more information about MQTT, see the following website:

http://mqtt.org/

Apache Kafka is a non-JMS provider, but provides a Java driver. There is an enhancement request to implement an adapter to let the clients talk JMS with Apache Kafka¹⁸ but this issue is still open. So it is better to use the Java driver that Kafka provides. For more information about Kafka, see the following website:

http://kafka.apache.org/

RabbitMQ is a message broker system which implements the AMQP protocol. It provides a Java client¹⁹. For more information about RabbitMQ, see the following website:

https://www.rabbitmq.com/

Spring has a library to talk to AMQP based messaging systems. Spring also provides support for the MQTT protocol²⁰. For more information about Spring, see the following website:

http://projects.spring.io/spring-amqp/

As you can see, there is support for Java with non JMS messaging systems.

¹ “Enterprise Integration Patterns: Designing, Building, and Deploying Messaging Solutions”, Gregor Hohpe and Booby Woolf, Addison-Wesley Professional

² http://en.wikipedia.org/wiki/Message-oriented_middleware

³ http://www.amqp.org/

⁴ http://en.wikipedia.org/wiki/Advanced_Message_Queuing_Protocol

⁵ http://mqtt.org/

⁶ http://en.wikipedia.org/wiki/MQTT

⁷ http://jcp.org/en/jsr/detail?id=343

⁸ http://en.wikipedia.org/wiki/Java_Message_Service

⁹ http://qpid.apache.org/index.html

¹⁰ http://en.wikibooks.org/wiki/Java_Persistence/What_is_JPA%3F

¹¹ http://jcp.org/en/jsr/detail?id=338

¹² http://docs.oracle.com/javaee/7/api/javax/persistence/EntityManager.html

¹³ http://jcp.org/en/jsr/detail?id=345