CHAPTER 2

image

Hibernate OGM and MongoDB

By now, you should have some idea of the general scope and architecture of Hibernate OGM. In Chapter 1, I discussed how Hibernate OGM works with generic NoSQL stores, and I spoke about its general focus and how you represent, persist, and query data. In addition, you learned how to obtain a Hibernate OGM distribution, and you’ve installed a MongoDB NoSQL store and performed a simple command-line test to verify that the MongoDB server responds correctly.

In this chapter, I’ll define more clearly the relationship between Hibernate OGM and MongoDB. Instead of generic possibilities, I’ll focus on how Hibernate OGM works with the MongoDB store, and you’ll see how much of MongoDB can be “swallowed” by Hibernate OGM and some MongoDB drawbacks that force Hibernate OGM to work overtime to manage them.

Configuring MongoDB-Hibernate OGM Properties

Hibernate OGM becomes aware of MongoDB when you provide a bundle of configuration properties. If you’ve worked before with Hibernate ORM, you’re already familiar with these kinds of properties. In particular, there are three ways of setting these properties, as you’ll see in the next chapters:

  • declarative, through the hibernate.cfg.xml configuration file
  • programmatically, through Hibernate native APIs
  • declarative, through the persistence.xml configuration file in JPA context

image Note   Remember, we’re using Hibernate OGM 4.0.0.Beta.2 with Hibernate OGM for MongoDB 4.0.0.Beta1 and the Java driver for MongoDB 2.8.0.

Let’s take look at the properties that enable Hibernate OGM to work with MongoDB.

hibernate.ogm.datastore.provider

As you know from Chapter 1, Hibernate OGM currently supports several NoSQL stores, including MongoDB. This property value is how you let Hibernate OGM know which NoSQL store you want to use. For MongoDB, the value of this property must be set to mongodb.

hibernate.ogm.mongodb.host

Next, Hibernate OGM needs to locate the MongoDB server instance. First, it must locate the hostname, which is represented by the IP address of the machine that hosts the MongoDB instance. By default, the value of this property is 127.0.0.1, which equivalent to localhost, and it can be set through the MongoDB driver as well:

Mongo mongo = new Mongo("127.0.0.1");
Mongo mongo = new Mongo(new ServerAddress( "127.0.0.1"));

hibernate.ogm.mongodb.port

And what is a hostname without a port? By default, the MongoDB instance runs on port number 27017, but you can use any other MongoDB port as long as you specify it as the value of this property. If you are using the MongoDB driver directly, the port is typically set like this:

Mongo mongo = new Mongo("127.0.0.1", 27017);
Mongo mongo = new Mongo( new ServerAddress("127.0.0.1", 27017));

hibernate.ogm.mongodb.database

Now Hibernate OGM can locate MongoDB through its host and port. You also have to specify the database to connect to. If you indicate a database name that doesn’t exist, a new database with that name will be automatically created (there’s no default value for this property). You can also connect using the MongoDB driver, like this:

DB db = mongo.getDB(" database_name");
Mongo db = new Mongo( new DBAddress( "127.0.0.1", 27017, " database_name" ));

hibernate.ogm.mongodb.username
hibernate.ogm.mongodb.password

These two properties represent authentication credentials. They have no default values and usually appear together to authenticate a user against the MongoDB server (though if you set the password without setting the username, Hibernate OGM will ignore the hibernate.ogm.mongodb.password property). You can also use the MongoDB driver to set authentication credentials, like so:

boolean auth = db.authenticate(" username ", " password ".toCharArray());

hibernate.ogm.mongodb.safe

Note that this property is a little tricky. MongoDB isn’t adept at transactions; it doesn’t do rollback and can’t guarantee that the inserted data is, in fact, in the database since the driver doesn’t wait for the write operation to be applied before returning. Behind the great speed advantage—resulting from  the fact that the driver performs a write behind to the MongoDB server—lurks a dangerous trap that can lose data.

The MongoDB team knew of this drawback, so it developed a new feature called Write Concerns to tell MongoDB how important a piece of data is. This is also used to indicate the initial state of the data, the default write, (WriteConcern.NORMAL).

MongoDB defines several levels of data importance, but Hibernate OGM lets you switch between the default write and write safe write concerns.

With write safe, the driver doesn’t return immediately; it waits for the write operation to succeed before returning. Obviously, this can have serious consequences for performance. You can set this value using the hibernate.ogm.mongodb.safe property. By default, the value of this property is true, which means write safe is active, but you can set it to false if loss of writes is not a major concern for your case.

Here’s how to use the MongoDB driver directly to set write safe:

DB db = mongo.getDB(" database_name ");
DBCollection dbCollection = db.getCollection(" collection_name ");
dbCollection.setWriteConcern(WriteConcern.SAFE);
dbCollection.insert( piece_of_data );
//or, shortly
dbCollection.insert( piece_of_data , WriteConcern.SAFE);

image Note   Currently, Hibernate OGM only lets you enable the write safe MongoDB write concern (WriteConcern.SAFE). Strategies like Write FSYNC_SAFE (WriteConcern.FSYNC_SAFE), Write JOURNAL_SAFE (WriteConcern.JOURNAL_SAFE), and Write Majority (WriteConcern.MAJORITY) are thus controllable only through MongoDB driver.

hibernate.ogm.mongodb.connection_timeout

MongoDB supports a few timeout options for different kinds of time-consuming operations. Currently, Hibernate OGM exposes through this property the MongoDB option connectTimeout (see com.mongodb.MongoOptions). This is expressed in milliseconds and represents the timeout used by the driver when the connection to the MongoDB instance is initiated. By default, Hibernate OGM sets it to 5000 milliseconds to override the driver default of 0 (which means no timeout). You can set this property as follows:

mongo.getMongoOptions().connectTimeout= n_miliseconds ;

hibernate.ogm.mongodb.associations.store

This property defines the way Hibernate OGM stores information relating to associations. The accepted values are: IN_ENTITY, COLLECTION, and GLOBAL_COLLECTION. I’ll discuss these three strategies a little later in this chapter.

hibernate.ogm.datastore.grid_dialect

This is an optional property that’s usually ignored because the datastore provider chooses the best grid dialect automatically. But if you want to override the recommended value, you have to specify the fully qualified class name of the GridDialect implementation. For MongoDB, the correct value is org.hibernate.ogm.dialect.mongodb.MongoDBDialect.

This is the set of properties that Hibernate OGM uses for configuring a connection to MongoDB server. At this point, you have access to the essential settings for creating decent communications with the MongoDB server. In future OGM releases, we can hope to be able to access many more settings for the MongoDB driver.

Data Storing Representation

As you know, the relational data model is useless in terms of MongoDB, which is a document-based database system; all records (data) in MongoDB are documents. But, even so, MongoDB has to keep a conceptual correspondence between relational terms and its own notions. Therefore, instead of tables, MongoDB uses collections and instead of records, it uses documents (collections contain documents). MongoDB documents are BSON (Binary JSON—binary-encoded serialization of JSON-like documents) objects and have the following structure:

{
   field1: value1,
   field2: value2,
   field3: value3,
   ...
   fieldN: valueN
}

Storing Entities

OK, but we are still storing and retrieving Java entities, right? Yes, the answer is definitely yes! If Hibernate ORM provides complete support for transforming Java entities into relational tables, Hibernate OGM provides complete support for transforming Java entities into MongoDB collections. Each entity represents a MongoDB collection; each entity instance represents a MongoDB document; and each entity property will be translated into a document field (see Figure 2-1).

9781430257943_Fig02-01.jpg

Figure 2-1. Storing a Java object in a MongoDB document

The Hibernate OGM team worked hard to store data as naturally as possible for MongoDB so that third-party applications can exploit this data without Hibernate OGM assistance. For example, let’s suppose we have a POJO class like the one in Listing 2-1. (I’m sure you’ve stored tons of Java objects like this into relational databases, so I’m providing no details about this simple class.)

Listing 2-1.  A POJO Class

import java.util.Date;

public class Players {
    
    private int id;
    private String name;
    private String surname;
    private int age;
    private Date birth;

    public int getId() {
        return id;
    }

    public void setId(int id) {
        this.id = id;
    }

    public String getName() {
        return name;
    }

    public void setName(String name) {
        this.name = name;
    }

    public String getSurname() {
        return surname;
    }

    public void setSurname(String surname) {
        this.surname = surname;
    }

    public int getAge() {
        return age;
    }

    public void setAge(int age) {
        this.age = age;
    }

    public Date getBirth() {
        return birth;
    }

    public void setBirth(Date birth) {
        this.birth = birth;
    }
}

Now, suppose an instance of this POJO is stored into the MongoDB players collection using Hibernate OGM, like this:

{
        "_id": 1,
        "age": 26,
        "birth": ISODate("1986-06-03T15:43:37.763Z"),
        "name": "Nadal",
        "surname": "Rafael"
}

This is exactly what you obtain if you manually store via the MongoDB shell with the following command:

>db.players.insert(
                  {
                        _id: 1,
                        age: 26,
                        birth: new ISODate("1986-06-03T15:43:37.763Z"),
                        name: "Nadal",
                        surname: "Rafael"
                  }
                       )

Practically, there’s no difference in the result. You can’t tell if the document was generated by Hibernate OGM or inserted through the MongoDB shell.  That’s great! Moreover, Hibernate OGM knows how to transform this result back into an instance of the POJO. That’s even greater! And you won’t feel any programmatic discomfort, since Hibernate OGM doesn’t require you write any underlying MongoDB code. That’s the greatest!

Storing Primary Keys

A MongoDB document or collection has a very flexible structure. It supports simple objects: the embedding of objects and arrays within other objects and arrays; different kinds of documents in the same collection; and more, but it also contains a document field especially reserved for storing primary keys. This field is named _id and its value can be any information as long as it’s unique. If you don’t set _id to anything, the value will be set automatically to “MongoDB Id Object”.

Hibernate OGM recognizes these specifications when storing identifiers into a MongoDB database; it lets you use identifiers of any Java type, even composite identifiers, and it always stores them into the reserved _id field.

Figure 2-2 shows some identifiers of different Java types and how they look in MongoDB.

9781430257943_Fig02-02.jpg

Figure 2-2. Correspondence between Java-style primary keys and MongoDB identifiers

Storing Associations

Probably the most powerful feature of relational databases relies on associations. Any database of any meaningful capability take advantages of associations: one-to-one, one-to-many, many-to-one, and many-to-many. In the relational model, associations require storing additional information, known as navigation information for associations.

For example, in a bidirectional many-to-many association, the relational model usually uses three tables, two tables for data and an additional table, known as a junction table. The junction table holds a composite key that consists of the two foreign key fields that refer to the primary keys of both data tables (see Figure 2-3). Note that the same pair of foreign keys can only occur once.

9781430257943_Fig02-03.jpg

Figure 2-3. A bidirectional many-to-many association, shown in a relational model representation

In a MongoDB many-to-many association, you store the junction table as a document. Hibernate OGM provides three solutions to accomplish this: IN_ENTITY, COLLECTION, and GLOBAL_COLLECTION. To better understand these strategies, let’s improvise a simple scenario—two relational tables (Players and Tournaments) populated respectively with three players, two tournaments, and a many-to-many association as shown in Figure 2-4. (The first and second players, P1 and P2, participate in both tournaments, T1 and T2, and the third player (P3) participates only in the second tournament, T2. Or, from the other side of the association, the first tournament, T1, includes the first and second players, P1 and P2, and the second tournament,T2, includes the first, second, and third players, P1, P2, and P3.)

9781430257943_Fig02-04.jpg

Figure 2-4. A bidirectional many-to-many association in a relational model representation—test case

Now, let’s look at the Hibernate OGM strategies for storing associations, using this test case. We want to observe how the junction table is stored in MongoDB based on the selected strategy. We’ll begin with the default strategy, IN_ENTITY, and continue with GLOBAL_COLLECTION, and finally COLLECTION.

In JPA terms, the main ways to represent this relational model are: the Players entity defines a primary key field named idPlayers and is the owner of the association; the Tournaments entity defines a primary key named idTournaments and is the non-owner side of the association—it contains the mappedBy element. Moreover, the Players entity defines a Java collection of Tournaments, named tournaments, and the Tournaments entity defines a Java collection of Players, named players.

IN_ENTITY

The default strategy for storing navigation information for associations is named IN_ENTITY. In this case, Hibernate OGM stores the primary key of the other side of the association (the foreign key) into:

  • a field if the mapping concerns a single object.
  • an embedded collection if the mapping concerns a collection.

Running the relational scenario for MongoDB using the IN_ENTITY strategy reveals the results shown in Figure 2-5 and Figure 2-6.

9781430257943_Fig02-05.jpg

Figure 2-5. Hibernate OGM-IN_ENTITY strategy result (Players collection)

9781430257943_Fig02-06.jpg

Figure 2-6. Hibernate OGM-IN_ENTITY strategy result (tournaments collection)

Figure 2-5 shows the MongoDB Players collection corresponding to the Players relational table; as you can see, each collection’s document contains part of the association as an embedded collection. (The Players collection contains the part of the junction table that references the Tournaments collection.)

image Note   The simplest way to explore a MongoDB collection from the shell is to call the find method, which returns all documents from the specified collection. In addition, calling the pretty method results in the output being nicely formatted. When a collection contains more documents than fit in a shell window, you need to type the it command, which supports document pagination.

The Players collection shows three main documents with the _id set as 1, 2, and 3, and each document encapsulates the corresponding foreign keys in a field named like the Java collection declared by the owner side (tournaments). Each document in the embedded collection contains a foreign key value stored in a field whose name is composed of the Java collection name declared by the owner side (tournaments) concatenated with an underscore and the non-owner side primary key field name (idTournaments).

The Tournaments collection, which corresponds to the Tournaments relational table, is like a reflection of the Players collection—the Players primary keys become Tournaments foreign keys (the Tournaments collection contains the part of the junction table that references the Players collection). Figure 2-6 shows the contents of the Tournaments collections.

The Tournaments collection includes two main documents with the _id set as 1 and 2. Each one encapsulates the corresponding foreign keys in a field named like the Java collection declared by the non-owner side (players). Each document of the embedded collection contains a foreign key value stored in a field whose name is composed of the Java collection name declared by non-owner side (players) concatenated with an underscore and the owner side primary key field name (idPlayers).

In the unidirectional case, only the collection representing the owner side will contain navigation information for the association.

You can use this strategy of storing navigation information for associations by setting the hibernate.ogm.mongodb.associations.store configuration property to the value IN_ENTITY. Actually, this is the default value of this property.

GLOBAL_COLLECTION

When you don’t want to store the navigation information for associations into an entity’s collections, you can choose the GLOBAL_COLLECTION strategy (or COLLECTION, as you’ll see in the next section). In this case, Hibernate OGM creates an extra collection named Associations, especially designed to store all navigation information. The documents of this collection have a particular structure composed of two parts. The first part contains a composite identifier, _id, made up of two fields whose values represent the primary key of the association owner and the name of the association table; the second part contains a field, named rows, which stores foreign keys in an embedded collection. For bidirectional associations, another document is created where the ids are reversed.

Running our relational scenario for MongoDB and the GLOBAL_COLLECTION strategy reveals the results shown in Figure 2-7 and Figure 2-8.

9781430257943_Fig02-07.jpg

Figure 2-7. Hibernate OGM-GLOBAL_COLLECTION strategy result (Players and Tournaments collections)

9781430257943_Fig02-08.jpg

Figure 2-8. Hibernate OGM-GLOBAL_COLLECTION strategy result (Associatins collection)

In Figure 2-7, you can see that the Players and Tournaments collections contain only pure information, no navigation information.

The extra, unique collection that contains the navigation association is named Associations and is listed in Figure 2-8.

This is a bidirectional association. The owner side (Players) is mapped on the left side of Figure 2-8 and the non-owner side (Tournaments) is mapped on the right side of Figure 2-8. In a unidirectional association, only the owner side exists.

Now, focus on the nested document under the first _id field (Figure 2-8, left side). The first field name, players_idPlayers, is composed from the corresponding Java collection name defined in the non-owner side (players), or, for unidirectional associations, the collection name representing the owner side (Players) concatenated with an underscore and the name of the field representing the primary key of the owner side (idPlayers). The second field name is table; its value is composed of the collection name representing the the owner side concatenated with an underscore and the collection name representing the non-owner side (Players_Tournaments). The rows nested collection contains one document per foreign key. Each foreign key is stored in a field whose name is composed of the corresponding Java collection name defined in the owner side (tournaments) concatenated with an underscore and the primary key field name of the non-owner side (idTournaments). As a consequence of bidirectionality, things get reversed, as shown on the right side of Figure 2-8.

You can use this strategy for storing navigation information for associations by setting the hibernate.ogm.mongodb.associations.store configuration property to the value GLOBAL_COLLECTION.

COLLECTION

If GLOBAL_COLLECTION stores all the navigation information in one global collection, the COLLECTION strategy is less global and creates one MongoDB collection per association. For example, in our scenario, there will be one extra collection named associations_Players_Tournaments. In this strategy, each collection is prefixed with the word associations followed by the name of the association table. Using this convention makes it easy to differentiate the associations collections from the other collections.

The documents of this collection have a particular structure composed of two parts. The first part contains the primary key of the association owner and the second part contains a field, named rows, which stores all foreign keys in an embedded collection. For each foreign key there’s a document in the embedded collection. For bidirectional cases, another document is created where the ids are reversed.

If you’re familiar with the relational model this strategy should seem closer to your experience. In Figure 2-9, you can see the partial content of associations_Players_Tournaments collection—the navigation information for the owner side (Players).

9781430257943_Fig02-09.jpg

Figure 2-9. Hibernate OGM-COLLECTION strategy result (associations_Players_Tournaments collection)

You can easily see that the collection structure is the same as in the GLOBAL_COLLECTION case. The only difference is that the _id field no longer contains the association table name in a field named table, which is logical since the association table name is a part of the collection name (associations_Players_Tournaments).

You can use this strategy of storing navigation information for associations by setting the hibernate.ogm.mongodb.associations.store configuration property to the value COLLECTION.

image Note   Based on this example, you can easily intuit how the associations are represented in one-to-one, one-to-many, and many-to-one cases. Keep in mind that collections and field names can be altered by JPA annotations, like @Column, @Table, @JoinTable and so on. The example I presented doesn’t use such annotations.

From the JPA perspective, when a bidirectional association doesn’t define the owning side (using the mappedBy element), Hibernate OGM considers each side to be an individual association. In other words, you’ll obtain two associations instead of one in such cases. For example, the COLLECTION strategy will produce two collections for storing two associations.

Now, it’s up to you to decide which strategy better meets your needs.

Managing Transactions

Before switching from a relational model system to a NoSQL platform like Mongo DB, it’s important to understand the differences between them, and the advantages and drawbacks of each in the context of your application needs. Knowing only that MongoDB doesn’t support SQL, while relational models don’t support collections and documents, can lead to serious problems in application implementation. This is actually the fundamental difference between the two, but there are many others, including the amount of space consumed and the time necessary to perform statements, caching, indexing, and, probably the most painful, managing transactions.

Many pioneer projects with MongoDB fail miserably when the developers realize that data transactional integrity is a must, because MongoDB doesn’t support transactions. MongoDB follows this directive: “write operations are atomic on the level of a single document: no single write operation can atomically affect more than one document or more than one collection.” It also provides the two-phase commit mechanism for simulating transactions over multiple documents. You’ll find more details at www.docs.mongodb.org/manual/tutorial/perform-two-phase-commits/. But both mechanisms omit the most powerful feature of transactional systems—the rollback operation.

Thus, if you need transactions, using MongoDB can be a delicate or even inappropriate choice. MongoDB is not an alternative to SQL as a “fashion” choice and should be used only if it satisfies your application needs better than an RDBMS. You should choose MongoDB when your database model doesn’t imply transactions or when you can shape your database model not to need transactions.

Hibernate OGM can’t provide the rollback facility, but it does diminish the transactions issue by querying all changes before applying them during flush. For this, OGM recommends using transaction demarcations to trigger the flush operation on commit.

Managing Queries

Hibernate OGM provides three solutions for executing queries against a MongoDB database:

  • Partial JP-QL support
  • Hibernate Search
  • Native MongoDB queries

Each of these will be discussed and demonstrated in Chapter 6.

Summary

Though this is a short chapter, it contains plenty of information. I presented the rules that govern the relationship between Hibernate OGM and MongoDB. You saw how to configure MongoDB from Hibernate OGM and how data can be persisted in MongoDB according to the OGM implementation. In addition, I described the MongoDB view of transactions and finished with a quick enumeration of the query mechanism supported by Hibernate OGM.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset