© Joseph B. Ottinger, Jeff Linwood and Dave Minter 2016

Joseph B. Ottinger, Jeff Linwood and Dave Minter, Beginning Hibernate, 10.1007/978-1-4842-2319-2_5

5. An Overview of Mapping

Joseph B. Ottinger, Jeff Linwood2 and Dave Minter3

(1)Youngsville, North Carolina, USA

(2)Austin, Texas, USA

(3)London, UK

The purpose of Hibernate is to allow you to treat your database as if it stores Java objects. However, in practice, relational databases do not store objects – they store data in tables and columns. Unfortunately, there is no simple way to correlate the data stored in a relational database with the data represented by Java objects.1

The difference between an object-oriented association and a relational one is fundamental. Consider a simple class to represent a user, and another to represent an email address, as shown in Figure 5-1.

A321250_4_En_5_Fig1_HTML.jpg
Figure 5-1. An object-oriented association

User objects contain fields referring to Email objects. The association has a direction; given a User object, you can determine its associated Email object. For example, consider Listing 5-1.

Listing 5-1. Acquiring the Email Object from the User Object
User user = ...
Email email = user.email;

The reverse, however, is not true. The natural way to represent this relationship in the database, as illustrated in Figure 5-2, is superficially similar.

A321250_4_En_5_Fig2_HTML.jpg
Figure 5-2. A relational association

Despite that similarity, the direction of the association is effectively reversed. Given an Email row, you can immediately determine which user row it belongs to in the database; this relationship is mandated by a foreign key constraint. It is possible to reverse the relationship in the database world through suitable use of SQL – another difference.

Given the differences between the two worlds, it is necessary to manually intervene to determine how your Java classes should be represented in database tables.

Why Mapping Cannot Easily Be Automated

It is not immediately obvious why you cannot create simple rules for storing your Java objects in the database so that they can be easily retrieved. For example, the most immediately obvious rule would be that a Java class must correlate to a single table. For example, instances of the User class defined in Listing 5-2 could surely be represented by a simple table like the one for a user, shown in Figure 5-1.

Listing 5-2. A Simple User Class with a Password Field
public class User {
   String name;
   String password;
}

And indeed it could, but some questions present themselves:

  • How many rows should you end up with if you save a user twice?

  • Are you allowed to save a user without a name?

  • Are you allowed to save a user without a password?

When you start to think about classes that refer to other classes, there are additional questions to consider. Have a look at the Customer and Email classes defined in Listing 5-3.

Listing 5-3. Customer and Email Classes
public class Customer {
   int customerId;
   int customerReference;
   String name;
   Email email;
}


public class Email {
   String address;
}

Based on this, the following questions arise:

  • Is a unique customer identified by its customer ID, or its customer reference?

  • Can an email address be used by more than one customer?

  • Can a customer have more than one email ID?

  • Should the relationship be represented in the Customer table?

  • Should the relationship be represented in the Email table?

  • Should the relationship be represented in some third (link) table?

Depending upon the answers to these questions, your database tables could vary considerably. You could take a stab at a reasonable design, such as that given in Figure 5-3, based upon your intuition about likely scenarios in the real world.

A321250_4_En_5_Fig3_HTML.jpg
Figure 5-3. Tables in which the customer is identified by customerId. Here, email address entities can be used only by a single customer, and the relationship is maintained by the Email table

The field names and foreign keys are important; without them, there’s no way to form any useful decision about these entities (see Listing 5-4). It would be a nearly impossible task to design an automated tool that could picture this structure wisely or appropriately.

Listing 5-4. A Class Identical in Structure to Listing 5-3, but with All Contextual Information Removed
public class Foo {
   int x;
   int y;
   String s;
   Bar bar;
}


public class Bar {
   String a;
}

Primary Keys

Most “relational” databases that provide SQL access are prepared to accept tables that have no predefined primary key. Hibernate is not so tolerant; even if your table has been created without a primary key, Hibernate will require you to specify one. This often seems perverse to users who are familiar with SQL and databases but not familiar with ORM tools. As such, let’s examine in more depth the problems that arise when there’s no primary key.

To begin, without a primary key it is impossible to uniquely identify a row in a table. For example, consider Table 5-1.

Table 5-1. A Table in Which the Rows Cannot Be Uniquely Identified

User

Age

dminter

35

dminter

40

dminter

55

dminter

40

jlinwood

57

This table clearly contains information about users and their respective ages. However, there are four users with the same name (Dave Minter, Denise Minter, Daniel Minter, and Dashiel Minter). There is probably a way of distinguishing them somewhere else in the system – perhaps by email address or user number. But if, for example, you want to know the age of Dashiel Minter with user ID 32, there is no way to obtain it from Table 5-1.

While Hibernate will not let you omit the primary key, it will permit you to form the primary key from a collection of columns. For example, Table 5-2 could be keyed by Usernumber and User.

Table 5-2. A Table in Which the Rows Can Be Uniquely Identified

User

Usernumber

Age

dminter

1

35

dminter

2

40

dminter

3

55

dminter

32

42

jlinwood

1

57

Neither User nor Usernumber contains unique entries, but in combination they uniquely identify the age of a particular user, and so they are acceptable to Hibernate as a primary key.

Why does Hibernate need to uniquely identify entries when SQL doesn’t? Because Hibernate is representing Java objects, which are always uniquely identifiable. The classic mistake made by new Java developers is to compare strings using the == operator instead of the equals() method . You can distinguish between references to two String objects that represent the same text and two references to the same String object.2 SQL has no such obligation, and there are arguably cases in which it is desirable to give up the ability to make the distinction.

For example, if Hibernate could not uniquely identify an object with a primary key, then the following code could have several possible outcomes in the underlying table.

String customer = getCustomerFromHibernate("dcminter");
customer.setAge(10);
saveCustomerToHibernate(customer);

Let’s say the table originally contained the data shown in Table 5-3.

Table 5-3. Updating an Ambiguous Table

User

Age

dcminter

30

dcminter

42

Which of the following should be contained in the resulting table?

  • A single row for the user dcminter, with the age set to 10

  • Two rows for the user, with both ages set to 10

  • Two rows for the user, with one age set to 10 and the other to 42

  • Two rows for the user, with one age set to 10 and the other to 30

  • Three rows for the user, with one age set to 10 and the others to 30 and 42

In short, the Hibernate developers made a decision to enforce the use of primary keys when creating mappings so that this problem does not arise. Hibernate does provide facilities that will allow you to work around this if it is absolutely necessary (you can create views or stored procedures to “fake” the appropriate key, or you can use conventional JDBC to access the table data), but when using Hibernate, it is always more desirable to work with tables that have correctly specified primary keys, if at all possible.

Lazy Loading

When you load classes into memory from the database, you don’t necessarily want all the information to actually be loaded. To take an extreme example, loading a list of emails should not cause the full body text and attachments of every email to be loaded into memory. First, they might demand more memory than is actually available. Second, even if they fit, it would probably take a long time for all of this information to be obtained. (Remember, data normally goes from a database process to an application over the network, even if the database and application are on the same physical machine.)

If you were to tackle this problem in SQL, you would probably select a subset of the appropriate fields for the query to obtain the list, or limit the range of the data. Here’s an example of selecting a subset of data:

SELECT from, to, date, subject FROM email WHERE username = 'dcminter';

Hibernate will allow you to fashion queries that are rather similar to this, but it also offers a more flexible approach, known as lazy loading. Certain relationships can be marked as being “lazy,” and they will not be loaded from the database until they are actually required.

The default in Hibernate is that classes (including collections like Set and Map) should be lazily loaded. For example, when an instance of the User class given in the next listing is loaded from the database, the only fields initialized will be userId and username. 3

public class User {
   int userId;
   String username;
   EmailAddress emailAddress;
   Set<Role> roles;
}

With this definition, the appropriate objects for emailAddress and roles will be loaded from the database if and when they are accessed, provided the session is still active.

This is the default behavior only; mappings can be used to specify which classes and fields should behave in this way.

Associations

When we looked at why the mapping process could not be automated, we discussed the following example classes:

public class Customer {
   int customerId;
   int customerReference;
   String name;
   Email email;
}


public class Email {
   String address;
}

We also gave the following five questions that it raised:

  • Is a unique customer identified by its customer ID, or its customer reference?

  • Can a given email address be used by more than one customer?

  • Should the relationship be represented in the Customer table?

  • Should the relationship be represented in the Email table?

  • Should the relationship be represented in some third (link) table?

The first question can be answered simply; it depends on what column you specify as the primary key. The remaining four questions are related, and their answers depend on the object relationships. Furthermore, if your Customer class represents the relationship with the EmailAddress using a Collection class or an array, it would be possible for a user to have multiple email addresses.

public class Customer {
   int customerId;
   int customerReference;
   String name;
   Set<Email> email;
}

So, you should add another question: Can a customer have more than one email address? The set could contain a single entry, so you can’t automatically infer that this is the case.

The key questions from the previous options are as follows:

  • Q1: Can an email address belong to more than one user?

  • Q2: Can a customer have more than one email address?

The answers to these questions can be formed into a truth table, as shown in Table 5-4.

Table 5-4. Deciding the Cardinality of an Entity Relationship

Q1 Answer

Q2 Answer

Relationship Between Customer and Email

No

No

One-to-one

Yes

No

Many-to-one

No

Yes

One-to-many

Yes

Yes

Many-to-many

These are the four ways in which the cardinality4 of the relationship between the objects can be expressed. Each relationship can then be represented within the mapping table(s) in various ways.

The One-to-One Association

A one-to-one association between classes can be represented in a variety of ways. At its simplest, the properties of both classes are maintained in the same table. For example, a one-to-one association between a User and an Email class might be represented as a single table, as in Table 5-5.

Table 5-5. A Combined User/Email Table

ID

Username

Email

1

dcminter

[email protected]

2

jlinwood

[email protected]

3

tjkitchen

[email protected]

The single database entity representing this combination of a User and an Email class is shown in Figure 5-4.

A321250_4_En_5_Fig4_HTML.jpg
Figure 5-4. A single entity representing a one-to-one relationship

Alternatively, the entities can be maintained in distinct tables with identical primary keys, or with a key maintained from one of the entities into the other, as in Tables 5-6 and 5-7.

Table 5-6. The User Table

ID

Username

1

Dcminter

2

Jlinwood

3

Tjkitchen

Table 5-7. The Email Table

ID

Username

1

[email protected]

2

[email protected]

3

[email protected]

It is possible to create a mandatory foreign key relationship from one of the entities to the other, but this should not be applied in both directions because a circular dependency would then be created. It is also possible to omit the foreign key relationships entirely (as shown in Figure 5-5) and rely on Hibernate to manage the key selection and assignment.

A321250_4_En_5_Fig5_HTML.jpg
Figure 5-5. Entities related by primary keys

If it is not appropriate for the tables to share primary keys, then a foreign key relationship between the two tables can be maintained, with a “unique” constraint applied to the foreign key column. For example, reusing the User table from Table 5-6, the Email table can be suitably populated, as shown in Table 5-8.

Table 5-8. An Email Table with a Foreign Key to the User Table

ID

Email

UserID (Unique)

34

[email protected]

1

35

[email protected]

2

36

[email protected]

3

This has the advantage that the association can easily be changed from one-to-one to many-to-one by removing the unique constraint. Figure 5-6 shows this type of relationship.

A321250_4_En_5_Fig6_HTML.jpg
Figure 5-6. Entities related by a foreign key relationship

The One-to-Many and Many-to-One Association

A one-to-many association (or from the perspective of the other class, a many-to-one association) can most simply be represented by the use of a foreign key, with no additional constraints.

The relationship can also be maintained by the use of a link table. This will maintain a foreign key into each of the associated tables, which will itself form the primary key of the link table. An example of this is shown in Tables 5-9, 5-10, and 5-11.

Table 5-9. A Simple User Table

ID

Username

1

dcminter

2

jlinwood

Table 5-10. A Simple Email Table

ID

Email

1

[email protected]

2

[email protected]

3

[email protected]

4

[email protected]

Table 5-11. A Link Table Joining User and Email in a One-to-Many Relationship

UserID

EmailID

1

1

1

2

2

3

2

4

Additional columns can be added to the link table to maintain information on the ordering of the entities in the association.

A unique constraint must be applied to the “one” side of the relationship (the userId column of the UserEmailLink table in Figure 5-7); otherwise, the link table can represent the set of all possible relationships between User and Email entities, which is a many-to-many set association.

A321250_4_En_5_Fig7_HTML.jpg
Figure 5-7. A relationship represented by a link table (duplicates are not permitted because of the use of a compound primary key)

The Many-to-Many Association

As noted at the end of the previous section, if a unique constraint is not applied to the “one” end of the relationship when using a link table, it becomes a limited sort of many-to-many relationship. All of the possible combinations of User and Email can be represented, but it is not possible for the same user to have the same email address entity associated twice, because that would require the compound primary key to be duplicated.

If instead of using the foreign keys together as a compound primary key, we give the link table its own primary key (usually a surrogate key), the association between the two entities can be transformed into a full many-to-many relationship, as shown in Table 5-12.

Table 5-12. A Many-to-Many User/Email Link Table

ID

UserID

EmailID

1

1

1

2

1

2

3

1

3

4

1

4

5

2

1

6

2

2

Table 5-12 might describe a situation in which the user dcminter receives all email sent to any of the four addresses, whereas jlinwood receives only email sent to his own accounts.

When the link table has its own independent primary key, as with the association shown in Figure 5-8, thought should be given to the possibility that a new class needs to be created to represent the contents of the link table as an entity in its own right.

A321250_4_En_5_Fig8_HTML.jpg
Figure 5-8. A many-to-many relationship represented by a link table (duplicates are permitted because of the use of a surrogate key)

Applying Mappings to Associations

The mappings are applied to express the various different ways of forming associations in the underlying tables; there is no absolutely correct way to represent them.5

In addition to the basic choice of approach to take, the mappings are used to specify the minutiae of the tables’ representations. While Hibernate tends to use sensible default values when possible, it is often desirable to override these. For example, the foreign key names generated automatically by Hibernate will be effectively random, whereas an informed developer can apply a name (e.g., FK_USER_EMAIL_LINK) to aid in the debugging of constraint violations at run time.

Other Supported Features

While Hibernate can determine a lot of sensible default values for the mappings, most of these can be overridden by one or both of the annotation- and XML-based6 approaches. Some apply directly to mapping; others, such as the foreign key names, are really only pertinent when the mapping is used to create the database schema. Lastly, some mappings can also provide a place to configure some features that are perhaps not “mappings” in the purest sense.

The final sections of this chapter discuss the features that Hibernate supports in addition to those already mentioned.

Specification of (Database) Column Types and Sizes

Java provides the primitive types and allows user declaration of interfaces and classes to extend these. Relational databases generally provide a small subset of “standard” types and then offer additional proprietary types.

Restricting yourself to the proprietary types will still cause problems, as there are only approximate correspondences between these and the Java primitive types.

A typical example of a problematic type is java.lang.String (treated by Hibernate as if it were a primitive type since it is used so frequently), which by default will be mapped to a fixed-size character data database type. Typically, the database would perform poorly if a character field of unlimited size was chosen, but lengthy String fields will be truncated as they are persisted into the database. In most databases, you would choose to represent a lengthy String field as a TEXT, CLOB, or long VARCHAR type (assuming the database supports the specific type). This is one of the reasons why Hibernate can’t do all of the mapping for you and why you still need to understand some database fundamentals when you create an application that uses ORM.

By specifying mapping details, the developer can make appropriate trade-offs among storage space, performance, and fidelity to the original Java representation.

The Mapping of Inheritance Relationships to the Database

There is no SQL standard for representing inheritance relationships for the data in tables; and while some database implementations provide a proprietary syntax for this, not all do. Hibernate offers several configurable ways in which to represent inheritance relationships, and the mapping permits users to select a suitable approach for their model.

Primary Key

As stated earlier in this chapter (in the section entitled “Primary Keys,” of all things), Hibernate demands that a primary key be used to identify entities. The choice of a surrogate key, a key chosen from the business data, and/or a compound primary key can be made via configuration.

When a surrogate key is used, Hibernate also permits the key-generation technique to be selected from a range of techniques that vary in portability and efficiency. (This was shown in Chapter 4, in “Identifiers.”)

The Use of SQL Formula–Based Properties

It is sometimes desirable that a property of an entity be maintained not as data directly stored in the database but, rather, as a function performed on that data – for example, a subtotal field should not be managed directly by the Java logic, but instead be maintained as an aggregate function of some other property.

Mandatory and Unique Constraints

As well as the implicit constraints of a primary or foreign key relationship, you can specify that a field must not be duplicated – for example, a username field should often be unique.

Fields can also be made mandatory – for example, requiring a message entity to have both a subject and message text.

The generated database schema will contain corresponding NOT NULL and UNIQUE constraints so that it is very, very difficult to corrupt the table with invalid data (rather, the application logic will throw an exception if any attempt to do so is made).

Note that primary keys are implicitly both mandatory and unique.

Summary

This chapter has given you an overview of the reason why mappings are needed, and what features they support beyond these absolute requirements. It has discussed the various types of associations and the circumstances under which you would choose to use them.

The next chapter looks at how mappings are specified.

Footnotes

1 If there were, books like this one probably wouldn’t exist.

2 When comparing objects for equivalence, use equals(). It’s like comparing two doorknobs: are the two like each other, or are they the same doorknob? The equals() method checks to see if they’re alike. The == operator checks to see if they’re the same.

3 There are conditions for this. In most of the examples we’ve seen, where the majority of columns are accessed directly via their attribute references, lazy loading is very much the norm regardless of type. When in doubt, specify and test.

4 Cardinality refers to numbering, so cardinality in relationships indicates how many of each participant is being referred to by either side of the relationship.

5 This is why otherwise fantastic tools like Hibernate haven’t replaced good database analysts.

6 We haven’t actually spent much time in XML configuration for a very good reason: most people don’t use it in the real world unless they absolutely have to. It’s also excessively verbose, especially compared to the annotations.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset