What Types of Database Models Can Be Created Using the UML?

Traditionally, there are three types of database design models:

  1. Conceptual

  2. Logical

  3. Physical

All these can be modeled and designed using the UML. Not only can you create the traditional database models using the UML, but you can also create additional ones that once might have only been created on white boards or in the minds of the people responsible for the database. These models include how the database will be deployed, how applications will interact with the database, and even what hardware will be used and how the database server software will be deployed on that hardware.

As we continue with this chapter, we will focus on how to use the UML to create database designs and what elements can be used for them. Because this is just one chapter and not really a complete treatment of the subject, some details will be left out. For full details, see our book UML for Database Design.

Conceptual Models

A conceptual model is a very high-level model of a database. It typically models high-level domain entities and their basic relationships with other major domain entities. Its main purpose is to define the scope of the database while gaining an understanding of the data that needs to be captured. This model should be technology-independent.

Business Analysis Model

The conceptual model is first determined in the business analysis model. If your team is using UML for analysis and design, they most likely already have created business analysis models that can be leveraged and altered to define the conceptual data model. Even if business analysis models haven't already been created by others earlier in the process, they can be built now to define the conceptual data model.

A business entity is shown on a diagram as a circle with a line underneath and a line slashed through it (see Figure 6-7). The line at the bottom means it is an entity, and the slash shows that it is a business model. An advantage of using UML over traditional data modeling notations is that you can have different models that represent the different levels (i.e., business, analysis, implementation) of the model, rather than just different views of the same model.

Figure 6-7. Business entity.


The business entity itself in UML is a class with the stereotype of <<business entity>>. The icon that we described in the previous paragraph is associated with a business entity and is tied directly to that stereotype. If you are using a UML tool, it will in most cases have the business entity stereotype available for your use. If not, you can just create it and use it from then on. This is another value of the UML. Visually, the business entity looks different, whether it is by showing the textual version of the class with its stereotype or by seeing the icon that is associated with a particular stereotype, as shown in Figure 6-8.

Figure 6-8. Business entity stereotype.


The business model is used as the conceptual model because the business entities represent the business view of the data model. Not to be confused with a business process model, which demonstrates how the business works using workflows and business rules, the business analysis model provides a view of the data that needs to be captured at the highest level.

Defining a Conceptual Model

To define a conceptual model, begin by figuring out the data that you want to capture. You must look at it from the highest level without trying to clarify any of the entities' properties but their names, often expressed in a natural language. While defining the entities, you will also begin to design relationships between these entities using the UML associations. Figure 6-9 provides an example of a conceptual data model that describes a part of a customer order system.

Figure 6-9. Conceptual model.


By using the UML to define the conceptual model, both the UML and the conceptual models created can be used by more than just one subgroup involved in development. The data modelers certainly use it as their conceptual data model, but the other teams can take advantage of it as well. For the analysts, it defines the business objects in the organization whose problem they are trying to solve with the software that is being built. It also helps to work with customers (either internal or external) to show them what you are building and to ensure that you capture all the information they deem necessary. Because it is at the conceptual level, they don't get caught up or even lost in the technical details. They get a high-level understanding of what is being designed. The software architects can also take advantage of this conceptual model to jump-start the application architecture they need to define. As the application needs to create, read, update, and delete (CRUD) information within the database, it is natural that the application and the database are conceptualized using the same information and process flow. As we described in the earlier real-world example “Taking Off with a Common Understanding,” ensuring that the overall development team is starting with a common definition will go a long way toward success.

Watch Out—Trying to Please Everyone

Although we emphasize the value of keeping the entire team involved in creating the conceptual model, you also need to ensure that you're not trying to create something that is all things to all parties involved. The conceptual data model has a purpose, and you don't want to drift from that purpose by trying to follow a process that satisfies everyone. The intention from the beginning should be to qualify the requirements. True, the elements that are being defined as database entities should also be the same entities, at least conceptually, that you need to capture for the application. That being said, though, you don't want to create a model that ends up having no meaning because you were trying to have meanings for everyone.

The point is to make sure that you scope the work. It is great to work together in defining a common conceptual model as the airline did, but they were also smart about it and defined a vision for what is expected from the conceptual model as well as rules regarding its intent. When you know your boundaries, it is much easier to ensure that your conceptual model doesn't become just a dumping ground for all business-level entities that might have nothing to do with the project or database.


Logical Models

Logical modeling is used to design the application or, in this case, the database, at a technology-independent level. The logical data model generally uses non-technical names for entities, much like the business entities in the conceptual model, and does not at this point concern itself with how the database will be implemented. The logical model is used to communicate the designs at a level that the implementers of the database can use to transform that model into a high-performing database specific to the platform they use, no matter what target database management system (DBMS) is used.

Database design tools often provide capabilities to transform the conceptual model into the logical model, maintaining mappings of the entities as they evolve through time to ensure that the requirement's intent wasn't changed in implementation. Using UML or any modeling language that contains a common meta-model for objects, you can provide linkage or traceability from one entity to another, even though it might morph into something else over time.

Deep Dive—Moving from Conceptual to Logical

Morphing conceptual model entities into logical ones might be as straightforward as using basically the same entity with the same name and properties, or it could become a number of different entities all mapped back to that same business entity in the conceptual model. For example, you might have a business entity called “customer,” and that is fine for the conceptual model. But when you start to define the logical model for database consistency, you might determine that you need to capture data for several different types of customers with different data. Because of this, you might decide that you will have three different logical entities: retail customer, wholesale customer, and online customer. The three are mapped back to that one conceptual model element called customer, but there is also great value in knowing that mapping. When reviewing the conceptual and logical models, you can ensure consistency where needed and work with your client (as well as the other teams involved) to be sure they are in agreement with the different types of customers.


A logical model will also start to fill in some of the pieces that were not important to the conceptual model. One of the major additions to the logical model when compared to the conceptual one is attributes. Attributes define the properties of an entity in greater detail. Keeping with the customer example, attributes of a customer could include name, address, phone number, and more. Attributes also have a type associated with them for further description. A type describes an attribute—for example, the attribute name is described as a “text” type, meaning that data captured for a name must be in text format. Databases support specific datatypes, which we will talk about in the “Physical Modelings” section of this chapter. For now, in the logical model, we will stay with more generic types.

Class Diagrams

You might have already figured it out, but just in case you haven't, class diagrams are used for logical models and logical database designs. The class diagram encompasses all the constructs needed to build a logical database design. Classes for attributes with the stereotype of <<entity>>, along with attributes within that class, make up the logical database design construct of entity. The stereotype of entity can be viewed either in text form or using an icon, as you can see in Figure 6-10.

Figure 6-10. Entity stereotypes.


Although the icon tells you more quickly that the class is an entity, we don't recommend you use the icon, especially within a tool, unless you are showing the entity without attributes. As you saw in Figure 6-10, the addition of attributes to the entity on a diagram with multiple entities can make it difficult to read. Using the textual version of the stereotype in these scenarios provides the definition of an entity without cluttering up the diagram. Quite often, you will have multiple diagrams that show the same entities in different ways. You might have a diagram that is set up just to review entities and their names and relations. In that case, simply displaying the name is sufficient, while other diagrams that use that exact same entity could require further detail.

Three basic types of relationships are used in a logical data model. Traditionally, when designing a database, you will have the following types of relationships:

  • Non-identifying— A relationship in which the child and parent entities can exist on their own without each other.

  • Identifying— A relationship in which the child entity cannot exist without its parent.

  • Inheritance— A relationship in which the child entity, or quite often multiple child entities, inherit by its relationship everything that is included in the parent entity.

Table 6-1 shows how these traditional database design relationships are represented in the UML.

Table 6-1. Mapping of Traditional Database Design Relationships to Those in UML
Traditional Database DesignUML
Non-identifyingAssociation
IdentifyingComposite aggregation
InheritanceGeneralization

The non-identifying and identifying relationships use both the UML relationship type noted in Table 6-1 and a stereotype that denotes the type of relationship. Because each relationship type is graphically distinct, you can use your judgment as to whether the display of the stereotype is needed on the diagram. Figure 6-11 provides UML notation for each of the relationships described.

Figure 6-11. Logical relationships.


When defining the logical data model, you will also define business rules that can be enforced in the database. One such business rule is known as cardinality (a.k.a. multiplicity). As explained earlier in this chapter, multiplicity more deeply defines the relationship between two entities to essentially show the number of instances involved in a given relationship. For example, one customer can place one or more orders, but an order can only be placed by one customer.

Watch Out—Counting the Customers

The UML uses the word “multiplicity,” whereas database designers traditionally call the same thing “cardinality.” Don't get caught up in the words; just make sure you capture the multiplicity or cardinality correctly. Whether you follow the UML by the book or you stay with what you and the rest of your team are familiar with does not matter. However, keep it consistent across the team so everybody is on the same page.


Chapter 3, “Requirements Modeling,” goes into more depth describing the different types of multiplicity and how they are used, so refer back to that chapter for additional details.

As the class is used to define an entity, you will use several classes to describe the overall logical data model. Because the UML provides the ability to create multiple diagrams within a single model, you can use each of these diagrams to demonstrate different uses for each entity. For example, you can have the same entity present on multiple diagrams, but you can use them to show different things. One diagram might be used to describe the customer relationship aspects of the database, while another might be used to describe marketing aspects. The customer entity can occur on both diagrams, and it still has the same meaning, but the diagrams describe different areas of the database. Figure 6-12 shows a portion of the marketing diagram. The customer relationship diagram contains the use of all the different logical modeling elements we have talked about so far in this chapter.

Figure 6-12. Customer relationship logical database diagram.


Primary keys are used to uniquely define a logical entity. The primary key is an attribute that is unique for any particular row entered within a database. For example, you can have many orders, but generally, there is only one order number per order, and each order has a different number. A primary key in the UML contains a stereotype of <<primary key>>. This stereotype both provides properties specific to a primary key and also gives a visual cue to the fact that it is a primary key. To save room on a particular diagram, we have seen different tools use different variations on the stereotype. Some spell it out entirely, and others shorten it by just using the initials as <<PK>> for the stereotype. Figure 6-13 shows the customer relationship model again with primary keys added.

Figure 6-13. Primary keys in the logical model.


The logical model leads to the physical model, which we will cover in the next section. Although it goes against best practices, teams often create a logical model one time and never go back to it. They then move from the logical to the physical and only work in the physical model thereafter. The value of the logical model is to ensure that what you are building is the right thing for the business, and it serves as a communication vehicle for the different constituencies to share, including programmers, analysts, etc. By only working in the physical model after a first logical model is created, you run the risk of implementing a database specifically for physical reasons, and you miss out on opportunities to ensure the business needs are synchronized with what is being implemented.

Physical Modelings

Physical database design is tied directly to the implementation of the database. The data model that is created for the physical database takes into account the specifics of the DBMS and is optimized for such software and hardware.

Logical Versus Physical

Where the logical model has entities, the physical model has tables, which are the physical implementation of entities. Just like entities and tables, attributes, which are logical, become columns in the physical model.

Most databases don't support inheritance as it is described in the logical model. Therefore, decisions need to be made when moving to the implementation. To rid yourself of inheritance in the physical model, you have a few common choices:

  • One-to-one mapping of each entity in the inheritance to a table. Here, you will create additional columns in the tables and have identifying relationships between the child tables and the parent super-type table.

  • Roll-up, in which all the subtypes (child tables) are rolled up into the parent table, making just one table. The single table will have columns that differentiate themselves to show the different subtypes. For example, instead of having both fulltime and part-time employees, the employee table might have a column called employee type.

  • Roll-down, in which the parent table no longer exists in the transformation, but all or most of the columns in the parent table are created in each of the child tables. Regarding the previous example, in this case, there would be a fulltime and part-time table, but there would no longer be an employee table. The columns that would have been just in the employee table, like name, address, and so forth, would now be included in both the fulltime and part-time tables.

Data types are another difference between the logical and physical database models. Where in the logical you have generic types, in the physical, types are specific to the database vendor. Most vendors support a standard set of types, but to add competitive differentiation, each vendor provides additional types that add value to their platforms.

The last difference that we will cover is foreign keys. Foreign keys describe the relationships between tables. They are the migration of the primary key from the parent table into the foreign key of the child table. The foreign key can become part of either the primary key in the child table or just a regular column, depending on the relationship between the two tables. An identifying relationship means that the foreign key will also be a primary key, while a non-identifying relationship means it is just a foreign key. There are many other differences between the two types of models, but these are the primary ones that we will cover in this book.

Physical Data Models

Just like in the logical model, physical data models are described using classes and class diagrams. Stereotypes in the physical model include <<table>> to show that the class is a table and either <<foreign key>> or <<FK>> to denote a foreign key.

One of the values of using the UML to define database designs is the additional constructs that are available as compared to other database modeling notations. Having the extra compartment where operations are modeled in a class provides the ability to capture information in the physical model that cannot be visualized in traditional data modeling notations. Indexes and constraints are physical implementations that traditionally are not modeled anywhere but are just hidden in the meta-data of a physical model. The UML provides the ability to model indexes and constraints and enables the team to visualize these elements so that you are doing more with the models and allowing everyone who needs to understand how the database is being implemented to do so visually. Referential integrity constraints that provide information on primary and foreign keys are used, as are indexes and check constraints. The stereotypes used for these constraints are <<PK>>, <<FK>>, <<Index>>, and <<Check>>. Also, triggers are defined as operations on a table using the stereotype of <<Trigger>>. These are the main stereotypes used in the physical model that differ from the logical. Figure 6-14 shows the different stereotypes and icons used in the physical data model.

Figure 6-14. Physical data model.


Just like the transformations used for inheritance, you will choose how to transform other entities into tables. You might want to have one-to-one mappings, but you might decide for performance improvement, security, or other reasons to alter your transformations to other than a one-to-one mapping. The concepts of splitting a single entity into more than one is common for reasons of searching and performance as well as doing the opposite, that is, combining multiple tables into one.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset