Another excellent practice for ensuring that the data context is clearly understood is to use standard data naming conventions. Standard names are most important in helping the business users and IT team communicate about the system effectively. Standards are important because they form data documentation for the system.
Data naming conventions require good business definitions as their basis. Because XYZ has already completed the planning phase, there are just three simple rules that XYZ's database designer followed in order to create clear and understandable data element names for Valencia.
There are a number of different naming conventions. It is not important which one your company chooses as long as there is one and it is followed rigorously throughout the IT department. It is really amazing how much easier it is to support and enhance a database or application in which the data element names give an indication of what information the elements hold.
There are three components of all data element names:
Prime Word(s): The basic fact that is being named
Modifier(s): Descriptors that indicate more specifically which fact this is
Class Word: Standard code describing the type (class) of data being named
Some experts recommend various tweaks to this basic convention, but these are the basics.
Class words are used to identify the type of data to be stored in each data element. Usually, there is a class word for any data types commonly used by your organization's systems. Classes also may indicate the purpose for which data is used. In Table D-16, “name” and “text” are both non-numeric fields, but “name” is a text field with a special purpose.
As shown, class words should be abbreviated in a standard way. Again, there is no universal standard to follow; just be sure that your company has thought through the types of data that are important to the business. Don't create too many class words, or you'll just be complicating life.
Strictly speaking, most DBMS systems don't have limits to the length of names they allow. Still, you'll reduce the amount of typing required by keeping names as short as possible (without sacrificing clarity, of course). One way to do this is to abbreviate each word in the definition.
Some people feel that it is clearer for business users to use real words, but I have found that as long as you follow a standard, everyone quickly learns that CUST is the shorthand for customer (not custard). Of course, if some use CUST and some use CUSTMR and still others use CUS, the situation becomes quickly unmanageable.
Pick a single standard list of abbreviations, and use them consistently. At HP, we started with a list created by the American Chemical Society. Their standard abbreviation list, known as CAS, can be reviewed at http://www.cas.org/ONLINE/standards.html, although it looks even more technically aimed than I remember. Start with a list of your own, and keep adding new abbreviations as you need them. HP used several abbreviation rules to create an addition to the list:
New abbreviations should be as short as possible and never longer than five characters.
Plural nouns use the same abbreviation as the singular form.
Duplicate abbreviations are okay because the full name usually gives clear context.
If anyone is working on a standard abbreviations list (or better yet, knows of a good external standard that already exists), please get in touch with me. I'd love to collaborate.
Data naming conventions require meaningful business definitions as their basis. Because we've seen the business definitions that XYZ developed for the Installed Base attributes, we will use them as examples. XYZ's database designer used these three rules to fill in the blanks and create a name for each attribute.
The prime word is generally the subject of the definition.
The adjectives that describe the subject are the modifiers.
The class word is the type or purpose of the data fact.
In Table D-17, you can see how the Valencia database designer followed these three rules and developed standard data names for the Installed Base entity.
Attribute | Definition | Modifier(s) (descriptor) | Prime Word (fact) | Class Word (type) | Attribute Name (Data Element Name) |
---|---|---|---|---|---|
product model | Name that identifies the type of hardware or software product installed | installed | product | identifier | installed product identifier (instl_prod_id) |
serial number | Unique number that identifies the specific physical hardware or software product | installed product | serial number | number | installed product serial number (number) (instl_prod_ser_nr) |
purchase date | Date of purchase | installed product | purchase | date | installed product purchase date (instl_prod_purch_ dt) |
where purchased | Sales channel where purchase | installed product purchase | sales channel | code | installed product purchase sales channel code (instl_prod_purch_sls_chan_cd) |
used for | Purpose for which product is used | installed product | use | code | installed product use code (instl_prod_use_cd) |
where used | Primary location where product is used | installed product | location | code | installed product location code (instl_prod_loc_cd) |
As you have seen, even the names of the physical database fields are completely determined by the business definition that was created by the business experts. The names are recognizable, meaningful, and sharable/reusable.