Characteristics of a Simple Network
A simple network database supports one-to-many relationship between entities. There is no restriction on multiple parentage, however. This means that the employees/departments/projects database we have been using as an example could be designed as in
Figure A-6. In this example, the project acts as a composite entity between department and employee. In addition, there is a direct relationship between department and employee for faster access.
Given the restrictions of the hierarchical data model, the simple network was a logical evolutionary step. It removed the most egregious limitation of the hierarchical data model: no multiple parentage. It
also further divorced the logical and physical storage, although as you will see shortly, simple network schemas still allowed logical database designers to specify some physical storage characteristics.
Simple network databases implement data relationships either by embedding pointers directly in the data or through the use of indexes. Regardless of which strategy is used, access to the data is restricted to the predefined links created by the pointers unless a fast access path has been defined for a particular type of entity. In this sense, a simple network is navigational, just like a hierarchical database.
There are two types of fast access paths available to the designer of a simple network. The first—hashing—affects the strategy used to place entity occurrences in a data file. When an entity occurrence is hashed into a data file, the DBMS uses a key (the value or one or more attributes) to compute a physical file locator (usually known as the database key). To retrieve the occurrence, the DBMS recomputes the hash value. Occurrences of related entities are then clustered around their parent entity in the data file. The purpose of this is twofold: It provides fast access to parent entities and puts child entities on the same disk page as their parents for faster retrieval. In the example we are using, a database designer might choose to hash department occurrences and cluster projects around their departments.
Note: An entity occurrence can either be clustered or hashed; it can't be both because the two alternatives determine physical placement in a data file.
The second type of fast access path is an index, which provides fast, direct access to entity occurrences containing secondary keys. If occurrences are not hashed and have no indexes, then the only way to retrieve them is by traversing down relationships with parent entity occurrences.
To enable traversals of the data relationships, a simple network DBMS must keep track of where it is in the database. For every program running against the database, the DBMS maintains a set of
currency indicators, each of which is a system variable containing a database key of the last entity occurrence accessed of a specific type. For example, there are currency indicators for each type of entity, for the program
as a whole, and so on. Application programs can then use the contents of the currency indicators to perform data accesses relative to the program's previous location in the database.
Originally, simple network DBMSs did not support query languages. However, as the relational data model became more popular, many vendors added relational-style query languages to their products. If a simple network database is designed like a relational database, then it can be queried much like a relational database. However, the simple network is still underneath, and the database is therefore still subject to the access limitations placed on a simple network.
Simple network databases are not easy to maintain. In particular, changes to the logical design of the database can be extremely disruptive. First, the database must be brought offline; no processing can be done against it until the changes have been made. Once the database is down, then the following process occurs:
1. Back up all data or save the data in text files.
2. Delete the current schema and data files.
3. Compile the new database schema, which typically is contained in a text file, written in a data definition language (DDL).
4. Reallocate space for the data files.
5. Reload the data files.
In later simple network DBMSs, this process was largely automated by utility software, but considering that most simple network DBMSs were mainframe-based, they involved large amounts of data. Changes to the logical database could take significant amounts of time. There are still simple network databases in use today as legacy systems. However, it would be highly unusual for an organization to decide to create a new database based on this data model.
CODASYL
In the mid-1960s, government and industry professionals organized into the Committee for Data Systems Languages (CODASYL). Their goal was to develop a business programming language, the eventual result of which was COBOL. As they were working, the committee realized that they had another output besides a programming language: the specifications for a simple network database. CODASYL spun off the Database Task Group (DBTG), which in 1969 released its set of specifications.
The CODASYL specifications were submitted to the American National Standards Institute (ANSI). ANSI made a few modifications to the standard to further separate the logical design of the database from its physical storage layout. The result was two sets of very similar, but not identical, specifications.
Note: It is important to understand that CODASYL is a standard rather than a product. Many products were developed to adhere to the CODASYL standards. In addition, there have been simple network DBMSs that employ the simple network data model but not the CODASYL standards.
A CODASYL DBMS views a simple network as a collection of two-level hierarchies known as
sets. The database in
Figure A-6 requires two sets: one for department → employee and department → project and the second for employee → project. The entity at the “one” end of the relationship is known as the
owner of the set; the entities at the “many” end of relationships are
member of the set. There can be only one owner entity but many member entities in any set. The same entity can be an owner of one set and a member of another, allowing the database designer to build a network of many levels.
As mentioned in the previous section, access is either directly to an entity occurrence using a fast access path (hashing or an index) or in traversal order. In the case of a CODASYL database, the members of a set have an order that is specified by the database designer.
If an entity is not given a fast access path, then the only way to retrieve occurrences is through the owners of some set. In addition, there is no way to retrieve all occurrences of an entity unless all of those occurrences are members of the same set, with the same owner.
Each set provides a conceptual linked list, beginning with the owner occurrence, continuing through all member occurrences, and linking back to the owner. Like the occurrences of a hierarchy in a hierarchical database, the occurrences of a set are distinct and unrelated, as in
Figure A-7.
Note: Early CODASYL DBMSs actually implemented sets as linked lists. The result was complex pointer manipulation in the data files, especially for entities that were members of multiple sets. Later products represented sets using indexes, with database keys acting as pointers to the storage locations of owner and member records.
The independence of set occurrences presents a major problem for entities that aren't a member of any set, such as the department occurrences in
Figure A-7. To handle this limitation, CODASYL databases support a special type of set—often called a
system set—that has only one owner occurrence: the database system itself. All occurrences of an entity that is a member of that set are connected to the single owner occurrence. Employees and projects would probably be included in a system set also to provide the ability to access all employees and all projects. The declaration of system sets is left up to the database designer.
Any DBMS that was written to adhere to either set of CODASYL standards is generally known as a CODASYL DBMS. This represents the largest population of simple network products that were marketed.
Arguably, the most successful CODASYL DBMS was IDMS, originally developed by Cullinet. IDMS was a mainframe product that was popular well into the 1980s. As relational DBMSs began to dominate the market, IDMS was given a relational-like query language and marketed as IDMS/R. Ultimately, Cullinet was sold to Computer Associates, which marketed and supported the product under the name CA-IDMS.
Note: Although virtually every PC DBMS in the market today claims to be relational, many are not. Some, such as FileMaker Pro, are actually simple networks. These are client–server products, robust enough for small business use. They allow multiple parentage with one-to-many relationships and represent those relationships with preestablished links between files. These are simple networks. As you become familiar with the relational data model, you will understand why such products aren't relational. It doesn't mean that they aren't good products but simply that they don't meet the minimum requirements for a relational DBMS
.