Chapter 39. Databases

Information in this chapter:

• Databases in everyday life
• What is a database?
• Database files as evidence
• Database recovery
• Data as evidence
The sole purpose of database systems is to allow for fast and accurate storage and retrieval of records. Data is the lifeblood of businesses from the smallest home business to global mega-corporations. This chapter explores how data is stored, how it is retrieved, and how it can be a factor in electronic evidence. We will also look at the challenges involved in getting data as evidence and at metadata, or data about the data and its value as evidence.
Keywords
Database Metadata, Relational Databases, Database Types, Database Recovery

Introduction

Databases have been around as long as man has had a need to store and retrieve information. Beginning as simple lists of accounts and transactions to ledger books and finally to electronic form, databases are in many cases the number one data asset of a company or person. Databases are the underpinning for the financial, medical, securities, and commerce sections of the global economy. The ability to rapidly store and retrieve data as information is one of the greatest advances in computers in the modern age.
While businesses understand the value of databases, criminals also understand the value of the data stored in those databases. There is information about you in multiple databases in both the public sector and private sector, and probably many more databases than you might think possible. In this chapter we look at what databases are, a little about how they work, and finally, how they can contain evidence that can be a factor in criminal and civil cases.

39.1. Databases in everyday life

Basically every transaction or interaction with almost anything electronic in today’s world results in a record being created. Even if the record does not contain any personal information about the individual, it will still be collected for analysis purposes.
However, that does not mean that anonymous data is without evidentiary value. It is just that data that contains personal data about people is of more immediate evidentiary value in many cases.
For instance, a hacker who breaks into a major online retailer is not going to be very interested in their analytical data, that is, their sales amounts or how many of which widget they sell. They will be going after the customer database to try to get credit card numbers and other personal information they can profit from either by using the data directly or by selling that data to someone else. Data theft is a lucrative business, and breaches of networks can result in some expensive litigation. While the biggest data compromises are typically via some form of online data breach where millions of records are stolen, the ways in which data can be compromised are as varied as the devices and methods used to store data; lost or stolen laptops, hard drives, and USB sticks account for lost data.

39.2. What is a database?

In its simplest form a database is a list of information that a person or entity would want to maintain. For instance, your checkbook register is a database of sorts. When you write a check, you make an entry in your check register to record the payee and the amount and date of the check. This would be the equivalent of adding a record in a database.
If you wanted to know where your money was going, you could then go and look in your check register and see that entry. That would be a very basic record retrieval, or query. In database terms, a query is nothing more than a question you ask to get information out of the database.
Finally, since you are very meticulous about your checking account, each time you make a new entry in your checkbook register, you add or subtract the amount of the transaction from your balance, giving you a new current balance. This is the equivalent in database terms of a summary report.
The sole purpose of database systems is to allow for fast and accurate storage and retrieval of records. With the introduction of computers, databases came into their own, allowing for massive storage of records, fast retrieval, and customized user interfaces for handling data input and output.

39.2.1. What is a database management system?

Our typical interaction with databases is through an end-user interface such as a website or an application that provides an interface or front end that allows a user to add, update, edit, and delete records in the database itself. It also provides tools for an application developer to write code to manipulate the data in the database. Database management systems include small desktop or workgroup database management systems like Microsoft Access, Alpha Five, or FileMaker, to name a few, where the software provides an easy-to-use program for handling all the functions of the database management including creating data entry forms, reports, and the table and record structure. The example checkbook database in this chapter was created using Microsoft Access 2010.
On a larger scale, enterprise-level database management systems are not fully integrated into a single application like a desktop Relational Database Management System (RDBMS). These include systems such as Oracle, Progress, Microsoft SQL Server, Sybase, MySQL, and the like. These systems are designed to handle extremely large data sets efficiently. To create an application using one of these RDBMS systems, the developer will typically use a third-party application development software program to create the part of the application the end user sees and uses.

39.2.2. Modern databases

This short discussion on modern databases is only for the purpose of giving a basic understanding of how modern database management systems look and work. Database management and design fills dozens or perhaps hundreds of books.
There are several types of database structures; however, computer-based database systems today primarily fall in the category of relational databases. A relational database system structure functions just as the name sounds. It is based on relationships between data.
Back to the checkbook register example. If you wanted to turn your checkbook register into a simple relational database, you would want to think about how it relates to you, the account holder, and how you relate to the transactions. Figure 39.1 is a simple relational diagram of a checkbook database. Each of the blocks is a table. Within each of the blocks are fields.
B9781597496438000390/f39-01-9781597496438.jpg is missing
Figure 39.1
A relational diagram showing how people are related to accounts and accounts to transactions
In Fig. 39.2 you see a screenshot of the checking account database with some data entered showing that Bob Smith has a checking account with Wells Fargo. Bob’s Wells Fargo checking account has several transactions that look just like what he would have written into his check register.
B9781597496438000390/f39-02-9781597496438.jpg is missing
Figure 39.2
A simple database for a checking account without any frills
Here you have three tables:
• People with Checking Accounts
• Checking Accounts
• Account Transactions
Now that you are looking at the tables with data present, it is easy to see that these look a lot like spreadsheets with rows and columns where each row is a record and each column is a field. That is the basic structure of all databases: table, record, field.
What makes this a relational database is that instead of three spreadsheets that are all independent of each other, these spreadsheets are all related. That means getting data back out of the database for reporting is simple. If you want to know how much Bob spent on flowers, you can ask the database a question like, “Give me all checks where Bob spent money on flowers on the Wells Fargo account.” The result of this question is shown in Fig. 39.3.
B9781597496438000390/f39-03-9781597496438.jpg is missing
Figure 39.3
Report showing the amount spent on flowers extracted from the database
As you can see, relational databases are simple in concept. However, as they grow in complexity, they are far from simple. An enterprise-level database can contain thousands of tables and fields, and billions of records.

39.2.3. Database formats

Previously we looked at database structure. The database structure is the way the database handles data. Database formats are the actual file format of the database, the way it handles storage and retrieval of data internally.
There is a long list of database formats; some are very old in origin and are still in use today by various database management systems.
Each of the database formats has it own unique file structure, from some that split database tables into separate physical files and indexes, to those that maintain everything in a single large file.

39.3. Database files as evidence

Databases themselves pose some interesting problems from an evidence standpoint. If the data in the database can only be accessed by a proprietary system, then getting to the underlying data can be difficult.
There are still a great many database programs around that were designed for small-scale vertical market customers using older DOS (Disk Operating System) tools that also use older database formats such as Symantec’s Q&A, Novell’s Btrieve, or Ashton-Tate’s Dbase II and III formats. These types of database formats can pose particular problems for an examiner if they are not familiar with the underlying format of the database and how it is organized and managed by an application.
One reason for this is that these particular database formats tend to be in separate files that are only connected by the application code of the program that is using them for storage. Unlike most modern database formats that encapsulate the data structure and relationships between tables internally in a single file, figuring out how to get data back out of these older formats can be a challenge.
One solution is to make sure you have access to the application that the user has on their computer for managing the database. Some examples of these types of applications are programs written for specialty contractors like landscapers, grading companies, jewelry stores, and pawn shops. These applications tend to be closely tied to the database in such a way that they cannot be separated and easily analyzed without the actual application, if at all.
In our simple example, even if all of the tables were in separate files, a database expert could probably figure out how to reconnect them and extract the data in a manner that is usable. In real situations, databases tend to be much more complex and have many more tables that would have to be reconnected to extract data.
It is rare that you would have the luxury of taking away someone’s business computer to perform an analysis using their licensed software.
There are several reasons you may not be able to get the underlying structural information for a database application.
• The database structure documentation is lost or never existed in the first place.
• The documentation that can be found was several development generations ago.
• The database application is proprietary and the developer won’t share the information.
• The company that developed the application is out of business.
In any case, if all you have is the database file or files, even if they are encapsulated in a single file like a SQL Server database or a Microsoft Access Database, the process of figuring out how the data is all related can be a daunting task.
The best solution in every case where the data is the evidence is to get it using the same application that the users have for generating reports and extracting data. Otherwise you are faced with what may be a long and expensive road for manually extracting and organizing the data into something you can use in a case. Even if you can use database design tools to reverse-engineer the database structure, the result will be incomplete to some degree. In order for a reverse-engineering tool to map the structure of a database, the database must be well designed, or the tool will fail when it cannot find the correct internal keys or connections in the database.
While using reverse-engineering tools may reduce the cost of figuring out the database structure, it will not be an easy process if the database application is very complex.

39.4. Database recovery

Depending on the database, there are various ways to get to evidence that may not be accessible via the user interface.
• Transaction Logs
• These are logs that the database keeps in order to perform a rollback. A rollback can occur if a database detects an error condition and needs to roll back to a previous state to prevent data loss or corruption.
• Backups
• Database backups can be a valuable source of historical evidence, especially if there is suspicion that someone has deleted records of interest from a database. By restoring the backups, it may be possible to locate the deleted data.
• Corrupted or Damaged Databases
• If the only evidence available is a damaged or corrupted database file, all is not always lost. In many cases some or all of the data can be recovered. Database recovery becomes more likely as the type of database moves from the small desktop database to the large enterprise database. It stands to reason that enterprise databases tend to have better recovery tools because the cost of data corruption is so high compared to a small department database written in one of the desktop applications like Microsoft Access or FileMaker.

39.5. Data as evidence

In the majority of cases, the data is the evidence you are looking for; a patient record, a set of financial transactions, information about a person contained in a database, annual sales and profit information, a purchase made by someone—the list goes on.
This is the surface evidence. In other words, this is the evidence that may be the most important, but it is also the easiest to get if the only thing you need to do is to request that the data be produced and printed out as a report or screenshot.
The underlying data, or metadata, is also important. Metadata from a database standpoint is the information about the data records that may be of significant interest.
For instance, many database applications record information about changes to records in the database, but this information is not shown to the user as it is only saved for audit purposes and troubleshooting.
Metadata includes information about a record or transaction such as the date and time stamp when the record was changed and the name of the user who changed the record. Depending on the sophistication of the database application and regulatory and other requirements, these audit transactions may show historical record change information as well.
However, bear in mind that even if the audit trail contains this metadata for more than just the last change, it will probably not contain the contents of the changes as it creates a lot of overhead in the database to store the extra content.
Asking what metadata is saved as part of the database operation is very important in order to understand what, if any, transaction logs or audit logs can be obtained from the database administrator. In some cases, the database administrator will not be aware of exactly what the database application logs, as some of the logging is placed into the application to assist the programmers with debugging and troubleshooting.
Contact the person or company who programmed the database to find out what, if anything, they programmed into the application for their own maintenance and troubleshooting.

Summary

In this chapter we learned about databases and database management systems. We examined a relational database and how the data is structured. We also looked at the methods and issues involved in getting evidence from databases and about the data itself in the form of database metadata. We also looked at the challenges presented by database applications and the need to use the same application the user has to get data out of a database system whenever possible.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset