Every discussion about database security should include at least a few words about preparing for catastrophic failures. Disk drives develop bad sectors, making it impossible to read or write data; natural disasters can fill the server room with water. Earthquakes and fires happen. When such failures occur, there is only one thing you can do: revert to a backup copy. In this section we’ll look at making backups and how they fit into a disaster recovery scheme.
Backup
We don't usually think of backups as a part of a security strategy, but in some circumstances it can be the most effective way to recover from security problems. If a server becomes so compromised by malware that it is impossible to remove (perhaps because virus protection software hasn't been developed for the specific malware), then a reasonable solution is to reformat any affected hard disks and restore the database from the most recent backup. Backups are also your only defense against physical damage to data storage, such as a failed hard disk.
A backup copy is a usable fallback strategy only if you have a backup that isn't too old and you are certain that the backup copy is clean (in other words, not infected by malware).
How often should you make backup copies? The answer depends on the volatility of your data. In other words, how much do your data change? If the database is primarily for retrieval, with very little data modification, then a complete backup once a week and incremental backups every day may be sufficient. However, an active transaction database in which data are constantly being entered may need complete daily backups. It comes down to a decision of how much you can afford to lose versus the time and effort needed to make backups often.
Assuming that you have decided on a backup interval, how many backups should you keep? If you back up daily, is it enough to keep a week's worth? Do you need less or more than that? In this case, it depends a great deal on the risk of malware. You want to keep enough backups that you can go far enough back in time to obtain a clean copy of the database (one without the malware). However, it is also true that malware may affect the server operating system without harming the database, in which case the most recent backup will be clean. Then you can fall back on the “three generations of backups” strategy, where you keep a rotation of “child,” “father,” and “grandfather” backup copies.
It used to be easy to choose backup media. Hard disk storage was expensive; tape storage was slow and provided only sequential retrieval, but it was cheap. We backed up to tape almost exclusively until recently, when hard disk storage became large enough and cheap enough to be seen as a viable backup device. Some mainframe installations continue to use tape cartridges, but even large databases are
quickly being migrated to disk backup media. Small systems, which once could back up to optical drives, now use disks as backup media almost exclusively.
The issue of backup has a psychological as well as a technical component: How can you be certain that backups are being made as scheduled? At first, this may not seem to be something to worry about, but consider the following scenario (which is a true story).
In the mid-1980s, a database application was installed for an outpatient psychiatric clinic that was affiliated with a major hospital in a major northeastern city. The application, which primarily handled patient scheduling, needed to manage more than 25,000 patient visits a year, divided among about 85 clinicians. The database itself was placed on a server in a secured room.
The last patient appointment was scheduled for 5 PM, which was when most staff left for the day. However, the receptionist stayed until 6 PM to close up after the last patient left. Her job during that last hour included making a daily backup of the database.
About a month after the application went into day-to-day use, the database developer who installed the system received a frantic call from the office manager. There were 22 unexplained files on the receptionist's computer. The office manager was afraid that something was terribly wrong.
Something was indeed wrong, but not what anyone would have imagined. The database developer discovered that the unidentified files were temporary files left by the database application. Each time the application was launched from the server, it downloaded the structure of the database and its application from the server and kept them locally until the client software was shut down. The presence of the temporary files meant that the receptionist wasn't quitting the application but only turning off her computer. If she wasn't quitting the database application properly, was she making backup copies?
As you can guess, she wasn't. The only backup that existed was the one the database developer made the day the application was installed and the data were migrated into the database. When asked why the backups weren't made, the receptionist admitted that it was just too much trouble. The solution was a warning from the office manager and additional training in backup procedures. The office manager also monitored the backups more closely.
The moral of this story is that just having a backup strategy in place isn't enough. You need to make certain that the backups are actually being made.
Disaster Recovery
The term disaster recovery refers to the activities that must take place to bring the database back into use after it has been damaged in some way. In large organizations, database disaster recovery will likely be part of a broader organizational disaster recovery plan. However, in a small organization, it may be up to the database administrator to coordinate recovery.
A disaster recovery plan usually includes specifications for the following:
▪ Where backup copies will be kept so they will remain undamaged even if the server room is damaged
▪ How new hardware will be obtained
▪ How the database and its applications are to be restored from the backups
▪ Procedures for handling data until the restored database is available for use
▪ Lists of those affected by the database failure and procedures for notifying them when the failure occurs and when the database is available again
The location of backup copies is vitally important. If they are kept in the server room, they risk being destroyed by a flood, fire, or earthquake along with the server itself. At least one backup copy should therefore be stored off-site.
Large organizations often contract with commercial data storage facilities to handle their backups. The storage facility maintains temperature-controlled rooms to extend the life of backup media. They may also operate trucks that deliver and pick up the backup media so that the off-site backup remains relatively current.
For small organizations, it's not unheard of for an IT staff member to take backups home for safekeeping. This probably isn't the best strategy, given that the backup may not be accessible when needed. Some use bank safe deposit boxes for offsite storage, although those generally are only accessible during normal business hours. If a small organization needs 24/7 access to off-site backup copies, the expense of using a commercial facility may be justified.
When a disaster occurs, an organization needs to be up and running as quickly as possible. If the server room and its machines are damaged, there must be some other way to obtain hardware. Small organizations can simply purchase new equipment, but they must have a plan for where the new hardware will be located and how network connections will be obtained.
Large organizations often contract with hot sites, businesses that provide hardware that can run the organization's software. Hot sites store backup copies and guarantee that they can have the organization's applications up and running in a specified amount of time.
Once a disaster recovery plan has been written, it must be tested. Organizations need to run a variety of disaster recovery drills, simulating a range of system failures. It is not unusual to discover that what appears to be a good plan on paper doesn't work in practice. The plan needs to be modified as needed to ensure that the organization can recover its information systems should a disaster occur.