CHAPTER 21

image

Automating Jobs

In almost any type of database environment—from development, to testing, to production—DBAs rely heavily on automating tasks. Typical jobs that DBAs automate include the following:

  • Shutdown and startup of databases and listeners
  • Backups
  • Validating the integrity of backups
  • Checking for errors
  • Removing old trace or log files
  • Checking for errant processes
  • Checking for abnormal conditions

Automating routine tasks allows DBAs to be much more effective and productive. Automated environments are inherently smoother running and more efficient than manually administered systems. DBA jobs that run automatically from scripts consistently execute the same set of commands each time and therefore are less prone to human error and mistakes. Two scheduling utilities are described in this chapter:

  • Oracle Scheduler
  • Linux/Unix cron utility

This chapter begins by detailing the basic aspects of the Oracle Scheduler utility. This scheduler is available if you have an Oracle database installed. Oracle Scheduler can be used to schedule jobs in a wide variety of configurations.

Also covered in this chapter is how to use the Linux/Unix cron scheduling tool. In Linux/Unix environments, DBAs often use the cron scheduling utility to run jobs automatically. The cron utility is ubiquitous and easy to implement and use. If you’re an Oracle DBA, you must be familiar with cron, because sooner or later, you’ll find yourself in an environment that relies heavily on this tool to automate database jobs.

The last several sections in this chapter show you how to implement many real-world DBA jobs, such as automatically starting/stopping the database, monitoring, and OS file maintenance. You should be able to extend these scripts to meet the automation requirements of your environment.

image Note   Enterprise Manager Grid/Cloud Control can also be used to schedule and manage automated jobs. If you work in a shop that uses Enterprise Manager, then it’s appropriate to use this tool for automating your environment.

Automating Jobs with Oracle Scheduler

Oracle Scheduler is a tool that provides a way of automating the scheduling of jobs. Oracle Scheduler is implemented via the DBMS_SCHEDULER internal PL/SQL package. Oracle Scheduler offers a sophisticated set of features for scheduling jobs. The following sections of this chapter cover the basics of using Oracle Scheduler to automate jobs with simple requirements.

image Tip   There are currently more than 70 procedures and functions available within the DBMS_SCHEDULER package. For complete details, see the Oracle Database PL/SQL Packages and Types Reference Guide, which is available for download from the Technology Network area of the Oracle Web site (http://otn.oracle.com).

Creating and Scheduling a Job

The example in this section shows how to use DBMS_SCHEDULER to run an OS shell script on a daily basis. First, a shell script is created that contains an RMAN backup command. For this example, the shell script is named rmanback.bsh and is located in the /orahome/oracle/bin directory. The shell script also assumes that there is an /orahome/oracle/bin/log directory available. Here is the shell script:

#!/bin/bash
# source oracle OS variables; see Chapter 2 for an example of oraset script
. /etc/oraset o12c
rman target / <<EOF
spool log to '/orahome/oracle/bin/log/rmanback.log'
backup database;
spool log off;
EOF
exit 0

Next, the CREATE_JOB procedure of the DBMS_SCHEDULER package is used to create a daily job. Next, connect as SYS, and execute the following command:

BEGIN
DBMS_SCHEDULER.CREATE_JOB(
job_name => 'RMAN_BACKUP',
job_type => 'EXECUTABLE',
job_action => '/orahome/oracle/bin/rmanback.bsh',
repeat_interval => 'FREQ=DAILY;BYHOUR=9;BYMINUTE=35',
start_date => to_date('17-01-2013','dd-mm-yyyy'),
job_class => '"DEFAULT_JOB_CLASS"',
auto_drop => FALSE,
comments => 'RMAN backup job',
enabled => TRUE);
END;
/

In the prior code the JOB_TYPE parameter can be one of the following types: STORED_PROCEDURE, PLSQL_BLOCK, EXTERNAL_SCRIPT, SQL_SCRIPT, or EXECUTABLE.

The REPEAT_INTERVAL parameter is set to FREQ=DAILY;BYHOUR=9;BYMINUTE=35. This instructs the job to run daily, at 9:35 am. The REPEAT_INTERVAL parameter of the CREATE_JOB is capable of implementing sophisticated calendaring frequencies. For instance, it supports a variety of yearly, monthly, weekly, daily, hourly, by the minute, and by the second schedules. The Oracle Database PL/SQL Packages and Types Reference Guide contains several pages of syntax details for just the REPEAT_INTERVAL parameter.

The JOB_CLASS parameter specifies which job class to assign the job to. Typically, you would create a job class and assign a job to that class, whereby the job would inherit the attributes of that particular class. For example, you may want all jobs in a particular class to have the same logging level or to purge log files in the same manner. There’s a default job class that can be used if you haven’t created any job classes. The previous example uses the default job class.

The AUTO_DROP parameter is set to FALSE in this example. This instructs the Oracle Scheduler not to drop the job automatically after it runs (the default is TRUE).

Viewing Job Details

To view details about how a job is configured, query the DBA_SCHEDULER_JOBS view. This query selects information for the RMAN_BACKUP job:

SELECT job_name
 ,last_start_date
 ,last_run_duration
 ,next_run_date
 ,repeat_interval
FROM dba_scheduler_jobs
WHERE job_name='RMAN_BACKUP';

Each time a job runs, a record of the job execution is logged in the data dictionary. To check the status of a job execution, query the DBA_SCHEDULER_JOB_LOG view. There should be one entry for every time a job has run:

SELECT job_name
,log_date
,operation
,status
FROM dba_scheduler_job_log
WHERE job_name='RMAN_BACKUP';

Modifying Job Logging History

By default the Oracle Scheduler keeps 30 days’ worth of log history. You can modify the default retention period via the S ET_SCHEDULER_ATTRIBUTE procedure. For example this command changes the default number of days to 15:

SQL> exec dbms_scheduler.set_scheduler_attribute('log_history',15);

To remove the contents of the log history completely, use the P URGE_LOG procedure:

SQL> exec dbms_scheduler.purge_log();

Modifying a Job

You can modify various attributes of a job via the SET_ATTRIBUTE procedure. This example modifies the RMAN_BACKUP job to run weekly, on Mondays:

BEGIN
  dbms_scheduler.set_attribute(
    name=>'RMAN_BACKUP'
   ,attribute=>'repeat_interval'
   ,value=>'freq=weekly; byday=mon'),
END;
/

You can verify the change by selecting the REPEAT_INTERVAL column from the DBA_SCHEDULER_JOBS view. Here is what the REPEAT_INTERVAL column now shows for the RMAN_BACKUP job:

21-JAN-13 12.00.00.200000 AM -07:00 freq=weekly; byday=mon

From the prior output you can see that the job will run on the next Monday, and because no BYHOUR and BYMINUTE options were specified (when modifying the job), the job is scheduled to run at the default time of 12:00 am.

Stopping a Job

If you have a job that has been running for an abnormally long period of time, you may want to abort it. Use the S TOP_JOB procedure to stop a currently running job. This example stops the RMAN_BACKUP job while it is running:

SQL> exec dbms_scheduler.stop_job(job_name=>'RMAN_BACKUP'),

The STATUS column of DBA_SCHEDULER_JOB_LOG will show STOPPED for jobs stopped using the STOP_JOB procedure.

Disabling a Job

You may want to temporarily disable a job because it’s not running correctly. You need to ensure that the job does not run while you’re troubleshooting the issue. Use the D ISABLE procedure to disable a job:


SQL> exec dbms_scheduler.disable('RMAN_BACKUP'),

If the job is currently running, consider stopping the job first or using the FORCE option of the DISABLE procedure:

SQL> exec dbms_scheduler.disable(name=>'RMAN_BACKUP',force=>true);

Enabling a Job

You can enable a previously disabled job via the ENABLE procedure of the DBMS_SCHEDULER package. This example reenables the RMAN_BACKUP job:

SQL> exec dbms_scheduler.enable(name=>'RMAN_BACKUP'),

image Tip   You can check to see if a job has been disabled or enabled by selecting the E NABLED column from the DBA_SCHEDULER_JOBS view.

Copying a Job

If you have a current job that you want to clone, you can use the COPY_JOB procedure to accomplish this. The procedure takes two arguments: the old job name and the new job name. Here is an example of copying a job, where RMAN_BACKUP is a previously created job, and RMAN_NEW_BACK is the new job that will be created:

begin
  dbms_scheduler.copy_job('RMAN_BACKUP','RMAN_NEW_BACK'),
end;
/

The copied job will be created but not enabled. You must enable the job first (see the previous section for an example) before it will run.

Running a Job Manually

You can manually run a job outside its regular schedule. You might want to do this to test the job to ensure that it’s working correctly. Use the RUN_JOB procedure to initiate a job manually. This example manually runs the previously created RMAN_BACKUP job:

BEGIN
  DBMS_SCHEDULER.RUN_JOB(
  JOB_NAME => 'RMAN_BACKUP',
  USE_CURRENT_SESSION => FALSE);
END;
/

The USE_CURRENT_SESSION parameter instructs Oracle Scheduler to run the job as the current user (or not). A value of FALSE instructs the scheduler to run the job as the user who originally created and scheduled the job.

Deleting a Job

If you no longer require a job, you should delete it from the scheduler. Use the DOP_JOB procedure to permanently remove a job. This example removes the RMAN_BACKUP job:

BEGIN
  dbms_scheduler.drop_job(job_name=>'RMAN_BACKUP'),
END;
/

The code will drop the job and remove any information regarding the dropped job from the DBA_SCHEDULER_JOBS view.

Oracle Scheduler vs. cron

DBAs often debate whether they should use Oracle Scheduler or the Linux/Unix cron utility for scheduling and automating tasks. These are some of the benefits that Oracle Scheduler has over cron:

  • Can make the execution of a job dependent on the completion of another job
  • Robust resource balancing and flexible scheduling features
  • Can run jobs based on a database event
  • Program’s DBMS_SCHEDULER PL/SQL package syntax works the same, regardless of the OS
  • Can run status reports, using the data dictionary
  • If working in a clustered environment, no need to worry about synchronizing multiple cron tables for each node in the cluster
  • Can be maintained and monitored via Enterprise Manager

The Oracle Scheduler is implemented via the D BMS_SCHEDULER PL/SQL package. As discussed previously, it’s fairly easy to create and maintain jobs, using this utility. Yet, despite Oracle Scheduler’s benefits, many DBAs prefer to use a scheduling utility such as cron. These are some of the advantages of cron:

  • Easy to use; simple, tried and true; only takes seconds to create or modify jobs
  • Almost universally available on all Linux/Unix boxes; for the most part, runs nearly identically, regardless of the Linux/Unix platform (yes, there are minor differences)
  • Database agnostic; operates independently of the database and works the same, regardless of the database vendor or version
  • Works whether or not the database is available

The prior lists aren’t comprehensive but should give you a flavor of the uses of each scheduling tool. I prefer to use cron, but if you require a more sophisticated scheduler, then consider using Oracle Scheduler. The following sections in this chapter provide information on how to implement and schedule automated jobs via cron.

image Note   If you’re in a Windows environment, use the Task Scheduler utility to run batch jobs automatically. You can access the Task Scheduler by going to the Control Panel, and then to Administrative Tools.

Automating Jobs via cron

The cron program is a job-scheduling utility that is ubiquitous in Linux/Unix environments. This tool derives its name from chrónos (the Greek word for “time”). The cron (the geek word for “scheduler”) tool allows you to schedule scripts or commands to run at a specified time and repeat at a designated frequency.

How cron Works

When your Linux server boots up, a cron background process is automatically started to manage all cron jobs in the system. The cron background process is also known as the cron daemon. This process is started upon system startup by the etc/init.d/crond script. You can check to see whether the cron daemon process is running with the p s command:

$ ps -ef | grep crond | grep -v grep
root      3081     1  0  2012 ?        00:00:18 crond

On Linux boxes, you can also check to see whether the cron daemon is running, using the s ervice command:

$ /sbin/service crond status
crond (pid  3081) is running...

The root user uses several files and directories when executing system cron jobs. The /etc/crontab file contains commands for running system cron jobs. Here is a typical listing of the contents of the /etc/crontab file:

SHELL=/bin/bash
PATH=/sbin:/bin:/usr/sbin:/usr/bin
MAILTO=root
HOME=/
# run-parts
01 * * * * root run-parts /etc/cron.hourly
02 4 * * * root run-parts /etc/cron.daily
22 4 * * 0 root run-parts /etc/cron.weekly
42 4 1 * * root run-parts /etc/cron.monthly

This /etc/crontab file uses the run-parts utility to run scripts located in the following directories: /etc/cron.hourly, /etc/cron.daily, /etc/cron.weekly, and /etc/cron.monthly. If there is a system utility that needs to run other than on an hourly, daily, weekly, or monthly basis, then it can be placed in the /etc/cron.d directory.

Each user can create a crontab (also known as a cron table) file. This file contains the list of programs that you want to run at a specific time and interval. This file is usually located in the /var/spool/cron directory. For every user who creates a cron table, there will be a file in the /var/spool/cron directory named after the user. As root, you can list the files in that directory, as shown:

# ls /var/spool/cron
oracle root

The cron background process is mostly idle. It wakes up once every minute and checks /etc/crontab, /etc/cron.d, and the user cron table files and determines whether there are any jobs that need to be executed.

Table 21-1 summarizes the purpose of the various files and directories used by cron. Knowledge of these files and directories will help you troubleshoot any issues as well as get a better understanding of cron.

Table 21-1. Descriptions of Files and Directories Used by the cron Utility

File Purpose
/etc/init.d/crond Starts the cron daemon upon system boot
/var/log/cron System messages related to the cron process; useful for troubleshooting problems
/var/spool/cron/<username> User crontab files are stored in the /var/spool/cron directory.
/etc/cron.allow Specifies users who can create a cron table
/etc/cron.deny Specifies users who are not allowed to create a cron table
/etc/crontab/ The system cron table that has commands to run scripts located in the following directories: /etc/cron.hourly, /etc/cron.daily, /etc/cron.weekly, and /etc/cron.monthly
/etc/cron.d Directory that contains cron tables for jobs that need to run on a schedule other than hourly, daily, weekly, or monthly
/etc/cron.hourly Directory that contains system scripts to run on an hourly basis
/etc/cron.daily Directory that contains system scripts to run on a daily basis
/etc/cron.weekly Directory that contains system scripts to run on a weekly basis
/etc/cron.monthly Directory that contains system scripts to run on a monthly basis

Enabling Access to cron

Sometimes when SAs set up a new box, they don’t (by default) enable the use of cron for all users on the system. To verify whether you have access to cron, invoke the utility as follows:

$ crontab -e

If you receive the following error message, then you do not have access:

You (oracle) are not allowed to use this program (crontab)

To enable cron access as the root user, add oracle to the /etc/cron.allow file with the e cho command:

# echo oracle >> /etc/cron.allow

Once the oracle entry is added to the /etc/cron.allow file, you can use the crontab utility to schedule a job.

image Note   You can also use an editing utility (such as vi) to add an entry to the cron.allow file.

The root user can always schedule jobs with the crontab utility. Other users must be listed in the /etc/cron.allow file. If the /etc/cron.allow file does not exist, then the OS user must not appear in the /etc/cron.deny file. If neither the /etc/cron.allow nor the /etc/cron.deny file exists, then only the root user can access the crontab utility. (These rules may vary slightly, depending on the version of Linux/Unix you’re using.)

On some Unix OSs (such as Solaris) the cron.allow and cron.deny files are located in the /etc/cron.d directory. These files usually can only be modified by the root user.

image Tip   On some Linux/Unix platforms newer and more flexible variants of cron are available, such as the anacron utility. Use the man anacron command to view details on the implementation of this utility on your system.

Understanding cron Table Entries

Your cron table is a list of numbers and commands that the cron background process (cron daemon) will run at a specified time and schedule. The crontab utility expects entries to follow a well-defined format. It’s a good idea to add a comment line at the beginning of your crontab file that documents the required format, like so:

# min(0-59) hr(0-23) dayMonth(1-31) monthYear(1-12) dayWeek(0/7-6) commandOrScript

In the previous example the number sign (#) in the cron file represents the start of a comment. Any text entered after # is ignored by cron.

Each entry in the crontab is a single line composed of six fields. The first five fields specify the execution time and frequency. Entries in these fields can be separated by commas or hyphens. A comma indicates multiple values for an entry, whereas a hyphen indicates a range of values. An entry can also be an asterisk (*), which indicates that all possible values are in effect. Here is an example to help clarify. The following entry sends an e-mail saying, “Wake up,” every half hour, from 8 am to 4:30 pm, Monday through Friday:

0,30 8-16 * * 1-5 echo "wake up" | mailx -s "wake up"
[email protected]

On some Linux systems, you can skip a value within a range by following the entry with /<integer>. For instance, if you wanted to run a job every other minute, use 0-59/2 in the minute column. You can also use a slash (/) with an asterisk to skip values. For instance, to run a job every fourth minute, you would use */4 in the minute column.

The sixth field in the crontab can be one or more Linux commands or a shell script. Or, put another way, the sixth column can be any combination of commands or a script that you can run on one line from the Linux command line.

The cron utility has a few quirks that need further explanation. The fifth column is the day of the week. Sunday is designated by either a 0 or a 7; Monday, by a 1; Tuesday, by a 2; and so on, to Saturday, which is indicated with a 6.

The hour numbers in the second column are in military time format, ranging from 0 to 23. The fourth column (month of the year) and fifth column (day of the week) can be represented with numeric values or by three-letter abbreviations. For example, the following entry in the crontab uses three-letter abbreviations for months and days:

0,30 8-16 * Jan-Dec Mon-Fri echo "wake up" | mailx -s "get up"
[email protected]

There also appear to be overlapping columns, such as the third column (day of the month) and the fifth column (day of the week). These columns allow you to create flexible schedules for jobs that need to run on schedules such as the 1st and 15th day of the month or every Tuesday. Put an asterisk in the column that you’re not using. If you need to run a job on the 1st and 15th and every Tuesday, then fill in both columns.

If you’re running a shell script from cron that contains a call to an Oracle utility such as sqlplus or rman, ensure that you instantiate within the script any required OS variables, such as ORACLE_SID and ORACLE_HOME. If you don’t source these variables, you’ll see errors such as the following when your shell script runs from cron:

sqlplus: command not found

When cron runs a script (in a user’s crontab), it doesn’t run the user’s startup or login files (such as .bashrc). Therefore, any script (being run from cron) needs to explicitly set any required variables. You can directly set the variables within the script or call another script that exports these variables (such as Oracle’s oraenv script).

image Tip   Don’t schedule the jobs that you enter in cron to run all at the same time. Rather, spread them out so as not to bog down cron or the system at any particular point in time.

Scheduling a Job to Run Automatically

To schedule a job, you must add a line in your cron table, specifying the time you want the job to execute. There are two methods for adding entries in your cron table:

  • Editing the cron table file directly
  • Loading the cron table from a file

These two techniques are described in the following sections.

Editing the cron Table Directly

You an edit your cron table directly with the -e (editor) option of the crontab command:

$ crontab -e

When issuing the previous command, you will be presented with a file to edit. This file is known as your cron table (crontab). To schedule a script named backup.bsh to run daily, at 11:05 pm, enter the following line into your cron table:

5 23 * * * /home/oracle/bin/backup.bsh

Here, the 5 specifies that the job will run at 5 minutes after the top of the hour. The 23 is military time, specifying that the job should run in the 2300 hour window (in this example, 5 minutes after the hour). The three stars (* * *) signify that the job should run every day of the month, every month of the year, and every day of the week.

Exit the cron table file. If your default editor is vi, then type wq to exit. When you exit crontab, your cron table is saved for you. To view your cron entries, use the -l (list) option of the crontab command:

$ crontab -l

To remove your cron table completely, use the r option:

$ crontab -r

Before running the previous command, you should save your cron table in a text file, as shown:

$ crontab -l > saved.cron

In this way, you can refer to the saved file, in the event that you didn’t mean to delete your cron table.

image Tip   I once worked with a DBA who thought crontab -r meant “read the cron table.” Don’t ever make that mistake.

Setting Default Editor

The default editor invoked to modify the cron table is dependent on the value of your VISUAL OS variable. In my current environment the VISUAL variable is set to vi:

$ echo $VISUAL
vi

If the VISUAL OS variable isn’t set, then the value of EDITOR is used to define the default editor. Make sure that either VISUAL or EDITOR is set to your editor of choice. If neither VISUAL nor EDITOR is set, your system will default to the ed editor. In this scenario, you’ll be presented with the following prompt:

26
<blank prompt>

Press the Q key to exit from ed. You can have the VISUAL or EDITOR variable automatically set for you when you log in to the system. You can also manually set the editor with the export command. The following example sets the default editor to vi:

$ export EDITOR=vi

Consider putting the prior line of code in a startup file (such as .bashrc) so that your editor is set consistently.

Loading the cron Table from a File

The other way to modify your cron table is to load it directly with a file name, using the following syntax:

$ crontab <filename>

Here, the crontab utility will load the contents of the specified file into your cron table. The recommended steps to modify your cron table with this method are as follows:

  1. Before modifying your cron table, first populate a file with the cron table’s current contents; for example,
    $ crontab -l > mycron.txt
  2. Next, make a copy of the previously created file (mycron.txt). This allows you to revert back to the original file, in the event that you introduce errors and can’t readily figure out what’s incorrect. This also provides you with an audit trail of changes to your cron table:
    $ cp mycron.txt mycron.jul29.txt
  3. Now, edit the mycron.txt file with your favorite text editor:
    $ vi mycron.txt

    For example, to schedule a script named backup.bsh to run daily, at 11:05 pm, add the following lines:

    #---------------------------------------------------------------------------------
    # File backup, dk: 20-jan-14, inserted.
    5 23 * * * /home/oracle/bin/backup.bsh
    #---------------------------------------------------------------------------------
  4. When you are finished making edits, load the contents of mycron.txt into the cron table, as shown:
    $ crontab mycron.txt

If your file doesn’t conform to the cron syntax, you’ll receive an error such as the following:

"mycron.txt":6: bad day-of-week
errors in crontab file, can't install.

In this situation, either correct the syntax error, or reload the original copy of the cron table.

EXAMPLE OF A TYPICAL CRON TABLE

Listed here is a sample entry from a cron table on a database server:

#---------------------------------------------------------------------------------

# min(0-59) hr(0-23) dayMonth(1-31) monthYear(1-12) dayWeek(0/7-6) commandOrScript

#---------------------------------------------------------------------------------

# RMAN backups, dk: 01-may-12, updated.

1 16 * * * /u01/oracle/bin/rmanback.bsh INV >/u01/oracle/bin/log/bck.log 2>&1

#-----------------------------------------------------------------

# Tablespace check, sp: 17-dec-12, created.

5 * * * * /u01/oracle/bin/tbsp_chk.bsh INV 10 1>/u01/oracle/bin/log/tbsp.log 2>&1

#-----------------------------------------------------------------

Take note of a few aspects in the this entry. I always place a line at the top of every cron table (on every database server) that briefly explains the meanings of the date scheduling features:

# min(0-59) hr(0-23) dayMonth(1-31) monthYear(1-12) dayWeek(0/7-6) commandOrScript

I also separate each entry with a comment line. This makes the entry much more readable:

#-----------------------------------------------------------------

Additionally, I include a brief note (with my initials), describing the cron job and when the edit was made:

# RMAN backups, dk: 01-jan-13, updated.

If you manage dozens of database servers (each with its own cron table), with multiple DBAs, you’ll need some mechanism (and it doesn’t have to be sophisticated) for tracking who made changes and when.

Redirecting cron Output

Whenever you run a Linux shell command, by default the standard output (of the command) will be displayed on your screen. Also, if any error messages are generated, they will, by default, be displayed on your screen. You can use either > or 1> (they are synonymous) to redirect any standard output to an OS file. Additionally, you can use 2> to redirect any error messages to a file. The notation 2>&1 instructs the shell to send any error messages to the same location as standard output.

When you create a cron job, you can use these redirection features to send the output of a shell script to a log file. For example, in the following cron table entry, any standard output and error messages generated by the b ackup.bsh shell script are captured in a file named bck.log:

11 12 * * * /home/oracle/bin/backup.bsh 1>/home/oracle/bin/log/bck.log 2>&1

If you don’t redirect the cron job output, then any output will be e-mailed to the user that owns the cron table. You can override this behavior by specifying the M AILTO variable directly within the cron table. With the next few lines of code, the cron output will go to the root user:

MAILTO=root
11 12 * * * /home/oracle/bin/backup.bsh

If you don’t want the output to go anywhere, then redirect it to the proverbial bit bucket. The following entry sends the standard output and standard error to the dev/null device:

11 12 * * * /home/oracle/bin/backup.bsh 1>/dev/null 2>&1

Troubleshooting cron

If you have a cron job that isn’t running correctly, follow these steps to troubleshoot the issue:

  1. Copy your cron entry, paste it into the OS command line, and manually run the command. Frequently, a slight typo in a directory or file name can be the source of the problem. Manually running the command will highlight errors such as this.
  2. If the script runs Oracle utilities, make sure you source (set) the required OS variables within the script, such as ORACLE_HOME and ORACLE_SID. Oftentimes, these variables are set by startup scripts (such as HOME/.bashrc) when you log in. Because cron doesn’t run a user’s startup scripts, any required variables must be set explicitly within the script.
  3. Ensure that the first line of any shell scripts invoked from cron specify the name of the program that will be used to interpret the commands within the script. For instance, #!/bin/bash should be the first entry in a Bash shell script. Because cron doesn’t run a user’s startup scripts (such as HOME/.bashrc), you can’t assume that your OS user’s default shell will be used to run a command or script evoked from cron.
  4. Make certain that the cron background process is running. Issue the following command from the OS to verify:
    $ ps -ef | grep cron

    If the cron daemon (background process) is running, you should see something similar to this:

    root      2969     1  0 Mar23 ?        00:00:00 crond
  5. Check your e-mail on the server. The cron utility will usually send an e-mail to the OS account when there are issues with a misbehaving cron job.
  6. Inspect the contents of the /var/log/cron file for any errors. Sometimes, this file has relevant information regarding a cron job that has failed to run.

Examples of Automated DBA Jobs

In today’s often chaotic business environment, it’s almost mandatory to automate jobs. If you don’t automate, you may forget to do a task; or, if performing a job manually, you may introduce error into the procedure. If you don’t automate, you could find yourself replaced by a more efficient or cheaper set of DBAs.

Usually, I automate jobs such that the script only sends an e-mail in the event of a failure. Generating an e-mail upon success often leads to a full mailbox. Some DBAs like to see success messages. I generally don’t.

DBAs automate a wide variety of tasks and jobs. Almost any type of environment requires that you create some sort of OS script that encapsulates a combination of OS commands, SQL statements, and PL/SQL blocks.

The following scripts in this chapter are a sample of the wide variety of different types of tasks that DBAs automate. This set of scripts is, by no means, complete. Many of these scripts may not be needed in your environment. The point is to give you a good sampling of the types of jobs automated and the techniques used to accomplish a given task.

image Note   Chapter 3 contains basic examples of some core scripts that DBAs require. This section provides examples of tasks and scripts that DBAs commonly automate.

Starting and Stopping the Database and Listener

In many environments it’s desirable to have the Oracle database and listener automatically shut down and start up when the server reboots. If you have that requirement, then follow the next several steps to automate your database and listener shutdown and startup.

image Note   The example in this section is Linux centric; if you’re working with a different Unix platform, refer to the Oracle Database Administrator’s Guide, which can be freely downloaded from the Technology Network area of the Oracle web site (http://otn.oracle.com), for details for your environment.

  • 1.  Edit the /etc/oratab file, and place a Y at the end of the entry for the databases you want to restart automatically when the system reboots. You may need root privileges to edit the file:
    # vi /etc/oratab
  • 2.  Place within the file a line similar to this, for your environment:
    o12c:/u01/app/oracle/product/12.1.0.1/db_1:Y
  • 3.  In the previous line, o12c is the database name, and /u01/app/oracle/product/12.1.0.1/db_1 specifies the directory ORACLE_HOME. The Y at the end of the string signifies that the database can be started and stopped by the ORACLE_HOME/bin/dbstart and ORACLE_HOME/bin/dbshut scripts. You can replace the Y with an N if you do not want the database automatically stopped and restarted.

image Note   With Unix systems (such as Solaris) the oratab file is usually located in the /var/opt/oracle directory.

  • 4.  As root, navigate to the /etc/init.d directory, and create a file named dbora:
    # cd /etc/init.d
    # vi dbora
  • 5.  Place the following lines in the dbora file. Make sure you change the values of the variables ORA_HOME and ORA_OWNER to match your environment. This is a bare-bones script of what you minimally would need to stop and start a database and listener:
    #!/bin/bash
    # chkconfig: 35 99 10
    # description: Starts and stops Oracle processes
    ORA_HOME=/u01/app/oracle/product/12.1.0.1/db_1
    ORA_OWNER=oracle
    case "$1" in
      'start')
        su - $ORA_OWNER -c "$ORA_HOME/bin/lsnrctl start"
        su - $ORA_OWNER -c $ORA_HOME/bin/dbstart
       ;;
      'stop')
        su - $ORA_OWNER -c "$ORA_HOME/bin/lsnrctl stop"
        su - $ORA_OWNER -c $ORA_HOME/bin/dbshut
       ;;
    esac

These lines look like comments in the d bora file but are actually mandatory:

# chkconfig: 35 99 10
# description: Starts and stops Oracle processes

The prior lines describe the service characteristics of the script. The 35 means that the service will be started in runlevels 3 and 5. The 99 indicates that the service will be started near the end of the init processing. The 10 signifies that the service will be stopped near the beginning of the init processing. A description is also required providing textual information about the service.

image Note   A Linux runlevel is a logical container for specifying which services will run when the system is started.

  • 6.  Change the group of the d bora file to match the group assigned to the OS owner of the Oracle software (usually oinstall or dba):
    # chgrp dba dbora
  • 7.  Change the permissions on the d bora file to 750:
    # chmod 750 dbora
  • 8.  Run the following c hkconfig command:
    # /sbin/chkconfig --add dbora

Here, the chkconfig command registers the service script. This also creates the appropriate symbolic links to files beneath the /etc/rc.d directory. Use the --list option to display whether a service is on or off for each runlevel:

# chkconfig --list | grep dbora
dbora           0:off   1:off   2:off   3:on    4:off   5:on    6:off

This output indicates that the dbora service is on for runlevels 3 and 5. If you need to delete a service, use the -del option of c hkconfig.

image Tip   If you want to stop and start automatically (on system reboots), you can use other processes, such as the Intelligent Agent, Management Server, or HTTP Server. See MOS note 222813.1 for details.

Automating the shutdown and startup of your Oracle database will vary, depending on whether you’re using tools such as cluster software or ASM. The prior code in this section demonstrates the typical steps for implementing the shutdown and startup of your database in scenarios in which you don’t have other software that manages this task.

To test whether the d bora script is working, as root, run the following command to stop your database and listener:

# /etc/init.d/dbora stop

To test the startup of your database and listener, as root, issue the following command:

# /etc/init.d/dbora start

As of this writing, you may need to make a slight modification to the Oracle-supplied ORACLE_HOME/bin/dbstart and ORACLE_HOME/bin/dbshut scripts. If you inspect these scripts with an OS editor (such as vi), you’ll notice the following line:

ORACLE_HOME_LISTNER=$1

I would recommend that you change it to this:

ORACLE_HOME_LISTNER=${1:-$ORACLE_HOME}

This line instructs the scripts to accept a parameter if one is passed in. If a parameter is not passed in, then set ORACLE_HOME_LISTNER to the value contained in the variable $ORACLE_HOME. This preserves the functionality of dbstart and dbshut and also makes these scripts work when called from d bora.

If you have the opportunity to reboot your system, I recommend that you do so to ensure that the database stops and restarts correctly. When you invoke dbstart and dbshut, these utilities create log files in the ORACLE_HOME directory, named startup.log and shutdown.log. You can inspect the contents of these to verify that the shutdown and startup are working as expected.

image Tip   Starting with Oracle Database 11g, consider using the Oracle Restart utility to automate the startup/shutdown of your database and listener. This utility is especially useful in RAC and ASM environments.

LINUX SYSTEM V INIT RUNLEVELS

A Linux service is an application that typically runs in the background (conceptually similar to a Windows service). A runlevel is used to configure which services run on a box. Usually, there are seven runlevels (0–6). The chkconfig command manages which services you want running in which runlevel(s).

When Linux starts up, the sbin/init program reads the /etc/inittab file to determine the runlevel in which it should run. Following is a snippet of script from /etc/inittab that shows the runlevels used by Red Hat (these are similar to runlevels in other Linux distributions):

#   0 - halt (Do NOT set initdefault to this)

#   1 - Single user mode

#   2 - Multiuser, without NFS (The same as 3, if you do not have networking)

#   3 - Full multiuser mode

#   4 - unused

#   5 - X11

#   6 - reboot (Do NOT set initdefault to this)

To set the default runlevel, specify N in the id:<N>:initdefault line in the /etc/inittab file. This example sets the default runlevel to 5:

id:5:initdefault:

A runlevel of 1 is used by SAs when performing maintenance and repairs. A runlevel of 5 will start the Linux server with a graphical login screen at the console, plus networking capabilities. However, if you have a problem running the display manager at the console—owing to a video driver issue, for example—then you can start in runlevel 3 instead, which is command line based but still has networking services.

Most SAs who are security conscious operate their servers in runlevel 3. With the wide acceptance of VNC, SAs oftentimes do not see the benefit of running in runlevel 5. If an SA wants to take advantage of graphical utilities, the SA will just use VNC (or a similar tool). Do not attempt to set initdefault to either 0 or 6, because your Linux server will never start.

To determine the current runlevel, you can run who -r or runlevel, as follows:

# runlevel

N 5

# who -r

run-level 5  Jun 17 00:29                   last=S

A given runlevel governs which scripts Linux will run when starting. These scripts are located in the directory /etc/rc.d/rc<N>.d, where <N> corresponds to the runlevel. For runlevel 5 the scripts are in the /etc/rc.d/rc5.d directory. For example, when Linux starts up in runlevel 5, one of the scripts it will run is /etc/rc.d/rc5.d/S55sshd, which is actually a soft link to /etc/rc.d/init.d/sshd.

Checking for Archive Redo Destination Fullness

Sometimes DBAs and SAs don’t adequately plan and implement a location for storing archive redo log files on disk. In these scenarios it’s sometimes convenient to have a script that checks for space in the primary location and that sends out warnings before the archive redo destination becomes full. Additionally, you may want to implement within the script that the archive redo log location automatically switch to an alternate location that has adequate disk space.

I’ve only used scripts such as this in chaotic environments that have issues with the archive redo log destination’s filling up at unpredictable frequencies. If the archive redo log destination fills up, the database will hang. In some environments this is highly unacceptable. You could argue that you should never let yourself get into this type of situation. Therefore, if you’re brought in to maintain an unpredictable environment, and you’re the one getting the phone calls at 2:00 am, you may want to consider implementing a script such as the one provided in this section.

Before using the following script, change the variables within the script to match your environment. For instance, SWITCH_DIR should point to an alternate location on disk that you can safely switch to the archive redo log destination, should the primary destination become full. The script will send warning e-mails when the threshold goes below the amount of space specified by the T HRESH_GET_WORRIED variable. If the archive redo log space falls below the value specified in the T HRESH_SPACE_CRIT variable, then the destination will automatically be switched to the directory contained in the S WITCH_DIR variable.

#!/bin/bash
PRG='basename $0'
DB=$1
USAGE="Usage: ${PRG} <sid>"
if [ -z "$DB" ]; then
  echo "${USAGE}"
  exit 1
fi
# source OS variables
. /var/opt/oracle/oraset ${DB}
# Set an alternative location, make sure it exists and has space.
SWITCH_DIR=/oradump01/${DB}/archivelog
# Set thresholds for getting concerned and switching.
THRESH_GET_WORRIED=2000000 # 2Gig from df -k
THRESH_SPACE_CRIT=1000000  # 1Gig from df -k
MAILX="/bin/mailx"
MAIL_LIST="[email protected] "
BOX='uname -a | awk '{print$2}''
#
loc='sqlplus -s <<EOF
CONNECT / AS sysdba
SET HEAD OFF FEEDBACK OFF
SELECT SUBSTR(destination,1,INSTR(destination,'/',1,2)-1)
FROM v\$archive_dest WHERE dest_name='LOG_ARCHIVE_DEST_1';
EOF'
# The output of df depends on your version of Linux/Unix,
# you may need to tweak the next line based on that output.
free_space='df -k | grep ${loc} | awk '{print $4}''
echo box = ${BOX}, sid = ${DB}, Arch Log Mnt Pnt = ${loc}
echo "free_space        = ${free_space} K"
echo "THRESH_GET_WORRIED= ${THRESH_GET_WORRIED} K"
echo "THRESH_SPACE_CRIT = ${THRESH_SPACE_CRIT} K"
#
if [ $free_space -le $THRESH_GET_WORRIED ]; then
$MAILX -s "Arch Redo Space Low ${DB} on $BOX" $MAIL_LIST <<EOF
Archive log dest space low, box: $BOX, sid: ${DB}, free space: $free_space
EOF
fi
#
if [ $free_space -le $THRESH_SPACE_CRIT ]; then
sqlplus -s << EOF
CONNECT / AS sysdba
ALTER SYSTEM SET log_archive_dest_1='location=${SWITCH_DIR}';
ALTER SYSTEM SWITCH LOGFILE;
EOF
$MAILX -s "Archive Switch ${DB} on $BOX" $MAIL_LIST <<EOF
Archive log dest, box: $BOX, sid: ${DB} has switched.
Then ALTER SYSTEM SET LOG_ARCHIVE_DEST_1='location=<Normal Location>';
EOF
else
  echo no need to switch, ${free_space} KB free on ${loc}
fi
#
exit 0

The prior script assumes that the LOG_ARCHIVE_DEST_1 initialization parameter has been set to your archive redo location. If you’re using an FRA for the location of your archive redo log files, you can derive the archive location from the V$ARCHIVED_LOG view; for example,

select
  substr(name,1,instr(name,'/',1,2)-1)
from v$archived_log
where first_time =
 (select max(first_time) from v$archived_log);

Typically, I’ll run a script to check the archive redo log destination once an hour. Here is a typical cron entry (this entry should actually be a single line of code but has been placed on two lines in order to fit on the page):

38 * * * * /u01/oracle/bin/arch_check.bsh DWREP
  1>/u01/oracle/bin/log/arch_check.log 2>&1

Truncating Large Log Files

Sometimes log files can grow very large and cause issues by filling up critical mount points. The listener.log will record information about incoming connections to the database. With active systems this file can quickly grow to several gigabytes. For many of my environments, the information in the listener.log file does not need to be retained. If there are Oracle Net connectivity issues, then the file can be inspected to help troubleshoot issues.

The listener.log file is actively written to, so you shouldn’t just delete it. If you remove the file, the listener process won’t re-create the file and start writing to it again; you have to stop and restart the listener to restart its writing to the listener.log file. You can, however, null out the listener.log file or truncate it. In Linux/Unix environments this is done via the following technique:

$ cat /dev/null >listener.log

The previous command replaces the contents of the listener.log file with the contents of /dev/null (a default file on Linux/Unix systems that contains nothing). The result of this command is that the listener.log file is truncated, and the listener can continue to actively write to it.

Listed next is a shell script that truncates the default listener.log file. This script is dependent on setting the OS variable ORACLE_BASE. If you don’t set that variable in your environment, you’ll have to hard-code the directory path within this script:

#!/bin/bash
#
if [ $# -ne 1 ]; then
  echo "Usage: $0 SID"
  exit 1
fi
# See Chapter 2 for details on setting OS variables
# Source oracle OS variables with oraset script
. /etc/oraset $1
#
MAILX='/bin/mailx'
MAIL_LIST='[email protected]
BOX=$(uname -a | awk '{print $2}' | cut -f 1 -d'.')
#
if [ -f $ORACLE_BASE/diag/tnslsnr/$BOX/listener/trace/listener.log ]; then
  cat /dev/null > $ORACLE_BASE/diag/tnslsnr/$BOX/listener/trace/listener.log
fi
if [ $? -ne 0 ]; then
  echo "trunc list. problem" | $MAILX -s "trunc list. problem $1" $MAIL_LIST
else
  echo "no problem..."
fi
exit 0

The following cron entry runs the prior script on a monthly basis (this entry should all be on one line but has been placed on two lines in order to fit on the page):

30 6 1 * * /orahome/oracle/bin/trunc_log.bsh DWREP
  1>/orahome/oracle/bin/log/trunc_log.log 2>&1

Checking for Locked Production Accounts

Usually I have a database profile in place that specifies that a database account become locked after a designated number of failed login attempts. For example, I’ll set the DEFAULT profile FAILED_LOGIN_ATTEMPTS to 5. Sometimes, however, a rogue user or developer will attempt to guess the production account password, and after five attempts, locks the production account. When this happens, I need to know about it as soon as possible so that I can investigate the issue and then unlock the account.

The following shell script checks the LOCK_DATE value in DBA_USERS for a list of production database accounts:

#!/bin/bash
if [ $# -ne 1 ]; then
  echo "Usage: $0 SID"
  exit 1
fi
# source oracle OS variables
. /etc/oraset $1
#
crit_var=$(sqlplus -s <<EOF
/ as sysdba
SET HEAD OFF FEED OFF
SELECT count(*)
FROM dba_users
WHERE lock_date IS NOT NULL
AND username in ('CIAP','REPV','CIAL','STARPROD'),
EOF)
#
if [ $crit_var -ne 0 ]; then
  echo $crit_var
  echo "locked acct. issue with $1" | mailx -s "locked acct. issue" [email protected]
else
  echo $crit_var
  echo "no locked accounts"
fi
exit 0

This shell script is called from a scheduling tool, such as cron. For example, this cron entry instructs the job to run every 10 minutes (this entry should actually be a single line of code but has been placed on two lines in order to fit on the page):

0,10,20,30,40,50 * * * * /home/oracle/bin/lock.bsh DWREP
   1>/home/oracle/bin/log/lock.log 2>&1

In this way, I am notified when one of the production database accounts becomes locked.

Checking for Files Older Than a Certain Age

For some jobs that I run from shell scripts (such as backups), I’ll first check to see if another backup job is already running. The check involves looking for a lock file. If the lock file exists, then the shell script exits; if the lock file doesn’t exist, then one is created. At the end of the job, the lock file is removed.

Sometimes, there’s a problem with the job, and it aborts abnormally before the lock file can be removed. In these situations, I want to know if a lock file that is older than 1 day exists the a server. The existence of an old lock file indicates that there has been an issue and that I need to investigate. The following shell script checks for any lock files older than 1 day in the tmp directory.

#!/bin/bash
BOX=$(uname -a | awk '{print $2}')
# find all lock files gt 1 day old.
# Find all lock files in /tmp, if found, find any older than one day
ls /tmp/*.lock 2>/dev/null &&
filevar=$(find /tmp/*lock -type f -mtime +1 | wc -l) || filevar=0
if [ $filevar -gt 0 ]; then
  echo "$BOX, lockfile issue: $filevar" |
  mailx -s "$BOX lockfile problem" [email protected]
else
  echo "Lock file ok: $filevar"
fi
exit 0

I usually check for the existence of a lock file on a daily basis. Here is a typical cron entry for running the prior script, named lock_chk.bsh (this entry should all be on one line but has been placed on two lines in order to fit on the page):

33 5 * * * /orahome/oracle/bin/lock_chk.bsh
   1>/orahome/oracle/bin/log/lock_chk.log 2>&1

Checking for Too Many Processes

On some database servers, you may have many background SQL*Plus jobs. These batch jobs may perform tasks such as copying data from remote databases and large daily update jobs. In these environments it’s useful to know if, at any given time, there are an abnormal number of shell scripts or SQL*Plus processes running on the database server. An abnormal number of jobs could be an indication that something is broken or hung.

The next shell script has two checks in it: one  to determine the number of shell scripts that are named with the extension of bsh and one to determine the number of processes that contain the string of sqlplus:

#!/bin/bash
#
if [ $# -ne 0 ]; then
  echo "Usage: $0"
  exit 1
fi
#
crit_var=$(ps -ef | grep -v grep | grep bsh | wc -l)
if [ $crit_var -lt 20 ]; then
  echo $crit_var
  echo "processes running normal"
else
  echo "too many processes"
  echo $crit_var | mailx -s "too many bsh procs: $1" [email protected]
fi
#
crit_var=$(ps -ef | grep -v grep | grep sqlplus | wc -l)
if [ $crit_var -lt 30 ]; then
  echo $crit_var
  echo "processes running normal"
else
  echo "too many processes"
  echo $crit_var | mailx -s "too many sqlplus procs: $1" [email protected]
fi
#
exit 0

The prior shell script, named proc_count.bsh, is run once an hour from a cron job (this entry should actually be a single line of code but is placed on two lines in order to fit on the page):

33 * * * * /home/oracle/bin/proc_count.bsh
  1>/home/oracle/bin/log/proc_count.log 2>&1

Verifying the Integrity of RMAN Backups

As part of your backup-and-recovery strategy, you should periodically validate the integrity of the backup files. RMAN provides a RESTORE...VALIDATE command that checks for physical corruption within the backup files. The following script starts RMAN and spools a log file. The log file is subsequently searched for the keyword error. If there are any errors in the log file, an e-mail is sent:

#!/bin/bash
#
if [ $# -ne 1 ]; then
  echo "Usage: $0 SID"
  exit 1
fi
# source oracle OS variables
. /etc/oraset $1
#
date
BOX='uname -a | awk '{print$2}''
rman nocatalog <<EOF
connect target /
spool log to $HOME/bin/log/rman_val.log
set echo on;
restore database validate;
EOF
grep -i error $HOME/bin/log/rman_val.log
if [ $? -eq 0 ]; then
  echo "RMAN verify issue $BOX, $1" |
  mailx -s "RMAN verify issue $BOX, $1" [email protected]
else
  echo "no problem..."
fi
#
date
exit 0

The RESTORE...VALIDATE doesn’t actually restore any files; it only validates that the files required to restore the database are available and checks for physical corruption.

If you need to check for logical corruption as well, specify the CHECK LOGICAL clause. For example, to check for logical corruption, the prior shell script would have this line in it:

restore database validate check logical;

For large databases the validation process can take a great deal of time (because the script checks each block in the backup file for corruption). If you only want to verify that the backup files exist, specify the V ALIDATE HEADER clause, like so:

restore database validate header;

This command only checks for valid information in the header of each file that would be required for a restore and recovery.

Summary

Automating routine database jobs is a key strategy of the successful DBA. Automated jobs ensure that tasks are repeatable and verifiable and that you’re quickly notified when there are any problems. For example, your job as a DBA greatly depends on successfully running backups and ensuring that the database is highly available. This chapter includes several scripts and examples that detail how to run routine jobs at defined frequencies.

Oracle provides the Oracle Scheduler utility (implemented via the DBMS_SCHEDULER PL/SQL package) for scheduling jobs. This tool can be used to automate any type of database task. You can also initiate jobs based on system events or on the success/failure of other scheduled jobs. I prefer to use cron for scheduling database jobs. However, you might have sophisticated scheduling requirements that dictate the use of a tool such as the Oracle Scheduler.

If you’re a DBA who works in a Linux/Unix shop, you should familiarize yourself with the cron utility. This scheduler is simple to use and is almost universally available. Even if you don’t use cron for your current assignment, you’re sure to encounter its use in future work environments.

At this point in the book, you’ve learned how to implement and perform many tasks required of a DBA. Even if you manage just one database, no doubt you’ve been embroiled in a vast number of troubleshooting activities. The next chapter focuses on diagnosing and resolving many of the issues that a DBA encounters.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset