Chapter 11. Workload Management

This chapter explores workload management options available with Oracle Real Application Cluster (RAC). There is a global trend in the industry to consolidate applications and workloads, and this is where Real Application Cluster shines. For a consolidation project to be a success, suitable mechanisms need to exist to allow these applications to coexist together. Administrators need to be able to define importance levels for applications served by Oracle, and less important applications must not starve out the ones with higher importance levels.

Database client sessions should not be disrupted in case of node failures; instead, they must be relocated to a surviving cluster node. Also, database sessions should be evenly distributed to all cluster nodes to prevent one node from being over-utilized, while another set of nodes remains idle. Sessions in a connection pool would ideally be able to move to less busy nodes at runtime, as well.

Last but not least, the database should provide accounting functionality based on application usage metrics. The infrastructure provider (whether internal or external) can use these metrics to charge the consumers of their services.

All the preceding concepts are available with Oracle Real Application Cluster databases. The central workload management concept—and the first we will discuss in this chapter—is database services. But workload management comprises a number of additional technologies as well, including the following concepts that will be covered in depth throughout this chapter:

  • Load Balancing

  • High Availability Framework

  • Fast Application Notification (FAN)

  • Fast Connection Failover

  • Transparent Application Failover

  • Resource Manager

Let's start by taking a look at the available database services.

Introducing Services

Services as we know them today have a long history. They were introduced to the Oracle user community back in 1999, with Oracle 8i Release 2. In its "Getting to know Oracle 8i, Release 2" manual, Oracle introduced services to the network stack. They did this alongside another feature, called dynamic registration; specifically, the process monitor background process (PMON) registered with the listener of a database. While the connection method using the SID is still available even in 11g Release 2, users are encouraged to use the SERVICE_NAME parameter in the CONNECT_DATA section of the local naming file.

Although services were introduced long ago, from a single-instance user's point-of-view, it wasn't entirely clear why one should prefer the service name to the ORACLE_SID. Indeed, using services seemed to introduce more problems! Most of the networking issues in an Oracle environment are caused by incorrectly specified service names on the database and/or the local naming configuration. The most common of these errors is possibly the infamous ORA-12514 error: "TNS: listener does not currently know of service requested in connect descriptor."

So what exactly are these services? In a nutshell, they are logical entities you can define in Oracle RAC databases that enable you to group database workloads and route connection requests to the optimal instances that are assigned to offer the service. In single-instance Oracle, services can be used to provide additional levels of granularity for statistics aggregation and tracing. However, the full potential of services is realized in RAC environments by controlling which instances in the cluster perform a specific workload at different times for a given priority.

In many consolidation projects, individual services are created per application consolidated in the cluster. Other than that, there is hardly a limit to the creativity of the users when it comes to defining services. For instance, you can create services on a departmental basis or to reflect a certain type of workload such as batch processing or online reporting. Users connect to the database service, not the database itself, and they can be directed to the best instance providing the service transparently.

As we will see later in this chapter, services form the basis of workload management in RAC. Effective capacity management is made possible by assigning services to specific instances in the cluster and further controlling resource consumption with the Resource Manager. In the event of planned or even unplanned outages, services can be relocated to different instances in the cluster to provide continued availability to the application. Technologies such as Transparent Application Failover and Fast Connection Failover can make failover seamless to the users in a correctly configured system.

There is more than one way to create services, and we will discuss the different options available in their own section later in the chapter.

Creating an Administrator Managed Database vs. Policy-Managed Database

Administrators have two options when defining services to RAC databases, depending on how a database is managed. For administrator-managed databases, preferred and optionally available instances are defined for a service. Think of preferred instances as "active" instances or the instance you want to provide the service under normal conditions. If a preferred instance is available, the service will be started on it. If a preferred instance fails, a service can relocate to an available instance, if so defined. Available instance are instances on "standby." Once the failed node comes back online, the service will not automatically move back to the preferred instance.

Policy-managed databases behave differently. Remember that administrators have less control over policy-managed database because they are mostly automated, and services are no exception to this general rule. Instead of manually defining preferred and available instances, you define whether a service should run on all or only one instance in the server pool. A service that is started on all nodes in a server pool is referred to as a "uniform" service. The opposite, a service started on one arbitrary instance only, is called a singleton service. When additional instances join the server pool, Oracle automatically extends uniform services to the newly joined instances without user intervention. Although we said in the RAC Concepts chapter that the transition to using a SCAN is not immediately needed, you definitely have to use a SCAN address when connecting to a policy-managed database. This is because the Oracle client can't know how at connect time on how many nodes a policy-managed database is running.

A startup policy can be assigned to services, regardless of database type. This feature is new with release 11.2. Services can start automatically when the database starts. Similarly, services can be configured to start depending on the database role in a Data Guard environment. Before this, a database trigger had to be deployed, firing after the startup of the database and activating services based on the detected database role. The database role is available in the V$DATABASE view, and it can even be queried while the database is mounted and not open.

Services play an important role in other parts of the database, as well. We'll explore how you can use services with the database scheduler, shared server, and parallel processing. Later in the chapter, we will explain how the Resource Manager can be used to control resource usage on a per-service basis.

Managing Services with the Database Scheduler

The Oracle database scheduler (also referred to as the scheduler, replaces functionality previously provided by the DBMS_JOB package. In a nutshell, the scheduler allows the execution of code, both inside and outside the database, in either an ad-hoc fashion or at defined intervals. One of the design goals of the scheduler was to provide (better) support for the execution of jobs in a clustered environment. The discussion of all the scheduler's features is beyond the scope of this chapter, but we will cover the necessary basics for you to understand how the scheduler and RAC services cooperate.

The first thing you will notice when dealing with the scheduler is that it is a lot more flexible than DBMS_JOB. Unfortunately, this flexibility comes at the cost of higher complexity. The scheduler uses a number of concepts to facilitate the creation of jobs. For example, metadata for jobs can be defined in a number of entities:

  • Job classes

  • Programs

  • Schedules

  • Windows

All of this sounds quite abstract, and it is, in a way. It helps to keep in mind that all the metadata serves only one purpose: the creation of a job. It's important to keep this in the back of your mind for the following discussion. The more metadata that is available about a job, the less coding that is required to instantiate it.

The procedure to create a scheduler job is heavily overloaded, and the parameters to be supplied depend on how much detail you have saved as metadata. So, instead of entering the same information repeatedly, in a heavily customized environment, you only need to refer to the job class, the program, and a scheduler to create the job.

Similar jobs can be grouped into job classes. Job classes are very important in the context of RAC and workload management in general. These entities can take attributes such as a resource consumer group and a database service, as well as some other attributes. The association with a service allows database administrators and developers to limit the execution of a job to a distinct subset of nodes in the cluster.

Tip

Stopping a service will automatically will stop the jobs that rely on it from executing!

From a RAC administrator's perspective, this is one of the most sought-after improvements over DBMS_JOB: you have more control over which service a job executes. Add in the integration with a resource consumer group, and you have a very flexible tool for automating the execution of work.

A scheduler program saves meta-information about what is to be executed. The scheduler can execute anonymous PL/SQL blocks, stored procedures, or an executable stored outside the database, such as a shell script or C program. Stored procedures can of course take arguments; up to 255 arguments are supported.

A schedule defines the frequency of execution and the interval at which a job is to be executed. A very flexible syntax allows the creation of individual schedules, and it leaves little to be desired. Jobs can be executed only once, at a specific time. Or, they can be executed multiple times, as defined by the schedule.

Last but not least, scheduler windows work hand-in-hand with the resource manager. Windows are represented by a start timestamp and have a fixed duration. The important feature in the context of workload management is the integration with resource manager. A window can have a resource plan associated with it that becomes active as soon as the window opens.

These few paragraphs obviously can't provide a complete overview of the scheduler, but it should provide you with enough insight into the scheduler and its related objects to get you started. From the workload-management perspective, the most important concepts to remember are the scheduler's ability to use resource consumer groups and scheduler windows to change resource plans, as well as its ability to limit the execution of jobs to a service in job classes.

To bring theory to life, let's go through the definition of a scheduler job using a job class to limit its execution to the service batchserv. To keep the example simple, we will populate the parameters for the job creation manually, except for the job class. The following job purges data from a logging table older than two days by using the stored procedure, PURGELOGPROC:

BEGIN
  dbms_scheduler.create_job_class(
    logging_level => DBMS_SCHEDULER.LOGGING_RUNS,
    service => 'batchserv',
    comments => 'Limits execution to batchserv service',
    job_class_name => 'BATCHSERVCLASS'),
END;
/

BEGIN
  dbms_scheduler.create_job(
   job_name => 'PURGELOGJOB',
   job_type => 'STORED_PROCEDURE',
   job_action => 'PURGELOGPROC',
   repeat_interval => 'FREQ=DAILY',
   start_date => to_timestamp('11.05.2010 19:00', 'dd.mm.yyyy hh24:mi'),
   job_class => 'BATCHSERVCLASS',
   comments => 'purge logs older than two days',
   auto_drop => FALSE,
   enabled => TRUE);
END;
/

The newly created database job will use BATCHSERVCLASS job class; it will limit its execution to the nodes utilizing the BATCHSERV service.

Using Services with Shared Server

Shared server is discussed less and less frequently in Oracle related publications. The majority of Oracle installations possibly use dedicated server connections and connection pools based on them. The days when hardware was not powerful enough to support a large number of users may be considered ancient history, which partly explains the absence of shared server-related subjects. An application deployed in a RAC environment does not usually have to deal with these problems; we can now scale to a larger user base by adding another node to the cluster. However, we still think it is useful to understand shared servers and services, as well as how they work.

A small number of systems that need to support thousands of connections can benefit from using a shared server configuration, especially in conjunction with connection pooling. But what exactly is a shared server configuration? And how does it relate to services? We will have a look at shared servers first, and then examine the link of shared servers to services.

In a shared server configuration, database sessions do not create a dedicated connection between the database client and the database. Instead, a so-called dispatcher process is initially contacted by the database client. The dispatcher in turn places the client request into a queue, from where an idle shared server from a pool of shared servers picks it up and processes it. The Oracle server process is called shared because there is no 1:1 mapping between it and a client. Once the request is completed, the result of the processing is placed into the dispatcher's response queue; from there, it is transmitted to the client.

To set up and manage connection with shared servers, a number of initialization parameters must be changed. You can do using the database configuration assistant dbca or Enterprise Manager if the standard SQL*Plus interface is not sufficient. The most important initialization parameters are shared_servers and dispatchers. With the latter parameter, we come full circle and create the link with services.

The dispatcher process in the shared server setup is the first component contacted by an Oracle client. By default, dispatchers accept all client connections; however, it is possible to create dispatchers to serve a specific service only. Consider this example for a configuration that uses two dispatchers for the service wikisrv and five dispatchers for all other connection requests:

*.dispatchers=(protocol=TCP)(dispatchers=5)
*.dispatchers=(protocol=TCP)(dispatchers=2)(service=wikisrv)

Prior to the introduction of Database Resident Connection Pooling (DRCP), the use of shared servers could solve problems created by legacy applications.

Note

We will discuss the DRCP feature later in this chapter's "Database Resident Connection Pool" section.

Some CGI scripts created a dedicated connection each time a dynamic web page was requested; this caused unusually high overhead to the database CPUs. The dedicated server process creation does not happen in a shared server configuration; instead, the number of shared server processes and dispatchers is configured by the database administrator in the initialization files. This reduces CPU overhead and overall resource consumption on the database server.

The use of shared server and dedicated server processes is not mutually exclusive. The two coexist quite happily in an Oracle database. In fact, many administration tasks require the use of dedicated server connections, even if you set up shared server for user connections.

Note

Services play an important role for parallel execution as well, for which a whole chapter has been dedicated. Please refer to chapter 14 for more information.

Managing Services

Managing services is an important aspect of any RAC project. As you will see in the "Balancing the Workload" section, defining services to run on the most appropriate subset of cluster nodes is essential in achieving an even workload distribution across nodes. Services are deeply integrated with Grid Infrastructure and Clusterware. Instead of setting the database initialization parameter service_names manually so it defines an instance-to-service mapping, you should use administrative tools such as the ones to be described below instead. In fact, bypassing the administration tools for service maintenance is strongly discouraged!

Next, we will explain how services are created and modified. There are three different tools you can use to create services:

  • The server control srvctl command line tool

  • Enterprise Manager

  • The DBMS_SERVICE package

The Database creation assistant dbca used to be available to manage database services. However, this functionality was removed in Oracle 11.1. The remaining tools available to create, modify, and delete services are the ones mentioned previously in this chapter. We will discuss each of them independently. For learning the definitions of load balancing properties, we kindly refer the reader to the "Balancing the Workload" section later in this chapter; that section includes an extensive discussion of workload management.

Managing Services with SRVCTL

Using srvctl is probably the easiest way to define and manage services in a RAC environment. The syntax has changed a little in Oracle 11.2, allowing administrators to define workload management-related properties on the command line, rather than having to rely on the DBMS_SERVICE package. You will also find that some options refer to administrator-managed databases, while others are applicable to policy-managed databases only. The srvctl utility provides you a number of actions that you can execute against a service, including the following:

  • Create a service

  • Start and stop a service

  • Modify a service

  • Check service's define-time properties

  • Check service's runtime properties

The syntax for the creation of services depends on whether the database type is administrator-managed or policy managed. For administrator-managed databases, the following options are available:

srvctl add service -d dbUniqueName -s serviceName
   -r preferredList [-a availableList] [-P {BASIC | NONE | PRECONNECT}]
   [-l [PRIMARY | PHYSICAL_STANDBY | LOGICAL_STANDBY | SNAPSHOT_STANDBY]
   [-y {AUTOMATIC | MANUAL}] [-q {TRUE | FALSE}] [-j {SHORT | LONG}]
   [-B {NONE | SERVICE_TIME | THROUGHPUT}] [-e {NONE | SESSION | SELECT}]
   [-m {NONE | BASIC}] [-x {TRUE | FALSE}]
   [-z failover_retries] [-w failover_delay]

The list of options is slightly different for policy-managed databases:

srvctl add service -d dbUniqueName -s serviceName
   -g serverPoolName [-c {UNIFORM | SINGLETON}]
   [-l [PRIMARY | PHYSICAL_STANDBY | LOGICAL_STANDBY | SNAPSHOT_STANDBY]
   [-y {AUTOMATIC | MANUAL}] [-q {TRUE | FALSE}] [-j {SHORT | LONG}]
   [-B {NONE | SERVICE_TIME | THROUGHPUT}] [-e {NONE | SESSION | SELECT}]
   [-m {NONE | BASIC}] [-P {BASIC | NONE }] [-x {TRUE | FALSE}]
   [-z failover_retries] [-w failover_delay]

Users of Oracle 11.1 will notice that a number of new options have been introduced into the latest version of the tool. The most important of these new features: You can now set connect-time load balancing parameters and enable the load balancing framework without having to call DBMS_SERVICE.MODIFY_SERVICE() to do so. Many of the arguments to srvctl are identical for other operations regarding services, so they will be listed only once. Table 11-1 shows the most common command-line options to srvctl.

Table 11.1. Command-Line Options for Adding Services

Option

Meaning

Comment

-d dbUniqueName

Specifies the unique name for this database, as defined during the addition of the database to Grid Infrastructure. If you are unsure about the name, you can query the OCR using srvctl config database for a list of defined databases

 

-s serviceName

Specifies the name of the new service to be added.

 

-r preferredList and -a availableList

Provides a comma-separated list of available and preferred nodes for a service.

Only for administrator-managed databases

-g serverPoolName

Specifies the server pool the service should be assigned to.

Only for policy-managed databases

-c UNIFORM or SINGLETON

Defines the service to run either on all nodes in the server pool (uniform) or only one (singleton).

Only for policy-managed databases

-l PRIMARY, PHYSICAL_STANDBY, SNAPSHOT_STANDBY, LOGICAL_STANDBY

Defines the service to start with the specified database role.

 

-P NONE, BASIC, PRECONNECT

Allows you to define the Transparent Application Failover policy.

Only for administrator-managed databases

-y AUTOMATIC or MANUAL

Defines whether the service starts with the database or not.

 

-q TRUE or FALSE

Enables notifications for clients sent through Advanced Queuing rather than FAN.

ODP.NET clients need this to receive load balancing information; some OCI clients need this, as well

- j LONG or SHORT

New in 11.2, the connection load balancing goal can be set via the command line rather DBMS_SERVICE

 

-B NONE, SERVICE_TIME or THROUGHPUT

Disables or enables the load balancing advisory.

 

-e NONE, SESSION, SELECT

Defines the type of failover operation.

 

-m NONE, BASIC

Specifies the failover method.

 

-z and -w

Overrides the default number of failover retries (-z) and the failover delay (-w).

 

You should keep the separation between administrator-managed databases and policy-managed databases in mind. Only administrator-managed databases can have preferred and available instances!

Assume you want to add a service named reporting to your four-node administrator-managed database that normally uses the third node, but can alternatively run on the first node. You could use the following syntax to do so:

[oracle@london2 ˜]> srvctl add service -d PROD -s REPORTING -r PROD3 -a PROD1 
>   -P BASIC -e SESSION

Note that Transparent Application Failover should also be enabled for session failover with the basic method of reestablishing a session only when needed. The new service must be started before it can be used by clients. The syntax for starting services looks like this:

srvctl start service -d dbName [-s service [-n nodeName | -i instanceName ]]

The database name is the only mandatory argument here. If service names are omitted, then all services for the database will be started. The remaining parameters allow the administrator to start one or more services on a specific database instance. To start the preceding created service on its available node, you would use the following command:

[oracle@london1 ˜]> srvctl start service -d PROD -s REPORTING -i PROD1

Conversely, you can stop services using the following command:

srvctl stop service -d dbName [-s service  [-n nodeName | -i instanceName] ] [-f]

For example, you can issue the following command to stop the reporting service on instance PROD3:

[oracle@london1˜ ]> srvctl stop service -d PROD -s REPORTING -i PROD3

Service definitions can be modified after a service is created. You can also shuffle services around once they are created. Instances can also be upgraded from available to preferred. Of course, you can also modify the workload balancing attributes; we will cover how to do that in the "Configuring Server Side Load Balancing" section later in the chapter.

Moving a service from one set of nodes to another is also supported. You can do so using the following syntax:

srvctl modify service -d dbName -s service -i oldInstance -t newInstance [-f]

This syntax moves a service from the old instance to a new instance, optionally forcing session disconnects. For example, we could move the aforementioned reporting service from the preferred instance to the available instances using this syntax:

[oracle@london1˜]> srvctl modify service -d PROD -s REPORTING -i PROD3 -t PROD1

It is also possible to upgrade instances from available to preferred instances:

srvctl modify service -d dbName -s service -i availableInst -r [-f]

To upgrade instance PROD1 to a preferred instance for the reporting service, use the following command:

[oracle@london1˜]> srvctl modify service -d PROD -s REPORTING -i PROD1 -r

It is also possible to perform the mapping between preferred and available instances in one command:

srvctl modify service -d dbName -s serviceName -n -i preferredNodes
     [-a availableNodes] [-f]

Checking the status of services is important from time to time. This helps you identify which services are running on their available, rather than their preferred instances. Passing the relocate argument to srvctl allows the administrator to move the service back to where it should be. To compare the configuration against its current status, Oracle supplies the srvctl config service and srvctl status service calls. When invoked with the config option, Oracle displays configuration information about services, as in the following example:

[oracle@london1 ˜]$ srvctl config service -d PROD -s reporting
Service name: reporting
Service is enabled
Server pool: prod_reporting
Cardinality: 1
Disconnect: false
Service role: PRIMARY
Management policy: AUTOMATIC
DTP transaction: false
AQ HA notifications: false
Failover type: SESSION
Failover method: NONE
TAF failover retries: 0
TAF failover delay: 0
Connection Load Balancing Goal: LONG
Runtime Load Balancing Goal: NONE
TAF policy specification: BASIC
Preferred instances: PROD1
Available instances: PROD3

You can see from the output that the service is defined to run on instance PROD1 as preferred instance, but with PROD3 as a backup. The service is set up neither for distributed processing (XA) nor for connection load balancing or AQ notifications.

The final option to cover in this section is the relocate command. As we said in the introduction, services will not automatically fail back from available instances to their preferred instances. However, you can use the srvctl relocate command to force a service to fail back to its preferred instance:

srvctl relocate service -d dbName -s serviceName
    {-c fromNode-n toNode |
    -i sourceInstance -t destinationInstance }
    [-f]

This is one of the few administrator-managed commands applicable to policy-managed databases, as well, but only for singleton services. The DBA would usually use the -i and -t arguments with administrator-managed databases. The use of the -c and -n flags is most appropriate for policy-managed databases.

Continuing with the example, let's say that we learned our service was running on the available node. Let's also say that we want to move our service back to the preferred instance. We can make that move using this snippet:

[oracle@london3 ˜]$ srvctl status service -d PROD -s reporting
Service reporting is running on instance(s) PROD1
[oracle@london3 ˜]$ srvctl relocate service -d PROD 
> -s reporting -i PROD1 -t PROD3
[oracle@london3 ˜]$ srvctl status service -d PROD -s reporting
Service reporting is running on instance(s) PROD3

Managing Services with Enterprise Manager

Enterprise Manager DBConsole and Enterprise Manager Grid Control both offer the ability to modify services. Enterprise Manager 11g Release 1 is the latest release, and it includes almost complete support for Oracle 11g Release 2 databases. Enterprise Manager Grid Control 10.2.0.5 requires additional patches to support Oracle 11.2 as a managed target, and even then, it cannot use new 11.2 features. This section shows screen captures from an 11.1 Enterprise Manager Grid Control setup that monitors, among other targets, a four-node RAC system.

As with any well-designed graphical user interface, administration of Oracle is simplified, as is the creation and management of services. To access the service management interface, selecting the cluster database target, and then click on Availability tab. You should be presented with the page shown Figure 11-1.

The Availability tab in Enterprise Manager Grid Control for a cluster database

Figure 11.1. The Availability tab in Enterprise Manager Grid Control for a cluster database

Next, click the Cluster Managed Database Services link. After specifying the login credentials for the database and host on the next page, you are redirected to the main service management console within Enterprise Manager (see Figure 11-2). If you installed Grid Infrastructure with a user other than the RDBMS binaries, then please enter those RDBMS account credentials. Otherwise, EM will spin forever trying to get the information. The options presented will differ, depending on the management type of your database. Figure 11-2 shows the resulting "Cluster Managed Database Services" page.

Services in an administrator-managed database

Figure 11.2. Services in an administrator-managed database

Next, click the Create Service button to create a new service. You'll be taken to the page in Figure 11-3, which presents you with the same options as with the command line interface; however, in this case, you don't have to remember the syntax.

Creating a new service for an administrator-managed database

Figure 11.3. Creating a new service for an administrator-managed database

In addition to the options offered by srvctl, you can define custom alerts in cases where a service uses more CPU cycles than a specified threshold. Internally, this calls DBMS_SERVER_ALERT.SET_THRESHOLD(). Grid Control conveniently provides the option to update the server's tnsnames.ora file; it also starts the service after its creation.

Once you successfully add the service to the cluster database, it will appear in the Overview page. If you opted to start the service after its creation, it should be up on the preferred nodes. Another nice feature relates to maintaining the local tnsnames.ora file. This file will include the necessary entry to connect to the service. If this is not required, then the TNS settings can be obtained by clicking the Show all TNS strings button. The page that comes up next displays all TNS strings for the services defined, along with your TAF policies and load balancing settings (see Figure 11-4).

Displaying all TNS settings for the services defined in database PROD

Figure 11.4. Displaying all TNS settings for the services defined in database PROD

Modifying the service settings is as simple as clicking the service name in the Overview page. You are then redirected to the same page you edited when creating the service. The Manage option in the service Overview page lets you start, stop, and relocate services.

Managing Services with DBMS_SERVICE

The DBMS_SERVICE package allows you to create and modify services in Oracle RAC and single-instance databases. In Oracle 11.2, most of this service's functionality is marked as deprecated in favor of the server control utility, srvctl. The following procedures in DBMS_SERVICE should no longer be used in RAC 11.2 and Oracle Restart environments:

  • CREATE_SERVICE()

  • MODIFY_SERVICE()

  • START_SERVICE()

  • STOP_SERVICE()

The documentation states the fact that the service definition and its modifications don't propagate into the metadata held in the OCR. Subsequent calls to srvctl would therefore overwrite settings made by calls to the deprecated subprograms. This is not only true for Oracle 11g, but also for 10g. Therefore, we do not recommend maintaining services with this package in 11.2. DBMS_SERVICE will update the data dictionary only, but Clusterware will not know about the service and cannot monitor it.

In Oracle 11.1 and earlier, calls to DBMS_SERVICE are still relevant, although this is mainly for the aforementioned reasons of modifying service attributes.

One of the more useful remaining functions in DBMS_SERVICE is DISCONNECT_SESSION, which disconnects all sessions from a service.

Balancing the Workload

Services are essential for load-balancing in a RAC environment. Making the optimal use of resources requires that you avoid situations where the majority of connection requests end up on one node, while the other nodes remain mostly idle. In Oracle 10g Release 2 up to, but not including 10.2.0.4, a number of bugs caused load balancing to fail in some cases. As far as we can tell, these problems have been resolved, and load balancing should work as expected in releases 10.2.0.4 and later.

Two types of load balancing exist with the Oracle database software stack: client-side and server-side load balancing. With client-side load balancing prior to Oracle 11.2, you choose a random listener out of the ADDRESS_LIST parameter in the local naming file to connect to. When using the Single Client Access Name with multiple IP addresses instead, connection requests will be spread across all SCAN listeners in a round-robin way. This is also the reason why you should have specified more than one IP address for the SCAN address during the installation. Server-side load balancing uses information provided by the load advisory framework. The SCAN listeners (or database listeners in Oracle 11.1 and earlier) direct connection requests to the most suitable instance providing the database service. The load balancing advisory keeps track of the load on individual instances; thus, it allows the listener to select (among other things) the least loaded node for a connection.

Note

In a RAC environment, you should use both types of load balancing.

We will discuss the implementation of client-side and server-side load balancing, as well as the load advisory framework in the "Configure Client Side Load Balancing" and "Configure Server Side Load Balancing" sections later in this chapter.

Configuring Client-Side Load Balancing

Client-side load balancing, or connection-load balancing, is defined in the local naming file by setting the LOAD_BALANCE parameter to True or On, as in this example;

ebs.example.com =
  (DESCRIPTION=
    (ADDRESS_LIST=
      (LOAD_BALANCE = on)(FAILOVER = ON)
      (ADDRESS = (PROTOCOL = tcp)(HOST = london1-vip.example.com)(PORT = 1521))
      (ADDRESS = (PROTOCOL = tcp)(HOST = london2-vip.example.com)(PORT = 1521))
      (ADDRESS = (PROTOCOL = tcp)(HOST = london3-vip.example.com)(PORT = 1521))
      (ADDRESS = (PROTOCOL = tcp)(HOST = london4-vip.example.com)(PORT = 1521))
      (CONNECT_DATA =
        (SERVICE_NAME = ebs.example.com))))

In RAC 11.2, the LOAD_BALANCE setting has lost a little bit of its relevance with the introduction of the Single Client Access Name (SCAN). Remember that the scan resolves to a maximum of three IP addresses provided by the DNS server in a round-robin fashion. Prior to the 11.2 release, it was good practice to ensure that LOAD_BALANCE was enabled. It was also important to have all nodes' virtual IP addresses listed in the tnsnames.ora file An example for the new syntax using the SCAN is shown below.

ebs. example.com =
  (DESCRIPTION=
    (ADDRESS = (PROTOCOL = tcp)(HOST = cluster1.example.com)(PORT = 1521))
    (CONNECT_DATA =
      (SERVICE_NAME = ebs.example.com)
    )
  )

As you can see the 11.2 entry looks a lot less cluttered than the previous example, thanks to the SCAN "cluster1.example.com". There is no need to rush now and update all tnsnames.ora files in your environment when migrating from an older version to Oracle 11.2. In the transition phase, while upgrading from an older version of RAC to Oracle 11.2, it is not required to use SCAN straight away. The updating of the various tnsnames files on database clients can wait until the proper change control has been implemented. However, you must use SCAN if you intend to use policy-managed databases.

The FAILOVER directive is another important parameter. Set at the same level as the LOAD_BALANCE directive, the FAILOVER keyword instructs the Oracle client to try connecting to the next entry in the tnsnames ADDRESS_LIST section. For example, assume that node1 in the preceding example has failed and is currently rebooting. Its virtual IP address will have migrated to another node in the cluster, such as node4. Should a client request a connection through node1-vip, it will get a notification that the node failed, and it will immediately try another server thanks to the FAILOVER keyword.

The FAILOVER keyword has lost a bit of its relevance in RAC 11.2 because there is only one address specified in the local naming file. The DNS round-robin resolution will provide another IP address resolving to the SCAN should the initial connection request fail. The client will automatically try the next supplied IP address. However, it is possible to define multiple addresses pointing to standby databases in addition to the primary database. This means you might use a single tnsnames.ora file for the application, as in this example:

ebs.example.com =
  (DESCRIPTION=
    (FAILOVER=ON)
    (CONNECT_TIMEOUT=5)(RETRY_COUNT=3)
      (ADDRESS =
        (PROTOCOL = tcp)(HOST = prodcluster-scan.example.com)(PORT = 1521))
      (ADDRESS =
        (PROTOCOL = tcp)(HOST = drcluster-scan.example.com)  (PORT = 1521))
      (CONNECT_DATA=
        (SERVICE_NAME= ebs.example.com)
      )
  )

The preceding example illustrates yet again the usefulness of the SCAN option. You undoubtedly spotted the CONNECT_TIMEOUT and RETRY_COUNT parameters in the naming definition. These are new 11g Release 2 parameters, and they can be used in client connection strings, rather than a sqlnet.ora file where they are globally applicable.

Note the connect time load balancing is set to work on the Net*8 layer only, and it has nothing to do with session failover; FAN/FCF and TAF are responsible for handling these.

Note

You might be surprised that LOAD_BALANCE and FAILOVER are enabled by default when an ADDRESS_LIST section is used in your tnsnames.ora file. If you don't want these to be enabled, you must specifically set LOAD_BALANCE to and FAILOVER to FALSE.

Configuring Server-Side Load Balancing

SCAN listeners—or database listeners in Oracle releases prior to 11.2—route connection requests to the most suitable instance providing the service requested. To be able to do so, they need to regularly receive information from the database instances about their load. They also need to be instructed about what is actually most suitable for the application. Correct configuration of services is important to make maximum use of the load balancing feature.

The Oracle database has to register with the listener first. It does so using two important initialization parameters: local_listener and remote_listener. The local listener should be set to the node's virtual IP address, even when using the default port of 1521. A number of problems have been reported in versions up to and including RAC 11.1.0.7, where connection requests were re-routed to the hosts public IP instead of the virtual hostname. This rerouting can cause unwanted connection failures if the public hostname doesn't resolve on the client. Setting the local listener parameter is mandatory when using a different than the default port. The local_listener parameter can be hard-coded into the initialization parameter. Or, it can reference an entry from in the tnsnames.ora file. We have seen many sites leave the definition of local_listener and remote_listener to the local naming, as in the following example:

LISTENER_PROD1 =
  (DESCRIPTION=
    (ADDRESS_LIST=(ADDRESS=(PROTOCOL=TCP)(HOST=london1-vip.example.com)(PORT=1521))))

The way you set the remote listener changed in RAC 11.2. Instead of registering with the local listener, the database registers with SCAN listeners. The syntax instructing the database to communicate with the SCAN listener has also changed. It now uses the EZConnect keyword. To enable it, you need to update names.directory_path to include the ezconnect keyword in the sqlnet.ora file in _HOME/network/admin/:

names.directory_path=(ezconnect,tnsnames,ldap)

Instead of storing the remote_listener in the tnsnames.ora file, you usually hard-code it in spfile; you accomplish this using the is scanname:port syntax, as in this example:

SQL> alter system set remote_listener='cluster1-scan.grid1.example.com:1521'
 2*  scope=both sid='*';

The preceding statement will fail if the SCAN IP address cannot be resolved by all nodes in the cluster! Oracle explicitly discourages the use of a TNS alias for setting the remote listener setting.

If your RAC cluster operates with Clusterware pre-11.2, then you don't need to worry about this detail. Instead, all you need to do is add an entry in the tnsnames.ora file as in the following example:

LISTENERS_PROD =
  (DESCRIPTION=
    (ADDRESS_LIST=
      (ADDRESS=(PROTOCOL=TCP)(HOST=london1-vip)(PORT=1521))
      (ADDRESS=(PROTOCOL=TCP)(HOST=london2-vip)(PORT=1521))
      (ADDRESS=(PROTOCOL=TCP)(HOST=london3-vip)(PORT=1521))
      (ADDRESS=(PROTOCOL=TCP)(HOST=london4-vip)(PORT=1521))
    )
  )

The next step is to use the alter system command to set the remote_listener parameter to the TNS alias:

SQL> alter system set remote_listener='listeners_prod' scope=both sid='*';

Regardless of which version of Oracle you use, the listener.log files provide periodic service updates from all database instances.

Note

A listener not providing service updates is a common cause for skewed load in RAC 10g. You should try to reload the listener a couple of times to see if that fixes the problem. The 10.2.0.4 patchset allegedly fixes these listener problems.

Cross-registration of listeners is the first step towards achieving a working server-side load balancing. In the next step, it is necessary to provide additional information about the expected connection duration for the service. The property to be used is called CLB_GOAL, and its setting is exposed in the DBA_SERVICES view. CLB_GOAL stands for connection load balancing goal, and it can be defined in Enterprise Manager by using srvctl or the DBMS_SERVICE package. Fortunately, Oracle simplified the setting of CLB_GOAL with Grid Infrastructure, as shown in the following example, which demonstrates how to use the -j switch:

$> srvctl modify service -d dbName -s serviceName -j {LONG|SHORT}

Prior to Oracle 11.2, you had to use DBMS_SERVICE or Enterprise Manager to modify CLB_GOAL, as shown in the following example:

BEGIN
  DBMS_SERVICE.MODIFY_SERVICE (
    service_name => 'SomeServiceName',
    clb_goal     => DBMS_SERVICE.CLB_GOAL_LONG -- or CLB_GOAL_SHORT
  );
end;
/

CLB_GOAL can take two parameters. These parameters tell Oracle about the expected duration of the session, which can be either short or long. The short parameter should be used for connections lasting only a few seconds to minutes in duration. In combination with the GOAL parameter to be discussed next, the short setting for CLB_GOAL is recommended for use with connection pools that include support for Fast Application Notification events.

Setting CLB_GOAL to long is suggested for dedicated connections that remain connected for longer periods of time. Oracle Application Express or Forms sessions are examples for these, as are some Excel plug-ins for database browsing. Setting only the connection load balancing goal will not activate the enhancements provided by the load advisory framework. However, setting this will instruct the listener to route connection requests based on metrics. Possible metrics available are load per node (as defined by the operating system's run queue) or the number of current connections. The load balancing framework provides additional benefits for load balancing, and we will discuss that framework next.

Exploring the Load Advisory Framework

The load advisory framework serves as the basis for runtime connection load balancing in RAC environments. It uses Fast Application Notification events to inform clients about changes to the infrastructure, such as node up or down events or changes to load on an instance. ODP.NET and other non FAN-aware clients can alternatively subscribe to messages sent through Advanced Queuing. To enable it, the GOAL parameter must be defined for a service. This parameter further describes the type of workload expected to connect to a service, and it can take three values:

  • None

  • Service time

  • Throughput

Setting the goal to NONE disables the load balancing advisory for the particular service. Optimization for service time instructs Oracle to route requests to instances providing the best response time. The load balancing framework uses elapsed time and available bandwidth when considering which node to connect to. When optimizing for THROUGHPUT, Oracle will try to hand the connection off to the node providing the best overall throughput. Metrics such as the rate at which work is completed and available bandwidth are used. The recommended use for the SERVICE TIME goal is for web sessions where usage patterns are unknown. Batch processing workloads are candidates for a load balancing goal of THROUGHPUT. Enabling the load balancing advisory for a service is achieved by a call to srvctl in Oracle 11.2 or DBMS_SERVICE in earlier releases, as in this example:

[oracle@london1˜]> srvctl modify service -d QA -s batchserv -B throughput

In Oracle 11.1 and earlier, you need to call DBMS_SERVICE:

BEGIN
  DBMS_SERVICE.MODIFY_SERVICE(
    service_name => 'batchserv',
    goal => DBMS_SERVICE.GOAL_THROUGHPUT
  );
END;
/

Note

The preceding examples assume that a connect time load balancing goal has already been set.

The load balancing advisory is compatible with many key Oracle components, such as the listener, JDBC, and ODDP.NET connection pools. Applications not supported out-of-the-box can still use of it through a published API or OCI callback functions. With the load balancing advisory enabled for a service, FAN events are generated. These events enable clients to gracefully respond to changes in the RAC environment they are running on.

Oracle encourages the use of connection pools for connections to the database. From the point-of-view of the database, this means that a set number of long running transactions is created during application initialization. The application then borrows connections from the pool for short periods of time, performs work, and returns the connection back to the pool, where it can be reused. This removes the overhead of having to create a dedicated server process each time a dynamic web component connects to the database.

Using Transparent Application Failover

Transparent application failover (TAF) has been discussed in several other places in this book in detail. In this section, we will provide an overview that explains how to use it. Please refer back to Chapter 3 for a more complete discussion of the concepts behind TAF. As a reminder, TAF can be configured in two different ways:

  • In a client side tnsnames.ora file

  • As a property of a service

The second approach is the preferred one; it simplifies modifying a TAF policy and lets you manage it centrally instead of having to deploy a configuration file to all your servers.

Unfortunately the Oracle network administration tools network manager and network configuration assistant still aren't TAF-aware. Therefore, users who want to deploy this failover technique on the client side must manually edit the tnsnames.ora file. Alternatively, Enterprise Manager can be instructed to maintain the server's tnsnames.ora file, which can then be copied to clients.

Note

Remember that TAF requires OCI libraries; thus, it will not work with the JDBC thin client!

Configuring Transparent Application Failover

To enable TAF, the alias for a TNS CONNECT_DATA section needs to be amended with a FAILOVER_MODE parameter, so it will support TAF when you define it in through local naming. A sample tnsnames.ora entry for select type failover using the basic connection method might look like this:

reporting =
 (DESCRIPTION=
   (ADDRESS=(PROTOCOL=tcp)(HOST=cluster1.example.com)(PORT=1521))
   (CONNECT_DATA=
     (SERVICE_NAME=REPORTING)
     (FAILOVER_MODE= (TYPE=select)(METHOD=basic))
   )
 )

In the preceding example, a RAC system is referenced through its single-client access name, and client-side load balancing and failover are implicitly enabled. The CONNECT_DATA section in the preceding example specifies that we would like to connect to a service called reporting. For this example, the reporting service can be assumed to run on all cluster nodes. As you can see the TAF-relevant information is inserted in the FAILOVER_MODE section.

The preconnect scenario is more complex to set up. Remember that you need two entries: the default and the backup entry. Consider the following setup:

tafpreconnect =
 (DESCRIPTION =
  (ADDRESS = (PROTOCOL = TCP)(HOST = cluster1.example.com)(PORT = 1521))
  (LOAD_BALANCE = YES)
  (CONNECT_DATA =
   (SERVER = DEDICATED)
   (SERVICE_NAME = tafpreconnect)
   (FAILOVER_MODE =
   (BACKUP = tafpreconnect_PRECONNECT)
(TYPE = SELECT)(METHOD = PRECONNECT)(RETRIES = 180)(DELAY = 5)
   )
  )
 )

tafpreconnect_PRECONNECT =
 (DESCRIPTION =
  (ADDRESS = (PROTOCOL = TCP)(HOST = cluster1.example.com)(PORT = 1521))
  (LOAD_BALANCE = YES)
  (CONNECT_DATA =
   (SERVER = DEDICATED)
   (SERVICE_NAME = tafpreconnect_PRECONNECT)
   (FAILOVER_MODE =
   (TYPE = SELECT)(METHOD = BASIC)(RETRIES = 180)(DELAY = 5)
   )
  )
 )

You will notice that additional sessions are established when you connect using this TNS alias:

SQL> select inst_id,sid,failover_type,failover_method, failed_over
  2  from gv$session
  3* where username = 'APRESS'
14:22:10 MARTIN@tafpreconnect > /

   INST_ID        SID FAILOVER_TYPE FAILOVER_M FAI
---------- ---------- ------------- ---------- ---
         1         26 NONE          NONE       NO
         1               156 SELECT          PRECONNECT NO
         2               140 NONE          NONE       NO

Note

When defining the TAF properties with the service directly, the failover_type and failover_method aren't populated in GV$SESSION.

Additional parameters are available to fine-tune the failover operation. In addition to the FAILOVER_METHOD and FAILOVER_TYPE parameters, administrators can set the RETRIES and DELAY parameters. The RETRIES parameter specifies the maximum number of attempts to failover and defaults to five if the DELAY parameter is specified. The DELAY parameter specifies the delay between connection attempts and defaults to one second if the RETRIES parameter is set. However, both the RETRIES and DELAY parameters are ignored when an application registers a TAF callback function, as in the next example we will discuss.

Using TAF with JDBC Applications

Once Transparent Application Failover-aware connection strings or services have been completed, it is possible to create applications that use this feature. There is no additional setup work required for session failover; you can use this feature straight away. For select type failover scenarios, a little more work is required. The next example demonstrates how to use TAF with JDBC applications. Oracle provides the interface OracleOCIFailover in the oracle.jdbc namespace; this enables you to leverage TAF in your applications.

Unfortunately, this interface is not well documented. The most important part of the interface defines a callback function and a number of constants that indicate the state of the failover, such as failover begin, end, abort, and so on. The exact definition is available in the Javadoc API documentation for JDBC.

You need to implement the interface and put any application logic into the CallbackFn() function. The class implementing the callback needs to be registered with the instantiated OracleDataSource object. The callback function will be invoked whenever a TAF event occurs, and it allows the developer to take action accordingly.

Remember that DML operations are not covered by TAF, nor the is session state preserved. However, the developer could intercept the failover and roll his transaction back when the FO_BEGIN event is raised. After receiving the FO_END event, the transaction can be restarted.

The following example demonstrates how to connect to an Oracle database with the OCI JDBC driver and register a TAF callback. The program then opens a cursor against the ALL_OBJECTS view and iterates over it. To extend the execution of the cursor, we added a 1000 millisecond delay before the next record of the result set is retrieved. The Oracle JDK installed with Oracle 11.2 (JDK 1.5) has been used to compile the code. Make sure your classpath argument or environment variable reference at least ojdbc5.jar. You also need to set your LD_LIBRARY_PATH to $ORACLE_HOME/lib. This is needed for the OCI driver. In Windows, you need to set the PATH environment variable instead:

// TNSNames.ora entry for this application:
reporting =
 (DESCRIPTION=
  (ADDRESS=(PROTOCOL=tcp)(HOST=cluster1.example.com)(PORT=1521))
  (CONNECT_DATA=
    (SERVICE_NAME=reporting)
    (FAILOVER_MODE=(TYPE=select)(METHOD=basic)))
  )

// File: TAF.java
import java.sql.*;
import oracle.jdbc.*;
import oracle.jdbc.pool.OracleDataSource;

public class TAF {
  private OracleDataSource ods;
  private static Connection conn;

  public TAF()
  throws Exception
  {
    ods = new OracleDataSource();
    ods.setURL("jdbc:oracle:oci:@reporting");
    ods.setUser("user");
    ods.setPassword("password");
  }
public void connect() throws Exception {

    conn = ods.getConnection();
    ((OracleConnection) conn).registerTAFCallback(new TAFCallback(),
        new Object());
  }

  public void query() throws Exception  {
    Statement stmt = null;
    try {
      stmt = conn.createStatement();
      ssResultSet rs = stmt.executeQuery(
        "select sys_context('userenv','instance_name') " +
          " from dual");
      rs.next();
      System.out.println("We are connected to instance " + rs.getString(1));
      rs.close();
      // now do some more work
      rs = stmt.executeQuery(
       "select object_id,object_name from all_objects");
      while (rs.next()) {
        System.out.println(rs.getInt(1) + "-" + rs.getString(2));
        // sleep a little bit to give us more time to simulate
        // instance failure
        Thread.sleep(1000);
      }
    } catch (SQLException e) {
      int sqlcode = e.getErrorCode();
      System.out.println("Error executing query " + e.toString());
    }
  }
  public static void disconnect()  {
    try {
      System.out.println("disconnecting");
      conn.close();
    } catch (Exception e) {
      System.out.println(e);
    }
  }
  public static void main (String[] args)
  {
    System.out.println("starting..." );

    try {
      TAF t = new TAF();
t.connect();
      t.query();
      t.disconnect();
    } catch (Exception e) {
      e.printStackTrace();
    }
  }
}

// File: TAFCallback.java
import java.sql.*;
import oracle.jdbc.*;
import oracle.jdbc.pool.OracleDataSource;

public class TAFCallback implements OracleOCIFailover  {

  public int callbackFn (Connection conn, Object ctxt, int type, int event) {

    String failover_type = null;
    switch (type) {
      case FO_SESSION: failover_type = "SESSION"; break;
      case FO_SELECT:  failover_type = "SELECT";  break;
      default:         failover_type = "NONE";
    }

    switch (event) {

      case FO_BEGIN:
           System.out.println(ctxt + ": "+ failover_type +
             " failing over...");
            break;
       case FO_END:
            System.out.println(ctxt + ": failover ended");
            boolean doReconnect = true;
            break;
       case FO_ABORT:
            System.out.println(ctxt + ": failover aborted.");
            break;
       case FO_REAUTH:
            System.out.println(ctxt + ": failover.");
            break;
       case FO_ERROR:
            System.out.println(ctxt +
              ": failover error received. Sleeping...");
            // Sleep for a while-will be invoked again
            try {
              Thread.sleep(100);
            } catch (InterruptedException e) {
               System.out.println("Thread.sleep failed with: "
                 + e.toString());
            }
            return FO_RETRY;
default:
            System.out.println(ctxt + ": bad failover event.");
            break;
     }
    return 0;
  }
}

To see Transparent Application Failover in action, you need to simulate an instance failure. After starting the code execution, open a new terminal and issue a shutdown abort against the instance executing the code:

[oracle@london2˜ ]> srvctl stop instance -d PROD -i PROD1 -o abort

You should now see the application reacting according to your instruction and fail over the second node providing the service:

[oracle@london1 taf]$ java TAF
starting...
We are connected to instance RAC1
20-ICOL$
...
40-I_OBJ5
java.lang.Object@1ac88440: SELECT failing over...
java.lang.Object@1ac88440: failover ended
26-I_PROXY_ROLE_DATA$_1
56-I_CDEF4
...

You can verify the connection from a different instance:

SQL> select sid,serial#,inst_id,failover_type,failover_method,
 2   failed_over,service_name
 3*  from gv$session where service_name = 'REPORTING';

       SID    SERIAL#    INST_ID FAILOVER_TYPE FAILOVER_M FAI SERVICE_NAME
---------- ---------- ---------- ------------- ---------- --- ------------
       152         23          1 SELECT        BASIC      NO  reporting

Now when you abort instance 1, the session fails over and resumes the select:

[oracle@london1 taf]$ srvctl stop instance -d PROD -i PROD1 -o abort

SQL> select sid,serial#,inst_id,failover_type,failover_method,failed_over,
 2   service_name
 3*  from gv$session where service_name = 'REPORTING';

       SID    SERIAL#    INST_ID FAILOVER_TYPE FAILOVER_M FAI SERVICE_NAME
---------- ---------- ---------- ------------- ---------- --- ------------
        41         91          2 SELECT        BASIC      YES reporting

Transparent Application Failover feels a bit dated today, but it remains the easiest way to take advantage of a RAC deployment.

If you are considering using TAF, you should remember that it is not well suited to all types of applications. In fact, it only applies in very limited circumstances. While TAF works well for read-only applications, it presents issues for applications that modify the database. In the event of a failure, any uncommitted transactions will be rolled back or will need to be restarted by the application. Therefore, the application must be capable of detecting the failure and, if necessary, reapplying DML statements up to the point of failure. If this capability is a requirement, then the application must also be capable of recording all statements issued in a transaction, along with the values of any bind variables.

Also, note that while TAF can reauthorize a session, it does not restore the session to its previous state. Therefore, following an instance failure, you will need to restore any session variables, PL/SQL package variables, and instances of user-defined types. TAF will not restore these values automatically, so you will need to extend your applications to restore them manually.

For the reasons just discussed, you may find that the areas within your applications that can take advantage of TAF are fairly limited. In Oracle 10.1 and later, there has been some de-emphasizing of the capabilities of TAF in favor of FAN, especially for JDBC and .Net applications that use Fast Connection Failover (FCF). These are discussed in more detail in the next section.

Implementing Fast Connection Failover

Fast Connection Failover is based on events published by the Fast Application Notification framework that is part of every Real Application Cluster database installation. From a developer's point of view, it offers many advantages over Transparent Application Failover. First, no programmatic changes are needed for the application in most cases. It appears as if the age of fat database clients distributed to each user PC is slowly coming to an end. In an effort to reduce maintenance costs and increase efficiency, many businesses are consolidating their end-user PCs into terminal-server farms. The high maintenance PC might eventually be replaced by a thin client, and application deployment is going to be managed centrally. Many applications follow this trend and are developed with a web frontend that requires very few resources on the client side. Typically, a standard web browser is all that's needed to access the application.

Most of these types of applications use connection pools from which individual connections are borrowed for short periods of time. Fast Connection Failover, along with Fast Application Notification, provides the Oracle developer with the tools needed to successfully write multi-tiered applications. We are going to explore the use of FAN and FCF in this section.

Implementing Fast Application Notification

We introduced the Fast Application Notification Framework in Chapter 3. The framework forms the basis for many RAC specific features. It is mainly a notification mechanism that informs interested clients about state changes. These changes include node up and down events, service up and down events, and so on. FAN also serves as the basis of the runtime load balancing feature by communicating load advisory events.

Note

The great advantage of FAN is that applications do not have to poll the database for information; instead, they receive events immediately after they occur.

You must use services if you want to take advantage of FAN. Many key Oracle components are already using FAN events under the covers. There are three different ways that you can use FAN events:

  • Use an Oracle client with integrated support: No changes to the application are necessary when using JDBC, OCI, or ODP.Net clients. In most cases, these are connection pooled applications. To use the load balancing events, an Oracle database 10.2 or later is required. (You will see an example that relies on a Tomcat-based JDBC-connection pooled application later in the chapter.)

  • Use the FAN APIs: Applications can use FAN events by leveraging the APIs provided. Unfortunately, the documentation regarding this feature for Java is incomplete and incorrect in many places, making it impossible to create an example. A documentation bug has been filed with Oracle to address this.

  • Use FAN callouts on the database server: FAN callouts are small server-side executables in $GRID_HOME/racg/usrco/. Oracle executes them immediately after an event occurs. Their main use is to raise alarms for on-call DBAs or to automatically raise a ticket in a fault-tracking system. For services with a manual policy, a callout script could be employed to relocate or restart services when the preferred instance comes back on line.

Oracle RAC and Oracle Restart use the Oracle Notification Service (ONS) processes to send out the FAN events. The Oracle Notification Service (ONS) is used by Oracle Clusterware to propagate FAN messages both within the RAC cluster and to clients and application-tier systems. ONS uses a simple publish-and-subscribe method to generate and deliver event messages to both local and remote consumers.

ONS is automatically installed as a node application on each node in the cluster. In Oracle 10.2 and later, it is configured as part of the Oracle Clusterware installation process. ONS daemons run locally, sending and receiving messages from ONS daemons on other nodes in the cluster. In Oracle 11.2, ONS got company in the form of eONS, or enhanced ONS. In the base release, it was mainly used to handle user callouts. However, development efforts were underway to integrate eONS into EVMd directly. Unlike ONS, eONS is a java process that uses UDP multicast to communicate with all other cluster members.

Clients can subscribe to ONS events in either of two ways: by using a local ONS process or by specifying a remote ONS instance. Oracle Application Server typically had its own local ONS process; however, ONS is not normally present in an all-java middle tier. ONS and eONS are also available in Oracle Restart allowing developers to code for Data Guard broker events.

FAN events can broadly be categorized into three different event types:

  • Node events

  • Service events

  • Runtime load balancing ("service metrics") events

Node events inform you about cluster membership states and nodes joining and leaving operations. Service events inform you about state changes in services. Finally, service metrics are used by the load balancing advisory to broadcast the load of instances to the client.

It is possible to dequeue the runtime load balancing events from their queue using a simple piece of code, as in the following excerpt:

CREATE procedure display_fan_events (
  po_service out varchar2,
  po_information out varchar2)
is
  v_dequeue_opts        DBMS_AQ.dequeue_options_t;
  v_message_properties  DBMS_AQ.message_properties_t;
  v_message_handle      RAW(16);
  v_event_msg           SYS.SYS$RLBTYP;
BEGIN
    v_dequeue_opts.consumer_name := '&V_GRANTEE';
    v_dequeue_opts.visibility    := DBMS_AQ.IMMEDIATE;
    v_dequeue_opts.dequeue_mode  := DBMS_AQ.REMOVEBROWSE;
    DBMS_AQ.dequeue(queue_name          => 'sys.SYS$SERVICE_METRICS',
                    dequeue_options     => v_dequeue_opts,
                    message_properties  => v_message_properties,
                    payload             => v_event_msg,
                    msgid               => v_message_handle);
    po_service := v_event_msg.srv;
    po_information := v_event_msg.payload;
    COMMIT;
END;
/

It's easy to create a sample application that queries that data and dumps it to the standard output:

#!/usr/bin/perl

use strict;
use warnings;

use DBI;
use DBD::Oracle;
use Getopt::Long;

# define a cleanup routine for CTRL-C
$SIG{'INT'} = 'cleanupandexit';

# get the service name from the command line
my $service;
GetOptions ("service=s" => $service);
die "usage: dequeue.sh -s <servicename>" if (!defined ($service));

# about to start...
print "Dequeue AQ events for Service $service
";
my $dbh = DBI->connect ("dbi:Oracle:$service", "user", "password")
  or die ("Cannot connect to service $service: DBI:errstr!");

# prepare a cursor to loop over all entries in the queue
my $csr = $dbh->prepare(q{
  BEGIN DISPLAY_FAN_EVENTS(:po_service, :po_information); END;
});
# out variables to be bound to the cursor
my $po_service;
my $po_information;

$csr->bind_param_inout(":po_service", $po_service, 4000);
$csr->bind_param_inout(":po_information", $po_information, 4000);

eval {
  # just dequeue all the time, note that the procedure discards entries
  # from the queue after reading them!
  while (1) {
    $csr->execute();
    print "$po_information
";
    sleep 1;   # give the eyes a break
  }
};
if ($@) {
  print "Error executing Cursor: $@";
  cleanupandexit();
}

# we are done - disconnect
$dbh->disconnect();

# called when user presses CTRL-C to clean up or an error occurs
# while exectuing the cursor
sub cleanupandexit {
  print "exiting...
";
  $csr->finish();
  $dbh->disconnect();
  exit 0;
}

This sample output demonstrates the RLB feature very nicely. First, let's generate some load by continuously calculating random numbers:

[oracle@london2 ˜]$ for i in 1 2 3 4 ; do
  sqlplus user/secretpwd@reporting @dothis.sql &
done

[oracle@london2 ˜]$ cat dothis.sql
declare
 n number;
begin
  WHILE (TRUE)
  LOOP
    -- something silly burning CPU...
    n:= dbms_random.random();
  END LOOP;
end;
/

Oracle chose to distribute three of these four sessions to node 2, and one to node 1:

INST_ID SERVICE_NAME
---------- -------------------
         1 reporting
         2 reporting
         2 reporting
         2 reporting

Subsequently, instance two was hit harder:

[oracle@ london1 ˜]$ top
top - 16:58:06 up 6 days,  2:28,  7 users,  load average: 1.44, 0.74, 0.44
Tasks: 223 total,   7 running, 216 sleeping,   0 stopped,   0 zombie
Cpu(s):  1.1%us,  0.7%sy,  0.0%ni, 89.5%id,  8.5%wa,  0.0%hi,  0.0%si,  0.2%st
Mem:   4194304k total,  4119808k used,    74496k free,   104988k buffers
Swap:  1048568k total,   198792k used,   849776k free,   955824k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
 4885 oracle    25   0  718m  29m  26m R 99.1  0.7   2:41.98 oracle
...

[oracle@ london2 ˜]$  top
top - 16:58:11 up 6 days,  2:28,  2 users,  load average: 3.49, 4.37, 2.29
Tasks: 213 total,   6 running, 207 sleeping,   0 stopped,   0 zombie
Cpu(s): 98.0%us,  0.3%sy,  0.0%ni,  1.5%id,  0.0%wa,  0.0%hi,  0.0%si,  0.2%st
Mem:   4194304k total,  4031568k used,   162736k free,   102356k buffers
Swap:  1048568k total,   344756k used,   703812k free,   822636k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
 2304 oracle    18   0  718m  29m  26m R 92.3  0.7   2:30.08 oracle
 2306 oracle    25   0  717m  25m  23m R 87.6  0.6   1:28.14 oracle
...

The output of the dequeue application reflects this difference in load; it has been slightly reformatted for readability:

[oracle@london1 aq]$ ./dequeue.sh  -s reporting
Dequeue AQ events for Service reporting
service: reporting - VERSION=1.0 database=admindb service=reporting {
  {instance=admindb2 percent=13 flag=GOOD aff=TRUE}
  {instance=admindb1 percent=87 flag=GOOD aff=TRUE} }
    timestamp=2010-05-10 16:57:07
service: reporting - VERSION=1.0 database=admindb service=reporting {
  {instance=admindb2 percent=9 flag=GOOD aff=TRUE}
  {instance=admindb1 percent=91 flag=GOOD aff=FALSE} }
    timestamp=2010-05-10 16:57:37
service: reporting - VERSION=1.0 database=admindb service=reporting {
  {instance=admindb2 percent=7 flag=GOOD aff=TRUE}
  {instance=admindb1 percent=93 flag=GOOD aff=FALSE} }
timestamp=2010-05-10 16:58:07
exiting...

The relevant payload of the event is recorded in the instance, percent, and flag fields. Information pertaining to the instance is grouped. In the preceding example, you can see that instance 2 can take a lot less additional load ("percent") than instance 1, which still has resources to spare. The flag attribute states that, despite the load on the service, the load balancing goal (service time, in this example) is not violated.

Implementing a Fast Connection Failover Example

We chose an example based on JDBC connection pooling with the Apache Tomcat 6 servlet/JSP container. This example uses the new Universal Connection Pool (UCP) instead of the Implicit Connection Cache. The Implicit Connection Cache feature has been deprecated as of Oracle 11.2; new applications should no longer use this feature. This approach requires a little more work to set up; however, this is also a more work-relevant example because the majority of UCP demos are of little practical value.

The example uses JDK 1.6.0.18, Apache Tomcat 6.0.24, and Apache Ant 1.8, along with an Oracle 11.2 client for the remote ONS configuration. All software packages were downloaded from their respective websites and deployed into /opt/ on a machine separate from the cluster nodes. The oracle client software is installed to /u01/app/oracle/product/11.2.0/client_1 to conform to the Oracle Flexible Architecture.

Configuring Tomcat

You can download Apache Tomcat 6 from the project's website or a mirror near you, and then un-tar/gzip the file to /opt/apache-tomcat. This installation directory is henceforth referenced as $CATALINA_HOME. Before starting the web server in $CATALINA_HOME/bin/startup.sh, you should set some environment variables in your shell (please change them where necessary to reflect your environment):

export JAVA_HOME=/opt/jdk1.6.0_18
export CATALINA_HOME=/opt/apache-tomcat-6.0.24
export ANT_HOME=/opt/apache-ant-1.8.0
export ORACLE_HOME=/u01/app/oracle/product/11.2.0/client_1

export PATH=$CATALINA_HOME/bin:$JAVA_HOME/bin:$ANT_HOME/bin:$PATH

Next, you should copy the following required JAR files for the Universal Connection Pool, JDBC, and Oracle Notification Support from your $ORACLE_HOME into $CATALINA_HOME/lib:

  • ojdbc6.jar

  • orai18n.jar

  • ucp.jar

  • ons.jar

The Tomcat startup code will automatically make the JAR files available to all deployed applications. Finally, you need to add support for FAN events in Tomcat. To do so, edit $CATALINA_HOME/bin/startup.sh and modify the line reading CATALINA_OPTS to include the following pointer to the Oracle client:

CATALINA_OPTS="--Doracle.ons.oraclehome=/u01/app/oracle/product/11.2.0/client_1  $CATALINA_OPTS"

The Oracle client's ons.config file contains the configuration for the remote Oracle Notification Service and its ports. Edit the file in $ORACLE_HOME/opmn/conf and ensure that you are making use of the correct ONS ports. In a default installation, the following should work for you after you change the host names in the line beginning with "nodes":

localport=6100
remoteport=6200
nodes=london1:6200,london2:6200,london3:6200

You can use onsctl debug in the Grid Infrastructure home on any of the Real Application Cluster nodes to find out which ports are in use by ONS. Therefore, the line beginning with "nodes" needs to list all the cluster nodes, as well as their remote ONS ports.

Preparing the Code

With all the preparation work completed, getting a data source from the pool is relatively simple. We need to look up the context where the data source is stored. By convention, this is in the jdbc sub namespace. We use a POJO (plain old java object) class to perform the lookup and connection management. A Java Server Page uses this class to retrieve a connection and display information about the server pool and various other statistics available in the UCP namespace. Here is the code behind the java class:

package com.apress.racbook.wlm.ucp;

import java.sql.Connection;
import java.sql.SQLException;
import java.sql.Statement;
import oracle.ucp.jdbc.PoolDataSourceFactory;
import oracle.ucp.jdbc.PoolDataSource;

import javax.naming.Context;
import javax.naming.InitialContext;
import javax.naming.NamingException;

import java.io.Serializable;
import java.sql.ResultSet;
import java.util.Properties;

public class UCPDemo {

  private    Context         envContext, initContext;
  private    PoolDataSource  pds = null;

  // look up a JNDI data source in the "comp/env" context root
  public UCPDemo(String jndiName)
throws NamingException, SQLException
  {
    initContext = new InitialContext();
    envContext = (Context) initContext.lookup("java:/comp/env");

    pds = (PoolDataSource) envContext.lookup(jndiName);
  }

  // return a pooled connection from the pooled data source
  private Connection getConnection()
    throws SQLException
  {
    return pds.getConnection();
  }

  // get the instance the session was borrowed from. Retuns the
        //SQL Exception
  // if raised in the String as well.
  public String getInstance()
    throws SQLException
  {
    String instName = "";

    try {
      Connection conn = this.getConnection();
      Statement stmt = conn.createStatement();
      ResultSet rst = stmt.executeQuery(
        "select sys_context('userenv','instance_name') from dual");

      rst.next();
      instName = rst.getString(1);

      // return the connection to the pool
      stmt.close();
      conn.close();
      conn = null;
    } catch (SQLException sqlexception) {
      instName = sqlexception.toString();
    }

    return instName;
  }

  // public getter method for the private variable
  public PoolDataSource getPDS()  {
    return this.pds;
  }
}


// File: generic_connection.jsp
<%@ page language="java" contentType="text/html; charset=UTF-8"
pageEncoding="UTF-8"%>
<%@ page import="com.apress.racbook.wlm.ucp.*" %>
<%@ page import="oracle.ucp.jdbc.oracle.*" %>

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
  "http://www.w3.org/TR/html4/loose.dtd">
<html>
<head>
  <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
  <title>Generic Connection Page</title>
</head>
<body>

<table cellspacing="2" cellpadding="2" border="0">
<tr>
  <td><img src="images/proOracleRac11gOnLinux.gif" /></td>
  <td>

    <h1>Pro Oracle RAC 11g on Linux</h1>

    <h3>UCP Connection Pool Demo</h3>
  </td>
</tr>
<tr>
  <td colspan="2">
    <p>
      This demo uses a resource defined in $CATALINA_HOME/conf/context.xml to
      connect to a UCP connection pool using an Oracle 11.2 RAC database.
    </p>
  </td>
</tr>
</table>

<%
String jndiName = request.getParameter("jndiName");
// do processing if the user has entered a JNDI name previously
if (jndiName != null)  {

%>
<h2>Connecting to JNDI Name <%= jndiName %></h2>
<%

UCPDemo ucp;
ucp = new UCPDemo(jndiName);

%>

<p>You are connected to instance <%= ucp.getInstance() %></p>

<h3>Statistics</h3>

<table width="100%" cellspacing="2" cellpadding="2">
<tr>
  <th colspan="4">Pooled Data Source Connection</th>
<tr>
  <td> getAbandonedConnectionTimeout()   </td>
  <td> <%= ucp.getPDS().getAbandonedConnectionTimeout() %>  </td>
  <td> getAvailableConnectionsCount()    </td>
  <td> <%= ucp.getPDS().getAvailableConnectionsCount()  %>  </td>
</tr>
<tr>
...
<hr/>

<h3>Specify Connection Information</h3>

<form action="generic_connection.jsp">

<table>
<tr>
  <td>Connect to jndi Name: </td>
  <td><input type="text" name="jndiName" value="<%= jndiName %>"/></td>
  <td colspan="2"><input type="submit" /></td>
</tr>
</table>

</form>

</body>
</html>

Note

You can download the complete source code for this example from the Apress website at www.apress.com. To deploy this code to your environment, simply drop the workloadMgmt-1.0.war file into $CATALINA_HOME/webapps.

The JSP page prompts the user for a JNDI name to connect to and then uses the UCPDemo class to initialize the connection pool. The user is rewarded with a wealth of statistics about the pool. The JNDI data source is defined in the web application's context.xml file. Beginning with Tomcat 5, JDBC data sources no longer need to be defined in server.xml or context.xml files that reside in $CATALINA_HOME/conf. Previously changes to the data source required a restart of the Tomcat instance. The context.xml relevant to this application is deployed as part of the META-INF directory. A sample data source configuration for a RAC data source can be defined as follows. This example has been reformatted for readability; when working with the context, it seemed to be required that all the information be one line:

<?xml version='1.0' encoding='utf-8'?>
<Context>
  <Resource name="jdbc/reporting"
    auth="Container"
factory="oracle.ucp.jdbc.PoolDataSourceImpl"
    type="oracle.ucp.jdbc.PoolDataSource"
    description="FCF RAC database"
    connectionFactoryClassName="oracle.jdbc.pool.OracleDataSource"
    connectionWaitTimeout="30"
    minPoolSize="10"
    maxPoolSize="50"
    inactiveConnectionTimeout="60"
    timeoutCheckInterval="30"
    fastConnectionFailoverEnabled="true"
    onsConfiguration="nodes=node1:6200,node2:6200,node3:6200,node4:6200"
    user="databaseUser" password="databasePassword"
    url="jdbc:oracle:thin:@//cluster1.example.com:1521/reporting"
    connectionPoolName="FCF Connection Pool for service REPORTING"
    validateConnectionOnBorrow="true"
    sqlForValidateConnection="select 1 from DUAL"
    maxConnectionReuseTime="30"
  />
</Context>

Many of the attributes of the Resource tag reference setter functions in the PoolDataSource interface in the oracle.ucp.jdbc namespace. The most important attributes are listed in Table 11-2.

Table 11.2. Important Attributes for Creating Connection Pools

Attribute

Meaning

Comment

Name

The resource name

Use the resource name to look up the JDNI name in your application. In the sample code, this needs to be entered in the HTML form.

Type

The type of data source to be created by the servlet container

Use the suggested PoolDataSource here for Universal Connection Pool

minPoolSize

The minimum pool size

Allocated this when the connection pool is created.

maxPoolSize

The maximum pool size

Prevents the connection pool from growing beyond that specified number of sessions

fastConnectionFailoverEnabled

Controls the Fast Connection Failover

Enables Fast Connection Failover; obviously; this is the most important parameter for this demonstration!

onsConfiguration

Enables mapping between cluster nodes and their ONS ports

Required for Fast Connection Failover. Otherwise, the connection pool cannot receive FAN events.

user,password,URL

The database connection

 

validateConnectionOnBorrow

Ensures the connection borrowed from the pool is valid

Relieves the developer of the responsibility of checking whether a connection is valid when borrowing it from the pool. This can create overhead if set to True.

When you point your web browser to the application (such as http://tomcathost:8080/workloadMgmt/generic_connection.jsp) and enter a valid JNDI name, the connection pool will be created and allocate sessions as specified per the minimum pool size. When refreshing the page, you should see the statistics change according to usage. You can use tools such as the Apache benchmark program (/usr/sbin/ab2) to stress the application a little and use the DISPLAY_FAN_EVENTS() function to find out how Oracle reacts to changes in load. You should see service up and down events when stopping and starting services, and you should also see sessions migrating from the instance where the service failed to other instances that provide the failed service.

Establishing Connection Affinity

When using connection pools, users grab a session from the pool, perform some work, and hand it back to the pool. It would be nice if the allocation of sessions could have some affinity to the instance a previous user allocated his session from. The rationale for this idea: Oracle's runtime load balancing previously selected an instance suitable for creating the initial session. If so configured, the initial and subsequent sessions will have an affinity to this instance, based on the assumption that the instance has spare resources. Sessions can incur performance penalties when relocated to a different instance after a load advisory event. The affinity hint tries to keep a session on the instance, regardless of the events received by the RAC instance.

The affinity hint does not guarantee that node affinity is going to be respected. If Oracle cannot satisfy an affinity request, it will break the affinity and reassign sessions from the connection pool to different instances. After this reshuffling of sessions, new affinities will be established.

Oracle supports connection affinity with the Universal Connection Pool and Oracle RAC from 11g Release 1 and later. Two types of affinity exist to choose from:

  • Web session affinity

  • Transaction-based affinity (also known as XA-affinity)

Web session affinity is often used for short lived sessions that do not incur a massive penalty when redirected to a different instance. Transaction-based affinity is used for the opposite—long-lived transactions that would incur performance penalties when relocated to a different instance if the cluster should use this type.

Connection affinity is based on Fast Connection Failover, run time connection load balancing, and a registered affinity callback. The first two points have already been discussed. The third and fourth requirements need some more explanation. The callback class has to implement the interface oracle.ucp.ConnectionAffinityCallback. During session initialization (such as in the constructor of class UCPDemo in the preceding example), the developer registers the instantiated callback with a PoolDataSource object. The following example demonstrates how to set an affinity. There are quite a few lines of code in this example, but the important bit is in the affinityPolicy variable and the setAffinityPolicy() and getAffinityPolicy() functions. Although they have to be implemented, the context functions serve no specific purpose in the example:

package com.apress.racbook.wlm.ucp;

import oracle.ucp.ConnectionAffinityCallback;
import oracle.ucp.ConnectionAffinityCallback.AffinityPolicy;

public class ConnectionAffinity
  implements ConnectionAffinityCallback
{

  ConnectionAffinityCallback.AffinityPolicy affinityPolicy =
    ConnectionAffinityCallback.AffinityPolicy.WEBSESSION_BASED_AFFINITY;
    // use TRANSACTION_BASED_AFFINITY alternatively
  Object appAffinityContext = null;

  public AffinityPolicy getAffinityPolicy() {
    return this.affinityPolicy;
  }

  public void setAffinityPolicy(AffinityPolicy policy) {
    this.affinityPolicy = policy;
  }

  public Object getConnectionAffinityContext() {
    return appAffinityContext;
  }

  public boolean setConnectionAffinityContext(Object affinityContext)  {
    this.appAffinityContext = affinityContext;
    return true;
  }
}

// the modified constructor of UCPDemo now is defined as follows:
public class UCPDemo {

  private   Context         envContext, initContext;
  private   PoolDataSource  pds = null;

  public UCPDemo(String jndiName)
    throws NamingException, SQLException
  {
    initContext = new InitialContext();
    envContext = (Context) initContext.lookup("java:comp/env");

    pds = (PoolDataSource)envContext.lookup(jndiName);

    // set a little bit of connectivity
pds.registerConnectionAffinityCallback (new ConnectionAffinity());
  }
...

When executing the sample application, you should now see that you are redirected to the same instance each time you refresh the page.

Reading FAN events using the FAN API

Oracle provides an API to register callbacks with FAN for applications that cannot use either the Universal Connection Pool or the Implicit Connection Cache. It also allows developers to create their own connection pools. However, only a subset of events is available through this API.

Speaking in pseudo-code, the application needs to connect (subscribe) to a service to receive FAN events. Event handler functions defined in oracle.simplefan.FanEventListener can be implemented by the developer to react to the following event classes:

  • LoadAdvisoryEvent

  • NodeDownEvent

  • ServiceDownEvent

Depending on the event type, additional information is available. To enable this functionality in an application, you should place the simplefan.jar and ons.jar files into your class path. You should also configure the Oracle client's ons.config file as described previously to specify the remote ONS ports. Interestingly, the classes of the oracle.simplefan namespace are not part of the main JDBC javadoc documentation. Instead, these were described in the Oracle Database RAC FAN Events Java API Reference document. At the time of writing, the documentation regarding this feature was both incorrect in many places and incomplete. This is a new 11.2 feature, so we expect it to mature in future patchsets.

Using the Resource Manager

Database services and the Resource Manager form an ideal pair for workload management in RAC, especially in consolidated databases. Where services are used to logically subdivide the cluster into groups of similar workload, the Resource Manager allows the administrator to govern resource usage based on the service. When users connect through a dedicated service, a mapping within Oracle assigns these users a resource consumer group. Part of a resource plan, a resource consumer group has certain rights to consume resources on the database. The Resource Manager helps administrators prevent situations where less important workloads starve out the important ones, potentially endangering service-level agreements (SLAs).

The Resource Manager serves as a very strong argument in favor of consolidating multiple single-instance and RAC-databases into a single, more powerful cluster. Up until Oracle 11.1, the Resource Manager worked horizontally only—within a database. Beginning with Oracle 11.2, the so-called instance caging extends the Resource Manager's functionality vertically to multiple database instances on the same hardware.

The use of the Resource Manager is preferred over operating system management of application classes to govern their resource consumption. The operating system scheduler does not know about the different requirements of Oracle foreground and background processes, nor are operating systems able to provide the same granularity of control.

The Resource Manager relies upon the following concepts:

A resource consumer group: This is a group of users based on similar requirements. The Resource Manager allocates resources based on consumer groups. You can define your own custom mapping of sessions to resource consumer groups based on the service users connect to.

A resource plan directive: Plan directives associate how resources are allocated within a consumer group. Plenty of options exist to define which resources in what quantity are available to a consumer group. The the most important of these resources is probably the CPU quantity.

A resource plan: The resource plan is the top-level object in the Resource Manager. It brings resource consumer groups and their plan directives together. For the Resource Manager to work, a plan needs to be active. Resource plans can be assigned to scheduler windows (as discussed previously in this chapter), making it possible to have different plans for different times in the week.

Oracle comes with three pre-defined consumer groups: SYS_GROUP, DEFAULT_CONSUMER_GROUP, and LOW_GROUP. You will find that all user sessions other than SYS and SYSTEM in a default Oracle installation use the default consumer group. You can see that by checking the RESOURCE_CONSUMER_GROUP column in the GV$SESSION view.

The Resource Manager is very flexible, and it allows the creation of hierarchical plans and other advanced features. We'll work through a simplified example that demonstrates how to create a resource plan that supports two database services.

The high level steps to create the resource plan for our example are as follows:

  1. Create a sandbox to verify the plan before activating it; this is called a pending area.

  2. Add the consumer groups.

  3. Create a resource plan.

  4. Add plan directives to the plan.

  5. Define mapping between consumer groups and services and grant privileges to users/roles to switch consumer groups.

  6. Validate the sandbox.

  7. Submit the sandbox after a successful validation.

  8. (Optionally) Enable the plan.

Here is the PL/SQL code to implement the aforementioned steps:

-- Create the pending area
SQL> exec dbms_resource_manager.create_pending_area();

-- Add two consumer groups-reporting and batch (group "BATCH" already
-- exists in 11.1 and 11.2)

SQL> begin dbms_resource_manager.create_consumer_group(
  consumer_group => 'REPORTING_GROUP',
  Comment => 'group connecting through service reporting'),
end;
/
-- this is not needed in 11g
SQL> begin dbms_resource_manager.create_consumer_group(
  consumer_group => 'BATCH_GROUP',
  comment => 'group connecting through service batch'),
end;
/


-- Create theresource plan. We use the RATIO method to define the ratio of
-- CPU usage between resource consumer groups.
-- The ratio is 1:2:3:5:8 for low group, other group, reporting group, batch
-- group and sys group.

SQL> exec dbms_resource_manager.create_plan( plan => 'DAYTIME_PLAN', -
     mgmt_mth => 'RATIO', comment => 'Pro Oracle Database 11g RAC sample plan')

-- Create a plan directive for each consumer group. This is a very basic
-- example only using CPU allocation.

SQL> exec dbms_resource_manager.create_plan_directive( plan=>'DAYTIME_PLAN', -
  mgmt_p1 => 1, group_or_subplan => 'LOW_GROUP', comment => 'low group')

SQL> exec dbms_resource_manager.create_plan_directive( plan=>'DAYTIME_PLAN', -
  mgmt_p1 => 2, group_or_subplan => 'OTHER_GROUPS', -
  comment => 'others group')

SQL> exec dbms_resource_manager.create_plan_directive( plan=>'DAYTIME_PLAN', -
  mgmt_p1 => 3, group_or_subplan => 'REPORTING_GROUP', -
  comment => 'reporting group')

SQL> exec dbms_resource_manager.create_plan_directive( plan=>'DAYTIME_PLAN', -
  mgmt_p1 => 5, group_or_subplan => 'BATCH_GROUP', comment => 'batch group')

SQL> exec dbms_resource_manager.create_plan_directive( plan=>'DAYTIME_PLAN', -
  mgmt_p1 => 5, group_or_subplan => 'SYS_GROUP', comment => 'sys group')


-- Define mapping for service reporting to reporting group, same for batch
SQL> begin
  dbms_resource_manager.set_consumer_group_mapping(
   attribute => DBMS_RESOURCE_MANAGER.SERVICE_NAME,
   value => 'REPORTING',
   consumer_group => 'REPORTING_GROUP'),
end;
/

SQL> exec dbms_resource_manager.set_consumer_group_mapping( -
  attribute => DBMS_RESOURCE_MANAGER.SERVICE_NAME, -
  value => 'BATCHSERV', -
  consumer_group => 'BATCH_GROUP')
-- Validate the pending area
SQL> exec dbms_resource_manager.validate_pending_area

-- Submit the pending area

SQL> exec dbms_resource_manager.submit_pending_area


-- Allow users REPORTING_USER and BATCH_USER to switch to their
-- consumer group. Must be done after the initial pending area has
-- been submitted.
BEGIN
  DBMS_RESOURCE_MANAGER.CREATE_PENDING_AREA;
  DBMS_RESOURCE_MANAGER_PRIVS.GRANT_SWITCH_CONSUMER_GROUP (
     GRANTEE_NAME   => 'REPORTING_RW',
     CONSUMER_GROUP => 'REPORTING_GROUP',
     GRANT_OPTION   =>  FALSE);
  DBMS_RESOURCE_MANAGER_PRIVS.GRANT_SWITCH_CONSUMER_GROUP (
     GRANTEE_NAME   => 'BATCH_RW',
     CONSUMER_GROUP => 'BATCH_GROUP',
     GRANT_OPTION   =>  FALSE);
  DBMS_RESOURCE_MANAGER.VALIDATE_PENDING_AREA;
  DBMS_RESOURCE_MANAGER.SUBMIT_PENDING_AREA;
END;
/

Your work is not yet done once you define the plan defined; you must also enable it. You do this either by assigning it to a window or by executing the following command:

SQL> alter system set resource_manager_plan = 'DAYTIME_PLAN' scope=both sid='*';

Next time a user connects to the database via services REPORTING or BATCH the automatic mapping to a consumer group should take place.

Caging an Instance

The Oracle Resource Manager has always been very good at managing resource consumption in the same database. It knows about the requirements of Oracle database foreground and background processes, and it can send individual processes to sleep more effectively than an operating system's scheduler. While the vertical management of resource usage was part of the Resource Manager very early on, it could not manage horizontally (i.e., across database instances).

Some large SMP machines, as well as RAC databases forming part of consolidation projects, have multiple databases running on the same host. Like any process in a runnable state, Oracle competes for available scheduling time. Resource Manager 11.2 has now finally been extended to prevent one of the databases from starving the others out for resources. In other words, the Resource Manager can now cage instance resource usage horizontally, while at the same time controlling resource usage within the database instance. This is a huge step forward, especially in cases where users cannot consolidate data in a single database for security or compliance reasons, yet still want to use one consolidated cluster.

Note

Instance caging allows the administrator to partition the available CPU power between databases; however, instance caging does not prevent administrators from over-provisioning CPU power to databases.

Enabling instance caging, the marketing name for advanced capabilities of the Resource Manager, is simple. It requires an active resource plan on all the databases in the cluster. All that needs to be done at this point is to set the instance cpu_count initialization parameter to the desired value. For instance, on a RAC cluster with two databases (such as the production wiki and RMAN catalog), the available eight CPU cores could be allocated like this: six for the wiki and two for the RMAN catalog. The cpu_count parameter is a dynamic parameter that can be set at runtime, so corrections are possible without interrupting a service.

Database Resident Connection Pool

Applications not written either in Java or .Net often cannot use connection pools. In the past, this could lead to severe performance problems in cases where high throughput was required. Dedicated server processes simply could not be created quickly enough, and CPU overhead on servers could be significant. Shared server processes could somewhat help in the short term until the application was ported to a middle-tier solution capable of connection pooling.

In Oracle 11.1, this problem was addressed with database resident connection pools (DRCP). Instead of solely relying on a middle tier to provide connection pools, applications written in PHP or even classic CGI can use a connection pool provided by and inside the Oracle database. The use of a database resident connection pool is not limited to scenarios where no middle-tier connection pool is available; it can also be used to complement such a setup to scale into thousands of users.

A so-called pooled server combines a hybrid foreground process with a database session. Using a DRCP can result in a significant reduction of resource usage, mainly in terms of memory and CPU resources. Thus, it allows a large user base that would otherwise be impossible to support with the given hardware. The readily available pooled servers also reduce the cost of creating new client connections.

Connection handling is similar to a middle-tier connection pool. A connection broker will pick an available pooled server from the available servers to handle a client request. If there is no free pooled server, the server pool will be grown by a predefined value. if the pool reaches its maximum size, client requests will be placed into a queue. As soon as a client session has been released, the used pooled server is returned to the pool to serve additional client requests. Unlike with a shared server, the memory used by the pooled servers will be allocated from the PGA, rather than the shared pool. This is a tribute to the close relationship between pooled and dedicated server processes.

Administrators and developers can use the database resident connection pool either through OCI commands or a PL/SQL API in the form of DBMS_CONNECTION_POOL. Unfortunately, the PL/SQL package does not allow the creation of additional connection pools. The default database resident connection pool is the only pool available; it is named SYS_DEFAULT_CONNECTION_POOL.

To demonstrate the usefulness of DRCP, we set up a simple example. The performance improvements of using DRCP over dedicated sessions is quite dramatic, as you will see. The demonstration uses the typical single-threaded Apache 2.2.3/PHP 5.3.2/OCI 1.4.1 combination that is often found in open source-based development projects.

The two PHP scripts (named drcp.php and dedicatedSever.php) in this example are identical, except for the connection string used. In the DRCP case, the connection pool is tied to a policy-managed database's uniform service with connect-time load balancing goal of short and a service time goal. The TNS aliases are defined as follows:

drcp =
  (DESCRIPTION =
    (ADDRESS = (PROTOCOL = TCP)(HOST = cluster1.example.com)(PORT = 1521))
    (LOAD_BALANCE = YES)
    (CONNECT_DATA = (SERVER = POOLED)(SERVICE_NAME = drcp)))

nondrcp =
  (DESCRIPTION =
    (ADDRESS = (PROTOCOL = TCP)(HOST = cluster1.example.com)(PORT = 1521))
    (LOAD_BALANCE = YES)
    (CONNECT_DATA = (SERVER = DEDICATED)(SERVICE_NAME = nondrcp)))

Again, the full code suite can be downloaded from the Apress website (www.apress.com); the following snippet shows the relevant, very basic PHP code:

<?php
if ( $conn=OCILogon("user", "password", "nondrcp")) {
  echo "Successfully connected to Oracle.
";

  // get the instance information
  $stid = oci_parse($conn, "select sys_context('userenv','instance_name'), " .
     "sys_context('userenv','service_name') inst from dual");
  $r = oci_execute($stid);

  while (($row = oci_fetch_array($stid, OCI_BOTH))) {
    echo "<p>You are connected to instance ";
    echo $row[0] . " using service " . $row[1];
    echo "</p>";
    }

    oci_free_statement($stid);

    OCILogoff($conn);
  } else {
    $err = OCIError();
    echo "Error in connecting to the Oracle." . $err["message"];
  }
?>

The preceding example simply connects to Oracle (TNS alias, nondrcp) and queries the userenv context to learn which instance it runs on and the service name the session connected to. It then closes the session. We've kept this example deliberately simple to re-create the most common problem with non-pooled dedicated server connections.

The database resident connection pool uses the attributes described in Table 11-3.

Table 11.3. Important DRCP Properties for the Example Used

Property

Meaning

Comment

Pool Name

The database resident connection pool used

Only the default is supported.

Minimum pool size

The initial pool size

Do not choose this value; it's too low.

Maximum pool size

The maximum number of pooled sessions possible

 

Pool increment

Increment the current pool size by n sessions until maximum pool size is reached

 

Inactivity Timeout

Idle timeout before the session is returned to the pool

 

Once the pool is defined per the preceding settings, it needs to be started:

SQL> exec dbms_connection_pool.start_pool;

We used the Apache benchmarking tool, ab, to test the effect a database resident connection pool has on the throughput and memory usage. The parameters used required 10000 requests with a concurrency level of 20. The benchmarking software was executed on the web server to eliminate potential network impact. The results of the test were quite telling. Here are the dedicated server results:

Document Path:          /dedicatedServer.php
Document Length:        699 bytes

Time taken for tests:   52.169972 seconds
Complete requests:      10000
Failed requests:        9252
   (Connect: 0, Length: 9252, Exceptions: 0)
Total transferred:      11324772 bytes
HTML transferred:       9404772 bytes
Requests per second:    191.68 [#/sec] (mean)
Time per request:       104.340 [ms] (mean)
Time per request:       5.217 [ms] (mean, across all concurrent requests)
Transfer rate:          211.98 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   0.6      0      27
Processing:     3  103 489.2     30    8221
Waiting:        3  103 489.2     29    8221
Total:          3  103 489.2     30    8221

A large percentage of requests could not be served, and the total execution time was 52.167 seconds. By comparison, the DRCP example beats these statistics by miles—and ab did not report any failures during the execution. The whole 10000 requests were served by Oracle in 7.1 seconds, with an average of 14.213 ms per request. Compare that number to 104340 ms for the dedicated server. What follows are the results from the DRCP example:

Document Path:          /drcp.php
Document Length:        695 bytes
Concurrency Level:      20
Time taken for tests:   7.106642 seconds
Complete requests:      10000
Failed requests:        0
Total transferred:      8870000 bytes
HTML transferred:       6950000 bytes
Requests per second:    1407.13 [#/sec] (mean)
Time per request:       14.213 [ms] (mean)
Time per request:       0.711 [ms] (mean, across all concurrent requests)
Transfer rate:          1218.86 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   0.2      0       7
Processing:     2   13 114.2      6    4530
Waiting:        2   13 114.2      6    4530
Total:          2   13 114.2      6    4530

Unfortunately, the use of any Advanced Security Option feature is not possible with DRCP. There are some other, less severe limitations, as well.

Summary

We covered a lot of ground in this lengthy chapter, and yet it in some ways this is only touching the tip of the iceberg. Oracle workload management and RAC were designed to work together, and you should invest a lot of effort in laying out the optimal configuration for services in consolidated databases. Database services are the key to successful workload management, and these can be configured by a number of tools. While the database configuration assistant was available in Oracle 10g, the local database console or Enterprise Manager Grid Control have taken over the reins. Most of the functions provided by DBMS_SERVICE have also been deprecated. This essentially leaves the server control utility srvctl and Enterprise Manager as the most useful configuration tools.

The aim of the workload balancing techniques described in this chapter is to spread the workload evenly over all instances of a cluster. For example, connect-time load balancing either uses round robin DNS resolution or multiple listener entries in the local naming file to spread the work across multiple (SCAN) listeners evenly. In a different vein, server-side load balancing tries to ensure that the listener initially contacted hands the connection request off to the least loaded node providing the requested service. Finally, runtime load balancing can be used to reassign sessions during their existence in a connection pool to less-loaded instances.

Oracle RAC uses the Fast Application Notification framework to broadcast information about load and node status. FAN-integrated clients can use the Fast Connection Failover technology to leverage this information and reassign database sessions in a connection pool when an instance fails. This is usually transparent to the end user. For some applications, Transparent Application Failover can be implemented providing session failover capability when instances fail. Open cursors can be salvaged when a session fails over, but any ongoing transaction will be rolled back. Neither can session state cannot be failed over; this combination makes TAF a less desirable option than Fast Connection Failover.

The previously used Implicit Connection Cache has been deprecated in favor of the new Universal Connection Pool, which offers many advanced features, including instance affinity. Oracle will try to keep sessions on the same instance if relocating them to a different instance would prove costly in terms of resources.

We also looked at the Resource Manager and how it can help assigning sessions to resource groups, governing their allowed resource usage. The Resource Manager has been extended in Oracle 11.2; now it not only manages resources within a database, but also within all the databases inside the same cluster node. Using the toolset provided by Oracle properly should enable you to consolidate projects on Oracle RAC; also, the extensive monitoring for services provides the business and the infrastructure teams with vital information about the use of their hardware.

Database Resident Connection pools are a great new addition to the Oracle software stack for single-threaded applications that do not use connection pooling. The Oracle database once more proves to be the one-stop solution for the developer's needs.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset