CHAPTER 7

image

Resource Management

If resources were unlimited, there would be no need to manage them. We see this in all aspects of our daily lives. If your car were the only car on the road, traffic signals would not be necessary. If you were the only customer at the bank, there would be no need for the winding ropes that form orderly lines. But, as we all know, this is rarely the case, which is the same for database servers. When the load on the system is light, there is very little need for resource management. Processes complete in a fairly consistent period of time. But when the system gets busy and resources become scarce, we can find ourselves with an angry mob on our hands.

Since version 8i of the database, the Oracle Database Resource Manager has been available with features to manage resources inside the database. Without DBRM, all database connections are treated with equal priority, and they are serviced in a sort of round-robin fashion by the operating system scheduler. When the system is under heavy load, all sessions are impacted equally. Low-priority applications receive just as high a priority as business-critical applications. It is not uncommon to see a few poorly written ad-hoc queries degrade the performance of mission-critical applications using the same database. If you have been a DBA long enough, especially in data warehouse environments, you are probably familiar with the Unix renice command. It is a root-level command that allows you to influence the CPU priority of a process at the operating-system level. A number of years ago, we worked in a DBA group supporting a particularly heavily loaded data warehouse. The renice command was used frequently to throttle back CPU priority for database sessions that were dominating the system. There were a couple of obvious problems with this approach. First of all, the renice command is a privileged command available only to the root user and system administrators were reluctant to grant DBAs access to it. The second problem was that automating it to manage CPU resources was difficult at best. Oracle’s Database Resource Manager is a much more elegant solution to the problem. It allows DBAs to address resource allocation within the domain of the database itself. It is a well-organized framework that is automated by design. It ensures that critical system resources such as CPU will be available to your important applications whenever they are needed, even when the system is under a heavy workload. This is done by creating resource allocation schemes that define priorities based on the needs of the business.

Another case for resource management is consolidation. It was inevitable that a platform with the performance, capacity, and scalability of Exadata would be viewed by many as an ideal consolidation platform. But consolidating databases is a challenge, mainly because of the difficulty of managing resources across databases. We have worked with a huge number of clients who have used Exadata to consolidate multiple database servers onto the Exadata platform. One such client consolidated 29 databases from 17 database servers onto two Exadata full racks. Needless to say, without Oracle’s resource management capabilities, it would be extremely difficult, if not impossible, to balance system resources among so many database environments. Using the database resource manager, there is not really a way to prioritize I/O across databases. Starting from Exadata V2, Oracle introduced the I/O Resource Manager (IORM) inside the storage servers, which enables the DBA to prioritize I/O. So whether you are consolidating multiple databases onto your Exadata platform or handling resource intensive applications within a single database, effective resource management will play an important role in your success.

Beyond introducing the Exadata and non-Exadata specific options available for resource management, our goals in presenting this material are twofold. First, we want to provide enough detail to demystify resource management without overwhelming the reader. Second, we intend to demonstrate how to build a fully functional resource management model. These goals present a unique challenge. Provide too little information, and the reader will only be able to set up very simple configurations. Give too much detail, and we risk convoluting the topic and losing the audience. The most difficult part of writing this chapter has been striking a balance between the two. For that reason, we created simple, share-based single level resource plans. In fact, simple, single-level resource plans will solve a vast majority of the resource management problems we see in the real world. Moreover, multi-level resource plans can be difficult to design and test. In this chapter, we demonstrate both multi-level and single-level plans because it is important to understand how they work. However, if you are considering using the Oracle Resource Manager or other resource management options, the best approach is to keep it simple and add features only as they are needed.

Consolidation

The primary drivers of consolidation centers around cost savings. Every new generation of hardware release is typically more powerful than the previous one, and this is also true for Exadata. IT shops can take advantage of this trend by consolidating the silos of database environments into a standardized powerful platform and ultimately achieve greater efficiencies by improving the total resource utilization—in effect, lowering both capital and operational expenditures. In the following sections, we will review the types of database consolidation.

Types of Database Consolidation

Aside from resource requirements, many factors need to be considered when consolidating multiple databases. These factors include, but are not limited to, namespaces, isolation, maintenance, upgrade, backup and recovery, cloning, and service level agreements. Each approach has its own pros and cons, as you will soon see.

Server

This is probably the easiest route, which means multiple databases are put onto a single database server or servers in the case of an Oracle RAC cluster. Each application is isolated by using a dedicated database, which can be easily maintained and upgraded. But as more and more databases are moved to the server or servers, the number of resources that must be dedicated to each of them presents a practical limit to the consolidation density that can be achieved. This can also be the same from a resource standpoint for any virtualization consolidation solution, which typically has more overhead than consolidating databases on a server because every virtual machine needs resources for the operating system too. With these kinds of consolidation, especially with development environments (which means low connection count and low usage), we usually see the memory capacity being reached first.

Schema

This method puts separate application schemas coming from multiple databases into a single instance or Oracle RAC database. There is a lot more planning and due diligence that has to be done when doing this kind of consolidation than the server approach. The DBAs and the application team have to check if there are any conflicting schema names or hard-coded schema names in the SQLs and packages. If there are any conflicts, they have to be resolved with some application and database-level changes. It would be an easy consolidation if all the application schemas fit nicely with each other, meaning there are no object name collisions due to the shared data dictionary. The result of this consolidation is only one database and accompanying background processes to administer for a number of applications. This means that schema-level consolidation, if possible, reaches a way higher density than server-level consolidation.

Multitenancy

Oracle Multitenant is an option introduced with Oracle 12c with an architecture change introducing the concept of pluggable databases (PDB). This enables an Oracle database (single-instance or RAC) to be a container (CDB) which having a single set of background processes, shared memory (SGA/PGA), undo, common temp, control files, and redo log per instance, which can contain multiple isolated databases, known as PDBs. PBDs have their own set of data files. PDBs are like combining the full isolation of a dedicated database and efficient resource usage and density of schema consolidation. Although most of our customers during the time of writing are still on version 11gR2, the multitenant architecture available with version 12c could be beneficial in a lot of cases. With the multitenant option, there is only one database to administer. On the other hand, the PDBs provide isolation overcoming the problem of naming conflicts between databases, which means no application changes are needed. All PDBs can be maintained without impacting the other PDBs residing in the same CDB, and they can be easily unplugged and plugged to another CDB. There’s a lot more flexibility, features, and benefits with the multitenant architecture that will be discussed in this section of this chapter. To learn more about Oracle Multitenant, visit the Part VI of the Database Concepts guide at http://docs.oracle.com/database/121/CNCPT/part_consol.htm#CHDGDBHJ.

Instance Caging

Instance caging provides a way to limit CPU resource usage at the instance level. It is dependent on the database resource manager to execute the throttling and, as such, needs a resource manager plan to be active. Please note that managing CPU resources for consumer groups inside the database with the database resource manager is a different feature.

There are two specific problems instance caging can address:

  • Manage CPU resource usage of multiple database instances at the same host, where instance caging prevents one instance from monopolizing CPU resources for the other instances. The focus of instance caging is on guaranteeing database performance.
  • Manage CPU resource usage of one or more database instances at the same host, where instance caging prevents one or more database instances from oversubscription of CPU resources. The focus of instance caging is on the prevention of long run queues for the CPUs.

Instance caging essentially works by counting the active foreground and non-critical background processes and compare this number with the number set with the CPU_COUNT parameter. This is done by the database resource manager, which is the reason it must be active. If the number of active processes is CPU_COUNT or higher, the database resource manager throttles execution of the next database process that wants to execute. The throttling is visible by the wait event resmgr: cpu quantum.

A less obvious benefit from instance caging is that it operates on foreground and non-critical background processes only. This means that critical background processes, like the log writer and its slaves, LMS processes, and so on, are not throttled by instance caging. Consequently, excessive activity in an instance with instance caging configured, preventing the host from oversubscription of CPU, will not lead to CPU starvation of critical background processes. Without instance caging, oversubscription of CPU resources will lead to long CPU run queues in which all database processes must wait—foreground processes and critical background processes alike. Having critical background processes waiting in a run queue can have drastic consequences: When the log writer is queued, most database processing will block, and when the LMS processes get queued, the cluster stability can be harmed, potentially leading to node evictions in extreme cases.

There is a downside to using instance caging, too. When instance caging is active, it will not use more CPU resources than is configured with the CPU_COUNT parameter, regardless of availability. This means that if the CPU_COUNT parameter is set too low, the host appears healthy and not busy, while database processing is impacted because of a lot of processes being throttled.

Configuring Instance Caging

Configuring instance caging is very simple. Set a resource plan and set CPU_COUNT to the desired number of active processes. In order to be careful, first have a look at the current contents of the parameters that are necessary to activate instance caging in the spfile:

SYS@v12102 AS SYSDBA> show spparameter cpu_count

SID      NAME                          TYPE        VALUE
-------- ----------------------------- ----------- ----------------------------
*        cpu_count                     integer
SYS@v12102 AS SYSDBA> show spparameter resource_manager_plan

SID      NAME                          TYPE        VALUE
-------- ----------------------------- ----------- ----------------------------
*        resource_manager_plan         string

By looking at the output of show spparameter, you can check the spfile for values of parameters. Checking the current setting of these parameters via show parameter will give you the value the Oracle engine filled out. In most cases, you will encounter the above situation—both CPU_COUNT and RESOURCE_MANAGER_PLAN are unset.

In order to activate instance caging, set CPU_COUNT to a value and set a resource manager plan. Both parameters can be changed at runtime:

SYS@v12102 AS SYSDBA> alter system set cpu_count=2 scope=both sid='*';

System altered.

SYS@v12102 AS SYSDBA> alter system set resource_manager_plan=default_plan scope=both sid='*';

System altered.

If you are uncertain which resource manager plan to set, the DEFAULT_PLAN plan is usually a good choice. The DEFAULT_PLAN is a very simple plan that strongly favors the SYS_GROUP and gives a very low priority to the ORA$AUTOTASK group (used for statistics collection, space advisor, sql tuning advisor, and non-ondemand in-memory population).

SYS@v12102 AS SYSDBA> select group_or_subplan, mgmt_p1, mgmt_p2, mgmt_p3, mgmt_p4
  2  from dba_rsrc_plan_directives where plan = 'DEFAULT_PLAN';

GROUP_OR_SUBPLAN    MGMT_P1    MGMT_P2    MGMT_P3    MGMT_P4
---------------- ---------- ---------- ---------- ----------
SYS_GROUP                90          0          0          0
OTHER_GROUPS              9          0          0          0
ORA$AUTOTASK              1          0          0          0

Please note that in the maintenance window (6 a.m. – 2 a.m. on the weekends, 10 p.m. – 2 a.m. during weekdays) by default, the resource manager plan is changed to the DEFAULT_MAINTENANCE_PLAN for the duration of the window. In a lot of cases, this is not an issue. The DEFAULT_MAINTENANCE_PLAN uses a multi-level plan, where the SYS_GROUP gets 75% at level 1. The OTHER_GROUP 70% (the group in which all foreground connections will be by default), which effectively means the OTHER_GROUP, gets a little lesser priority in favor of the automatic tasks. A second important note is that with or without instance caging or a resource manager plan set, by default the default maintenance window will be set and automatic tasks be run.

However, if you run time sensitive and/or important batch processing during that time, you might want to change or disable the Oracle-provided automatic tasks in order not to run at the same time as the batch processing. For more information on changing the automatic tasks, see the Database Administrator’s Guide, Managing Automated Database Maintenance Tasks at http://docs.oracle.com.

Setting CPU_COUNT

The instance caging feature depends on setting CPU_COUNT to the total amount of foreground and non-critical background processes. As has been noted earlier, CPU_COUNT can be set and changed online. A word of warning per Oracle’s advice: Do not change the value too often and do not make huge changes in values. Another important warning is not to set the value of CPU_COUNT below 2, as this could prevent foreground processes from running in certain cases.

Instance Caging Usage and Results

If you have read through this chapter, you have seen we have enabled instance caging by setting CPU_COUNT to 2 and setting the DEFAULT_PLAN resource manager plan at “Configuring Instance Caging.” In order to show what that means, we use a small script called burn_cpu.sql, which looks like this:

declare
  run boolean := true;
begin
  while run loop
    execute immediate 'select sqrt( 9999999999 ) from dual';
  end loop;
end;
/

If we start two normal user (which means non-SYS) sessions and let them run for a while, we can see the resource consumption and resource manager throttling using the V$RSRCMGRMETRIC view. This view is special in the sense that it shows statistics of the past one minute. After the view’s contents have been refreshed, it waits for one minute, in which the view’s contents do not change, and then changes with the latest minute’s measurements:

SYS@v12102 AS SYSDBA> select consumer_group_name,
  2  cpu_consumed_time/(intsize_csec*10) "CPU_USED",
  3  cpu_wait_time/(intsize_csec*10) "CPU_WAIT"
  4  from v$rsrcmgrmetric;

CONSUMER_GROUP_NAME              CPU_USED   CPU_WAIT
------------------------------ ---------- ----------
ORA$AUTOTASK                   .000166528          0
OTHER_GROUPS                   1.91517069 .000066611
SYS_GROUP                      .000099917          0
_ORACLE_BACKGROUND_GROUP_               0          0

What we see is the “normal user” sessions execute in the OTHER_GROUPS consumer group and nearly use 2 CPUs,. A tiny bit of throttling is going on, which is visible in the CPU_WAIT column. The throttling happened because there was a small amount of CPU usage for the ORA$AUTOTASK and SYS_GROUPS.

What is interesting is to see what happens if we add another session executing burn_cpu.sql:

SYS@v12102 AS SYSDBA> select consumer_group_name,
  2  cpu_consumed_time/(intsize_csec*10) "CPU_USED",
  3  cpu_wait_time/(intsize_csec*10) "CPU_WAIT"
  4  from v$rsrcmgrmetric;

CONSUMER_GROUP_NAME              CPU_USED   CPU_WAIT
------------------------------ ---------- ----------
ORA$AUTOTASK                   .000133289          0
OTHER_GROUPS                   1.84096968 1.04265245
SYS_GROUP                      .000166611          0
_ORACLE_BACKGROUND_GROUP_               0          0

What we see is CPU_USED stayed a little below 2, adding a session resulted in an increase in the CPU_WAIT metric. At this point, it is important to point out that using instance caging sessions are throttled on CPU in a round-robin fashion rather than specific processes being suspended for long periods of time This is very well visible with the Linux “top” utility, which, in this case, will show a CPU percentage of the processes running the burn_cpu.sql script at around 66% (66 x 3 = 198%, so almost 200%, which is the value of CPU_COUNT).

Instance Caging and Multitenancy

When using instance caging with pluggable databases, it can only be applied on the CDB level. This sets a limitation on the total number of active processes of the CDB and all of its PDBs. Further CPU management must be done using resource manager plans.

Over-Provisioning

Over-provisioning in the area of instance caging refers to the practice of allocating more CPU resources to the databases (the sum of CPU_COUNT of all the instances on the server) than are actually available in the server. This is useful when your server hosts multiple databases with complementing peak workload schedules. For example, if one database needs a lot of CPU resources serving OLTP-like transactions during the day but is mostly idle during the night, while another database is doing data-warehouse-like transactions in the night, which requires a lot of CPU resources, but is mostly idle during day, it is the safest, but not the best-performing option, to limit both at 50% of the CPUs available. Rather, you would like to set both to, for example, 75% of the CPUs available. This introduces a risk that both databases still need to compete for CPU resources if both databases need a lot of CPU resources at the same time, despite the prognosed different peak CPU-resource usage time (75 + 75 = 150%, alias an oversubscription of 50%) but has the advantage of using a number of CPUs that otherwise would not have been available. This means that over-provisioning, like all things in life, is a trade-off.

Binding Instances to Specific CPUs Using Cgroups

The Linux “cgroups” or “control groups” feature, which provides a way to limit, account for, and isolate resource usage, was introduced with kernel version 2.6.24 and is available starting from Oracle Linux 5 with the Unbreakable Enterprise Kernel (UEK) version 2 or Oracle Linux 6 and higher. With version 12c of the Oracle database and with Oracle version 11.2.0.4, the database integrates with the cgroups feature with the PROCESSOR_GROUP_NAME parameter.

The cgroups feature gives you the opportunity to isolate part of the resources, specifically CPUs or specific NUMA nodes in case of the Oracle database, by assigning them to specific cgroups. One use case is to have different consolidation densities for different groups of databases by binding these groups of databases to specific CPUs. Please note that the two socket Exadata compute nodes, which are the “dash two” servers, such as X5-2, X4-2, and so on, are NUMA systems, but have NUMA turned off (numa=off) in the kernel load line in grub (/etc/grub.conf). This means cgroups can only be used to isolate to specific CPUs. The eight-socket Exadata compute nodes, as you might have guessed are the “dash eight” servers, have NUMA turned on, which means both isolation based on NUMA nodes and CPUs can be done.

The easiest way to use cgroups with the Oracle database is to use the script that is available in My Oracle Support note: “Using PROCESSOR_GROUP_NAME to bind a database instance to CPUs or NUMA nodes on Linux” (Doc ID 1585184.1).

Installation and Configuration of Cgroups

Once the script has been downloaded, the most logical place is to put it in the home directory of the root user. Any action done by the script needs to be done as root. First thing to do is look at our system:

# ./setup_processor_group.sh -show
This is a NUMA system.

NUMA configuration:
-------------------
Node 0
  CPUs: 0-15
  Memory Total: 75488296 kB Free: 13535460 kB

This system does not have any processor group.

The script is run on an Exadata V2 machine, which is the reason 75GB of memory is shown. It says it is a NUMA system and then displays one node. The reason only one node (node 0) is shown is because the Exadata dual socket compute nodes have NUMA turned off at the kernel load line (see the “kernel” line or lines in /etc/grub.conf, which lists the “numa=off” setting).

In order to use cgroups via the setup_processor_group.sh script, the system needs to be “prepared” using the script. This needs to be done only once. This is how the preparation looks like:

# ./setup_processor_group.sh -prepare

SUCCESS: -prepare complete
Next step: create a new proc group using option -create

Once this has been run, verify the preparation by running setup_processor_group.sh with the –check switch:

# ./setup_processor_group.sh -check
setup_processor_group.sh -prepare has been run on this system

The preparation modifies /etc/rc.local, and it adds the script /etc/ora_autopg. The script /etc/ora_autog mounts the cgroup pseudo filesystem at /mnt/cgroup and runs the setup_processor_group.sh script, which has been copied to /etc, together with the cgroup settings in the file /etc/auto_orapg.conf.

Now that the host is prepared, the next step is to learn how the CPU topology looks on the node. Use this script to learn what your system looks like: cpu_topo.sh.

echo "== summary =="
CPUINFO="/proc/cpuinfo" || CPUINFO=$1
awk -F: '/^physical/ && !ID[$2] { P++; ID[$2]=1 }; /^cpu cores/ { CORES=$2 };  /^cpu cores/ { T++ }; END { print "sockets: "P" tot cores: "CORES*P" tot threads: "T }' $CPUINFO
echo "== cpu map =="
echo " T -  S -  C"
C=0
cat /proc/cpuinfo | egrep "processor|physical id|core id" | awk '{ print $(NF) }' | while read V; do
        [ $C -lt 2 ] && printf "%2d - " $V || printf "%2d " $V
        [ $C -lt 2 ] && let ++C || C=0
done

This is how the output looks on my test system:

# ./cpu_topo.sh
== summary ==
sockets: 2
tot cores: 8
tot threads: 16
== cpu map ==
 T -  S -  C
 0 -  0 -  0
 1 -  0 -  1
 2 -  0 -  2
 3 -  0 -  3
 4 -  1 -  0
 5 -  1 -  1
 6 -  1 -  2
 7 -  1 -  3
 8 -  0 -  0
 9 -  0 -  1
10 -  0 -  2
11 -  0 -  3
12 -  1 -  0
13 -  1 -  1
14 -  1 -  2
15 -  1 -  3

This shows that my system has two sockets (S column, socket 0 and 1) and every socket four cores (C column, numbers 0 to 3). The thread number (T column) is the CPU number of the operating system. This means that the operating system visible CPUs on this system are actually (hyper-)threads. If you look carefully, you will see that threads number 0 and 8 are connected to the same core (both have S and C 0). In order to fully isolate workloads, it is very important to separate different cgroups based on cores. The separation is important because a CPU thread should be seen as an execution context only and requires a core to execute. This means that if one CPU thread is using the core it is connected to, the other thread has to wait for the core to become available to execute. When both threads want to execute, only one of them truly is—the other one waits. Because this waiting is an in-CPU event, it is not visible for the operating system and shows up as running on CPU. In other words, it is not visible.

Let’s create a cgroup called “singlecore” and bind the group to the processor number 0 and 8, so it is bound to one core:

# ./setup_processor_group.sh -create -name singlecore -cpus 0,8 -u:g oracle:dba
SUCCESS: -create complete for processor group singlecore

To start an Oracle instance in this processor group,
 set the database initialization parameter:
PROCESSOR_GROUP_NAME=singlecore

Please note the -u:g switch. It is meant to set the user and the group of the Oracle executable as configured on your system. To see how it is configured, list the user and group of the of the oracle executable in $ORACLE_HOME/bin.

As the output of -create mentions, the next and last step is to set the database parameter PROCESSOR_GROUP_NAME in order to make the instance obey the cgroup settings it is configured for. If an instance is started with PROCESSOR_GROUP_NAME set for a control group, the alert.log file will show something like this:

Instance has been started in processor group singlecore (NUMA Nodes: 0 CPUs: 0,8)

When using cgroups and the PROCESSOR_GROUP_NAME database parameter with the multitenant option, the PROCESSOR_GROUP_NAME parameter must be set at the CDB level. Pluggable databases will inherit the cgroup properties and obey them.

Oracle 12c THREADED_EXECUTION

Oracle 12c introduced a new initialization parameter, THREADED_EXECUTION. This parameter, when set to TRUE, will make the Oracle executable run certain database processes as threads inside a process. By combining multiple database processes as threads inside an operating system process, the number of processes on the operating system level is reduced. Please note that when THREADED_EXECUTION is enabled, some background processes are running as a thread; however, user processes created through the listener will still be processes unless the listener parameter DEDICATED_THROUGH_BROKER_<listener name> is set to ON. Some of the critical background processes, like PMON, DBWn, VKTM, and PSP, still will use a dedicated process, regardless of THREADED_EXECUTION having been set to TRUE.

We have not thoroughly tested the threaded execution model. The tests we have done show both the default process model and the threaded model being in the same league with regards to performance. Both models can come out on top with the tests we performed.

There definitely is a conceptual difference between processes, being stand-alone and having their own address space, and threads sharing the address space of the process that created them. However, modern operating systems like Linux try very hard try to reduce the amount of work they need to do whenever possible. One of these optimizations is that on process creation, the address space of the new process almost entirely is shared with its parent, via pointers. Only when the new process starts writing to memory pages, new pages are allocated in a copy-on-write fashion. As such, there is no significant additional overhead in allocating memory for a process when compared with threads.

As with a lot of technical functions, it is best to choose what the majority of the users choose in order to have the most stable and most tested execution path in the Oracle kernel code. Limited tests have shown there is no significant gain in performance. These are reasons that the current advice is to use the regular process model and leave THREADED_EXECUTION at its default value, FALSE.

Managing PGA Memory

Oracle database version 12c introduced the parameter PGA_AGGREGATE_LIMIT. The purpose of this parameter seems very obvious, limiting the overall Process Global Area (PGA) memory usage of the database instance. Of course, Oracle’s automatic PGA feature needs to be used before PGA_AGGREGATE_LIMIT can be used, which means the PGA_AGGREGATE_TARGET parameter needs to be set to a non-zero value. The function of the PGA_AGGREGATE_TARGET parameter is to set the desired amount of memory you want the instance’s total PGA memory to take. Please note the descriptions in this paragraph are about dedicated database processes, not Shared Server/Multi-Threaded Server processes.

It is important to understand that by setting the PGA_AGGREGATE_TARGET parameter, there is no guarantee the instance will not take more memory for PGA allocations, hence the name PGA_AGGREGATE_TARGET. In essence, PGA_AGGREGATE_TARGET limits and divides work-area allocations of active processes in order to try to make overall PGA memory usage lower than the size set with PGA_AGGREGATE_TARGET. Work-area allocations are sort, hash, and bitmap memory needed during execution. Of course, there is no limit to how much a process can sort or to the size of the hash table when doing a hash join. When the amount of memory needed exceeds the work-area size, the contents are moved to the session’s temporary tablespace and processing continues.

So, if PGA_AGGREGATE_TARGET works by sizing the process sort, hash ,and bitmap area, then what can cause the instance to use more memory than is set with PGA_AGGREGATE_TARGET? The following points are the ones we encountered; there could be more reasons:

  • The PGA_AGGREGATE_TARGET parameter cannot just be set to a value and be respected by all sessions in the instance. Every session that is started needs a minimum memory footprint to be able to run. This means that setting the PGA target too small with respect to the number of used sessions means the instance will allocate more than is set.
  • PL/SQL variables like arrays and collections can be used and extended without any restriction by any individual session. Any of these sessions can use up to 4GB with Oracle 11.2 and 32GB with Oracle 12.1 (this is specific to Linux x86-64, which commonly is used on the Exadata platform), provided the operating system limits allow this and there is enough virtual memory to accommodate this. Please note virtual memory—the physical memory can be exhausted way before the virtual memory is, which means the operating system needs to satisfy the memory allocations by swapping current memory contents to the swap device.
  • Every cursor in a session that uses sort, hash, or bitmap memory has a minimal memory footprint, too. If a lot of cursors are allocated in the sessions that use any of these memory areas, there is a certain amount of memory that will be allocated, regardless of PGA_AGGREGATE_TARGET.

Now let’s take a step back. In most cases, especially when the application allocates sessions via a connection pool, the amount of connections is static and little or no PL/SQL variables are used. This means the PGA of an instance will take the memory needed and remains at that amount for its entire duration. As a result, the actual PGA memory usage can be way lower than is actually set with PGA_AGGREGATE_TARGET, around the set value or higher, totally depending on the application’s type of request, the number of cursors, and the number of connections in the database. It can be way more because, as we just described, database processes will take what is needed, regardless of what is set. It is a good practice to periodically measure the actual PGA usage in V$PGASTAT by looking at the “maximum PGA allocated” statistic after a period of normal usage to see if there is a strong difference between the high-water mark of what is actually used and what is set with PGA_AGGREGATE_TARGET.

However, sometimes the PGA usage is not that static. This can lead to problems if the memory on your system is carefully divided between the operating system, Clusterware, ASM, and database instance(s) because excessive PGA allocation means more memory is requested than is accounted for. Memory used cannot be held in main memory, and the operating system needs to resort to swapping memory contents to disk. Whenever this happens, a strong drop in performance is noticeable.

Situations as the ones described above are what the PGA_AGGREGATE_LIMIT parameter is created for. It will try to limit PGA memory allocations by terminating the current execution of a session that is doing allocations for untunable PGA once too much memory is found to be allocated to the PGA in the instance. Such a termination results in an “ORA-04036: PGA memory used by the instance exceeds PGA_AGGREGATE_LIMIT” in the session. Please note background processes and SYS connections are exempted from the limit. Parallel query processes are not considered background processes for this feature, so they could also be terminated, which is done for the entire set of parallel query processes used for a given execution.

However, testing shows that in version 12.1.0.2, PGA_AGGREGATE_LIMIT is far from a hard limit. During our research, we found that when PGA allocation exceeds PGA_AGGREGATE_LIMIT, the memory management code starts sampling sessions that are candidates for termination for 4 to 20 times before it truly terminates one or more executions to comply with PGA_AGGREGATE_LIMIT. This has the advantage that short-lived peaks of high memory usage will not immediately result in termination of execution, but it has the disadvantage that actual memory usage can go way beyond PGA_AGGREGATE_LIMIT, depending on how fast allocations are done.

The following code example allocates memory to an associative array. Because we know it will hit an ORA-4036, the exception handler prints the error code and start PGA and last PGA measurement, together with the values set for PGA_AGGREGATE_TARGET and PGA_AGGREGATE_LIMIT, in order to investigate process memory allocations:

pga_filler.sql:

declare
  type vc2_ar is table of varchar2(32767) index by pls_integer;
  vc vc2_ar;
  v varchar2(32767);
  target number;
  start_pga number;
  current_pga number;
  agg_target number;
  agg_limit number;
  counter number := 0;
begin
  select value into agg_target from v$parameter where name = 'pga_aggregate_target';
  select value into agg_limit from v$parameter where name = 'pga_aggregate_limit';
  select value into start_pga from v$mystat m, v$statname n where m.statistic#=n.statistic# and name = 'session pga memory';
  for idx in 1 .. 30000000 loop
    v := rpad('x',32767,'x'),
    vc(idx) := v;
    counter := counter + 1;
    if counter = 300 then
      select value into current_pga from v$mystat m, v$statname n where
        m.statistic#=n.statistic# and name = 'session pga memory';
      counter := 0;
    end if;
  end loop;
exception when others then
  dbms_output.put_line('error message :'||SQLERRM);
  dbms_output.put_line('start pga     :'||start_pga);
  dbms_output.put_line('last pga      :'||current_pga);
  dbms_output.put_line('pga agg target:'||agg_target);
  dbms_output.put_line('pga agg limit :'||agg_limit);
end;
/

Please do not to test this with the SYS user because SYS sessions will not be throttled. If the SYS user is used, the above code will either complete or give an ORA-4030 (out of process memory when trying to allocate %s bytes (%s,%s) message when 4G (oracle 11.2) or 32G (Oracle 12.1) of PGA memory has been allocated, or an operating system limit has been hit.

TS@v12102 > @pga_filler
error message :ORA-04036: PGA memory used by the instance exceeds PGA_AGGREGATE_LIMIT
start pga     :3338760
last pga      :807924232
pga agg target:524288000
pga agg limit :629145600

PL/SQL procedure successfully completed.

This example is the output of the execution of the pga_filler.sql script. This execution shows we had PGA_AGGREGATE_TARGET set to 524288000 (500M), and PGA_AGGREGATE_LIMIT set to 629145600 (600M). When the PL/SQL procedure began executing, the PGA had a size of 3338760 (a little more than 3M). The PL/SQL block did encounter an ORA-4036, at which time the PGA had grown to 807924232 (770M). This clearly shows the limit as set by PGA_AGGREGATE_LIMIT is rather a starting point for the consideration for termination rather than a hard limit.

This means that if you have the explicit need for limiting PGA memory allocations, you should set it lower than your actual needed limit for PGA memory. The most important thing is it should be very thoroughly tested in general and especially for situations where the PGA limit actually gets reached and is used to prevent your server from starting to swap, leading to a performance death spiral.

For versions prior to Oracle version 12c, Oracle introduced an (undocumented) event in a whitepaper on consolidation on Exadata to limit PGA allocations per process. This is not exactly the same as PGA_AGGREGATE_LIMIT, which works on the instance level. This event is event 10261, and it controls the amount of PGA a single process is allowed to allocate, including SYS connections. An ORA-600 is signaled if the amount of allocated memory exceeds the amount set as level for this event up to Oracle version 11.2.0.3 and an ORA-10260 is signaled with Oracle version 11.2.0.4. Please note this event is not a hard limit too; testing showed it could take up to approximately 100MB more than set. This is the way this event is set:

alter system set events = '10261 trace name context forever, level <PGA LIMIT IN BYTES>';

Because this is an undocumented event, this should only be a last-resort solution. If you truly need to use this way of PGA memory limiting, you should ask Oracle support for blessing and guidance specific to your case.

Database Resource Manager

Database Resource Manager (DBRM) has been around since Oracle 8i (1999) and is basically geared toward managing CPU resources at the database tier. Exadata V2 introduced a new feature called I/O Resource Manager (IORM), which, as you might expect, is geared toward managing and prioritizing I/O at the storage cell layer. When databases on Exadata request I/O from the storage cells, they send additional information along with the request that identifies the database making the request as well as the consumer group making the request. The software on the storage cells (Cellserv or cellsrv) knows about the consumer group priorities inside the database (DBRM) and/or at the storage cell (IORM), and it manages how I/O is scheduled. DBRM and IORM are tightly knit together, so it is important to have a solid understanding of DBRM before IORM is going to make any sense to you. Now, the Database Resource Manager is a lengthy topic and could easily justify a book all by itself. Thus, we will focus on the basic constructs that we will need to develop an effective IORM Resource Plan. If you already have experience with DBRM, you may be able to skip over this topic.

With the (Oracle 12c) multitenancy option, the database resource manager has gotten an additional layer. As described above, the traditional database resource manager manages resources within the database; however, with the multitenancy option, you essentially got “databases within a database.” In order to manage resources between (pluggable) databases in the root/container database, there needs to be resource management at the root database level, which is called a CDB (container database) resource plan, and additionally resource management inside the pluggable databases, which is almost exactly the same as with a traditional non-container database, as described above.

Before we begin, let’s review the terminology that will be used in this topic. Table 7-1 describes the various components of database resource manager. We will discuss these in more detail as we go along.

Table 7-1. Resource manager component descriptions

Name

Description

Resource consumer group

Consumer group

These are the various names by which you may see resource consumer groups referred. Resource Manager allocates resources to consumer groups, rather than user sessions. A consumer group is a set of database sessions that may be grouped together based on their priority and/or resource requirements.

Resource plan directive

Plan directive

Directive

These are the names by which you may see resource plan directives referred. Resource allocations are not assigned directly to consumer groups. They are defined in a resource plan directive. A consumer group is then assigned to the plan directive so that resource allocations may be enforced. In the case of a CDB plan, the resource allocations are assigned to a pluggable database.

Resource plan

Plan

Resource plans are sometimes referred to simply as “plans” or “the plan.” Plan directives are grouped together to create a resource plan, thus defining the overarching allocation of resources to all sessions within the database. In the case of the multitenancy option, there can be CDB plans to manage resources between pluggable databases and regular plans, which act inside the pluggable database.

As shown in Table 7-2, depending on whether you use DBRM in non-CDBs or CDBs, DBRM consists of:

Table 7-2. Resource manager component components based on tenancy

Option

Resource manager components

Non multitenant

Plan image Directive image Consumer group

Multitenant

CDB: Plan image Directive image Pluggable database

PDB: Plan image Directive image Consumer group

Creating a CDB Resource Plan

Support for multitenancy on the storage cell level has been introduced with cell version 12.1.1.1.0. If you have created a container database and want to do resource management, you must create A CDB resource plan first, what is what we are going to do now. Please note a CDB resource plan is required for doing resource management inside a pluggable database. If you created a regular/non-multitenant database or you are looking for a PDB resource plan, you might want to skip this part and go to the section Creating a (Pluggable) Database Resource Plan.

The next example creates a CDB resource plan, cdb_plan, for an imaginary container database with three pluggable databases: database1, database2, and database3:

begin
 DBMS_RESOURCE_MANAGER.CLEAR_PENDING_AREA;
 DBMS_RESOURCE_MANAGER.CREATE_PENDING_AREA;
 DBMS_RESOURCE_MANAGER.CREATE_CDB_PLAN(plan=>'cdb_plan'),
 DBMS_RESOURCE_MANAGER.CREATE_CDB_PLAN_DIRECTIVE(
  plan=>'cdb_plan',
  pluggable_database=>'database1',
  shares=>4,
  utilization_limit=>null,
  parallel_server_limit=>null);
 DBMS_RESOURCE_MANAGER.CREATE_CDB_PLAN_DIRECTIVE(
  plan=>'cdb_plan',
  pluggable_database=>'database2',
  shares=>1,
  utilization_limit=>10,
  parallel_server_limit=>0);
DBMS_RESOURCE_MANAGER.CREATE_CDB_PLAN_DIRECTIVE(
  plan=>'cdb_plan',
  pluggable_database=>'database3',
  shares=>1,
  utilization_limit=>null,
  parallel_server_limit=>10);
end;
/

PL/SQL procedure successfully completed.

This plan requires some explanation. As you will see with (pluggable) database resource plans later on, CDB plans tend to be much simpler than database resource plans. A CDB plan simply is a plan in the root of the CDB with directives for PDBs. For every pluggable database, you can assign a number of shares, which sets the priority for a pluggable database. In this case, we got 4+1+1=6 shares; database1 having four shares means it gets 4/6*100=66% of the CPU resources if all the pluggable databases need to contend for it. The parameter utilization_limit is a percentage and specifies a percentage of the total CPU capacity available a pluggable database is allowed to use. utilization_limit is not cumulative for the PDBs—you can give any PDB 100 (percent), which is the same as null: no limit. You can use parallel_server_limit to set a percentage of the number of parallel execution servers as set by the PARALLEL_SERVERS_TARGET parameter.

You might wonder what happens if a DBA creates a pluggable database that is not specified: Is it free to use any resource until it is specified in the CDB plan? The answer is no: If a CDB plan is enabled, a pluggable database that is not specified in a CDB plan is subject to the default directive settings. The default directive settings are: 1 share, 100(%) utilization_limit, and 100(%) parallel_server_limit. You can change the default directive settings with the DBMS_RESOURCE_MANAGER.UPDATE_CDB_DEFAULT_DIRECTIVE() procedure.

The plan cdb_plan now is created. The next step is to validate the plan and, if that completes without warnings, submit the plan:

begin
 DBMS_RESOURCE_MANAGER.VALIDATE_PENDING_AREA();
end;
/

PL/SQL procedure successfully completed.

begin
 DBMS_RESOURCE_MANAGER.SUBMIT_PENDING_AREA();
end;
/

PL/SQL procedure successfully completed.

Now the CDB plan is submitted, it is added to the data dictionary, and it can be enabled by setting the RESOURCE_MANAGER_PLAN parameter. A CDB plan can only be enabled in the root database of a container database. This is how a resource manager plan is enabled:

SQL> alter system set resource_manager_plan=cdb_plan sid='*';

System altered.

If this plan causes problems with the databases running in the container, another alter system set resource_manager_plan can be issued to revert to the old resource manager plan, or, if there was no old resource manager plan, to ‘’ (single quote, single quote) to reset the resource manager plan.

Another caveat is the automatic maintenance jobs in the Oracle database. The automatic maintenance jobs, which are enabled by default after installation, sets the DEFAULT_MAINTENANCE_PLAN resource manager plan during the maintenance windows. If you rely on your own resource manager plan for meeting service levels, you can change or disable the maintenance windows or force the setting of your own resource manager plan, which is done by prefixing FORCE: in front of your resource manager plan when setting it with alter system.

If you want more information on CDB plans, the current version of Oracle 12 contains the following DBA views, accessible from the CDB root:

  • DBA_CDB_RSRC_PLANS
  • DBA_CDB_RSRC_PLAN_DIRECTIVES

Creating a (Pluggable) Database Resource Plan

Whether you have a pluggable database and set a CDB resource management plan and now want to manage the resources inside the individual pluggable database, or you choose not to use the multitenancy option and just want to manage resources in the database, this section will help you create a database resource management plan. When using the multitenancy option, please note that in order to set a resource manager in the pluggable database, there needs to be a CDB resource manager active already.

The following creates a database resource plan, DAYTIME_PLAN, which has consumer groups for online processing (OLTP), batch/long running jobs (BATCH) and maintenance (MAINTENANCE). You need to be connected as an administrative user in a PDB, not the CDB root for these commands to have an effect:

begin

 DBMS_RESOURCE_MANAGER.CLEAR_PENDING_AREA;
 DBMS_RESOURCE_MANAGER.CREATE_PENDING_AREA;

 DBMS_RESOURCE_MANAGER.CREATE_PLAN(
  plan    =>'DAYTIME_PLAN',
  mgmt_mth=>'RATIO'),

 DBMS_RESOURCE_MANAGER.CREATE_CONSUMER_GROUP(
  consumer_group   => 'MAINTENANCE'),
 DBMS_RESOURCE_MANAGER.CREATE_CONSUMER_GROUP(
  consumer_group   => 'OLTP'),
 DBMS_RESOURCE_MANAGER.CREATE_CONSUMER_GROUP(
  consumer_group   => 'BATCH'),

 DBMS_RESOURCE_MANAGER.CREATE_PLAN_DIRECTIVE(
  plan            =>'DAYTIME_PLAN',
  mgmt_p1         =>20,
  group_or_subplan=>'SYS_GROUP'),
 DBMS_RESOURCE_MANAGER.CREATE_PLAN_DIRECTIVE(
  plan            =>'DAYTIME_PLAN_PLAN',
  mgmt_p1         =>10,
  group_or_subplan=>'MAINTENANCE'),
 DBMS_RESOURCE_MANAGER.CREATE_PLAN_DIRECTIVE(
  plan            =>'DAYTIME_PLAN',
  mgmt_p1         =>5,
  group_or_subplan=>'OLTP'),
 DBMS_RESOURCE_MANAGER.CREATE_PLAN_DIRECTIVE(
  plan            =>'DAYTIME_PLAN',
  mgmt_p1         =>3,
  group_or_subplan=>'BATCH'),
 DBMS_RESOURCE_MANAGER.CREATE_PLAN_DIRECTIVE(
  plan            =>'DAYTIME_PLAN',
  mgmt_p1         =>1,
  group_or_subplan=>'OTHER_GROUPS'),

 DBMS_RESOURCE_MANAGER.SET_CONSUMER_GROUP_MAPPING(
  attribute      => DBMS_RESOURCE_MANAGER.ORACLE_USER,
  value          => 'APP_ADMIN',
  consumer_group => 'MAINTENANCE'),
 DBMS_RESOURCE_MANAGER.SET_CONSUMER_GROUP_MAPPING(
  attribute      => DBMS_RESOURCE_MANAGER.SERVICE_NAME,
  value          => 'OLTP',
  consumer_group => 'OLTP'),
 DBMS_RESOURCE_MANAGER.SET_CONSUMER_GROUP_MAPPING(
  attribute      => DBMS_RESOURCE_MANAGER.SERVICE_NAME,
  value          => 'BATCH',
  consumer_group => 'BATCH'),

end;
/

PL/SQL procedure successfully completed.

The resource manager plan is now created in the pending area. This is a simple plan, aimed to give you an idea on how to do it. If you choose to use the database resource manager in your database or a pluggable database, this is a good starting point. Let’s go through the plan in the fashion that has been described earlier: plan, directives, and consumer groups.

The above plan is called DAYTIME_PLAN. In the plan properties, MGMT_MTH (resource allocation method) has been set from EMPHASIS (the default, which lets you use percentages at different levels) to RATIO, which lets you use single-level plans only by setting MGMT_P1 to the share it is supposed to get. By using ratio/shares-based plans, you do not have to use the emphasis/percentage-based plans, which can be hard to understand and troubleshoot. With PDB resource plans, you cannot use multi-level plans anyway.

After the creation of the plan, there are a few statements to define the consumer groups. In order to map groups of connections to the resource manager consumer group, the group first needs to exist, which is what the DBMS_RESOURCE_MANAGER.CREATE_CONSUMER_GROUP procedure does.

There are five directives in the DAYTIME_PLAN:

  • The first directive is for the SYS_GROUP, aimed at prioritizing the SYS and SYSTEM users with the most resources.
  • The next directive is for the MAINTENANCE group, meant to be used when doing application maintenance, which should get priority over the normal usage.
  • The next directive is for the OLTP group, which is the first directive for normal usage.
  • Next is the directive for the BATCH group, which should have lower priority than the OLTP group.
  • The last the directive is for OTHER_GROUPS, which should give any other users the lowest priority.

If you look at the priorities, you see how the ratio/shares method is easy to read and understand:

Tablea.jpg

A few additional remarks: If you have the need to add directives, there is no need to change anything in the other directives when using the RATIO method. You can simple add the new directive with the appropriate ratio/amount of shares set in MGMT_P1. Of course, this will change the calculation; adding the amount of shares will lower the percentages. A very important point is that the distribution of CPU as indicated with the ratio is done based on the consumer groups the current ACTIVE sessions are dedicated to. This means that if there are only active sessions in the OLTP and BATCH groups, the CPU resources will be divided based on their mutual share: 5/(5+3)*100=62.5% for the OLTP group and 3/(5+3)*100=37.5% for the BATCH group. The ratio sets the resource division; it does not keep resource groups from using CPU resource.

The last part consists of mapping rules. The mapping rules in this plan are very simple, but show the two most common ways database connections are grouped in the situations we have encountered. The first rule maps a certain database user to a group. The next two mapping rules map database connections to two groups based on the service they used to connect to the database. Services can be created using srvctl add service ... command line utility, which will add a “service” type cluster resource. Services can also be used to point specific use of a database to one or more instances, and they can be defined to fail over if the current instance is down.

When a non-SYS session enters the database, it belongs to the OTHER_GROUPS consumer group by default. The resource manager plan automates switching sessions to a specific consumer group and manages resources. However, sessions are not allowed to switch to any consumer group—they need to be explicitly granted to a consumer group before the resource manager plan can switch them to it. This is done with the DBMS_RESOURCE_MANAGER_PRIVS.GRANT_SWITCH_CONSUMER_GROUP procedure:

begin
 DBMS_RESOURCE_MANAGER_PRIVS.GRANT_SWITCH_CONSUMER_GROUP(
  grantee_name => ‘APP_ADMIN’,
  consumer_group => ‘MAINTENANCE’
 );
 DBMS_RESOURCE_MANAGER_PRIVS.GRANT_SWITCH_CONSUMER_GROUP(
  grantee_name => ‘APP_USER’,
  consumer_group => ‘OLTP
 );
 DBMS_RESOURCE_MANAGER_PRIVS.GRANT_SWITCH_CONSUMER_GROUP(
  grantee_name => ‘APP_BATCH’,
  consumer_group => ‘BATCH’
 );
end;
/

In our example, we have an APP_ADMIN user for doing maintenance to the application schema who is granted the consumer group MAINTENANCE. The application user is APP_USER and is granted the OLTP consumer group. In the same fashion, the APP_BATCH user is granted the BATCH consumer group.

It is good practice to leave the OTHER_GROUPS (the default consumer group) out of being priorized. By priorizing specific consumer groups and granting access to the consumer groups to database users based on specific rules, any session that connects to the database outside of the rules explicitly set will automatically be left in the OTHER_GROUPS consumer group and not be priorized.

Now that we have an understanding of the resource plan, let’s validate and submit it to the data dictionary:

begin
 DBMS_RESOURCE_MANAGER.VALIDATE_PENDING_AREA();
end;
/

PL/SQL procedure successfully completed.

begin
 DBMS_RESOURCE_MANAGER.SUBMIT_PENDING_AREA();
end;
/

PL/SQL procedure successfully completed.

You could set the parameter RESOURCE_MANAGER_PLAN (either in a normal database or in a PDB) to enforce the resource manager plan. However, in a lot of cases, the resource need and priorities of a database are different at different times, such as online clients having priority during the day, and batches and reports having priority during the evening and night. The next section shows how to let the database switch a resource manager plan using scheduler windows.

Using the Scheduler to Change the Resource Plan

Nearly every database has time frames in which the database is used in a totally differently way. For a lot of databases, it is common to have online users requesting information in an OLTP-like fashion during daytime, and have batch jobs for moving data in and out in the evening or night. During daytime, the online users should have the priority, and other usage like reporting and batch usage should have lower priority. However, during the evening the batches and reporting should have the highest priority. It is common that batches deliver other databases/applications with daytime transactions.

If we take the DAYTIME_PLAN, we can create a NIGHTTIME_PLAN simply by modifying the name of the plan, and change the MGMT_P1 share of the OLTP directive to 1, and change the share of the BATCH directive to 5. This way the SYS_GROUP and MAINTENANCE directives still have priority over normal usage, but now any OLTP request will have significant lower priority than the BATCH requests in this plan.

In order to change the active resource manager plan, scheduler windows can be used. The next example will make the database automatically change the resource plan based on time during the week:

begin
 DBMS_SCHEDULER.CREATE_WINDOW(
  window_name     => 'WEEKDAY_WINDOW',
  resource_plan   => 'DAYTIME_PLAN',
  start_date      => systimestamp at time zone 'EUROPE/AMSTERDAM',
  duration        => numtodsinterval(660, 'minute'),
  repeat_interval => 'FREQ=WEEKLY;BYDAY=MON,TUE,WED,THU,FRI;BYHOUR=7;BYMINUTE=0;BYSECOND=0',
  end_date        => null);
 DBMS_SCHEDULER.ENABLE('"SYS"."WEEKDAY_WINDOW"'),
 DBMS_SCHEDULER.CREATE_WINDOW(
  window_name     => 'WEEKNIGHT_WINDOW',
  resource_plan   => 'NIGHTTIME_PLAN',
  start_date      => systimestamp at time zone 'EUROPE/AMSTERDAM',
  duration        => numtodsinterval(780, 'minute'),
  repeat_interval => 'FREQ=WEEKLY;BYDAY=MON,TUE,WED,THU,FRI;BYHOUR=18;BYMINUTE=0;BYSECOND=0',
  end_date        => null);
 DBMS_SCHEDULER.ENABLE('"SYS"."WEEKNIGHT_WINDOW"'),
end;
/

PL/SQL procedure successfully completed.

The procedure creates two windows, WEEKDAY_WINDOW and WEEKNIGHT_WINDOW. The parameter resource_plan binds a resource plan to the window. The weekday window starts at 7 a.m.; the weeknight window starts at 18:00 (6 p.m.).

The Wait Event: resmgr: cpu quantum

The throttling by the resource manager of CPU consumption of a process will be visible by the wait event resmgr: cpu quantum. The throttling only happens when the database resource manager is enabled. A resource manager plan in either a regular database or a pluggable database is active because of:

  • Instance caging
  • A user-defined resource management plan
  • The automatic maintenance jobs window turned active

Image Tip  Without a resource manager plan, there is no way for the database to throttle CPU consumption, and thus for the wait event resmgr: cpu quantum to show up.

The database resource manager keeps track of database processes running and willing to run. As long as all processes can get a full time slice on the CPU, there is no need to throttle (outside of resource manager consumer groups bound to a directive with UTILIZATION_LIMIT, which we will discuss later). However, once the amount of running processes and willing to run processes exceeds the amount of CPUs visible (or number of CPUs allowed to use in the case of instance caging), the resource manager investigates the consumption groups of the running and willing to run processes and throttles the execution of the processes according to the distribution of the resources configured in the directives. The throttling is visible in the wait interface as “resmgr: cpu quantum.” In most cases, processes in a consumer group are not throttled for extended amounts of time, but rather in a round-robin fashion among the active processes belonging to the throttled consumer group(s) for a time (“quantum”) of 100ms. This means that throttled processes do not stop in most cases, but will execute slower, depending on the amount of throttling.

Where to Go from Here

As mentioned previously, the database resource manager is a topic that could easily fill a book on its own. As such, the starting point for doing database resource management has been described in the last few sections. In order to learn more on this subject and to search for specific properties, see the documentation at http://docs.oracle.com/database/121/ADMIN/dbrm.htm. You can also refer to Chapter 11, where you can read about the other resource manager events.

For completeness sake, this paragraph shows some specific topics on the database resource manager that have been helpful when we applied the resource manager.

Resource Mapping Priorities

You might have wondered what happens when a database user is eligible for multiple consumer groups when mapping them with the DBMS_RESOURCE_MANAGER.SET_CONSUMER_GROUP_MAPPING procedure. For example, when consumer groups are mapped like this:

DBMS_RESOURCE_MANAGER.SET_CONSUMER_GROUP_MAPPING(
  attribute      => DBMS_RESOURCE_MANAGER.ORACLE_USER,
  value          => 'APP_ADMIN',
  consumer_group => 'MAINTENANCE'),

and

DBMS_RESOURCE_MANAGER.SET_CONSUMER_GROUP_MAPPING(
  attribute      => DBMS_RESOURCE_MANAGER.SERVICE_NAME,
  value          => 'OLTP',
  consumer_group => 'OLTP'),

What happens when the APP_ADMIN database user logs on using the OLTP service? In that case, both the mappings apply! The answer is found in the DBA_RSRC_MAPPING_PRIORITY view, which shows which attributes are evaluated in what priority:

SQL> select * from dba_rsrc_mapping_priority;

ATTRIBUTE                   PRIORITY STATUS
--------------------------- ---------------
EXPLICIT                                  1
SERVICE_MODULE_ACTION                     2
SERVICE_MODULE                            3
MODULE_NAME_ACTION                        4
MODULE_NAME                               5
SERVICE_NAME                              6
ORACLE_USER                               7
CLIENT_PROGRAM                            8
CLIENT_OS_USER                            9
CLIENT_MACHINE                           10
CLIENT_ID                                11

11 rows selected.

If the priority is not in line with your needs, you can change the priorities with the DBMS_RESOURCE_MANAGER.SET_CONSUMER_GROUP_MAPPING_PRI procedure.

Resource Limiting

The database resource plan in this chapter has the directives setup with a priority in the directive, with the intention to get a share of the CPU resources based on the set priority. Independently from the priority, the resource manager gives you the option to limit the usage of both CPU and parallel query servers.

The parameter in the directive for this is utilization_limit:

DBMS_RESOURCE_MANAGER.CREATE_PLAN_DIRECTIVE(
  plan              =>'DAYTIME_PLAN',
  mgmt_p1           =>3,
  utilization_limit =>10
  group_or_subplan  =>'BATCH'),

This is one of the directives from the DAYTIME_PLAN shown earlier, but modified to include the utilization_limit directive parameter. What this does is the same as without the utilization_limit parameter, which means that it is prioritized exactly the same. However, the resource manager will limit the total CPU usage of the processes in the consumer group that uses the directive BATCH to 10%. In addition to limiting the CPU usage, it will also limit parallel query server usage with the value of this parameter as percentage of the number of parallel servers in the database parameter PARALLEL_SERVERS_TARGET. In this case, you might want to disable parallel query server usage altogether (in order not to conflict with daytime usage), in which case you could use the directive parameter parallel_server_limit and set it to zero. Setting the directive parameter parallel_server_limit overrides the value of the utilization_limit directive parameter for parallel query server usage.

Another import thing to realize when using the utilization_limit directive parameter on Exadata is that the I/O resource manager in the cell servers will take the utilization_limit directive parameter and apply the same limitation percentage, but now for limiting I/O requests from processes in the consumer group.

Other Limiting Parameters

Another resource manager option implemented as directive parameters is to set a threshold for certain conditions. Once the threshold is reached, the session will switch consumer group to the group set in the switch_group directive parameter for a number of “switch” parameters (see Table 7-3).

Table 7-3. Resource manager directives switching parameters

Parameter

Description

switch_io_megabytes

Specifies the amount of I/O (in MB) that a session can issue.

switch_io_reqs

Specifies the amount of I/O requests that a session can issue.

switch_io_logical

Specifies the amount of logical I/Os that a session can issue.

switch_elapsed_time

Specifies the elapsed time in the session.

switch_time

Specifies the amount of CPU time in the session.

This option provides a lot of flexibility such as whether the amount specified with the parameters is calculated for the total session or per call. If it is set to be calculated per call (switch_for_call), the consumer group is switched for the duration of the call or not (switch_for_call). Please note that for switching consumer groups, the database user must have been granted access to these groups with the DBMS_RESOURCE_MANAGER_PRIVS.GRANT_SWITCH_CONSUMER_GROUP procedure.

Another set of directive parameters to set limits for specific usage are the following:

Table 7-4. Resource manager directives limiting parameters

Parameter

Description

undo_pool

Limits the size in kilobytes of undo for a consumer group.

max_idle_time

Maximum session idle time in seconds.

max_idle_blocker_time

Maximum time a session can be idle while blocking another session.

max_est_exec_time

Maximum execution time (in CPU seconds) for a session. If the optimizer estimates an operation will take longer, the operation is not started and an ORA-7455 is issued.

active_sess_pool_p1

Specifies the maximum number of sessions that can currently have an active call.

queueing_p1

Specifies the time after which a call in the inactive session queue will time out. Default is NULL, which means unlimited.

parallel_degree_limit_p1

Specifies a limit on the degree of parallelism for any operation. Default is NULL, which means unlimited. A value of 0 means all operations will be serial.

Consumer Group Mappings Using ORACLE_FUNCTION

The resource manager provides an option to map a small number of resource intensive specific functions to consumer groups:

Table 7-5. Resource manager mapping by function

ORACLE_FUNCTION

Description

BACKUP

Backup operations using RMAN.

COPY

Image copies using RMAN.

DATALOAD

Loading data using Data Pump.

When the database has strict performance objectives, it can be beneficial to map above functions to a consumer group to manage the resource consumption. Here is how that is done:

BEGIN
  DBMS_RESOURCE_MANAGER.CREATE_PENDING_AREA();
  DBMS_RESOURCE_MANAGER.SET_CONSUMER_GROUP_MAPPING
    (DBMS_RESOURCE_MANAGER.ORACLE_FUNCTION, 'BACKUP', 'MAINTENANCE'),

  DBMS_RESOURCE_MANAGER.SET_CONSUMER_GROUP_MAPPING
    (DBMS_RESOURCE_MANAGER.ORACLE_FUNCTION, 'COPY', 'MAINTENANCE'),
DBMS_RESOURCE_MANAGER.SUBMIT_PENDING_AREA();
END;
/

Monitoring the Resource Manager

After you have implemented a resource manager plan, you might want to see if the resource manager plan works as intended. In order to do that, there are dynamic performance views to show you the behavior of the resource manager. One of such dynamic performance views is the V$RSRC_CONSUMER_GROUP:

Table 7-6. V$RSRC_CONSUMER_GROUP view relevant fields

Column

Description

NAME

The resource group name.

ACTIVE_SESSIONS

The number of active sessions in the consumer group.

EXECUTION_WAITERS

The number of active sessions waiting for a time slice in which they can use the CPU.

REQUESTS

The cumulative number of requests made by sessions in the consumer group.

CPU_WAIT_TIME

The cumulative amount of time that Resource Manager made sessions in the Resource Group wait for CPU. This wait time does not include I/O waits, delays from queue or latch contention, or the like. CPU_WAIT_TIME is the sum of the elapsed time allocated to the resmgr:cpu quantum wait event for the consumer group.

CPU_WAITS

The cumulative number of times sessions were made to wait because of resource management.

CONSUMED_CPU_TIME

The total amount of CPU time accumulated (in milliseconds) by sessions in the consumer group.

YIELDS

The cumulative number of times sessions in the consumer group had to yield the CPU to other sessions because of resource management.

The following listing is a report you may use to display the metrics collected in the V$RSRC_CONSUMER_GROUP view. These metrics are a valuable tool for determining the effect our resource allocations had on the consumer groups during the test.

col name                        format a12            heading "Name"
col active_sessions             format 999            heading "Active|Sessions"
col execution_waiters           format 999            heading "Execution|Waiters"
col requests                    format 9,999,999      heading "Requests"
col cpu_wait_time               format 999,999,999    heading "CPU Wait|Time"
col cpu_waits                   format 99,999,999     heading "CPU|Waits"
col consumed_cpu_time           format 99,999,999     heading "Consumed|CPU Time"
col yields                      format 9,999,999      heading "Yields"

SELECT DECODE(name, '_ORACLE_BACKGROUND_GROUP_', 'BACKGROUND', name) name,
       active_sessions, execution_waiters, requests,
       cpu_wait_time, cpu_waits, consumed_cpu_time, yields
  FROM v$rsrc_consumer_group
ORDER BY cpu_wait_time;

               Active Execution               CPU Wait         CPU      Consumed
Name         Sessions   Waiters Requests          Time       Waits      CPU Time     Yields
------------ -------- --------- -------- ------------- ----------- ------------- ----------
BACKGROUND         34         0       76             0           0             0          0
APPS               30        13       30    87,157,739  11,498,286    47,963,809    365,611
REPORTS            30        27       31   145,566,524   2,476,651    10,733,274     78,950
MAINTENANCE        30        29       30   155,018,913   1,281,279     5,763,764     41,368
OTHER_GROUPS       34        29      131   155,437,715   1,259,766     5,576,621     40,168

If you are using the multitenant option, this view will show the resource management statistics per container/PDB. In order to relate the statistics to the pluggable database, add the con_id field in the query.

Image Tip  The V$RSRCMGRMETRIC and V$RSRCMGRMETRIC_HISTORY views are also very useful for monitoring the effects that your DBRM resource allocations have on sessions in the database.

Resource Manager Views

Oracle supplies a number of views that report configuration, history, and metric for Resource Manager. Let’s take a look at a few of the views that are useful for reviewing and monitoring resources in your DBRM configuration:

V$RSRC_PLAN: This view displays the configuration of the current active resource plan.

V$RSRC_PLAN_HISTORY: This view shows historical information, including when they were activated and deactivated and whether they were enabled by the database scheduler or scheduler windows. The history includes the latest 15 plans.

V$RSRC_CONSUMER_GROUP: This view shows information about the current active consumer groups, including performance metrics.

V$RSRC_CONS_GROUP_HIST: This view shows the historical information from the V$RSRC_CONSUMER_GROUP view, for which the SEQUENCE# column needs to be joined to the same column in the V$RSRC_PLAN in order to get plan name and times.

V$RSRC_SESSION_INFO: This view shows performance statistics for session and how they were affected by the Resource Manager.

V$SESSION: This is not really a Resource Manager view; however, the RESOURCE_CONSUMER_GROUP field is useful for determining what resource group a session is assigned to.

V$RSRCMGRMETRIC: This view contains resource usage per consumer group of a past minute and is refreshed per minute.

V$RSRCMGRMETRIC_HISTORY: This view contains one hour of history of the contents of V$RSRCMGRMETRIC.

DBA_CDB_RSRC_PLANS / CDB_CDB_RSRC_PLANS: CDB resource plans.
DBA_CDB_RSRC_PLAN_DIRECTIVES / CDB_CDB_RSRC_PLAN_DIRECTIVES:
CDB resource plan directives.

CDB_RSRC_CATEGORIES: Database consumer group categories per container.

DBA_RSRC_CATEGORIES: Database consumer group categories.

DBA_RSRC_PLANS: Lists all database resource plans in the data dictionary, together with plan settings.

DBA_HIST_RSRC_PLANS: Historical information on resource manager plans (AWR; licensed view).

DBA_RSRC_CONSUMER_GROUPS: Lists all consumer groups created in the dictionary.

DBA_HIST_RSRC_CONSUMER_GROUPS: Historical information on resource manager consumer groups (AWR; licensed view).

DBA_RSRC_CONSUMER_GROUP_PRIVS: lists all the users and the consumer groups to which they have been granted access. As has been emphasized before, a user must be granted access to any other group than OTHER_GROUPS before the Resource Manager can switch a user to that group.

DBA_RSRC_PLAN_DIRECTIVES: Lists all resource manager directives stored in the data dictionary.

DBA_RSRC_GROUP_MAPPINGS: Lists session to resource group mapping rules defined in the data dictionary.

DBA_RSRC_MAPPING_PRIORITY: Lists the priority of session attributes, which is used by the Resource Manager to determine in which order to evaluate mapping rules.

DBA_USERS: This is not a resource manager view, but the field INITIAL_RSRC_CONSUMER_GROUP shows the initial resource manager group a database user is assigned to.

DBA_RSRC_IO_CALIBRATE: This view is related to the DBMS_RESOURCE_MANAGER.CALIBRATE_IO procedure and shows the outcome of the calibration.

DBA_RSRC_MANAGER_SYSTEM_PRIVS: This view shows all users and roles granted the ADMINISTER_RESOURCE_MANAGER system privilege. This privilege must be granted using the DBMS_RESOURCE_MANAGER_PRIVS package.

I/O Resource Manager

Earlier in this chapter, we discussed Oracle’s Database Resource Manager, which manages CPU resources within a database through consumer groups and plan directives. Sessions are assigned to resource groups, and plan directives manage the allocation of resources by assigning values such as a CPU percentage or a share to resource management attributes like MGMT_P1. The database resource manager, however, is limited to managing resources within the database. The database resource manager actually does manage I/O resources, but in a somewhat indirect manner by limiting CPU and parallelism available to user sessions (through prioritization and limiting as set in the directives to which consumer groups are mapped). It is extremely important to realize that any session needs to be allowed to run by the database resource manager in the first place, and then needs to be able to get a timeslice to run on the CPU (on the operating system level) before it can request I/Os. In other words, CPU is the ultimate resource needed to do I/O.

Before Exadata came along, Oracle had no presence at the storage tier, and limiting CPU and parallelism to database sessions was the only way to (indirectly) manage I/O. Exadata lifts I/O Resource Management above the database tier and manages I/O at the storage cell in a very direct way. Databases installed on Exadata send I/O requests to cellsrv on the storage cells using a proprietary protocol known as Intelligent Database protocol (iDB). Using iDB, the database packs additional attributes in every I/O call to the storage cells. This additional information is used in a number of ways. For example, IORM uses the type of file (redo, undo, datafile, control file, and so on) for which the I/O was requested to determine whether caching the blocks in Flash Cache would be beneficial or not. Five attributes known to us are embedded in the I/O request: performance profile name (if set), database, container in the case of multitenancy, the consumer group, and the consumer group’s category. These five bits of additional information are invaluable to Oracle’s intelligent storage. Knowing which database or container is making the request allows IORM to prioritize I/O requests by database.

Categories extend the concept of consumer groups on Exadata platforms. Categories are assigned to consumer groups within the database using the Database Resource Manager. Common categories, defined in multiple databases, can then be allocated a shared I/O priority. For example, you may have several databases that map user sessions to an INTERACTIVE category. I/O requests coming from the INTERACTIVE category may now be prioritized over other categories such as REPORTS, BATCH, or MAINTENANCE.

Resource management profiles are a new, easy way to group databases to use a certain I/O resource management profile. Resource management profiles are introduced with Exadata version 12.1.2.1.0 and database version 12.1.0.2. Resource management profiles differ from categories in the sense resource management groups group entire databases, while categories group sessions in different databases with a the same category as set in a directive.

IORM Methods

IORM provides four distinct methods for I/O resource management: Interdatabase, Category, Intradatabase, and Resource Profile. These methods may be used individually or in combination.

Interdatabase Resource Plans

IORM determines the priority of an I/O request based on the name of the database initiating the request and its priority set in the IORM plan. Interdatabase IORM is useful when Exadata is hosting multiple databases and you need to manage I/O priorities between the databases. This is a plan created on each storage server.

Category Resource Plans

IORM determines the priority of an I/O request among multiple databases by the category that initiated the request and the priority for the category set in the plan. Managing I/O by category is useful when you want to manage I/O priorities by workload type. For example, you can create categories like APPS, BATCH, REPORTS and MAINTENANCE in each of your databases and then set an I/O allocation for these categories according to their importance to your business. If the APPS category is allocated 70%, then sessions assigned to the APPS category in all databases share this allocation. This is a plan created on each storage server, working together with a category set by the database resource manager to a consumer group in multiple databases.

Intradatabase Resource Plans

Unlike Interdatabase and Category IORM, Intradatabase IORM is configured at the database tier using DBRM. DBRM has been enhanced to work in partnership with IORM to provide fine-grained I/O resource management using resource groups defined in the database. This is done by allocating the I/O share and priority in the storage server to consumer groups using the same mechanism used to allocate CPU, the MGMT_Pn attribute, and the utilization_limit parameter. For example, the SALES database may be allocated 50% using an Interdatabase IORM plan. That 50% may be further distributed to the APPS, REPORTS, BATCH, and OTHER_GROUPS consumer groups within the database. This ensures that I/O resources are available for critical applications, and it prevents misbehaving or I/O-intensive processes from stealing I/O from higher-priority sessions inside the database. The management of pluggable databases also falls into the intradatabase IORM plan’s responsibility.

Resource Management Profiles

Resource management profiles, fully named I/O resource management interdatabase plans, require setting a parameter in the database or container database (DB_PERFORMANCE_PROFILE) to a profile name defined on the storage servers. The big difference with Interdatabase plans is that instead of setting a directive for every database name in the dbplan, the names set in the dbplan now reflect profile names, to which databases are mapped by setting the DB_PERFORMANCE_PROFILE parameter. This is aimed at environments where the number of databases is moderate to high (10 or more) to simplify IORM by having a few profiles to which all the databases are mapped. In order for the storage server to understand it is a performance profile and not a database name, “type=profile” is added to the definition. Profiles cannot be used together with category plans.

How IORM Works

IORM manages I/O at the storage cell by organizing incoming I/O requests into queues according to the database name, category or profile, or consumer group that initiated the request. It then services these queues according to the priority defined for them in the resource plan. IORM only actively manages I/O requests when needed. This means that when a flash device or cell disk is not fully utilized, an I/O request is dispatched to the device immediately by cellsrv. However, when a flash device or cell disk is fully utilized, IORM will queue the I/O request in an I/O queue (per device), and schedule the I/O requests in the queues according to the priority defined in the resource manager. Please note that with storage server versions prior to 12.1.2.1.0, flash was excluded from resource management.

For example, there are two databases using a cell server, DB1 and DB2. The I/O resource plan in the cells is defined to prioritize DB1 for 75% and DB2 for 25%. When there is excess capacity available, the I/O requests will be serviced in a first-in-first-out (FIFO) manner. But when the devices on the cells begin to saturate, I/O requests will get queued. Once queued, IORM enforces the resource plan, which means the I/O requests are reordered as defined in the resource plan. Background processes are prioritized, too, based on internal resource plans with different priorities for different types of I/O. Needless to say, I/O requests of critical background processes like the log writer are given higher priority than foreground sessions (client sessions).

Please mind I/O resource management for flash and hard disks is a feature of Exadata release 12.1.2.1.0 or higher. Earlier versions of IORM only managed hard disks. Also, resource management for flash is different from resource management for hard disks. The way flash I/Os are prioritized is that small I/Os (smaller than 128K) always take priority over large I/Os, regardless of any IORM plan. Only for large I/Os, resource management plans are applied.

IORM Architecture

For every physical storage device in use by the storage server, an IORM queue is kept, regardless of the existence of any resource manager plan. The IORM queues are first organized by category or profile, depending on which type is used. This is the reason profiles and categories cannot work together. After been organized by category, the databases are organized by interdatabase plan and lastly by intradatabase plan. This is the IORM evaluation order:

  1. Category/profile
  2. Interdatabase plan
  3. Pluggable database
  4. Intradatabase plan

In order to see this for yourself, you could dump IORM information from the cell daemon. This dump is not documented and should not be done on a live system. This information is provided for completeness and educational purposes. The IORM dump information is managed and kept per cell; the cell daemons do not work together like a cluster, but work standalone.

First, log on to a cell, start cellcli, and execute: alter cell events = 'immediate cellsrv.cellsrv_statedump(2,0)':

CellCLI> alter cell events = 'immediate cellsrv.cellsrv_statedump(2,0)'
Dump sequence #11 has been written to /opt/oracle/cell/log/diag/asm/cell/enkcel04/trace/svtrc_31370_114.trc
Cell enkcel04 successfully altered

It is very important to note the name of the trace file. The dump is very large and contains a lot of information for the purpose of bug resolving and diagnosis for Oracle development. Here is a snippet from the file, showing IORM statistics for disk /dev/sdc:

IORM stats for disk=/dev/sdc
Heap stats: Inuse=62KB Total=207KB
--------- IORM Workload State & Characterization ---------
IORM: Solo Workload
Solo workload (no db or cg): 0 transitions
IORM boost =429.000000 (cnt = 3395, amt = 1458813)
#Bypassedios=0 #IOs skipped for fencing=1932 #IOs cancelled=0
#served=14073 bitmap=0x0 #queued=0 adtime=0ms asmrdtime=3ms #cumulserved=56404684 #pending=0 #lpending=0
busystate=1 #max_conc_io=5 write_cache_hit_rate=98% iocost=0.66ms
#free_lrg_wrts=11 #free_sml_wrts=85728 write_cost_adjust=0%
        catidx=0 bitmap=0x0 OTHER
          SIO:#served=3249 #queued=0 Util=2% aqtime=2us ahdtime=445us afdtime=0us
                dbidx=0 bitmap=0x0 DBM01
                  SIO:#served=4 #queued=0 Util=0% aqtime=2us ahdtime=123us afdtime=0us lastreqtime=Wed Jan
                  7 19:46:46 1970 97 msec

                        cgidx=0 bitmap=0x0 cgname=ORA$AUTOTASK limit=90
                        cgidx=1 bitmap=0x0 cgname=OTHER_GROUPS
                        cgidx=2 bitmap=0x0 cgname=SYS_GROUP
                        cgidx=3 bitmap=0x0 cgname=_ORACLE_BACKGROUND_GROUP_
                          SIO:#served=4 #queued=0 Util=0% aqtime=2us ahdtime=123us afdtime=0us
                                 #concios=4, #fragios=0 #starvedios=0 #maxcapwaits=0
                        cgidx=4 bitmap=0x0 cgname=_ORACLE_MEDPRIBG_GROUP_
                        cgidx=5 bitmap=0x0 cgname=_ORACLE_LOWPRIBG_GROUP_
                        cgidx=6 bitmap=0x0 cgname=_ORACLE_LOWPRIFG_GROUP_

(This dump was created on a cell server version 12.1.2.1.0; this information might not be available in earlier versions.)

This dump shows that the statistics are kept per device. As indicated, this is the third physical disk, disk /dev/sdc. The first accentuated piece is “catidx=0”; this shows that first categories are evaluated. If no categories or profiles have been defined, there are two categories, OTHER and _ASM_. The second accentuated piece is “dbidx=0.” This is the list of databases that the resource manager has seen for this device up until now, and the second level at which IORM can operate. The third accentuated piece is “cgidx=0” and shows the list of consumer groups in the database. Consumer group 3, 4, 5, and 6 are Oracle internal consumer groups, which have been mentioned earlier in the chapter. The _ORACLE_BACKGROUND_GROUP_ is the internal consumer group for critical I/Os of background processes. Without an intradatabase resource manager plan configured in the database, all user/foreground processes are mapped to the OTHER_GROUPS consumer group.

IORM Objective

After installation, the cell server already is running with IORM active. Versions of the storage server prior to version 11.2.3.2 were running without IORM and needed an IORM plan to be created before IORM could be enabled. Starting from version Exadata version 11.2.3.2, IORM is always enabled and running with the IORM objective ‘basic.’ Following is a list of the objectives that can be set:

  • low_latency: This setting provides optimization for applications that are extremely sensitive to I/O latency. It provides the lowest possible I/O latencies by significantly limiting disk utilization. In other words, throughput-hungry applications will be significantly (negatively) impacted by this optimization objective.
  • high_throughput: This setting provides the best possible throughput for DW transactions by attempting to fully utilize the I/O capacity of the storage cells. It is the opposite of low_latency and, as such, it will significantly (negatively) impact disk I/O latency.
  • Balanced: This setting attempts to strike a balance between low latency and high throughput. This is done by limiting disk utilization for large I/O operations to a lesser degree than the low_latency objective described above. Use this objective when workloads are mixed and you have no applications that require extremely low latency.
  • Auto: This setting allows IORM to determine the best optimization objective for your workload. Cellsrv continuously monitors the large/small I/O requests and applies the optimization method on a best-fit basis. If 75% or more of the I/O operations from a consumer group are small I/O (less than 128K), it is considered to be a latency-oriented consumer group and is managed accordingly.
  • Basic: This setting is the default after installation. This objective means the storage server does limited optimization for low latency, but does not throttle, which the other objectives do.

The objective can be set using the cellcli command-line tool:

CellCLI> alter iormplan objective='auto';
IORMPLAN successfully altered

In most cases, it is beneficial to set the objective to auto. Doing so, IORM tries to optimize based on the usage patterns of the databases.

The objective is applied only to hard disk I/Os. For I/Os done to flash prior to storage server version 12.1.2.1.0, there was no prioritization. Starting from storage server version 12.1.2.1.0, the objective for flash I/Os is to favor small I/Os (128KB or less), regardless of the objective set.

Configuring Interdatabase IORM

An interdatabase I/O resource management plan is configured using an IORM plan. This plan determines the database I/O priorities for the storage cell. IORM plans are created using the CellCLI command ALTER IORMPLAN. There can be only one IORM plan per storage cell, regardless of how many database instances (clustered or single instance) use it for storage. Creating an interdatabase IORM plan is fairly simple. The first step is to determine what your allocation policy should be for each database. You will use these allocation policies to define the directives for your IORM plan. The LEVEL attribute specifies the priority a database should be given relative to other databases in the plan. The ALLOCATION attribute determines the percentage of I/O a database will be given out of the total I/O available on its level. There always needs to be an other directive for any database that is not listed in the plan. The following example demonstrates how you to create an IORM plan:

CellCLI> alter iormplan dbplan=((name=database1, level=1, allocation=60), -
> (name=database2, level=2, allocation=80), -
> (name=other, level=3, allocation=100))
IORMPLAN successfully altered

The CellCLI command list iormplan detail displays our new IORM plan. Notice that the catPlan attribute is empty. This is a placeholder for the Category IORM plan we will be looking at in the next section.

CellCLI> list iormplan detail
         name:                       enkcel04_IORMPLAN
         catPlan:
         dbPlan:                     name=database1,level=1,allocation=75
                                     name=database2,level=2,allocation=80
                                     name=other,level=3,allocation=100
         objective:                  auto
         status:                     active

The aggregate allocation for all databases on a level may not exceed 100%. If the sum of allocations on any level exceeds 100%, CellCLI will throw an error. For example, the following listing shows the error CellCLI produces when over 100% is allocated on level 1:

CellCLI> alter iormplan dbplan=((name=database1, level=1, allocation=75), -
> (name=database2, level=1, allocation=80), -
> (name=other, level=3, allocation=100))

CELL-00006: The IORMPLAN command contains an invalid allocation total at level 1.

Because IORM is always enabled, setting the above (correct) plan means the IORM database plan is enforced. If the plan does not work as intended, the plan can be disabled by specifying an empty database plan:

CellCLI> alter iormplan dbplan=''
IORMPLAN successfully altered

When the Data Guard option is used, the role of the database in the Data Guard setup (primary or standby) can be included in the IORM plan to specify different resource limitations for both roles:

CellCLI> alter iormplan dbplan=((name=database1, level=1, allocation=75, role=primary), -
> (name=database1, level=1, allocation=25, role=standby), -
> (name=database2, level=2, allocation=80), -
> (name=other, level=3, allocation=100))
IORMPLAN successfully altered

Because the I/O resource manager plan needs to be changed whenever a database is added that needs resource management other than set the other directive, it is a good idea to take a look at IO resource manager profiles for resource management a little later in this chapter. IORM profiles need cell server 12.1.2.1.0 or above and database version 12.1.0.2 Exadata bundle patch 4 or higher. They have the advantage that you can create a few resource profiles (High, Medium, Low, for example) and assign databases to these resource profiles by setting the parameter DB_PERFORMANCE_PROFILE to the profile name. Especially if there are new databases and old ones removed regularly, such as in a database as-a-service environment, this could save a lot of work.

Configuring Interdatabase IORM: Shares

Starting from cell server version 11.2.3.1, IORM plans can be created based on shares to express a database’s relative weight instead of percentages. Percentage-based plans have a limit of 32 databases, while share-based IORM plans can support up to 1023 databases. Share-based plans cannot have multiple levels like the percentage-based plans can, and are therefore a bit more limited. However, share-based plans are easier to read and understand than percentage-based plans. Here is how a share-based plan is configured:

CellCLI> alter iormplan dbplan=((name=database1, share=10), -
> (name=database2, share=4))
IORMPLAN successfully altered

Setting this plan means that when database1 and database2 are the only databases using the cell server, database1 gets 71% (10/(10+4)*100) and database2 get 29% of the IO resources assigned in case of queueing. By default, databases that are not listed in the share plan get a share of 1. If you want to change the default settings, you can specify the change by adding a directive for name=default.

Limiting Excess I/O Utilization

Ordinarily, when excess I/O resources are available (allocated but unused by other consumer groups), IORM allows a consumer group to use more than its allocation. For example, if the database1 database is allocated 50% at level 1, it may consume I/O resources above that limit if other databases have not fully utilized their allocation. You may choose to override this behavior by setting an absolute limit on the I/O resources allocated to specific databases. This provides more predictable I/O performance for multi-database server environments. The LIMIT IORM attribute is used to set a cap on the I/O resources a database may use even when excess I/O capacity is available. The following listing shows an IORM plan that caps the database1 database at 80% of the cell’s IO capacity.

alter iormplan dbPlan=((name=database1, level=1, allocation=50, limit=80), -
> (name=other, level=2, allocation=100))
IORMPLAN successfully altered

Please note that maximum I/O limits may also be defined at the consumer group level by using the UTILIZATION_LIMIT attribute in your DBRM resource plans.

Configuring Interdatabase IORM: Flash Attributes

When creating a plan, either using the allocation attribute to set percentages or the shares attribute to set relative weight, the storage server provides a number of plan attributes to manage how flash is used. The following attributes can be used to manage the use of the Flash Log feature and the Flash Cache feature:

CellCLI> alter iormplan dbplan=((name=database1,share=10,flashlog=on,flashcache=on), -
> (name=database2,share=4,flashlog=on,flashcache=off), -
> (name=default,share=1,flashlog=off,flashcache=off))
IORMPLAN successfully altered

The usage of the Flash Log and Flash Cache attributes is very straightforward and has the simple options on or off. In the above example, any other database named in the plan will have the Flash Log and Flash Cache features disabled because default is redefined in the plan line with name=default. The default setting (meaning not explicitly defined in an IORM plan) of the Flash Log and Flash Cache attributes is on.

Starting from Exadata version 12.1.2.0, IORM provides a few more flash management attributes: flashcachemin and flashcachelimit. The flashcachemin attribute guarantees a minimal size for database objects in the Flash Cache; flashcachelimit sets a maximum to the amount of space a database can use in the Flash Cache:

CellCLI> alter iormplan dbplan=((name=database1,share=10,flashcachemin=1G), -
> (name=database2,share=4,flashcachemin=500M,flashcachelimit=1G), -
> (name=default,share=1,flashcachelimit=500M))
IORMPLAN successfully altered

In this example, database1 has a minimum size set for flash usage of 1G and no limit. database1 could be an important database that needs flash for performance. The other named database, database2, also is guaranteed an amount of space in the Flash Cache, 500M, but is limited to a maximum of 1G. Any other database will have a maximum of 500M of space usage of the Flash Cache.

Category IORM

The I/O Resource Manager (IORM) extends the concept of resource groups with an attribute known as a category. While resource groups allow DBRM to manage resources within a database, categories provide I/O resource management among multiple databases. For example, suppose our two databases (database1 and database2) have similar workloads. They both host OLTP applications that do short, time-sensitive transactions. During business hours, these transactions must take priority. These databases also do a fair amount of batch processing, such as running reports and maintenance jobs. The batch processing takes a lower priority during business hours. These two workloads can be managed and prioritized using IORM categories. The categories APPS_CATEGORY and BATCH_CATEGORY can be defined in both databases for high-priority applications and long-running, lower-priority activities, respectively. If APPS_CATEGORY is allocated 70% on level 1, then no matter how heavily loaded the storage grid is, sessions assigned to this category, for both databases, will be guaranteed a minimum of 70% of all I/O.

Configuring Category IORM

Setting up Category IORM is fairly straightforward. Once you have created your DBRM consumer groups, you need to create categories in the database and assign them to your consumer groups. The final step is to create an IORM plan in the storage cells to establish I/O allocation and priority for each category. You can define as many as eight levels in your Category IORM Plan.

In order to show the use of IORM category plans, we will create two new categories, OLTP_CATEGORY, and BATCH_CATEGORY, and assign them to the OLTP, BATCH, and MAINTENANCE consumer groups. For the sake of brevity, the creation of the consumer groups is not shown. This example uses the consumer groups created in the section Creating a (Pluggable) Database Resource Plan. The following listing creates our new categories and assigns them to the resource groups. Remember that you will need to run these commands on all databases participating in the IORM category plan. For illustration purposes, we will keep the number of categories to two. The BATCH and MAINTENANCE resource groups will be assigned to the category BATCH_CATEGORY:

BEGIN
  dbms_resource_manager.clear_pending_area();
  dbms_resource_manager.create_pending_area();

  -- Create Categories --
  dbms_resource_manager.create_category(
     category => 'OLTP_CATEGORY',
     comment  => 'Category for Interactive Applications'),
  dbms_resource_manager.create_category(
     category => 'BATCH_CATEGORY',
     comment  => 'Batch and Maintenance Jobs'),

  -- Assign Consumer Groups to Categories --
  dbms_resource_manager.update_consumer_group(
     consumer_group => 'OLTP',
     new_category   => 'OLTP_CATEGORY'),
  dbms_resource_manager.update_consumer_group(
     consumer_group => 'BATCH',
     new_category   => 'BATCH_CATEGORY'),
  dbms_resource_manager.update_consumer_group(
     consumer_group => 'MAINTENANCE',
     new_category   => 'BATCH_CATEGORY'),

  dbms_resource_manager.submit_pending_area();
END;
/

PL/SQL procedure successfully completed.

To check your resource-group-to-category mappings, query the DBA_RSRC_CONSUMER_GROUPS view as follows. Notice that the OTHER_GROUPS consumer group was assigned to the OTHER category. That mapping is created automatically by Oracle and cannot be altered:

SQL> SELECT consumer_group, category
               FROM DBA_RSRC_CONSUMER_GROUPS
              WHERE consumer_group
                 in ('OLTP','BATCH','MAINTENANCE','OTHER_GROUPS')
              ORDER BY category;

CONSUMER_GROUP                 CATEGORY
------------------------------ ------------------------------
OLTP                           OLTP_CATEGORY
BATCH                          BATCH_CATEGORY
MAINTENANCE                    BATCH_CATEGORY
OTHER_GROUPS                   OTHER

Now we can create a new IORM category plan on the storage cells and set I/O limits on these categories. Before we do, though, we will drop the Interdatabase IORM plan we created in the previous example. Remember that each storage cell maintains its own IORM plan, so you will need to run these commands on every cell in your storage grid.

CellCLI> alter iormplan dbplan= ''

IORMPLAN successfully altered

Now we are ready to create our IORM category plan. The following command creates a plan in which OLTP_CATEGORY and BATCH_CATEGORY are allocated 70% and 30%, respectively, of the total cell I/O at level 1. The default category, OTHER, is allocated 100% on level 2.

CellCLI> alter iormplan catplan=((name=APPS_CATEGORY, level=1, allocation=70), -
> (name=BATCH_CATEGORY, level=1, allocation=30), -
> (name=OTHER, level=2, allocation=100))
IORMPLAN successfully altered

Again, we will use the CellCLI command list iorm detail and confirm that our IORM category plan is configured the way we want it:

CellCLI> list iormplan detail
         name:                       enkcel04_IORMPLAN
         catPlan:                    name=APPS_CATEGORY,level=1,allocation=70
                                     name=BATCH_CATEGORY,level=1,allocation=30
                                     name=OTHER,level=2,allocation=100
         dbPlan:
         objective:                  auto
         status:                     active

Because we dropped the interdatabase plan from the previous exercise, the dbPlan field is empty. Later, in the IORM metrics section of the chapter, we will discuss how the effects of IORM can be monitored at the storage cells.

I/O Resource Manager and Pluggable Databases

With Oracle database 12c, a new architecture was implemented in the database. The new architecture has been mentioned in a number of places already and is commonly referred to as the Oracle Multitenant Option. Multitenancy has been implemented almost everywhere in the database; there are only few functional gaps between the container database and non-container database. IORM thankfully supports pluggable databases. Just like with the intradatabase IORM plan, the settings you define on the CDB level by means of the CDB resource manager plan will be sent to the cells as soon as the plan is activated on the RDBMS layer. Whichever shares are made available to the PDB are equally applicable for CPU and I/O. As with all resource management functions in the database, I/O resource management on pluggable databases will only be visible if there is an I/O constraint.

I/O Resource Manager Profiles

Exadata Storage server version 12.1.2.1.0 introduced a new mechanism to easily group databases in performance categories called I/O resource management profiles. The idea is to create a named performance profile on the storage servers and let the database bind to such a profile using a parameter that has been set to the name of the performance profile. The database part of this feature is a parameter named DB_PERFORMANCE_PROFILE, which is introduced with Oracle database version 12.1.0.2 Exadata bundle patch 4.

Performance profiles and category IORM plans cannot be used at the same time.

When using performance profiles, up to eight profiles can be created. Performance profiles are designed for handling large amounts of databases. In order to handle large amounts, performance profiles use the share-based prioritization method only. You might recall that interdatabase resource plans using the allocation (percentage)-based method can only handle up to 32 databases. By using the share-based prioritization method, performance profiles can handle up to 1023 databases. The database parameter DB_PERFORMANCE_PROFILE is not dynamic, which means it needs a database restart in order to set it. When using DB_PERFORMANCE_PROFILE on a container database (multitenant option), it can only be set at the CDB/root level, not at the pluggable-database level. When set at the CDB/root level, all pluggable databases inherit the DB_PERFORMANCE_PROFILE setting from the root.

This is an example on how a performance profile plan can be set:

CellCLI> alter iormplan dbplan=((name=gold, share=12, limit=100, type=profile), -
> (name=silver, share=5, limit=65, type=profile), -
> (name=bronze, share=3, limit=45, type=profile), -
> (name=default, share=1, limit=10))
IORMPLAN successfully altered

Performance profiles and interdatabase plans are two different resource management options in the storage server. This means they can be used at the same time, although we discourage this for the sake of keeping the resource management as simple as possible. Performance profiles directives are recognizable by type=profile; interdatabase plans do not have a type, or they have type=database added to the dbplan directives.

After the performance profiles have been set at all the storage servers, the DB_PERFORMANCE_PROFILE parameter can be set at the databases (requiring a restart) to use the profile on the storage layer. Because performance profiles have been implemented at the category (first) level, interdatabase plans can be used to divide resources inside the profile and intradatabase plans for managing resources inside the database.

Image Note  In most cases, a single-level I/O resource plan is sufficient. As they do with DBRM, multi-level IORM resource plans increase the complexity of measuring the effectiveness of your allocation scheme.

When using multi-level allocation schemes, it’s important to understand that I/O resources allocated but unused by a database, category, or consumer group on level 1 are immediately passed to the next level. For example, if you have databases A and B allocated 70%/30% on level 1, and database C is allocated 100% at level 2, then if database A uses only 50% of its allocation, the remaining 20% is passed to database C. Database B cannot capitalize on I/O resources allocated but unused by database A because A and B are on the same level. This is a subtle but important distinction of multi-level plans. If you are not careful, you can find yourself unintentionally giving excess I/O resources to less important databases at lower levels rather than making those resources available to your higher-priority databases on level 1.

The share-based resource plans are single level by design, and they are easy to create and configure. In most cases, using a share-based plan is the best way to use IORM. Unless you have a specific need that can only be solved by (multi-level) allocation (percentage)-based plans, these should be used.

Resource Management Directives Matrix

With all the features available and some of the features that cannot be used together, the following matrix tries to give an overview of which directives can and cannot be used. This might help when designing an I/O resource management strategy.

9781430262411_unFig07-01.jpg

  1. LIMIT can be used by SHARES or LEVEL and ALLOCATION
  2. Should have both primary and standby directives set
  3. Only if using shares
  4. Only if using level and allocation
  5. UTILIZATION_LIMIT and PARALLEL_SERVER_LIMIT directives
  6. DEFAULT shares setting for new PDBs
  7. The easiest way is to go with SHARES
  8. Specified on MGMT_P1
  9. OTHER_GROUPS is required
  10. Category plan cannot be used when performance profiles are used (vice versa)
  11. Applies to DBRM and PDB
  12. DB_PERFORMANCE_PROFILE set on either non-CDB or CDB (PDBs inherit from CDB$ROOT)

IORM Monitoring and Metrics

I/O performance metrics are collected and maintained for IORM in the storage cell. These metrics may be used to determine the effects your IORM plan has on the databases, categories or profiles, and resource group resource directives you defined in your environment. For example, you can see how much I/O a particular database is using compared to other databases. By observing the actual I/O distribution for your IORM consumer groups, you can determine whether adjustments need to be made to provide adequate I/O resources for applications using your databases. In this section, we will look at how these metrics are organized and tap into the valuable information stored there. More information about Exadata monitoring, including information taken from the storage servers, is found in Chapters 11 and 12.

There are two different sources for metrics available via cellcli: METRICCURRENT, which are metrics measured over a minute after which they are available via METRICCURENT, and METRICHISTORY, which are the metrics as exposed via METRICCURRENT, which are the expired metrics of METRICCURRENT stored for a certain amount of days—seven by default. The setting can be seen by querying the metrichistorydays attribute of the cell:

CellCLI> list cell attributes name, metrichistorydays
         enkcel04         7

If there is a need to change the time, the metric history is kept. It can be done using an alter cell command:

CellCLI> alter cell metricHistoryDays='14'
Cell enkcel04 successfully altered

CellCLI> list cell attributes name, metricHistoryDays
         enkcel04        14

In most cases, seven days is enough history for the typical use of the cell metrics.

Understanding IORM Metrics

When questions on I/O throughput arise, with or without IORM plans created and set, one of the first things to check on the cell is the actual activity. The following cellcli command shows the current amount of requests per database grouped by the storage media used:

CellCLI> list metriccurrent where objecttype='IORM_DATABASE' and name like 'DB_(FC_|FD_|)IO_BY_SEC'
         DB_FC_IO_BY_SEC         ASM                      0 MB/sec
         DB_FC_IO_BY_SEC         TEST                     0 MB/sec
         DB_FC_IO_BY_SEC         COLVIN                   0 MB/sec
         DB_FC_IO_BY_SEC         MBACH                    0 MB/sec
         DB_FC_IO_BY_SEC         _OTHER_DATABASE_         0 MB/sec
         DB_FD_IO_BY_SEC         ASM                      0 MB/sec
         DB_FD_IO_BY_SEC         TEST                     0 MB/sec
         DB_FD_IO_BY_SEC         COLVIN                   0 MB/sec
         DB_FD_IO_BY_SEC         MBACH                    0 MB/sec
         DB_FD_IO_BY_SEC         _OTHER_DATABASE_         0 MB/sec
         DB_IO_BY_SEC            ASM                      0 MB/sec
         DB_IO_BY_SEC            TEST                     0 MB/sec
         DB_IO_BY_SEC            COLVIN                   0 MB/sec
         DB_IO_BY_SEC            MBACH                    0 MB/sec
         DB_IO_BY_SEC            _OTHER_DATABASE_         0 MB/sec

The list command lists the bandwidth used by the databases that are visible to the cell by Flash Cache (DB_FC_IO_BY_SEC), flash disk (DB_FD_IO_BY_SEC), and hard disk (DB_IO_BY_SEC). There are a few things worth pointing out. The database ASM is a special database indicator for ASM-initiated and related tasks, like rebalancing. Also _OTHER_DATABASE_ is a special database name for IO tasks not directly related to IO on behalf of the compute layer. An example of _OTHER_DATABASE_ IO is the destaging of Write-back Flash Cache data to disk.

IORM Metrics: metric_iorm.pl

In order to make fetching and interpreting cell I/O performance and I/O resource manager statistics on the command line easier, Oracle provides a perl script called metric_iorm.pl in My Oracle Support note 1337265.1 (Tool for Gathering I/O Resource Manager Metrics: metric_iorm.pl). This tool is very valuable for doing diagnosis on both absolute disk and flash performance, and diagnosing the effects of IO resource management. There is not much to say about this script because the My Oracle Support note is very complete on installation (the installation is done by placing the script in the home directory of the user to be used on the cell servers, which would typically be root or celladmin) and usage (essentially on how to get historical data out).

This script fetches the statistics on disk and flash I/O performance and get IORM-related statistics, and provides these in a readable way. You can read more on these statistics in the section IORM-Related Metrics Overview. Chapter 11 provides further examples and use cases for the script.

Workload Management

IORM metrics provide insight into how cellsrv is allocating I/O resources among the consumers in your storage grid. Cellsrv keeps track of I/O requests broadly categorized as “small requests” and “large requests.” By comparing the large (LG) and small (SM) I/O requests in the IORM metrics, you can determine whether your databases lean more toward a DW workload (high throughput) or an OLTP workload (low latency). By comparing the IORM_MODE with the actual workload on your storage cells, you can determine whether the current IORM objective is appropriate or not. For example, if you find that a majority of I/O operations in the storage cells are greater than 128K and thus large (LG), then you could consider setting the objective for high throughput. Likewise, if you find the majority of I/O operations in the storage cells are smaller than 128K, you could consider setting the objective for low latency. However, be aware of the implications of setting the objective to a value outside of basic or auto; the objective high_throughput can increase latency times and low_latency can decrease throughput. We find the majority of the client’s to use basic or auto.

The cell server keeps track of the IORM objective set, which includes, as a quick reminder, basic, auto, low_latency, balance , and high_throughput. However, the cell server itself works with three modes of optimization for doing I/O:

  • Mode 1:    low_latency
  • Mode 2:    balanced
  • Mode 3:    high_throughput

In order to see what mode the cell server currently is set to, use the following command:

CellCLI> list metriccurrent iorm_mode
         IORM_MODE         enkcel04         2

When I/O latencies or bandwidth changes are influencing database processing time and the IORM objective was set to auto, the statistic IORM_MODE can be investigated for changes over time using the METRICHISTORY values:

CellCLI> list metrichistory where name = 'IORM_MODE' attributes name, metricvalue, collectiontime
         IORM_MODE         2         2015-05-06T07:59:16-05:00
         IORM_MODE         2         2015-05-06T08:00:16-05:00
         IORM_MODE         2         2015-05-06T08:01:16-05:00
         IORM_MODE         2         2015-05-06T08:02:16-05:00
         IORM_MODE         2         2015-05-06T08:03:16-05:00
         IORM_MODE         2         2015-05-06T08:04:16-05:00
         IORM_MODE         2         2015-05-06T08:05:16-05:00
..snip..

IORM-Related Metrics Overview

The metrics we are interested in for IORM monitoring have an objectType of IORM_DATABASE, IORM_CATEGORY, IORM_PLUGGABLE_DATABASE, and IORM_CONSUMER_GROUP. These metrics are further organized by the name attribute in the METRICCURRENT object. The name of the metric is a concatenation of abbreviations that indicate the type of I/O consumer group, the type of storage device, and a descriptive name. The elements of the name attribute appear as follows:

{consumer_type}_{device type}_{metric}

Where consumer_type represents the IORM resource group and is one of these:

  • DB = Interdatabase IORM Plan
  • CT = Category IORM Plan
  • CG = Intradatabase IORM Plan
  • PDB = Pluggable database IORM Plan

And device_type is the type of storage that serviced the I/O request and is one of the following:

  • FC = Flash Cache
  • FD = Flash-based grid disk
  • '' = If neither of the above, then the metric represents I/O to physical disks

The last part of the attribute, {metric}, is the descriptive name of the metric. The metric name may be further qualified by SM or LG, indicating that it represents small I/O requests or large I/O requests. For example:

  • CG_FC_IO_RQ_LG: The total number of large I/O requests serviced from Flash Cache (FC), for DBRM consumer groups.
  • CG_FD_IO_RQ_LG: The total number of large I/O requests serviced from flash-based grid disks (FD), for DBRM consumer groups.
  • CG_IO_RQ_LG: The total number of large I/O requests serviced from physical disks (grid disks), for DBRM consumer groups.

Below is a table of metrics that interdatabase, category, intradatabase, and pluggable database resource plans have in common. This list is not exhaustive; there are many more metrics.

Name

Description

{CG, CT, DB, PDB}_IO_BY_SEC

Megabytes per second scheduled for this I/O consumer.

{CG, CT, DB, PDB}_IO_LOAD

Average I/O load for this I/O consumer. I/O load specifies the length of the disk queue. It is similar to the iostat avgqu-sz, but the value is weighted depending on type of disk; for hard disks, a large I/O has three times the weight of a small I/O. For flash disks, large and small have the same weight.

{CG, CT, DB, PDB}_IO_RQ_{SM, LG}

The cumulative number of small or large I/O requests from this I/O consumer.

{CG, CT, DB, PDB}_IO_RQ_{SM, LG}_SEC

The number of small or large I/O requests per second issued by this I/O consumer.

{CG, CT, DB, PDB}_IO_WT_{SM,LG}

The cumulative time (in milliseconds) that I/O requests from the I/O consumer have spent waiting to be scheduled by IORM.

{CG, CT, DB, PDB}_IO_WT_{SM,LG}_RQ

Derived from {CG,CT,DB}_IO_WT_{SM,LG} above. It stores the average number of waits (in milliseconds) that I/O requests have spent waiting to be scheduled by IORM in the past minute. A large number of waits indicates that the I/O workload of this I/O consumer is exceeding its allocation. For lower-priority consumers, this may be the desired effect. For high-priority consumers, it may indicate that more I/O should be allocated to the resource to meet the objectives of your organization.

{CG, CT, DB, PDB}_IO_UTIL_{SM, LG}

The percentage of total I/O resources consumed by this I/O consumer.

{CG, CT, DB, PDB}_IO_TM_{SM, LG}

The cumulative latency of reading small and large blocks by this I/O consumer.

{CG, CT, DB, PDB}_IO_TM_{SM, LG}_RQ

The average latency for reading blocks for this I/O consumer.

All cumulative metrics above are reset to 0 whenever cellsrv is restarted, the IORM plan is enabled, or the IORM plan changes for that I/O consumer group. For example, if the IORM category plan is changed, the following cumulative metrics will be reset:

CT_IO_RQ_SM
CT_IO_RQ_LG
CT_IO_WT_SM
CT_IO_WT_LG
CT_IO_TM_SM
CT_IO_TM_LG

These IORM metrics are further categorized using the metricObjectName attribute. Interdatabase resource plan metrics are stored in detail records, where metricObjectName is set to the corresponding database name. In a similar fashion, metrics for IORM category plans are identified with a metricObjectName matching the category name. IORM consumer groups are identified by a concatenation of the database name and the name of the DBRM consumer group. And, finally, IORM information related to PDBs uses the <database name>.<PDB name> syntax in the metricObjectName field.

Background Processes

As we discussed earlier, database background processes are automatically assigned to built-in (internal) IORM consumer groups according to priority. The following table shows these special IORM consumer groups, along with a description of what they are used for.

Table 7-7. Resource manager internal consumer groups

Consumer Group

Description

_ASM_

I/O related to Oracle ASM volume management

_ORACLE_BG_CATEGORY_

High-priority I/O requests issued from the database background processes

_ORACLE_MEDPRIBG_CATEGORY_

Medium-priority I/O requests issued from the database background processes

_ORACLE_LOWPRIBG_CATEGORY_

Low-priority I/O requests issued from the database background processes

It is important to realize that the priority for highly sensitive I/O requests, like logwriter writes, are enforced via an (internal) consumer group. But it might be too late in the chain of events. Recall the evaluation steps of IORM:

  1. Category/profile
  2. Interdatabase plan
  3. Pluggable database
  4. Intradatabase plan

You see that despite the naming high-priority I/O requests group, the evaluation of IORM is at the fourth place for the group, which means that if resources have already been given out in earlier steps, like an interdatabase plan, the database can be left with little priority. This means that even while named high-priority I/O group, it still can have very little priority in the overall situation when I/O pressure arises. Of course, it does mean that this group gets the highest priority for its I/Os with respect to I/Os done from the same database with other consumer groups.

This changes when category plans are enabled. With category plans enabled, internal category plans are created, like the above internal consumer groups, and background and ASM requests are directly prioritized. Inside the internal plans, the prioritization is done based on interdatabase plan priorities.

Summary

One of the biggest challenges for DBAs is effectively managing system resources to meet business objectives, especially when databases are consolidated. Over the years, Oracle has developed a rich set of features that make resource management a reality. Unfortunately, these features are rarely implemented due to their complexity. But make no mistake, as servers become more powerful and efficient, database consolidation is going to become increasingly common. This is especially true with the Exadata platform. Understanding how to use database and I/O resource management is going to become an increasingly important tool for ensuring your databases meet the demands of your business.

The best advice we can offer is to keep things simple. Attempting to make use of every bell and whistle in the Oracle Resource Manager can lead to confusion and undesirable results. If you do not have a specific need for multi-level resource plans, stick to the single-level approach; rather use the shares based method, which is single level by design. Category plans are another rarely needed (and used) feature. The new Exadata 12.1.2.1.0 and database 12.1.0.2 BP4 feature performance profiles looks promising and an easy approach to do resource management. A majority of the situations you will face can be resolved by implementing a simple, single-level interdatabase resource plan. This means that the best advice on resource management that we can give is start small, keep it simple, and add features as you need them.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset