Chapter 5. Virtualization

Virtualization is a term used liberally within computing. In its broadest sense, virtualization is used to define any technology solution where a level of abstraction is applied to separate the consumers of resources from the compute resources themselves. Virtualization is a term frequently used in grid computing, and RAC is correctly identified as a virtualization technology in its own right. Within this context, RAC is the abstracting technology that enables a number of separate physical servers to appear to Oracle database resource consumers as if it were a single database.

RAC One Node is the logical extension of applying virtualization terminology to RAC. In essence, RAC One Node is an installation of RAC, within which only a single instance is active for servicing workloads at any particular time. The additional nodes provide failover functionality and an ability to upgrade and patch the initial instance. However, for installation and configuration, RAC One Node is an administration-managed RAC database installed on one node only, and it's identified with the raconeinit command. Therefore the technical implementation of RAC One Node for virtualization is the same as covered for RAC in general throughout this book. RAC One Node is also supported within a standard virtualized environment, as we will detail in this chapter. For this reason, we are going narrow our usage of the term virtualization to apply to platform or server virtualization and the dedicated Oracle virtualization software called Oracle VM. Oracle VM is one of the most exciting and evolving technologies for Oracle DBAs supporting clustered environments. Therefore in this chapter we explore the full capabilities of Oracle VM. We do not exclude features such as high availability that can be used for functionality similar to RAC but also where Oracle VMcomplements RAC environments.

Virtualization Definition and Benefits

In platform virtualization, the technology solution enables the hardware resources of one physical server or platform to be distributed between multiple operating system environments.

Figure 5-1 illustrates the definition of virtualization as we use the term in this chapter. The figure shows three independent operating systems deployed on a single server. From the perspective of the user, each operating system acts as if it is a complete, separate, physical machine with the virtualization software called the Virtual Machine Monitor (VMM), or Hypervisor, managing the assignment of the physical resources between the gguest operating systems.

Three virtual machines running on one physical server

Figure 5.1. Three virtual machines running on one physical server

With virtualization thus defined, the potential benefits of this technology complementing RAC in an Oracle grid computing environment should start to become clear. With RAC abstracting resources across physical servers and virtualization abstracting within them, we have added another dimension to the Oracle database grid. Virtualization delivers the opportunity to design a grid solution based on a much finer granularity of the hardware resources discussed in Chapter 4 than in non-virtualized RAC environments.

Deploying RAC in a virtualized environment brings a number of potential advantages:

  • Improved efficiency and manageability

  • Reduced hardware and software cost

  • Virtual RAC on a single server

However, you must also consider some additional factors when thinking about implementing virtualization:

  • VMM performance overhead

  • Running multiple levels of clustering Software

  • Guest OS timing and scheduling

  • Live migration of RAC instances

  • Dynamic changes of CPU and memory resources

  • Monitoring performance statistics

Improved efficiency and manageability is the key benefit touted for virtualization solutions. The benefit of decoupling the operating system from the physical hardware significantly improves the ability to respond to changes in levels of demand for resources by Oracle database services. For example, a common solution to respond to increased demand for resources has been to add an additional node to a cluster. However, adding that additional node to the cluster and migrating services to that node is a response that cannot be completed immediately. In a virtualized environment on the other hand, hardware resources such as processor and memory can be added and subtracted to the virtualized operating system environment and Oracle instance dynamically, enabling you to respond to demand without an interruption in service.

Virtualized environments also improve efficiency and manageability by significantly improving operating system installation and configuration times, thereby reducing the time it takes to deploy new database installations. Advanced virtualization features such as live migration also enable the transfer of running operating system images between servers across the network. The flexibility to move operating systems between servers has considerable benefits. For example, you might move a RAC instance to a more powerful server to meet a temporary increase in demand, conduct hardware maintenance on existing servers without interrupting the service, or even to perform rapid upgrades on new server platforms.

Virtualization can also reduce the overall cost of the solution for hardware, software, and data center requirements, such as power, cooling, and physical space. By enabling resources to dynamically respond to requirements, a smaller pool of servers running at a high level of utilization can service the same demands as a larger unvirtualized pool of servers dedicated to servicing distinct business needs. Deploying a smaller pool of servers reduces hardware and data center costs. It also reduces software costs in cases where lower software licenses can be applied to virtualized images on servers partitioned into presenting fewer resources. This enables you to manage software costs more closely against the levels of utilization.

A significant benefit that virtualization brings to RAC is the ability to build a virtual RAC cluster on a single server. For example, you might have one physical server host two Linux gguest operating systems where RAC is installed within the two gguests, and the cluster shared storage is the local disk within the server. This approach significantly lowers the hardware requirements for building a RAC environment, which makes it ideal for learning, training, testing, and development of RAC. However, it is important to note that more than one instance of a RAC cluster hosted on the same virtualized physical server in a production environment is neither recommended nor supported by Oracle.

Advantages that appear compelling must also be balanced against some potential disadvantages.

First, the virtualization software inevitably requires resources for its own operations that subsequently will not be available to the virtualized operating systems—and hence, the installed Oracle database. Consequently, one cost of virtualization is that it inevitably means that there will be a reduction in the relative level of performance available from a particular physical server compared to the same server installed natively with an operating system and the Oracle software.

Second, as with the introduction of any level of abstraction, virtualization brings more complexity to the solution, such as requiring additional testing certification above and beyond Oracle RAC on Linux within a native operating system. For example, as detailed later in this chapter, virtualization offers an additional approach to clustering with the ability to redistribute running gguest operating systems between separate physical servers. This functionality implements an independent approach to clusterware; therefore, with RAC running in a virtualized environment, it is important to manage multiple levels of clusterware at both the virtualization and the gguest/RAC layers. At the guest layer, it is also important to consider the impact of operating system timing and scheduling on the RAC clusterware to prevent node evictions from occurring due to timing differences between the guest and the VMM software. Closely related is the impact upon clusterware of this ability to live migrate guest operating systems between nodes in the cluster, as well as the ability of the RAC Clusterware to correctly detect and respond to this occurrence. Within the guest itself, virtualization also enables resources such as CPU and memory to be increased or decreased within a running operating system. Therefore, RAC must also be able to respond to these changes without impacting the level of service offered by the cluster. Oracle is also highly instrumented with detailed performance information available to both tune and troubleshoot the Oracle environment. However, in a virtualized environment, there is another layer of software to consider underneath the operating system, and the performance of the Oracle database in question may be impacted by the resource demands of distinct operating system hosted on the same physical server. With such a dynamic environment for moving loads within the grid, it may become more difficult to gather the consistent information required for Oracle performance diagnosis.

Virtualization is a technology evolving at a rapid pace. The potential advantages and disadvantages are not fixed, and the balance is shifting over time as the disadvantages are resolved. The degree to which the balance lies for a particular RAC deployment at a certain time will determine whether virtualization is applicable to each individual case at the time you consider it.

Oracle VM

In introducing virtualization, we have so far discussed the technology in generic terms. To begin deploying it, we need to narrow the scope to the virtualization software supported in a RAC environment. This software is Oracle VM.

Introduced in 2007, Oracle VM is the only virtualization software on which RAC is supported and certified; therefore if you're looking to virtualize a RAC environment, then we recommend using the officially supported choice.

Oracle VM consists of two distinct software components: Oracle VM Server, an open source server virtualization product; and Oracle VM Manager, an Oracle closed source software product that is installed on a separate server and used for managing multiple Oracle VM Server installations. Both products are free to download from http://edelivery.oracle.com.

Similar to how Oracle Enterprise Linux is based on Red Hat Enterprise Linux, Oracle VM is based on the Xen hypervisor, a well established open source virtualization solution on which Oracle sits as a member of the advisory board. Despite the fact that the underlying technology of Oracle VM is the same as that which underlies Xen, there are differences in the hardware architectures and guest operating systems supported with Oracle VM; namely, Oracle VM narrows the scope from the wider Xen support to the x86 and x86-64 architectures only and to the Linux and Windows guest operating systems only. It is also likely that the support requirements for RAC on Linux in an Oracle VM environment will be more narrowly defined than the equivalent native environment, and we therefore recommend that you check the latest supported configuration information at the design stage before proceeding with installation.

The Oracle VM Manager is a Java application that runs in a standalone distribution of Oracle Containers for J2EE (OC4J). This distribution presents a browser-based management console and stores configuration information in an Oracle database repository. Oracle VM Manager is not an essential component for deploying Oracle VM Server; however, we do recommend using it to take advantage of the increased flexibility in managing high availability features across a pool of managed servers. Although it is possible to install the Oracle VM Manager within a guest operating system on a VM Server in a production environment, we recommend maintaining a separate physical server or server cluster protected by Oracle Clusterware to ensure that all Oracle VM Servers are manageable.

Oracle VM Server Architecture

Before reviewing the design considerations for implementing RAC in a virtualized environment, it is worthwhile gaining some background about the Oracle VM Server architecture. Doing so will prove particularly beneficial in understanding the performance and management characteristics in comparison to RAC installed directly in a native server environment.

The ideal starting point for an insight into how an x86 based system can be virtualized is the protection model of the processor architecture. This model is similar to other processor architectures, and it consists of four rings of increasing privilege levels, from Ring 0 with the most to Ring 3 with the least (see Figure 5-2).

Processor protection and the rings of increasing privilege

Figure 5.2. Processor protection and the rings of increasing privilege

In a standard Linux installation, the operating system kernel executes instructions in Ring 0 kernel or supervisor mode, and the user applications are run in ring 0 user mode; Rings 1 and 2 are unused. Programs that run in Ring 3 are prevented by the processor from executing privileged instructions. An attempt to execute instructions that violate the privilege level results in the processor raising a general protection fault interrupt, which is handled by the operating system and usually results in the termination of the running process. An example of such a fault is a segmentation fault resulting from an invalid memory access.

For the Oracle VM server to run guest operating systems, it needs to run underneath software designed to run in supervisor mode—hence, the role of the Xen hypervisor, which literally supervises the supervisor mode of the guest operating system.

When the Oracle VM Server is booted, the Xen hypervisor named xen.gz or xen-64bit.gz in Oracle VM Server is loaded into system memory and run at the protection level of Ring 0 of the processor.

It is important to note that the Xen hypervisor itself is not a Linux operating system, and other open-source virtualization projects such as KVM give the Linux kernel a hypervisor role. Instead, the Xen hypervisor that runs directly on the hardware is a lightweight operating-system environment based on the Nemesis microkernel. After booting, the hypervisor subsequently creates a privileged control domain termed Domain 0 or Dom0, and it loads a modified Linux operating system into Domain 0's memory. Next, the Linux operating system boots and runs at the Ring 1 privilege level of the processor, with user and command applications running at Ring 3. In an x86-64 environment, both the guest kernel and applications run in Ring 3 with separate virtual memory page tables.

This chapter's introduction on the privilege levels of the processor architecture raises an immediate question: How can the guest operating system execute at a more restricted privilege level than the level that it was designed for? The answer: It does this through a virtualization technique called paravirtualization.

Paravirtualization

In paravirtualization, the guest operating system is modified so that attempts to execute privileged instructions instead initiate a synchronous trap in the hypervisor to carry out the privileged instruction. The reply returns asynchronously through the event channels in the hypervisor. These modified system calls, or hypercalls, essentially enable the hypervisor to mediate between the guest operating system and the underlying hardware. The hypervisor itself does not interact directly with the hardware. For example, the device drivers remain in the guest operating systems, and the hypervisor instead enables the communication between the guest and hardware to take place.

The Linux operating system in Dom0 is paravirtualized. However, it has special the privilege to interact with the hypervisor to control resources and to create and manage further guest operating systems through a control interface. The Oracle VM Agent also runs in Dom0 to serve as medium through which the Oracle VM Manager can control the Oracle VM Server. As with the operation of the Oracle VM Agent when you log into the Oracle VM Server, you log into the Oracle Enterprise Linux operating system running in Dom0 and not into the Xen hypervisor itself. Dom0 is also by default a driver domain, although the creation of further privileged driver domains is permitted. The hypervisor presents a view of all of the physical hardware to the driver domain that loads the native Linux device drivers to interact with the platform hardware, such as disk and network I/O. Dom0 also loads the back-end device drivers that enable the sharing of the physical devices between the additional guest operating systems that are created.

When guest operating system environments are to be created, unprivileged domains termed DomUs are created to contain them. When a domU is created, the hypervisor presents a view of the platform according to the configuration information defined by the administrator and loads the paravirtualized operating system. It is important to note that, in the paravirtualized guest operating system, the applications that run in user space such as the Oracle software do not need to be paravirtualized. Instead, this software can be installed and run unmodified.

CPU resources are presented to guests as VCPUs (virtual CPUs). These VCPUs do not have to correspond on a one-to-one basis to the physical cores (or logical for hyper-threaded CPUs). While Oracle VM Server will permit the creation of more VCPUs than physical cores, we do not recommend doing so because it may negatively impact performance. How efficiently the workload is allocated between guests is the responsibility of the CPU scheduler. The default scheduler in Oracle VM Server is the credit-based scheduler, and it can be managed with the xm sched-credit command, as shown later in this chapter. The credit scheduler implements a work conserving queuing scheme relative to the physical CPUs. This means that it does not allow any physical CPU resources to be idle when there is a workload to be executed on any of the VCPUs. By default, in a multi-processor environment CPU, cycles will be evenly distributed across all of the available processor resources, based on the credit scheme with each domain receiving an equal share. However, each domain has a weight and cap that can be modified to skew the load balancing across the system between domains. For example, Dom0 is required to service I/O requests from all domains, so it may be beneficial to give Dom0 a higher weighting. Alternatively, it is also possible to set processor affinity for VCPUs by pinning a VCPU to a physical core with the xm vcpu-pin command. Thus, dedicating a physical core to Dom0 can improve I/O throughput. Pinning of VCPUs is also a requirement for Oracle server partitioning, where Oracle VM is used to limit the number of processor cores below the number that are physically installed on the system for Oracle database licensing requirements.

In addition to CPU resources, memory management in a virtualized Oracle database environment is vital to performance. In Oracle VM Server, all memory access is ultimately under the hypervisor layer. This means, when considering SGA management, it is useful to understand some of the concepts for how this virtualized memory is managed.

In Chapter 4, we considered both physical and virtual memory in a Linux environment. The first distinction to be made between this and memory management in a virtualized environment is between physical and pseudo-physical memory. Physical memory is managed by the Xen hypervisor in 4KB pages on x86 and x86-64. However, this operates at a level beneath the guest operating system in some versions of Oracle VM, so it is not possible to take advantage of support for huge pages (see Chapter 7 for more information). For this reason, memory performance for large Oracle SGA sizes will be lower in a virtualized environment than would be the case for Linux installed on the native platform.

Oracle's requirement for addressing large amount of memory a consideration for memory allocation requires that the Xen hypervisor keeps a global page table for all of the physical memory installed on server readable by all guest domains. This page table stores the mapping from the physical memory to the pseudo-physical memory view for the guest domains. The guest operating system requires its memory to be contiguous, even though the underlying pages maintained by the hypervisor page table will be non-contiguous. Therefore, the pseudo-physical page table managed by each guest maps the guest domain memory to physical memory. In this case, the physical memory is pseudo-physical memory. Therefore, the guest domain requires a shadow page table to translate the pseudo-physical addresses into real physical memory addresses for each guest process.

The hypervisor provides the domain with the physical memory addresses during domain creation, against which the domain can build the pseudo-physical address mapping. Xen occupies the first 64MB of the address space. As discussed in Chapter 4, the guest operating system already maintains its own mapping of virtual memory to physical memory, and the small amount of private memory in the TLB of the MMU in the processor maintains a small number of these mappings to improve performance. Xen is installed in each address space to prevent the TLB from being flushed as a result of the context switch when the hypervisor is entered.

With this implementation, guest operating systems can read from memory directly. However, writing to memory requires a hypercall to synchronize the guest operating system page table and the shadow page table. These updates can be batched into a single hypercall to minimize the performance impact of entering the hypervisor for every page fault. However, there can still be a performance overhead in Oracle environments, especially in the case of intensive logical I/O and large SGA sizes.

Memory allocation is not static, and it can be increased or decreased dynamically for a domain within a maximum memory limit set at domain-creation time implemented by a balloon memory management driver.

In the context of Oracle, memory pages and shared memory structures such as the Oracle SGA cannot be shared between DomUs. However, from a technical standpoint, the Xen hypervisor permits access to the same memory between domains using the grant table mechanism. This presents interesting possibilities for the evolution of future RAC technology beyond 11g in the area of communication between instances in an Oracle VM virtualized environment.

In the case of I/O such as disk and network communication, shared memory between domains is already used. However, it is used only for communication between Dom0 and the DomUs. This is because none of the devices on the PCI bus are presented to the DomU these are presented to Dom0 or privileged driver domains only. Instead, the DomU loads paravirtualized front-end device drivers to interoperate with the back-end device drivers in the driver domain. Communication between front-end and back-end device drivers takes place through the shared memory rings in the hypervisor layer. The equivalent of a hardware interrupt from the hypervisor to the guest takes place through event channels. Figure 5-3 illustrates the paravirtualized Oracle VM Server environment with both Dom0 and a single DomU guest operating system.

A Paravirtualized Oracle VM environment

Figure 5.3. A Paravirtualized Oracle VM environment

If an I/O device is not in use at the time a DomU domain is created, then it can be presented to a DomU domain instead, which can then load the native device drivers and work with that device as if working in a standard Linux environment. However, this device can only be presented to one domain at a time and doing so will prevent further guests from using this device.

Full Virtualization

Paravirtualization delivers an efficient method by which to virtualize an environment. However, a clear disadvantage to paravirtualization is that the guest operating system must be modified to be virtualization-aware. In terms of Linux, it is now necessary to work with a different version of the operating system compared to the native environment. However, the modifications are minimal. Operating systems that are not open source cannot be paravirtualized, which means they cannot run under the Xen hypervisor in this way.

Full virtualization or hardware-assisted virtualization is a method by which unmodified non-virtualization aware operating systems can be run in an Oracle VM Server environment. With a Linux guest operating system, this means being able to run exactly the same version of the Linux kernel that would be run on the native platform. To do this requires a processor with virtualization technology features. Intel calls this technology IntelVT; for Oracle VM server on the x86 and x86-64 platform, the technology is called VT-x. The technology on AMD processors is called AMD-V; and although it's implemented in a different way, it's designed to achieve an equivalent result.

Most modern processors include hardware-assist virtualization features. If you know the processor number, then you can determine the processor features from the manufacturer's information. For example, for Intel processors, this information is detailed at http://www.intel.com/products/processor_number/eng/index.htm. If you don't have the processor details, you can run this the command at the Linux command line to determine whether hardware virtualization is supported by your processor:

egrep '(vmx|svm)' /proc/cpuinfo

For example, running this command might return the following:

dev1:˜ # egrep '(vmx|svm)' /proc/cpuinfo
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca
cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx lm
constant_tsc pni monitor ds_cpl vmx est tm2 cx16 xtpr lahf_lm

If the command returns processor information, then hardware virtualization is available. However, it may be necessary to enable virtualization at the BIOS level of the platform and to confirm its activation in the Oracle VM Server. In Oracle VM Server, multiple paravirtualized and fully virtualized guests can run side-by-side on a hardware virtualized platform as can 32- and 64-bit guests on a 64-bit hardware platform.

Once enabled, virtualization technology enables the guest operating system to run unmodified at its intended privilege level by defining two new operations: VMX root operation and VMX non-root operation. The hypervisor runs in VMX root operation, whereas the guest is presented with an environment where it runs in Ring 0. However, it is restricted in its privilege by running in VMX non-root operation. Additional processor features also exist to improve performance for hardware-assist virtualized environments. For example, some processors support virtual-processor identifiers, or VPIDs. VPIDs enable the virtual processor to be identified when running in VMX non-root mode, and it can be beneficial for TLB tagging. As discussed previously in this chapter, a number of techniques are used to increase performance by limiting the number of entries to the hypervisor layer. One of the reasons for this is to prevent a TLB flush from occurring. VPIDs can reduce the latency by tagging or associating an address translation entry with the processor id of a particular guest. This means that the valid address translations in the guests can be identified, even when a hypervisor entry has occurred and all TLB entries do not have to be flushed.

Whereas VPIDs can reduce the impact of entering the hypervisor, another feature—called Extended Page Tables (EPTs) in Intel terminology and Nested Page Tables (NPTs) by AMD—can improve memory performance by reducing the need to enter the hypervisor layer for page faults. This technology enables the direct translation of the pseudo-physical addresses into real physical memory addresses. Therefore, page faults can be managed directly by the guest operating system, without entering the hypervisor layer. This improves performance and reduces the memory required to maintain shadow page tables for each guest process. EPTs are supported from Xen version 3.4, which means they are also supported with Oracle VM version 2.2.

A fully virtualized guest interacts with devices in a different way from a paravirtualized guest. As we have seen, the paravirtualized guest loads front-end device drivers to communicate with the back-end device drivers managed by the driver domain. In a fully virtualized environment, the device drivers in the DomU interact with emulated devices managed by Dom0. The emulated device layer is created and managed by Dom0 with open-source software adapted from the QEMU project, which you can find at http://wiki.qemu.org/Index.html Although full virtualization provides good performance on processor-intensive operations, device emulation means that I/O operations require a comparatively higher level of CPU resources, but result in a comparatively lower level of I/O performance when compared to the same load on an equivalent paravirtualized guest. I/O performance is particularly vulnerable when overall CPU utilization across the system is high.

In fully virtualized environments, it is necessary to employ additional virtualization hardware assist technology in the platform. This additional technology is termed the IOMMU. It is a platform feature, not a CPU feature, so its availability on a particular server will be determined by the chipset, as opposed to the processor itself. The IOMMU enables the system to support interrupts and DMA, which is the transfer of data from a device to memory, such as an Oracle physical I/O to multiple protection domains. Consequently, the address translation enables the DMA operation to transfer data directly into the guest operating system memory, as opposed to emulation-based I/O. The IOMMU implementation on Intel platforms is called Virtualization Technology for Directed I/O (VT-d); a similar implementation on AMD systems is termed AMD I/O Virtualization Technology.

There are clearly advantages and disadvantages for both paravirtualization and full virtualization techniques. It is also possible to configure hybrid implementations with, for example, fully virtualized guests deploying paravirtualized device drivers. We anticipate a merging of the technologies over time, to a point where Oracle VM virtualization will not distinguish between the two techniques, but instead will employ features from both to produce the optimal virtualized environment.

Oracle VM Design

In this chapter's introduction, we discussed the concepts of RAC as a virtualization technology. For the design of an Oracle VM configuration, it is also necessary to note that virtualization with Oracle VM is a clustering technology in its own right. Therefore, grid computing is terminology that can be applied equally to Oracle VM. To design an Oracle VM infrastructure to support RAC, it is necessary to focus on instances where clustering techniques can bring the greatest benefit to support the entire virtualized Oracle database configuration as a whole. From a design perspective, it is therefore necessary to determine in what areas the technologies are complementary.

Being familiar with RAC design, it is also useful to view the top-level Oracle VM configuration as a cluster configuration. The Oracle VM Servers serve as the nodes in the cluster managed by the Oracle VM Manager software installed on a dedicated server acting through the Oracle VM Agent running in Dom0 on the servers. The entire cluster is termed the Server Pool, and the role of a particular server in the cluster is determined by the VM Agent. The VM Agent role cannot be set on an individual VM Server, and all roles must be configured from the VM Manager.

There are three roles under which a VM Server can be configured through its Agent: as a Server Pool Master, as a Utility Server, and as a Virtual Machine Server. These roles are not mutually exclusive. For example, one server can provide all three functions in a deployment where the highest levels of availability are not a requirement.

Applying a concept familiar to a number of clustering implementations, a Server Pool has one Server Pool Master. Multiple Server Pool Masters are not permitted; however, the Server Pool Master can serve other roles. The Agent on the Server Pool Master has a holistic view across all servers in pool; all Agents within the pool can communicate with the Master Agent, and the Master Agent communicates with the Oracle VM Manager. The Master Agent manages clustering activities throughout the Server Pool, such as load balancing and high availability. Despite the fact there is only one Server Pool Master, the Server Pool has been designed so that there is no single point of failure across the entire cluster. Consequently, if the Management Server fails, then the Server Pool Master continues to operate within the configuration already implemented. If the entire Server Pool Master or Master Agent fails, then the entire Server Pool continues to operate as configured.

The Utility Server handles I/O intensive file copy and move operations throughout the Server Pool, such as creation, cloning, relocation, and removal of virtual machines. There can be more than one Utility Server in a Server Pool; and in this case, tasks are assigned with a priority order determined by the server with the most CPU resources available. For a high availability configuration, we recommend deploying a dedicated Utility Server. However, where a dedicated Utility Server is not available or desired, we recommend the Server Pool Master also serve as the Utility Server.

The Virtual Machine Servers in the Server Pool provide the functionality detailed in the section on VM Server Architecture covered earlier in this chapter. The Virtual Machine Server Agent conveys configuration and management information to the Oracle VM Server software from the Oracle VM Management Server. It also relays performance and availability information back to it. If the Virtual Machine Server's Agent is stopped or fails, then the Oracle VM Server and its guest operating systems and hosted software continue to operate uninterrupted. The Oracle VM Server can continue to be managed directly with the standard Xen hypervisor commands. However, it cannot be controlled by the Oracle VM Management Server. If the Virtual Machine Server Agent has failed as a result of the failure of the entire Oracle VM Server, however, then this is a detectable event and automatic high availability features can take place to restart the failed guest operating systems on alternative Virtual Machine Servers in the Server Pool.

In some ways, beginning the design process of using an Oracle VM Server Pool as a cluster is similar to getting started with RAC. In both cases, the infrastructure and technologies to share data between the nodes in the cluster act as a vital foundation upon which the reliability of the entire configuration depends. The lowest level of Oracle VM Server Pool is the storage layer. Fortunately, all of the storage features taken into consideration in Chapter 4 for RAC storage are applicable here. The Oracle VM storage can also be considered at an equal level in the hierarchy as the RAC storage. Therefore, it is an entirely acceptable approach to utilize exactly the same SAN or NAS infrastructure for the Oracle VM storage layer that is deployed for the RAC database. Alternatively, as discussed in Chapter 4, a storage pool approach can be deployed for increased flexibility and choice in the storage type and protocols for each particular layer. Whichever approach is taken, it is important to focus on the central point in the cluster that the storage takes and to ensure that no single point of failure exists within the storage layer itself. In a storage pool approach for RAC and the Oracle database (see Chapter 9 for details), the ASM software layer enables a high availability configuration between multiple storage servers by configuring Failure Groups. Within an Oracle VM built on the Linux software layer, the functionally equivalent software for volume management is LVM (the Logical Volume Manager). However, LVM does not support mirroring between separate physical storage arrays, which leaves the cluster vulnerable to the failure of a single storage server. It is therefore important to ensure that resilience is implemented at the storage layer below the Oracle VM software with dedicated storage management. It is also important that fully redundant networks are deployed with a dedicated network (or dedicated bandwidth) to the storage layer.

Built on a fully redundant storage configuration, the storage layer for the Oracle VM, as with RAC, must be made available to all servers in the cluster at the same time. However, unlike the active/active cluster approach implemented by RAC, Oracle VM is based on an active/passive approach. The cluster will access the same storage and same file systems at the same time. However, the same files are not shared simultaneously. The key concept here is the ability to rapidly bring up or migrate guest operating systems throughout the nodes in the cluster at will. To facilitate this approach, it is necessary to implement further clustering technologies and to ensure that a guest operating system is brought up and run on one—and only one—host at any one time. One of these enabling technologies, the Oracle Cluster file System (OCFS2), is already compatible and supported with RAC environments. OCFS2 is available for this role; because it's being incorporated into the Linux kernel, it's the appropriate clustering software layer to apply on top of the storage infrastructure.

Note

Although the focus is on OCFS2 for Oracle VM design, it is also possible to build the shared storage for an Oracle VM Server Pool on a suitable highly available NFS-based infrastructure.

In an Oracle VM environment, OCFS2 implements the necessary clustering capabilities to protect the integrity of the guest operating system environments through its own cluster service called O2CB. This service implements a number of key components, including a node manager, heartbeat service, and distributed lock manager. One OCFS2 file system is mounted between all of the Oracle VM Servers in a Server Pool, and the guest operating system installations reside on that file system. Once configured, the O2CB service validates and monitors the health of the nodes in the cluster. O2CB does this by maintaining a quorum between the nodes in the cluster; the presence of an individual node in the cluster is preserved by updating a disk-based heartbeat, whereby it writes a timestamp to a shared system heartbeat file and reads that data of the other nodes in the cluster to ensure that the timestamp is incrementing according to a configurable threshold. O2CB also maintains a network connection and sends TCP keepalive packets between nodes to ensure the status of the connection. OCFS2 should therefore be configured on a private network interconnect and not across the public network. It is good design practice to configure the private network interconnect on the same network as the Oracle RAC private interconnect.

An algorithm determines the nodes in the cluster that are members of the quorum based on their heartbeat and network status. If a node no longer has a quorum, it will implement I/O fencing and exclude itself from the cluster by rebooting. This prevents the node from being able to access the same files as the systems in quorum in an unmanaged way. During normal operations, access to the individual files on the OCFS2 file system is synchronized by the Distributed Lock Manager (DLM) to prevent unmanaged, simultaneous access resulting in file corruption.

Having OCFS2 protecting the guest operating system environments on the Oracle VM Servers, but enabling these environments to be brought up on any server in the configured Server Pool, enables a number of high availability features to be implemented. First, automatic restarts can be configured to implement a traditional active/passive clustering approach at the guest operating system level. This means that if an individual guest or an entire server and multiple guests fail, they will be restarted on another server in the pool. Second, load balancing can be implemented by enabling a guest operating system environment to be started on the node with lowest utilization. Finally, live migration enables guest operating system environments to be transferred between Oracle VM Servers in the server pool while the guest operating system remains running, available, and providing a service.

The ability to implement clustered virtualized environments adds another dimension to highly available Oracle configurations. Oracle VM includes all the technologies to support the restarting of a single instance of an Oracle database, with minimal downtime in case of server failure. Multiple instances can also be run on separate Oracle VM Servers in the Server Pool with, for example, two Oracle VM Servers providing both active and passive or backup instances to its partner in the cluster. In this scenario, failure of one of the servers means that both instances for the databases in question will be running on the surviving node.

The potential is also increased with a RAC configuration overlaying the Oracle VM Server environment. RAC is able to complement Oracle VM Server by providing Oracle database services across guest operating system environments simultaneously. Oracle VM therefore enables the granularity of the Oracle Service to be smaller than the size of an individual node, while RAC enables the granularity to be larger than the size of an individual node. Oracle VM also simultaneously breaks the link between any physical individual server and a particular Oracle RAC instance.

It is important to note that, in a virtualized RAC environment, the Oracle Clusterware software is wholly contained within the guest operating system, which itself is running under the Oracle VM Server. Therefore, the VM Server configuration should always be viewed as providing a foundation on which the Clusterware depends. Therefore, in a production environment, both for performance and resilience, two instances of the same cluster should not be run on the same node unless for a non-mission critical environment. Also, Oracle VM live migration should not take place while an Oracle RAC instance is running. This helps prevent the Oracle Clusterware from detecting the migration as a node failure, and the node in question from being ejected from the cluster. Figure 5-4 illustrates the complete Oracle VM design, including all of its infrastructure components.

Oracle VM Design

Figure 5.4. Oracle VM Design

Oracle VM Server Installation

Before installing Oracle VM Server, you should check the hardware requirements against the release notes of the Oracle VM Server software. In particular, you should allow for a more powerful processor and memory configuration than for the same server running in a native Linux environment. In particular, multi-core processors enable multiple cores to be assigned to the guest operating system environments. Therefore, eight cores or more is recommended for a flexible configuration. Additionally, memory is a crucial resource, so at least 16GB of memory or 2GB per core will permit a configurable virtualization environment.

If you decide that you wish to have hardware virtualization support, then it is necessary to enable this functionality within the processor options at the BIOS level of the server. Enabling hardware virtualization support requires a hard-reset of the server. It is possible to enable and disable hardware virtualization support after Oracle VM Server has been installed; and even when support is enabled, guestthe hardware-assist functionality will not be used if a Paravirtualized guest is installed.

You can download the Oracle VM Server package from the Oracle electronic delivery site at http://edelivery.oracle.com. Once you download the package, unzip it to produce the Oracle VM Server CD image file such as OracleVM-Server-2.2.0.iso. This file can then be burned to a CD-ROM to make a bootable installation CD-ROM. Similarly, the installation files can be copied to a USB flash memory drive that is made bootable with software such as syslinux. You can boot from the Oracle VM Server CD-ROM to display the installation screen (see Figure 5-5).

The Oracle VM Installation Weclome screen

Figure 5.5. The Oracle VM Installation Weclome screen

The installation is based upon a standard Linux text-based installation procedure. Similar screens are displayed for a Linux install when the argument text is given at the boot prompt. The installation proceeds through the standard media test, language, and keyboard selection screens. If Oracle VM Server has previously been installed, the software presents the option to upgrade the system. At the Partitioning Type screen, choose to Remove all partitions on selected drives and create the default layout. Next, accept the Warning dialog on the removal of all partitions, and then choose to Review and modify partitioning layout. By default, Oracle VM Server will partition the server disk with /swap, /boot, /, and /OVS partitions, with all unallocated storage assigned to /OVS (see Figure 5-6).

Default partitioining

Figure 5.6. Default partitioining

With a standalone configuration, you may remain with this partitioning scheme. However, with a clustered approach, /OVS will be assigned to shared storage. Therefore, the initial local /OVS partition may be unmounted or reallocated. For this reason, we recommend reversing the storage allocations to the root partition (/) and /OVS; that is, all unallocated storage should be allocated to the root partition to create a more flexible configuration once high availability has been enabled. The /OVS partition should always be formatted as file system type .ocfs2, as shown in Figure 5-7.

Customized partitioning

Figure 5.7. Customized partitioning

Similarly, there is the option to manually configure LVM during the partitioning stage. If hardware-level RAID is not available, then LVM may be considered for the local partitions. However, LVM should not be used for the /OVS partition. In a shared storage configuration, LVM does not support mirroring between separate physical storage arrays. Therefore, this must be implemented by the underlying storage hardware or software. At the Boot Loader Configuration screen, you install the boot loader at the Master Boot Record (MBR) location. The screen shown in Figure 5-8 determines the network configuration, with the Oracle VM Management Interface screen enabling the selection of the network interface to use for management. This screen permits the configuration of only a single interface. Therefore, you should select the interface on the public network, usually eth0. The private network interface will be configured after installation is complete. Configure the public interface with your IP address and netmask, as illustrated in Figure 5-8.

Network Configuration

Figure 5.8. Network Configuration

On the following screen, Miscellaneous Network Settings, configure your gateway and DNS settings. Next, on the Hostname Configuration screen, enter your hostname manually, as shown in Figure 5-9.

Hostname configuration

Figure 5.9. Hostname configuration

For the Time Zone Selection screen, enter the time zone of your server location and ensure that System clock UTC option remains selected. The following screens require the entry of passwords to be used for the Oracle VM Agent and the root user, respectively. You will need to have the Oracle VM Agent password when registering the Oracle VM Server into a Server Pool with Oracle VM Manager. Once the passwords have been entered, press OK on the Installation to begin screen to begin installing Oracle VM Server. Unlike a typical Linux install, there is no option available to customize the packages to be installed, and the installation proceeds to install the default package environment, as shown in Figure 5-10.

Package installation

Figure 5.10. Package installation

On completion of the package install, the Complete screen is shown, prompting you to remove the installation media and to reboot the server. Press Reboot and, on the first boot only, it is necessary to accept the Oracle VM Server license agreement. Installation is now complete. You can repeat the installation procedure for the additional Oracle VM Servers in the server pool. On booting, the Oracle VM Server displays the console login prompt (see Figure 5-11).

The Console Login prompt

Figure 5.11. The Console Login prompt

As discussed previously, there is no graphical environment installed by default in Dom0 for which the prompt is given, and no graphical packages should be installed or configured to run in this domain. You can use the xm tools to manage the Oracle VM Server from the command prompt; these command-line tools are detailed later in the chapter. It is necessary to install and configure Oracle VM Manager if you want a supported graphical environment to manage the Oracle VM Server.

Oracle VM Manager Installation

As discussed previously in this chapter, Oracle VM Manager must be installed on a separate system from any Oracle VM Server where a supported version of Oracle Enterprise Linux has previously been installed. Oracle VM Manager is available from the Oracle edelivery site at http://edelivery.oracle.com; the unzipped package produces a CD image file, such as OracleVM-Manager-2.2.0.iso. This can be burned to a CD-ROM; alternatively, because the installation is not installing an operating system, the image can be mounted and run directly at the command line, as follows:

[root@londonmgr1 ˜]# mount -o loop OracleVM-Manager-2.2.0.iso /mnt
[root@londonmgr1 ˜]# cd /mnt
[root@londonmgr1 mnt]# ls
EULA  LICENSE  readme.txt  runInstaller.sh  scripts  source  TRANS.TBL
[root@londonmgr1 mnt]# sh ./runInstaller.sh
Welcome to Oracle VM Manager 2.2
Please enter the choice: [1|2|3]
1. Install Oracle VM Manager
2. Uninstall Oracle VM Manager
3. Upgrade Oracle VM Manager

Select the first choice, and the installation proceeds by installing a number of RPM packages. The first stage is the installation of a repository database. The option is given to install Oracle Express Edition on the Oracle VM Manager Server or to install the repository into an existing database. We recommend using the Oracle Express Edition installation or creating a new database to preserve the repository in a dedicated Oracle VM Management environment. The installer prompts for responses such as the port that will be used, the database listener, passwords for database accounts, and whether to start the database on boot. The default database schema is named OVS.

The installation continues with RPM installs of the ovs-manager and oc4j packages. Next, it prompts for the oc4jadmin password, keystore password for the Web Service, and whether to use HTTP or HTTPS for Oracle VM Manager. The Oracle VM Manager application is installed into the OC4J container, and the installation prompts for the password for the default administration account named admin. It is important to record this username and password combination in particular because it is used as the main login account to the Oracle VM Manager application. The installation continues by configuring the SMTP mail server to be used by the Oracle VM Manager. It is not essential for Oracle VM Manager that the SMTP server is successfully configured; however, some functionality does rely on this feature, in particular the password reminder for the Oracle VM Manager users. If this feature is not configured, no reminder can be sent. Therefore, it is beneficial to know that there is a password reset script available on the Oracle My Support site. In the OVS schema, this script updates the password field with an encrypted password in the OVS_USER table for the corresponding account name. For this reason, SMTP configuration is not absolutely essential to enabling Oracle VM Manager functionality, assuming e-mail alerts and reminders are not required. After SMTP configuration, the Oracle VM Manager installation is complete and reports the chosen configuration, as in this example:

Installation of Oracle VM Manager completed successfully.

To access the Oracle VM Manager 2.2 home page go to:
  http://172.17.1.81:8888/OVS

To access the Oracle VM Manager web services WSDL page go to:
  http://172.17.1.81:8888/OVSWS/LifecycleService.wsdl
  http://172.17.1.81:8888/OVSWS/ResourceService.wsdl
  http://172.17.1.81:8888/OVSWS/PluginService.wsdl
  http://172.17.1.81:8888/OVSWS/ServerPoolService.wsdl
  http://172.17.1.81:8888/OVSWS/VirtualMachineService.wsdl
  http://172.17.1.81:8888/OVSWS/AdminService.wsdl

To access the Oracle VM Manager help page go to:
  http://172.17.1.81:8888/help/help

The Oracle VM Manager application can be accessed by logging in through a web browser, as shown in Figure 5-12.

The Oracle VM Manager Login screen

Figure 5.12. The Oracle VM Manager Login screen

The status of the of the Oracle VM Manager application can reviewed by checking the status of the OC4J service:

[root@londonmgr1 ˜]# service oc4j status
OC4J is running.

Stopping the OC4J service requires entering the oc4jadmin password submitted during the installation:

[root@londonmgr1 ˜]# service oc4j stop
Stopping OC4J ...
Please enter the password of oc4jadmin:

Done.

Restarting the OC4J service also restarts the Oracle VM Manager application:

[root@londonmgr1 ˜]# service oc4j start
Starting OC4J ... Done.

The status of the underlying repository database, if installed in the default express edition database, can be checked with the command service oracle-xe status. This command reports the status of the database listener. Similar to the OC4J service, restarting the oracle-xe service restarts the repository database. It is also possible to log into the repository by setting the ORACLE_SID value to XE and the ORACLE_HOME value to /usr/lib/oracle/xe/app/oracle/product/10.2.0/server, with the default repository owner named OVS and the password given during the installation process.

Oracle supports optional additional configuration to protect the Oracle VM Manager installation with Oracle Clusterware. If the server that supports the Oracle VM Manager application fails, this configuration will fail over the application and database to an additional server. However, this configuration is not mandatory because, if required, the Oracle VM Servers can be controlled directly with the xm command line commands (you will learn more about this later in this chapter).

Oracle VM CLI Installation and Configuration

In addition to the graphical environment provided by Oracle VM Manager there is also an additional command line interface (CLI) available to interact with the Oracle VM Manager. The Oracle VM Manager installation and configuration must have previously been completed before you can use the CLI. However, the CLI lets you manage the Oracle VM management without requiring a graphical interface. It also lets you build scripts to accomplish more complex management tasks. The CLI may be installed on an Oracle Enterprise Linux server that can communicate across the network to the VM Manager.

To configure the CLI, it is necessary to download and install both the CLI and Python Web Services RPM packages. Customers of the Unbreakable Linux Network can acquire the packages, or they can be either installed directly from the public Yum repository or downloaded and installed manually. For example, the ovmcli-2.2-9.el5.noarch.rpm package is available here:

http://public-yum.oracle.com/repo/EnterpriseLinux/EL5/oracle_addons/x86_64/

Similarly, the python-ZSI-2.1-a1.el5.noarch.rpm package is available here:

http://public-yum.oracle.com/repo/EnterpriseLinux/EL5/addons/x86_64/

These packages can be installed with the rpm command:

[root@london5 ˜]# rpm -ivh 
> python-ZSI-2.1-a1.el5.noarch.rpm ovmcli-2.2-9.el5.noarch.rpm
Preparing...                ########################################### [100%]
   1:python-ZSI             ########################################### [ 50%]
   2:ovmcli                 ########################################### [100%]

After installation, you complete the configuration process with the ovm config command, which, at a minimum, specifies the Oracle VM Manager host and port previously configured in this section:

[root@london5 ˜]# ovm config
This is a wizard for configuring the Oracle VM Manager Command Line Interface.
CTRL-C to exit.

Oracle VM Manager hostname: londonmgr1
Oracle VM Manager port number: 8888
Deploy path (leave blank for default):
Location of vncviewer (leave blank to skip):
Enable HTTPS support? (Y/n): n

Configuration complete.
Please rerun the Oracle VM Manager Command Line Interface.

The CLI can now be used to complete Oracle VM Management tasks, without requiring a graphical environment.

Configuring Oracle VM

After installing the Oracle VM Server and Oracle VM Manager software, it is necessary to configure the environment for RAC. In particular, this means configuring high availability virtualization features across the cluster so they complement the high availability features in RAC itself. This configuration focuses on customizing the private interconnect network for optimal performance with RAC guests, configuring the server pool, and finally, configuring the Oracle Cluster File System software and enabling high availability.

Network Configuration

As detailed previously in this chapter, DomU network devices communicate through the network configured in Dom0. The configuration used is determined by the settings in /etc/xen/xend-config.sxp that call configuration scripts in /etc/xen/scripts. Scripts are available to configure bridged, routed, or NATed networks. Bridging is the most common form of network implemented in Xen, as well as the default configuration for Oracle VM Server. In the default configuration, an operating system that boots in Dom0 boots configures the network with the familiar names of devices, such as eth0 and eth1. Xend calls the wrapper script /etc/xen/scripts/network-bridges, which subsequently calls /etc/xen/scripts/network-bridge for each bridge to be configured.

After installation, only the first device, eth0, and its corresponding bridge will have been setup. The other bridges will be made available only after the underlying device has been manually configured. Through Oracle VM 2.1.5, the following occurs after the bridges are available: the standard network devices are brought down, and their configuration is copied to virtual devices, such as veth0 and veth1. The physical devices are also renamed; for example, eth0 becomes peth0. Similarly, the virtual devices are renamed, such as from veth0 to eth0. This process provides the interface names used by Dom0, as opposed to those configured for the guests. The network bridges are created with the physical interfaces connecting the bridge externally, and the virtual network interfaces (vifs) are created on the bridge. For example, the Dom0 virtual interfaces, such as veth0, which was renamed from eth0, are now connected to their corresponding vif on the bridge by the script /etc/xen/scripts/vif-bridge. Vifs are named according to their domain and device order, and they correspond to the veth devices given to the paravirtualized network devices in the DomU. Therefore, vif0.0 connects to eth0 in Dom0, while vif1.1 would connect to eth1 in the first DomU, with all connections being made through the bridge. By default, eight pairs of veth and vif connections are created, and each physical network device has a bridge, peth, veth renamed to eth, and vif in Dom0. Up to Oracle VM Server 2.1.5, veth and vif devices communicate through network loopback devices, and the number available can be increased by using the netloop.nloopbacks argument to the kernel at boot time. For hardware virtualized guests, veth devices don't connect to the vifs; instead, a tap device (such as tap0) is created by the script /etc/xen/qemu-ifup. This creates an emulated network device.

The actions required for configuring the network in a DomU guest are discussed later in this chapter, in the section that explains how to installing and configure a guest. This configuration determines the bridge to which a particular guest interface connects. To do so, it uses a new vif created on the bridge for the interface the vif-bridge script will connect to. When configured, the network interfaces in the guest are identified on the external network by MAC addresses assigned to Xen in the range 00:16:3E:xx:xx:xx. Note that the guests are not identified with the MAC addresses used by the physical devices. Figure 5-13 illustrates a simplified paravirtualized network configuration that shows the utilized network connections with only two guests in RAC environments, where both the guests are configured with a public and private network. In this illustration, the guests do not necessarily need to be members of the same cluster. While this would be a functional configuration, it is not recommended or supported for performance purposes. Instead, the guests are members of different clusters, as would be typical of a RAC development environment.

A paravirtualized network configuration

Figure 5.13. A paravirtualized network configuration

Although the bridge configuration provided by the /etc/xen/scripts/network-bridge script produces an operational environment in a clustered configuration, a practical alternative is enabling the bridge configuration at the Dom0 operating system level, as opposed to using the default script. Doing so results in a more reliable configuration, and it is compatible with advanced configurations such a network bonding, which otherwise have had issues reported with the default configuration. In Oracle VM 2.2 and later, the default network script has been modified to produce the same behavior described here. As described previously, the most important concept is that, in the default Xen networking, the default network device in Dom0 (such as eth0) was renamed to peth0. Also, the new eth0 device renamed from veth0 was created to communicate on the bridge xenbr0 through Vif0.0. In Oracle VM 2.2the IP Address is bound to the bridge itself. Also, the physical devices retain their initial name when connecting to the bridge meaning that the interface Vif0.0 is no longer created.

After installation, only the first network device of eth0 will have been configured. For an environment to support RAC, it is also necessary to configure an additional interface. Let's assume you use the command ifconfig -a in Oracle VM 2.2 later. In the example that follows, you can see that, although bridges have been configured for the interfaces eth0 and eth1, only the first bridge xenbr0 has been assigned an IP address:

[root@londonvs1 network-scripts]# ifconfig -a
...
xenbr0    Link encap:Ethernet  HWaddr 00:04:23:DC:29:50
          inet addr:172.17.1.91  Bcast:0.0.0.0  Mask:255.255.0.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:51582 errors:0 dropped:0 overruns:0 frame:0
          TX packets:1314 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:2983414 (2.8 MiB)  TX bytes:181128 (176.8 KiB)

xenbr1    Link encap:Ethernet  HWaddr 00:04:23:DC:29:51
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:85035 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:4479820 (4.2 MiB)  TX bytes:0 (0.0 b)

To configure the additional bridge and interface for the device eth1, update the configuration in /etc/sysconfig/network-scripts/ifcfg-eth1 as you would for a regular Linux network configuration:

[root@londonvs1 network-scripts]# more ifcfg-eth1
# Intel Corporation 80003ES2LAN Gigabit Ethernet Controller (Copper)
DEVICE=eth1
BOOTPROTO=static
HWADDR=00:04:23:DC:29:51
BROADCAST=192.168.1.255
IPADDR=192.168.1.91
NETMASK=255.255.255.0
NETWORK=192.168.1.0
ONBOOT=yes

The network can be restarted using the network-bridges script, as in this example:

[root@londonvs1 scripts]# ./network-bridges stop
Nothing to flush.
[root@londonvs1 scripts]# ./network-bridges start
net.bridge.bridge-nf-call-arptables = 0
net.bridge.bridge-nf-call-ip6tables = 0
net.bridge.bridge-nf-call-iptables = 0
Nothing to flush.
Waiting for eth0 to negotiate link.....
net.bridge.bridge-nf-call-arptables = 0
net.bridge.bridge-nf-call-ip6tables = 0
net.bridge.bridge-nf-call-iptables = 0
Nothing to flush.
Waiting for eth1 to negotiate link....

In this example, xenbr1 has now been assigned IP Address configured for eth1. For releases up to and including Oracle VM 2.2, you might wish to configure devices manually, without using the network bridges and network-bridge scripts. You can do so using the technique described momentarily to attain a configuration that will also support network bonding. With this approach, it is necessary to first reconfigure the interface eth0.

Begin by editing the file /etc/xen/xend-config.sxp and commenting out the section where the network-bridge script is called, as in this example:

#
# (network-script network-bridges)
#

Configure the bridges in the /etc/sysconfig/network-scripts directory. For example, bridge xenbr0 would require configuring a file /etc/sysconfig/network-scripts/ifcfg-xenbr0 as follows, where you specify the IP Address that would usually be named as eth0 and type as the bridge:

# Intel Corporation 80003ES2LAN Gigabit Ethernet Controller (Copper)
DEVICE=xenbr0
ONBOOT=yes
BOOTPROTO=static
IPADDR=172.17.1.91
NETMASK=255.255.0.0
NOALIASROUTING=yes
TYPE=BRIDGE
DELAY=0

Configure eth0 in /etc/sysconfig/network-scripts/ifcfg-eth0 as follows, specifying the bridge, but without the IP Address. For high availability, this device can also be configured as a bonded interface (see Chapter 6 for more details), instead of to eth0, as shown here:

# Intel Corporation 80003ES2LAN Gigabit Ethernet Controller (Copper)
DEVICE=eth0
ONBOOT=yes
BRIDGE=xenbr0

The steps for eth1 are exactly the same, except when specifying the private network IP address and netmask. Also, the bridge name is xenbr1. Use a kernel argument to set the number of loopback devices to zero in the file /boot/grub/grub.conf. The required vif devices can be created dynamically, and the default eight pairs are not required:

title Oracle VM Server-ovs (xen-64-3.1.4 2.6.18-8.1.15.1.16.el5ovs)
root (hd0,0)
kernel /xen-64bit.gz dom0_mem=834M
module /vmlinuz-2.6.18-8.1.15.1.16.el5xen ro root=LABEL=/1 netloop.nloopbacks=0
module /initrd-2.6.18-8.1.15.1.16.el5xen.img

Next, reboot the server for the network changes to take effect. You can see the result in the following listing, which shows a simplified and scalable network bridge configuration:

[root@londonvs1 network-scripts]# ifconfig -a
eth0      Link encap:Ethernet  HWaddr 00:04:23:DC:29:50
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:1118 errors:0 dropped:0 overruns:0 frame:0
          TX packets:465 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
          RX bytes:94478 (92.2 KiB)  TX bytes:63814 (62.3 KiB)

eth1      Link encap:Ethernet  HWaddr 00:04:23:DC:29:51
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:685 errors:0 dropped:0 overruns:0 frame:0
          TX packets:56 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:62638 (61.1 KiB)  TX bytes:5208 (5.0 KiB)

lo        Link encap:Local Loopback
          inet addr:127.0.0.1  Mask:255.0.0.0
          UP LOOPBACK RUNNING  MTU:16436  Metric:1
          RX packets:138 errors:0 dropped:0 overruns:0 frame:0
          TX packets:138 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:23275 (22.7 KiB)  TX bytes:23275 (22.7 KiB)

xenbr0    Link encap:Ethernet  HWaddr 00:04:23:DC:29:50
          inet addr:172.17.1.91  Bcast:172.17.255.255  Mask:255.255.0.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:716 errors:0 dropped:0 overruns:0 frame:0
          TX packets:490 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:58278 (56.9 KiB)  TX bytes:66104 (64.5 KiB)

xenbr1    Link encap:Ethernet  HWaddr 00:04:23:DC:29:51
          inet addr:192.168.1.91  Bcast:192.168.255.255  Mask:255.255.0.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:117 errors:0 dropped:0 overruns:0 frame:0
          TX packets:68 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:11582 (11.3 KiB)  TX bytes:5712 (5.5 KiB

When a guest domain is started, vif devices showing a guest with two network interfaces will be created. They will also be shown in the network listing, as in this example:

vif1.0    Link encap:Ethernet  HWaddr FE:FF:FF:FF:FF:FF
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:419 errors:0 dropped:0 overruns:0 frame:0
          TX packets:1134 errors:0 dropped:90 overruns:0 carrier:0
          collisions:0 txqueuelen:32
          RX bytes:60149 (58.7 KiB)  TX bytes:109292 (106.7 KiB)

vif1.1    Link encap:Ethernet  HWaddr FE:FF:FF:FF:FF:FF
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:113 errors:0 dropped:0 overruns:0 frame:0
          TX packets:8265 errors:0 dropped:1817 overruns:0 carrier:0
          collisions:0 txqueuelen:32
          RX bytes:23325 (22.7 KiB)  TX bytes:853477 (833.4 KiB)

Similarly, the brctl show command displays the bridge configuration and the interfaces attached to each particular bridge:

[root@londonvs2 ˜]# brctl show
bridge name     bridge id               STP enabled     interfaces
xenbr0          8000.000423dc1e78       no              vif1.0
                                                        eth0
xenbr1          8000.000423dc1e79       no              vif1.1
                                                        eth1

In addition to configuring the required bridges and interfaces, you should also ensure that the names and IP addresses used in your cluster are resolvable by all hosts, either by using DNS or by updating the /etc/hosts file. When configuring name resolution, it is important to ensure that the hostname does not resolve to the loopback address of 127.0.0.1, which will be the status of the default configuration. For example, the following details in /etc/hosts will result in errors during the subsequent cluster configuration:

# Do not remove the following line, or various programs
# that require network functionality will fail.
127.0.0.1               londonvs1 localhost.localdomain localhost
::1             localhost6.localdomain6 localhost6

Instead, the first line should resemble the following on all of the nodes in the cluster:

# Do not remove the following line, or various programs
# that require network functionality will fail.
127.0.0.1               localhost.localdomain localhost
172.17.1.81             londonmgr1
172.17.1.91             londonvs1
172.17.1.92             londonvs2
172.17.1.92             londonvs2
192.168.1.91            londonvs1-priv
192.168.1.92            londonvs2-priv
192.168.1.220           dss

In this example, londonmgr1 is the Oracle VM Manager host, londonvs1 and londonvs2 are the names of the virtual servers, and londonvs1-priv and londonvs2-priv are the private interconnect interfaces on these hosts. Finally, dss is the name of the iSCSI server to be used for shared storage for a high availability configuration.

Server Pool Configuration

Before proceeding with the server pool configuration, you need to ensure that all of the Oracle VM Servers for the cluster to be included in the same pool are installed and that the network is configured. You should also ensure that the Oracle VM agents are operational on the Oracle VM Servers by checking their status, as explained in the "Managing Domains" section later in this chapter.

The Oracle VM high availability feature and RAC are mutually exclusive. Therefore, if you choose to run RAC in a virtualized production environment, you should not use Oracle VM high availability, and vice versa. However, you may use high availability in a development or test environment or for its alternative clustering features. Depending on the version of Oracle VM, enabling high availability and configuring the Server Pool take place in different orders. Prior to Oracle VM 2.2, the Server Pool should be created first, as detailed here, and high availability should be configured later. For Oracle VM 2.2, you should omit this section and proceed with configuring high availability before you create the Server Pool.

The first time you log in to the Oracle VM Manager, the Server Pool Wizard is displayed and guides you through the creation of the Server Pool Master Server, as shown in Figure 5-14.

The Server Pool Configuration wizard

Figure 5.14. The Server Pool Configuration wizard

On the server Information page, enter the server details of the Oracle VM Server you want to act as the Server Pool Master. The Test Connection button should display this message if communication is established: "Server agent is active." Click Next and enter your chosen Server Pool Name; click Next again; and, finally, confirm your choice. The Server Pool is created at this point, as shown in Figure 5-15.

Server Pool creation

Figure 5.15. Server Pool creation

Once the Server Pool is created, it is possible to add additional servers to it. From the Servers tab, click Add Server, provide the Server Pool Name, and click Next. Now provide the server information and ensure that the Virtual Machine Server checkbox is selected. Optionally, you can choose Test the Connection to ensure the status of the agent on the Oracle VM Server. Next, click Add, select the server to be added under the "Servers to be Added to Server Pool" heading, and press Next. Finally, click Confirm, and the server is added to the Server Pool, as shown in Figure 5-16.

Servers added to the Server Pool

Figure 5.16. Servers added to the Server Pool

Repeat the preceding process to add all of the additional servers to be included in the Server Pool. Once of all of the nodes in the cluster are configured in the Server Pool, it is then necessary to enable high availability for the Server Pool, assuming you wish to do so.

Enabling High Availability

It is important to reiterate that, in a production environment, high availability should not be enabled with RAC due to the conflict of high availability features in both Oracle VM and RAC. However, high availability remains a valid option for a test-and-development environment, and we recommend that you learn how high availability is configured, so you can know where virtualization provides equivalent clustering capability. As with RAC, you can ensure the configuration is successful by understanding when actions are required on a single host or on all of the nodes in the cluster. In contrast to RAC, however, Oracle VM includes the concept of a Server Pool Master. Therefore, any actions run on a single host should be executed on the Server Pool Master.

Configuring Shared Storage

After a default installation, the /OVS partition is mounted as an OCFS2 file system. On a cluster file system, however, it is installed on a local device, as shown in the following example:

[root@londonvs1 utils]# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/sdb1             452G  971M  427G   1% /
/dev/sda1              99M   45M   49M  49% /boot
tmpfs                 277M     0  277M   0% /dev/shm
/dev/sda2             4.0G  271M  3.8G   7% /var/ovs/mount/A4AC9E8AE2214FC3ABF07904534A8777

In the next example, which is from Oracle VM 2.2, you don't mount /OVS directly; instead, you use a symbolic link to a Universal Unique Identifier (UUID) mount point in the directory, /var/ovs/mount:

root@londonvs1 ˜]# ls -l /OVS
lrwxrwxrwx 1 root root 47 Dec 14 12:14 /OVS -> /var/ovs/mount/A4AC9E8AE2214FC3ABF07904534A8777

Prior to Oracle VM 2.2, the /OVS partition was detailed in the file /etc/fstab to be mounted directly.

To configure high availability, it is necessary for this partition to be moved to an OCSF2- or NFS-based file system that is shared between the nodes in the cluster. However, OCFS2 should not be also configured on an NFS file system. If using OCFS2, this can be configured only on a suitably highly available SAN or NAS storage option.

In this example, the storage is network based, and the disks are presented with the ISCSI protocol. By default, the ISCSI initiator software is installed on the Oracle VM Server. Specifically, the ISCSI storage is presented to all the nodes in the cluster from the dedicated storage server with the hostname dss and the IP Address 192.168.1.220. The NAS storage should either be on a separate network from that the one used by both the public network and the private interconnect, or it should be in a unified fabric environment to ensure that sufficient bandwidth is dedicated to the storage. This will help ensure that high utilization does not interfere with the clustering heartbeats of the OCFS2 or RAC software. To configure the ISCSI disks, start the ISCSI daemon service on the Oracle VM Server Pool Master Server:

[root@londonvs1 ˜]# service iscsid start
Turning off network shutdown. Starting iSCSI daemon:
                                                           [  OK  ]

This snippet discovers the disks on the storage server:

[root@londonvs1 ˜]# iscsiadm -m discovery -t st -p dss
192.168.1.220:3260,1 iqn.2008-02:dss.target0

Next, start the ISCSI service:

root@londonvs1 ˜]# service iscsi start
iscsid (pid 2430 2429) is running...
Setting up iSCSI targets: Logging in to [iface: default, target: iqn.2008-02:dss.target0, portal: 192.168.1.220,3260]
Login to [iface: default, target: iqn.2008-02:dss.target0, portal: 192.168.1.220,3260]:
successful
                                                           [  OK  ]

The disk discovery information can be viewed in /proc/scsi/scsi or by using the dmesg command, as follows:

scsi3 : iSCSI Initiator over TCP/IP
  Vendor: iSCSI     Model: DISK              Rev: 0
  Type:   Direct-Access                      ANSI SCSI revision: 04
sdf : very big device. try to use READ CAPACITY(16).
SCSI device sdf: 5837094912 512-byte hdwr sectors (2988593 MB)

You can partition the disk to be used for the /OVS partition with the commands fdisk or parted, as discussed in Chapter 4. This next example shows that one partition has been created on device /dev/sdf and is now available to configure for high availability:

[root@londonvs1 ˜]# parted /dev/sdf
GNU Parted 1.8.1
Using /dev/sdf
Welcome to GNU Parted! Type 'help' to view a list of commands.
(parted) print

Model: iSCSI DISK (scsi)
Disk /dev/sdf: 2989GB
Sector size (logical/physical): 512B/512B
Partition Table: gpt

Number  Start   End     Size    File system  Name     Flags
 1      17.4kB  2989GB  2989GB               primary

(parted)

Next, repeat the steps for disk discovery and configuration on all of the nodes in the cluster. However, do not repeat these steps for the partitioning, which should be completed on the Server Master only. The disk device is shared, so the partition information should now be visible on the rest of the nodes.

Cluster Configuration

By default, OCFS2 is installed automatically with Oracle VM Server, and it is configured by default to operate in an environment local to that server. The default configuration lends itself to being readily adapted to share the /OVS directory between multiple servers. In addition to the OCFS2 integrated into the Linux kernel, the ocfs2-tools RPM package contains the command-line tools for management. There is an additional GUI front end for these tools in an RPM package calledocfs2console ; however, due to the best practice guidelines of running Dom0 with the least amount of overhead possible, there is no graphical environment available. Therefore, the GUI tools may not be installed. Thus, it is necessary to become familiar with the command-line tools for configuring OCFS2. In addition to OCFS2, the default installation also includes the o2cb service for cluster management. No additional software installations are required to extend the OCFS2 configuration into a clustered environment.

Before you can begin extending the configuration, all nodes in the cluster must ultimately share the same disk partition as the cluster root at /OVS, as in this example based on ISCSI. The default configuration has already installed an OCFS2 file system at /OVS, so it is necessary to move the existing mount point information for the /OVS directory to the shared storage. To begin creating the shared /OVS disk partition, it is necessary to format the shared storage device as an OCFS2 file system.

Before formatting the disk, you must have successfully completed the stages in Configuring Shared Storage, as described previously in this chapter. Specifically, you must have created the logical partition on a disk or shared storage system, and then provisioned it in such a way that every node in the cluster has read/write access to the shared disk. Once the disk is formatted, it can be mounted by any number of nodes. The format operation should be performed on one node only. Ideally, this node should be the one designated as the Server Pool Master. The o2cb service must be running to format a partition that has previously been formatted with an OCFS2 file system. However, if the o2cb service is not available and the file system is not in use by another server in the cluster, this check can be overridden with the -F or -force options. Volumes are formatted using the mkfs.ocfs2 command-line tool. For example, the following command creates an ocfs2 file system on device /dev/sdf1 with the OVS label:

[root@londonvs1 utils]# mkfs.ocfs2 -L "OVS" /dev/sdf1
mkfs.ocfs2 1.4.3
Cluster stack: classic o2cb
Overwriting existing ocfs2 partition.
mkfs.ocfs2: Unable to access cluster service while initializing the cluster
[root@londonvs1 utils]# dd if=/dev/zero of=/dev/sdf1 bs=1M count=100
100+0 records in
100+0 records out
104857600 bytes (105 MB) copied, 1.66662 seconds, 62.9 MB/s
[root@londonvs1 utils]# mkfs.ocfs2 -L "OVS" /dev/sdf1
mkfs.ocfs2 1.4.3
Cluster stack: classic o2cb
Filesystem label=OVS
Block size=4096 (bits=12)
Cluster size=4096 (bits=12)
Volume size=2988592558080 (729636855 clusters) (729636855 blocks)
22621 cluster groups (tail covers 6135 clusters, rest cover 32256 clusters)
Journal size=268435456
Initial number of node slots: 16
Creating bitmaps: done
Initializing superblock: done
Writing system files: done
Writing superblock: done
Writing backup superblock: 6 block(s)
Formatting Journals: done
Formatting slot map: done
Writing lost+found: done
mkfs.ocfs2 successful

After formatting the shared device as an OCFS2 file system, it is necessary to configure the cluster to mount the new OCFS2 file system and configure this file system as the cluster root on all hosts. From Oracle VM 2.2, this is done with the /opt/ovs-agent-2.3/utils/repos.py command. Prior to Oracle VM 2.2, this is done with the /usr/lib/ovs/ovs-cluster-configure command.

From Oracle VM 2.2, you will already have a local cluster root defined. This should be removed for all of the nodes in the cluster with the repos.py -d command, as shown:

[root@londonvs1 utils]# ./repos.py -l
[ * ] a4ac9e8a-e221-4fc3-abf0-7904534a8777 => /dev/sda2
[root@londonvs1 utils]# ./repos.py -d a4ac9e8a-e221-4fc3-abf0-7904534a8777
*** Cluster teared down.

The UUID will be different for all of the nodes in the cluster at this point because the nodes are defined on local storage:

[root@londonvs2 utils]# ./repos.py -d 5980d101-93ed-4044-baf8-aaddef5a9f3e
*** Cluster teared down.

At this stage, no /OVS partition should be mounted on any node, as shown in this example:

[root@londonvs1 utils]# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/sdb1             452G  971M  427G   1% /
/dev/sda1              99M   45M   49M  49% /boot
tmpfs                 277M     0  277M   0% /dev/shm

On the Server Pool Master, you should only configure the newly formatted shared OCFS2 partition as the cluster root, as shown here:

[root@londonvs1 utils]# ./repos.py -n /dev/sdf1
[ NEW ] 5f267cf1-3c3c-429c-b16f-12d1e4517f1a => /dev/sdf1
[root@londonvs1 utils]# ./repos.py -r 5f267cf1-3c3c-429c-b16f-12d1e4517f1a
[ R ] 5f267cf1-3c3c-429c-b16f-12d1e4517f1a => /dev/sdf1

Subsequently, follow the procedure detailed previously to create a Server Pool, as shown in Figure 5-17.

Highly available Server Pool creation

Figure 5.17. Highly available Server Pool creation

In this case, you need to ensure that the High Availability Mode checkbox is selected and click the Create button, as shown in Figure 5-18.

High Availability configuration

Figure 5.18. High Availability configuration

The Server Pool is created as a High Availability cluster with a single node, as shown in Figure 5-19.

The created High Availability Server Pool

Figure 5.19. The created High Availability Server Pool

On the Server Pool Master itself, the shared storage has been mounted and initialized as the cluster root under the /OVS symbolic link, as shown in this example:

[root@londonvs1 utils]# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/sdb1             452G  971M  427G   1% /
/dev/sda1              99M   45M   49M  49% /boot
tmpfs                 277M     0  277M   0% /dev/shm
/dev/sdf1             2.8T  4.1G  2.8T   1% /var/ovs/mount/5F267CF13C3C429CB16F12D1E4517F1A

To add nodes to the cluster under Oracle VM Manager, click the Servers tab and then the Add Server button. Provide the server details, including the name of the Server Pool, and then click OK, as shown in Figure 5-20.

Adding a server to the Server Pool

Figure 5.20. Adding a server to the Server Pool

The server is added to the Server Pool, and the cluster root partition is automatically configured and mounted as the shared storage on the additional node:

[root@londonvs2 utils]# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/sdb1             452G  971M  427G   1% /
/dev/sda1              99M   45M   49M  49% /boot
tmpfs                 277M     0  277M   0% /dev/shm
/dev/sdf1             2.8T  4.1G  2.8T   1% /var/ovs/mount/5F267CF13C3C429CB16F12D1E4517F1A

The Server Pool is now shown as having high availability enabled with two servers (see Figure 5-21).

A Highly Available Server Pool with two servers

Figure 5.21. A Highly Available Server Pool with two servers

For versions of Oracle VM prior to Oracle VM 2.2, it is a more manual process to configure high availability. However, even if you're running Oracle VM 2.2, we recommend being familiar with the details in this section, so you can gain familiarity with the cluster software, even though the actual enabling commands are different.

Prior to Oracle VM 2.2, it is first necessary to unmount the existing local /OVS partition and run /usr/lib/ovs/ovs-cluster-configure on the Server Pool Master. The first operation of this command adds the following line to the file /etc/sysconfig/iptables on all of the nodes in the cluster. It also restarts the iptables service, as follows:

-A RH-Firewall-1-INPUT -p tcp -m state --state NEW -m tcp --dport 7777 -j ACCEPT

This action opens the firewall for cluster communication on the default port number of 7777. Details of the OCFS2 service are stored in the /etc/ocfs2/cluster.conf configuration file, which has a default cluster name of ocfs2. The configuration file contains the details of all of the nodes in the Server Pool, and this file is the same on each node. If making manual changes, new nodes can be added to the cluster dynamically. However, any other change, such as adding a new node name or IP address, requires a restart of the entire cluster to update information that has been cached on each node. The /etc/ocfs2/cluster.conf file contains the following two sections:

  • The cluster section includes:

    • node_count: Specifies the maximum number of nodes in the cluster.

    • name: Defaults to ocfs2.

  • The node section includes:

    • ip_port : Indicates the IP port to be used by OCFS to communicate with other nodes in the cluster. This value must be identical on every node in the cluster; the default value is 7777.

    • ip_address: Specifies the IP address to be used by OCFS. The network must be configured in advance of the installation, so it can communicate with all nodes in this cluster through this IP address.

    • number: Specifies the node number, which is assigned sequentially from 0 to 254.

    • name: Indicates the server hostname.

    • cluster: Specifies the name of cluster.

There can be a maximum of 255 nodes in the cluster. However, the initial configuration will include all of the nodes in the Server Pool, as well as their public IP addresses. It is good practice to modify this file to use the private IP addresses as shown, and then to restart the OCFS2 and O2CB services as explained in the following section. When modifying the IP addresses, however, it is important that the hostname should remain as the public name of the host:

[root@londonvs2 ovs]# cat /etc/ocfs2/cluster.conf
node:
        ip_port = 7777
        ip_address = 192.168.1.91
        number = 0
        name = londonvs1
        cluster = ocfs2

node:
        ip_port = 7777
        ip_address = 192.168.1.92
        number = 1
        name = londonvs2
        cluster = ocfs2

cluster:
        node_count = 2
        name = ocfs2

During cluster configuration for high availability, the o2cb service is also started on each node in the cluster. This stack includes components such as the node manager, the heartbeat service, and the distributed lock manager that is crucial to cluster functionality and stability, so we recommend that you also become familiar with its parameters and operation. By default, the o2cb service is configured to start on boot up for a cluster name of ocfs2. The default configuration can be changed with the command service o2cb configure. Also, there are four parameters that can be changed to alter o2cb cluster timeout operations, depending on the cluster environment and the cluster name. Another option loads the o2cb modules at boot time. The four parameters for the o2cb service are as follows:

  • O2CB_HEARTBEAT_THRESHOLD : The heartbeat threshold defines the number of heartbeats that a node can miss when updating its disk-based timestamp before it is excluded from the cluster. The value is based in iterations of two seconds each; therefore, the default value of 31 sets a timeout of 60 seconds.

  • O2CB_IDLE_TIMEOUT_MS : The idle timeout, given in milliseconds, determines the maximum latency for a response on the network interconnect between the cluster nodes. The default value is 30000.

  • O2CB_KEEPALIVE_DELAY_MS : The keepalive delay sets the time in milliseconds, after which a TCP keepalive packet is sent over the network when no other activity is taking place. Keepalive packets and their response ensure that the network link is maintained. Keepalive packets are short and utilize minimal bandwidth. There is default setting of 2000 milliseconds for this parameter.

  • O2CB_RECONNECT_DELAY_MS : The reconnect delay has a default value of 2000 milliseconds, and it specifies the interval between attempted network connections.

The parameters can be modified by specifying new values for the configure option. While it is possible to change these values while a cluster is operational, you must not do so. The parameters must be the same on all nodes in the cluster; thus, if the O2CB_IDLE_TIMEOUT_MS value is changed dynamically when the first node in the cluster is restarted, the timeouts will be incompatible, and the node will not be able to join the cluster. Therefore, configuration changes should take place at the same time, and the o2cb modules reloaded after changes are made:

[root@londonvs1 ˜]# service o2cb configure
Configuring the O2CB driver.

This will configure the on-boot properties of the O2CB driver.
The following questions will determine whether the driver is loaded on
boot.  The current values will be shown in brackets ('[]').  Hitting
<ENTER> without typing an answer will keep that current value.  Ctrl-C
will abort.

Load O2CB driver on boot (y/n) [y]:
Cluster stack backing O2CB [o2cb]:
Cluster to start on boot (Enter "none" to clear) [ocfs2]:
Specify heartbeat dead threshold (>=7) [31]:
Specify network idle timeout in ms (>=5000) [30000]:
Specify network keepalive delay in ms (>=1000) [2000]:
Specify network reconnect delay in ms (>=2000) [2000]:
Writing O2CB configuration: OK
Starting O2CB cluster ocfs2: OK

The non-default configured values for o2cb are stored in the file /etc/sysconfig/o2cb after they are changed, and the current operational values can be shown with the command service o2cb status. This command also displays the current status of the cluster:

[root@londonvs1 ˜]# service o2cb status
Driver for "configfs": Loaded
Filesystem "configfs": Mounted
Driver for "ocfs2_dlmfs": Loaded
Filesystem "ocfs2_dlmfs": Mounted
Checking O2CB cluster ocfs2: Online
Heartbeat dead threshold = 31
  Network idle timeout: 30000
  Network keepalive delay: 2000
  Network reconnect delay: 2000
Checking O2CB heartbeat: Active

You can bring the cluster service online and take it offline again. To bring the cluster online, use the command service o2cb online [cluster_name], as in this example:

[root@londonvs1 ˜]# service o2cb online ocfs2
Loading filesystem "configfs": OK
Mounting configfs filesystem at /sys/kernel/config: OK
Loading filesystem "ocfs2_dlmfs": OK
Mounting ocfs2_dlmfs filesystem at /dlm: OK
Starting O2CB cluster ocfs2: OK

The clustered file system can then be mounted. Using the /OVS symbolic link ensures that the correctly configured cluster root is mounted:

[root@londonvs1 ˜]# mount /dev/sdf1 /OVS

To take the same ocfs2 cluster offline again, use the following command, which ensures that the clustered filesystem itself is unmounted before the service is taken offline:

[root@londonvs1 ˜]# umount /OVS
[root@londonvs1 ˜]# service o2cb offline ocfs2
Stopping O2CB cluster ocfs2: OK
Unloading module "ocfs2": OK

You can also stop both the ocfs2 and o2cb services individually, although the o2cb cluster service must be running with the filesystem unmounted to perform maintenance operations. As with any cluster environment, it is essential to maintain the health of the shared file system. If corrupted, the shared file system will render the entire cluster inoperational. If corruption is detected, the OCFS2 file system may operate in read-only mode, resulting in the failure of Oracle VM manager commands. You can perform a number of tuning operations on existing partitions using the tunefs.ocfs2 command-line utility. You can use the tuning tools to increase the number of node slots, change the volume label, and increase the size of the journal file. The fsck.ocfs2 tool can be used to check the health of an OCFS2 file system on an individual partition. To force a check on a file system that you suspect has errors, you can use the -f and -y options to automatically fix the errors without prompting, as in this example:

[root@londonvs1 ˜]# fsck.ocfs2 -fy /dev/sdf1
fsck.ocfs2 1.4.3
Checking OCFS2 filesystem in /dev/sdf1:
  label:              OVS
  uuid:               cf 03 24 d4 73 78 46 b8 96 c3 45 1f 75 f5 e2 38
  number of blocks:   729636855
  bytes per block:    4096
  number of clusters: 729636855
  bytes per cluster:  4096
  max slots:          16

/dev/sdf1 was run with -f, check forced.
Pass 0a: Checking cluster allocation chains
Pass 0b: Checking inode allocation chains
Pass 0c: Checking extent block allocation chains
Pass 1: Checking inodes and blocks.
[INODE_SPARSE_SIZE] Inode 6 has a size of 4096 but has 2 blocks of
actual data. Correct the file size? y
Pass 2: Checking directory entries.
[DIRENT_LENGTH] Directory inode 6 corrupted in logical block 1 physical block
208 offset 0. Attempt to repair this block's directory entries? y
Pass 3: Checking directory connectivity.
Pass 4a: checking for orphaned inodes
Pass 4b: Checking inodes link counts.
All passes succeeded.

Use mounted.ocfs2 to check the nodes currently mounting a specific device. Options include -d, which performs a quick detect useful in identifying the UUID for a particular device; and -f, which performs a full detect to display the mounted filesystems:

[root@londonvs1 ˜]# mounted.ocfs2 -d
Device                FS     UUID                                  Label
/dev/sda2             ocfs2  2f77c155-017a-4830-8abb-8bc767ef7e1f
/dev/sdf1             ocfs2  cf0324d4-7378-46b8-96c3-451f75f5e238  OVS
[root@londonvs1 ˜]# mounted.ocfs2 -f
Device                FS     Nodes
/dev/sda2             ocfs2  Not mounted
/dev/sdf1             ocfs2  londonvs1, londonvs2

For versions of Oracle VM prior to 2.2, after configuring OCFS2 it is necessary to configure the /etc/ovs/repositories file within which the cluster mount points will be maintained. This can be achieved by using /usr/lib/ovs/ovs-makerepo command, where the arguments are the device name of the partition C for cluster and a comment. This command must be run on each node in the cluster. The first run initializes the shared repository, while subsequent runs report that the shared repository is already initialized. Running this command also updates the local repository list:

[root@londonvs1 ovs]# ./ovs-makerepo /dev/sdf1 C "PRORAC Cluster"
Initializing NEW repository /dev/sdf1
SUCCESS: Mounted /OVS
Updating local repository list.
ovs-makerepo complete

After configuration, the device and OVS UUID are associated in the /etc/ovs/repositories file:

[root@londonvs1 ovs]# more /etc/ovs/repositories
# This configuration file was generated by ovs-makerepo
# DO NOT EDIT
@8EE876B5C1954A1E8FBEFA32E5700D20 /dev/sdf1

Also, the corresponding configuration stored on the device is now mounted at /OVS :

[root@londonvs1 ovs]# more /OVS/.ovsrepo
OVS_REPO_UUID=8EE876B5C1954A1E8FBEFA32E5700D20
OVS_REPO_SHARED=1
OVS_REPO_DESCRIPTION=PRORAC Cluster
OVS_REPO_VERSION=1

After configuring the /etc/ovs/repositories file, the OCFS2 mount point information should no longer be maintained in the /etc/fstab file. The command /usr/lib/ovs/ovs-cluster-check can be used to complete the cluster configuration. On the Server Pool master, use the arguments --master and --alter-fstab to complete the cluster configuration:

[root@londonvs1 ovs]# ./ovs-cluster-check --master --alter-fstab
Backing up original /etc/fstab to /tmp/fstab.fQOZF10773
Removing /OVS mounts in /etc/fstab
O2CB cluster ocfs2 already online
Cluster setup complete.

On the other nodes in the Server Pool, use only the argument --alter-fstab to complete the cluster configuration:

[root@londonvs2 ovs]# ./ovs-cluster-check --alter-fstab
O2CB cluster ocfs2 already online
Cluster setup complete.

To enable High Availability across the Server Pool, log into Oracle VM Manager and click the Server Pools tab. When the Server Pool is created, the High Availability Status field shows a status of Disabled (see Figure 5-22). Select the Server Pool and click Edit to show the Edit Server Pool page. On this page, click the Check button that corresponds to the High Availability Infrastructure field. The response chosen should be High Availability Infrastructure works well, as shown in Figure 5-22.

An Oracle VM 2.1.2 High Availability configuration

Figure 5.22. An Oracle VM 2.1.2 High Availability configuration

If the check reports an error, you need to ensure that all the steps detailed in this section have been correctly implemented on all of the nodes in the cluster. In particular, the check ensures that all the nodes in the cluster share the same /OVS partition. This means the storage can be synchronized and the cluster scripts have been correctly run on all of the nodes. The errors reported will aid in diagnosing where the configuration is incorrect. When the check is successful, select the Enable High Availability checkbox and click Apply, as shown in Figure 5-23.

High Availability Enabled

Figure 5.23. High Availability Enabled

Once the Server Pool has updated, press OK. The top-level Server Pools page that shows that the High Availability Status also now shows a status of Enabled. High availability features are active across the cluster, as shown in Figure 5-24.

A Highly Available Server Pool

Figure 5.24. A Highly Available Server Pool

Installing and Configuring Guests

With the underlying Oracle VM configuration established in a high availability configuration, the next step is to create and configure DomU guest domains and to install a guest operating system environment. There are a number of ways to configure a guest both at the command line. For example, you might use the virt-install command or the Oracle VM Manager from standard installation media. For Linux environments, however, we recommend standardizing on the use of Oracle VM templates for guest installations. Oracle VM templates provide pre-configured system images of both Oracle Enterprise Linux and Oracle software environments. These environments can be imported, and guests can be created from the template without needing to undergo the operating system installation procedure. These pre-configured Oracle VM templates are available from the Oracle edelivery website. On the Media Pack Search page, you can select Oracle VM templates from the Select a Product Pack menu. For a guest to support a RAC environment, downloading the template requires only that you select the operating system, such as Oracle VM Templates for Oracle Enterprise Linux 5 Media Pack for x86_64 (64 bit).

Additionally, with Oracle Enterprise Linux JeOS, Oracle provides a freely downloadable operating system environment that enables you to build your own Oracle VM Templates for importing into Oracle VM. Therefore, adopting the use of templates enables you to maintain consistency across the installations of all of your guest environments.

Importing a Template

Whether you have created your own template or downloaded an Oracle template, the first stage to installing a guest based on this template is to import one using Oracle VM Manager. At the time of writing, there is no preconfigured Oracle RAC template, so the example will focus on the standard Linux template. On the /OVS partition shared between the Oracle VM Servers, you have a number of directories, as shown here:

[root@londonvs1 OVS]# ls
iso_pool  lost+found  publish_pool  running_pool  seed_pool  sharedDisk

When working with Oracle VM templates, you will focus on the running_pool and seed_pool; the iso_pool is for the storage of CD images. Copy the Oracle VM template file to the /OVS/seed_pool directory. If you're using the Oracle-provided templates, then also use the unzip and tar commands with the arguments zxvf to unzip and extract the template into a top-level directory named after the template. This directory will contain three files, as shown:

[root@londonvs1 OVM_EL5U3_X86_64_PVM_4GB]# ls
README  System.img  vm.cfg

Log into Oracle VM Manager and click the Resources tab to show the Virtual Machine Templates page. Click Import to show the Source page, select a template name from Server Pool, and press Next. On the General Information page, enter the details of the template to use. The templates you have extracted into the seed_pool directory are shown under the Virtual Machine Template Name dropdown menu. Now press Next. On the Confirm Information page, press Confirm to import the template. When successfully imported, the template is displayed with the status of Pending, as shown in Figure 5-25.

The imported template

Figure 5.25. The imported template

Once the template is imported, it is necessary to approve the template before it can be used to create a guest. On the Virtual Machine Templates section, click Approve. The View Virtual Machine Template is also displayed with the status of Pending. Click Approve, and the template displays a status of active. This means it is ready to create a guest installation from, as shown in Figure 5-26.

The active template

Figure 5.26. The active template

Creating a Guest from a Template

To begin creating a guest from an imported and approved Oracle VM template, log into Oracle VM Manager, click the Virtual Machines tab, and press the Create Virtual Machine button. On the Creation Method page, choose the Create virtual machine based on a virtual machine template option, and then press Next. Now select the Server Pool and the Preferred Server options. The concept of Preferred Servers is similar to that which is applied to preferred servers for RAC service management; however, in this case it specifies the preferred Oracle VM Server that will run the created guest domain if that server is available (see Figure 5-27).

The preferred server

Figure 5.27. The preferred server

On the Source page, select the name of the template that was imported. Now press Next to show the Virtual Machine Information page, and then enter the details for the guest to be configured. The Virtual Machine Name will be applied to the guest domain, which may be the same as the guest operating system that runs in the domain. The console password is the password used to access the VNC interface to the guest, as discussed in the next section. It is not the root user password for the guest, which is set by default to the password of ovsroot. Under the Network Interface Card section, the first virtual interface of VIF0 is configured by default. Select Add Row, and then select the bridge on the private network. In this case, xenbr1 is chosen for the additional interface of VIF1, as shown in Figure 5-28.

The network configuration

Figure 5.28. The network configuration

Press Next, and then press Next again on the Confirm Interface Screen. This creates the guest from the template. The System image file is copied from the seed_pool to the running_pool directory, as shown in Figure 5-29.

Guest Creation

Figure 5.29. Guest Creation

When the creation process has completed, the guest shows a status of Powered Off. The guest can be started by pressing the Power On button.

Accessing a Guest

When the guest is running, access to the console is provided through the VNC service that is hosted by the operating system in Dom0. Once the guest network interfaces have been configured, the guest can be accessed across the network using standard tools, such as ssh. However, the console also enables accessing the guest before the network has been enabled. To find the VNC connectivity information on the Virtual Machines page, note the Oracle VM Server that the guest is running on. In the Details section, click Show and note the VNC port. In this example, the port is the VNC default of 5900 (see Figure 5-30).

The VNC configuration

Figure 5.30. The VNC configuration

Based on this connectivity information, the guest console can be accessed using any VNC connectivity tool available in most Linux environments, such as vncviewer. Now run vncviewer with the Oracle VM Server and VNC port as arguments, as shown here:

[root@londonmgr1 ˜]# vncviewer londonvs2:5900

VNC Viewer Free Edition 4.1.2 for X - built May 12 2006 17:42:13
Copyright (C) 2002-2005 RealVNC Ltd.
See http://www.realvnc.com for information on VNC.

At the graphical prompt, provide the console password entered when the guest was created. If this password has been forgotten, it can be accessed from the information stored in the Xenstore or the vm.cfg file, as detailed later in this chapter. Oracle VM Manager also provides a VNC plugin to the host web browser. Clicking the Console button under the Virtual Machines page can also display the console. In this case however, the console is embedded in the browser, as opposed to the standalone access provided by vncviewer (see Figure 5-31).

Guest Console

Figure 5.31. Guest Console

On first access to the guest console, you are prompted to provide the IP and hostname configuration information for the guest operating system. By default, this will configure the public eth0 interface only. Therefore, it is necessary to prepare the guest for RAC before you can log in and configure the additional private network interface.

Configuring a Guest for RAC

Once the guest template is installed, its configuration information is stored under a directory named after the guest under/OVS/running_pool. Under this directory, the file System.img is presented as a disk device to the guest when it is running. Consequently, this file should not be accessed or modified from the Oracle VM Server environment. The file vm.cfg contains the Virtual Machine configuration information, and it's where modifications can be made to change the configuration before starting the guest:

[root@londonvs1 12_london1]# ls -ltR
.:
total 6353708
-rw-rw-rw- 1 root root        475 Dec 15 16:07 vm.cfg
-rw-rw-rw- 1 root root        268 Dec 14 17:02 vm.cfg.orig
-rw-rw-rw- 1 root root        215 Dec 14 17:02 README
-rw-rw-rw- 1 root root 6506195968 Dec 14 17:02 System.img

A default vm.cfg file for a paravirtualized guest environment looks similar to the following:

[root@londonvs1 12_london1]# more vm.cfg
bootloader = '/usr/bin/pygrub'
disk = ['file:/var/ovs/mount/5F267CF13C3C429CB16F12D1E4517F1A/running_pool/12_london1/System.img,xvda
,w']
memory = '1024'
name = '12_london1'
on_crash = 'restart'
on_reboot = 'restart'
uuid = '6b0723e6-b2f0-4b29-a37c-1e9115798548'
vcpus = 1
vfb = ['type=vnc,vncunused=1,vnclisten=0.0.0.0,vncpasswd=oracle']
vif = ['bridge=xenbr0,mac=00:16:3E:12:29:C8,type=netfront',
'bridge=xenbr1,mac=00:16:3E:53:95:A5,type=netfront',
]
vif_other_config = []

In this example, the network interfaces were configured during the guest installation, and no further changes to the vm.cfg file are required. In terms of the disk devices, it can be seen that the System.img file is presented to the guest as the first block device, xvda. Using Oracle VM Manager, additional image files can be created to add disk devices to the host. However, for RAC it is necessary to pass shared physical disk devices to the guest operating systems. Although it is possible to configure an iSCSI device or NFS filesystem directly in the guest, only block devices configured from within Dom0 are supported by Oracle in a production RAC environment. If redundancy is a requirement at the disk device level, multipathing should be implemented within Dom0 (see Chapter 4 for more information). Also, udev can be used at the Dom0 level to configure device persistence or ASMLIB within the guest (again, see Chapter 4 for more information).

The physical devices must be already configured and presented to the Dom0 operating system, just as it would happen in a native Linux operating system environment. These can then be passed into the guests with a configuration, such as the one shown here:

disk =
['file:/var/ovs/mount/5F267CF13C3C429CB16F12D1E4517F1A/running_pool/12_london1/System.img,xvda
,w',
'phy:/dev/sdc,xvdc,w',
'phy:/dev/sdd,xvdd,w',
'phy:/dev/sde,xvde,w',
]

Disk devices can either be partitioned in one of two ways First, They can be partitioned within Dom0 and each partition can be passed as a device to the guest. Second, the full device can be passed to the guest and partitioned there. It is also important to note that, in some earlier versions of Oracle VM, the total number of xvd devices is limited to 16; in later versions, the limit increases to 256.

When the guest is restarted, the physical devices on which Oracle RAC can be installed and shared between the guests are available to the guest environments. These guest environments can use the physical devices exactly as would be the case for a shared disk device:

[root@london1 ˜]# cat /proc/partitions
major minor  #blocks  name

 202     0    6353707 xvda
 202     1      32098 xvda1
 202     2    4225095 xvda2
 202     3    2088450 xvda3
 202    32  106778880 xvdc
 202    33  106775991 xvdc1
 202    48    1048576 xvdd
202    49    1048376 xvdd1
 202    64  313994880 xvde
 202    65  313990393 xvde1

An additional configuration to prepare the guest for a RAC environment is to assign an IP address to the private network interface. This can be done exactly as you would in a standard Linux operating system environment; that is, by editing the corresponding configuration file such as /etc/sysconfig/network-scripts/ifcfg-eth1. By default, the network interfaces are configured for DHCP, and they will be automatically assigned an address if DHCP is configured in your environment. It is important to note that the MAC address for this interface is the Xen assigned MAC address, and the one used must be the MAC address shown for VIF1 during the guest creation. The same address is also shown in the vm.cfg file on the bridge, xenbr1:

[root@london1 network-scripts]# cat ifcfg-eth1
# Xen Virtual Ethernet
DEVICE=eth1
BOOTPROTO=static
ONBOOT=yes
IPADDR=192.168.1.1
HWADDR=00:16:3e:53:95:a5

It is important to note that in the standard Oracle Linux template, by default the iptables service is enabled in the guest environment. Therefore, it should also be stopped there; alternatively, you can use the service iptables stop and the chkconfig iptables off commands to prevent it from starting.

Additionally by default in a Xen configuration the time in the guest is synchronized to Dom0 and therefore the guest should be set to manage its own time with the command echo 1 > /proc/sys/xen/independent_wallclock run as root and the parameter xen.independent_wallclock=1 set in /etc/sysctl.conf. If using an template based guest then this parameter may already be set in the guest environment. If using NTP as described in Chapter 4 the guests should synchronize with the same time source.

Once all of the guest domains have been created to support the planned number of nodes for the RAC installation, it becomes possible to proceed to install and configure RAC in the guest operating system environment for a native Linux environment (see Chapter 6 for more information).

Managing Domains

One of the key features of operating in a virtualized environment is the ability to dynamically assign resources to guest operating systems according to demand. In this section, we will look at some of the tools available for managing Oracle VM, paying particular attention to managing resources. We will begin to implement this by looking at the role the Oracle VM Agent plays in communication between the VM Server and VM Manager, and then focusing on domain management with Oracle VM Manager and the xm command-line tools.

Oracle VM Agent

Earlier sin this chapter, we examined the architecture of Oracle VM and the role of the Oracle VM Agent and its control interface. In doing so, we covered how this software provides an important mediation service between the Oracle VM Manager and the Oracle VM Servers in the Server Pool. Therefore, it is important to ensure that the VM Agent is correctly configured and running on each of the VM Servers. This enables you to manage the VM Servers from a central Oracle VM Manager location. The VM Agent is installed automatically at the same time as the VM Server, enabling the VM Agent to run as a service in the Linux operating system running in the management domain. By default, it is located in /opt/ovs-agent-2.3. The VM Agent is configured to start when the Dom0 operating system is running at runlevel 3, and the command service ovs-agent status can be used to confirm its status. Using this command provides the following output, which confirms normal operations:

[root@londonvs1 ˜]# service ovs-agent status
ok! process OVSRemasterServer exists.
ok! process OVSLogServer exists.
ok! process OVSMonitorServer exists.
ok! process OVSPolicyServer exists.
ok! process OVSAgentServer exists.
ok! OVSAgentServer is alive.

Each of these Agent daemons runs as a Python script from the installation location, and additional utility scripts are available under this location to interact with the agent at the command line. Various configuration options, such as setting the agent password or defining IP addresses that are permitted to communicate with the agent, are available through the command service ovs-agent configure. The Oracle VM Agent interacts with the Xen Hypervisor through the Xend daemon running in Dom0. The Xend daemon is the Xen controller daemon, and it is responsible for management functionality such as creating and configuring domains and Live Migration. The Xend daemon is also primarily written in Python; hence, this is the reason the VM Agent runs as a Python script. The status of the Xend daemon can be verified with the command service xend status :

[root@london1 xenstored]# service xend status
xend daemon running (pid 3621)

The configuration of the Xend daemon can modified with the file xend-config.sxp in the /etc/xen directory. This daemon provides modifiable configuration options for things such as the memory and CPUs for the management domain. The Xend daemon creates an additional daemon, xenstored, and makes commands available to read and write the configuration in the Xenstore. The Xenstore is the central repository for all configuration information on an Oracle VM Server, and the DomU's information can be dynamically configured by changing the information contained within it. For example, you might change its configuration to enable device discovery. The information is held in a lightweight database called a Trivial Database (tdb) in the form of key-value pairs. It is located in the directory, /var/lib/xenstored/tbd :

[root@londonvs1 xenstored]# file /var/lib/xenstored/tdb
tdb: TDB database version 6, little-endian hash size 7919 bytes

The entire database can be listed with the xenstore-ls command:

[root@londonvs1 ˜]# xenstore-ls
tool = ""
 xenstored = ""
local = ""
 domain = ""
  0 = ""
   vm = "/vm/00000000-0000-0000-0000-000000000000"
device = ""
   control = ""
    platform-feature-multiprocessor-suspend = "1"
   error = ""
   memory = ""
    target = "566272"
   guest = ""
   hvmpv = ""
   cpu = ""
    1 = ""
...

The hierarchical paths can be listed with the xenstore-list command. For example, you can list the uuids of the virtual machines, as in this snippet:

[root@londonvs1 ˜]# xenstore-list /vm
00000000-0000-0000-0000-000000000000
6b0723e6-b2f0-4b29-a37c-1e9115798548

The xenstore-read and xenstore-write commands can be used to read and change the individual values. The command shown here details the name of the Virtual Machine that corresponds to the uuid given:

[root@londonvs1 ˜]# xenstore-read 
> /vm/6b0723e6-b2f0-4b29-a37c-1e9115798548/name
12_london1

When troubleshooting configuration changes, the Xenstore determines whether information from the VM Manager and the Oracle VM Agent have been relayed to the underlying Xen software to be applied by the domains themselves. In Oracle VM the persistent storage is configured under the /OVS/running_pool directory, and there are no entries in default Xen location of /var/lib/xend/domains. That said, the configuration can be imported with the xm new command. You'll learn more about this in the "Managing Domains" section later in this chapter.

Oracle VM Manager

You have already seen from the guest operating server how, when logged into Oracle VM Manager under the Virtual Machines page, you can Power On and Power Off a guest and access the console either through the Console button or with vncviewer. By clicking the Configure button, it is also possible to modify a number of configuration parameters such as the CPU by altering the number of cores assigned to the virtual machine and the amount of memory. These values correspond to the parameters memory and vcpus in the vm.cfg file, as in this example:

memory = '2048'
vcpus = 2

These parameters can be changed in the vm.cfg file before starting a guest. However, before modifying these parameters dynamically in a production environment, it is important to check whether doing so is a supported action. Nevertheless, in an Oracle Database on Linux environment, both parameters are dynamic. This means that both CPU and memory can be added to and subtracted from an instance as it is running.

The number of cores in use by the Oracle instance is shown by the parameter cpu_count :

SQL> show parameter cpu_count

NAME                                 TYPE        VALUE
------------------------------------ ----------- --------------------
cpu_count                            integer     2

In Oracle VM Manager, if the number of cores allocated to the guest is modified and the configuration is saved, then this is reflected automatically in the cpu_count parameter in the Oracle instance running in the guest. This can immediately provide a performance increase for CPU intensive workloads, and no further configuration is needed by the DBA. However, careful consideration should be given to modifying the CPU count when parallel workloads are in operation.

Within the guest, you should not allocate more virtual CPUs than you have physical cores within the processors on your system. The exception to this rule occurs where Hyper Threading is available (as detailed in Chapter 4). In the case where this is available, the limit applies to the number of available threads, as opposed to cores. However, the performance benefits of assigning the additional virtual CPUs are not as defined, as is the case with a native implementation. Where multiple guests are configured, the total number of assigned virtual CPUs should be no more than twice the available cores or threads.

From within Oracle VM Manager, the amount of memory allocated to a guest can also be modified. However, the impact upon the guest is more dependent on the Oracle configuration than on the CPU. In Oracle 11g, Automatic Memory Management is supported by setting the parameters memory_max_size and memory_target. Consequently, memory is stored in files in the /dev/shm shared memory filesystem. However, to use this feature, the /dev/shm filesystem must be pre-configured so it is large enough to support the maximum possible size of memory_max_size. Doing so requires all memory to be assigned to the VM in advance, thereby negating any advantage of assigning memory dynamically. For this reason, we recommend not utilizing Oracle 11g Automatic Memory Management in a virtualized environment. Instead, you should use Automatic Shared Memory Management, which was introduced with 10g. Within Automatic Shared Memory Management, the SGA memory configuration is controlled by the parameters sga_max_size and sga_target. This memory cannot be assigned to the PGA, as it can with Automatic Shared Memory Management. However, and this is crucial for virtualized environments, the sga_max_size can be set to the upper memory limit to which the SGA can grow. Moreover, that memory does not need to be allocated to the guest, which means it can be set to the potential size to which memory will be dynamically assigned, as shown here:

SQL> show parameter sga_

NAME                                 TYPE        VALUE
------------------------------------ ----------- --------------------
sga_max_size                         big integer 12G
sga_target                           big integer 2G

When memory is dynamically assigned to the guest, it is not immediately assigned to the Oracle instance. It is necessary to modify the sga_target parameter up to the new memory assigned to the guest, which ensures that the Oracle RAC instance on the guest is specified, as in this example:

SQL> alter system set sga_target=6g scope=both sid='PROD1';

System altered.

When allocating memory to guests, however, you should also ensure that sufficient memory is available for Dom0. By default, this is set to the minimum value of 512MB. Where I/O activity is intensive and additional memory is available, performance can benefit from increasing this value to 1024MB, as specified in the dom0_mem kernel parameter in the grub.conf file of the Oracle VM Server.

In addition to the standard configuration options, there are several additional actions that can be performed by selecting an action from the More Actions: dropdown menu. One of the most powerful features under these options is live migration. As detailed earlier in this chapter, this option enables you to move a running guest from one Oracle VM Server to another. However, it is important to ensure that either the hardware within the hosts must be identical, or the systems must support a feature to enable migration between different platforms, such as Flex Migration on Intel platforms. Additionally, it is important to reiterate that Live Migration should not be performed while a RAC instance is running in the guest. This is because doing so will cause the node to be ejected from the cluster. Instead, it is necessary to stop the instance and restart it after migration has taken place.

When selecting the Live Migration option on the Migrate To page, select the system to migrate to and press Next. On the Confirm Information page, press Confirm and the guest is migrated, as shown in Figure 5-32.

Live Migration

Figure 5.32. Live Migration

The time required to migrate is proportional to the memory allocated to the guest. When refreshing the Oracle VM Manager screen, the Server Name column will display the initial server first, and then the destination server. Finally, it will show the status as Running on the new server.

Oracle VM Manager CLI

In addition to functionality provided by the Oracle VM Manager's graphical environment, if you have installed and configured the CLI (as explained previously in this chapter), then you can also accomplish the same management task directly from the command line. The CLI provides a shell mode, and logging in with the username and password configured during the Oracle VM Manager installation results in an interactive prompt:

[root@london5 ˜]# ovm -u admin -p admin shell
Type "help" for a list of commands.
ovm>

The help command provides details on the CLI commands available. At the top level, the help command lists all the available commands. Or, if given the argument for a particular subcommand, the command lists the next level of command options. For example, you can detail the subcommands for Server Pool management, as in this example:

ovm> help svrp
Server pool management:

svrp conf         ---   Configure a server pool
svrp del          ---   Delete a server pool
svrp info         ---   Get server pool information
svrp ls           ---   List server pools
svrp new          ---   Create a new server pool
svrp refresh      ---   Refresh all server pools
svrp restore      ---   Restore server pool information
svrp stat         ---   Get server pool status

"help <subcommand>" displays help message for that subcommand.
"help all" displays complete list of subcommands.

These subcommands can then be used to manage the environment. For example, this snippet shows the details of the Server Pool:

ovm> svrp ls
Server_Pool_Name Status HA
PRORAC           Active Enabled

The vm commands can be used to manage the guest virtual machines, while the vm ls command is available to query the configured guest environments:

ovm> vm ls
Name    Size(MB) Mem  VCPUs Status  Server_Pool
london1 6205     1024 1     Running PRORAC
london2 6205     1024 1     Running PRORAC

In the same vein, the vm info command provides full details on a particular guest:

ovm> vm info -n london1 -s PRORAC
                           ID: 12
         Virtual Machine Name: london1
         Virtual Machine Type: Paravirtualized
             Operating System: Oracle Enterprise Linux 5 64-bit
                       Status: Running
                   Running on: londonvs1
                     Size(MB): 6205
              Memory Size(MB): 1024
      Maximum Memory Size(MB): 1024
Virtual CPUs: 1
     VCPU Scheduling Priority: Intermediate
          VCPU Scheduling Cap: High
                  Boot Device: HDD
              Keyboard Layout: en-us
            Hign Availability: Enabled
         PVDriver Initialized: False
             Preferred Server: Auto
                   Group Name: My Workspace
                  Description: OEL5 PVM Template

However, the CLI is not restricted to only observing the configured environment. It can also be used to execute commands available from within the Oracle VM Manager graphical environment. For example, the following command can be used to perform live migration:

ovm> vm mig -n london1 -s PRORAC
Migrating.

It is important to note that the graphical manager will show a change in status when actions are taken with the CLI. For example, when using the preceding command, the migration initiated with the CLI will be observed as being in progress within Oracle VM Manager. This means you can use both the Oracle VM Manager and the CLI; they are not mutually exclusive.

The xm Command-Line Interface

In addition to Oracle VM Manager, there are a number of command-line tools that enable interaction with the Xen environment directly on the Oracle VM Server itself. These tools enable the creation, deletion, and management of domains. The most common management tools are xm and virsh. These tools offer similar functionality and methods to achieve the same goals by different means. We will focus on the xm tool in this chapter because xm is dedicated to managing Xen domains; whereas virsh, as part of thelibvirt virtualization toolkit, is a more general purpose tool. The xm CLI is available to manage domains when logged into Dom0 with root user privileges.

Displaying Information

For a quick reference of the xm commands available, you can use xm help -l or xm help -long ; these options display both the commands available and a description of each:

root@londonvs1 ˜]# xm help -l
Usage: xm <subcommand> [args]

Control, list, and manipulate Xen guest instances.

xm full list of subcommands:

 console              Attach to <Domain>'s console.
 create               Create a domain based on <ConfigFile>.
 new                  Adds a domain to Xend domain management
 delete               Remove a domain from Xend domain management.
destroy              Terminate a domain immediately.
 domid                Convert a domain name to domain id.
 domname              Convert a domain id to domain name.
 dump-core            Dump core for a specific domain.
 list                 List information about all/some domains.
...
<Domain> can either be the Domain Name or Id.
For more help on 'xm' see the xm(1) man page.
For more help on 'xm create' see the xmdomain.cfg(5)  man page.

In navigating with the xm CLI, the first step is to discover information about the running environment. The highest level command to display details on running domains is xm list :

[root@londonvs1 ˜]# xm list
Name                                        ID   Mem VCPUs      State   Time(s)
12_london1                                   1  1024     1     -b----      0.6
30_london2                                   2  1024     1     -b----      0.2
Domain-0                                     0   553     8     r-----   7044.0

The xm list output shows the name of the domain, the domain ID, the number of VCPUs currently allocated, and the total run time. The state shows the current state of the domain, which in normal operations will be either r for running or b for blocked. The blocked state signifies a wait state, such as for an I/O interrupt, or more often, for a sleep state for an idle domain. Dom0 should always be in a running state because this will be the domain from which the xm list command is run. The other states ofp for paused, c for crashed, and d for dying correspond to unavailable domains. These might be unavailable for a couple reasons. First, they might be unavailable in response to xm commands such as xm pause or xm destroy for paused and dying, respectively. Second, they might be unavailable due to an unplanned stoppage, in the case of a crash.

More detailed information, such as the domain configuration parameters, can be shown for the running domains. You do this by displaying the list in long format with the command xm list -l, as shown in the following example:

[root@londonvs1 ˜]# xm list -l | more
(domain
    (domid 1)
    (on_crash restart)
    (uuid 6b0723e6-b2f0-4b29-a37c-1e9115798548)
    (bootloader_args -q)
    (vcpus 1)
    (name 12_london1)
    (on_poweroff destroy)
    (on_reboot restart)
    (cpus (()))
    (bootloader /usr/bin/pygrub)
    (maxmem 1024)
    (memory 1024)
    (shadow_memory 0)
    (features )
    (on_xend_start ignore)
    (on_xend_stop ignore)
    (start_time 1260894910.44)
(cpu_time 0.626292588)
    (online_vcpus 1)
...

For information about the Oracle VM Server itself as opposed to the running domains, the command xm info displays information such as the processor and memory configuration of the host:

[root@londonvs1 ˜]# xm info
host                   : londonvs1
release                : 2.6.18-128.2.1.4.9.el5xen
version                : #1 SMP Fri Oct 9 14:57:31 EDT 2009
machine                : i686
nr_cpus                : 8
nr_nodes               : 1
cores_per_socket       : 4
threads_per_core       : 1
cpu_mhz                : 2660
hw_caps                : bfebfbff:20100800:00000000:00000140:0004e3bd:00000000:00000001:00000000
virt_caps              : hvm
total_memory           : 16378
free_memory            : 13579
node_to_cpu            : node0:0-7
node_to_memory         : node0:13579
xen_major              : 3
xen_minor              : 4
xen_extra              : .0
xen_caps               : xen-3.0-x86_64 xen-3.0-x86_32p hvm-3.0-x86_32
hvm-3.0-x86_32p hvm-3.0-x86_64
xen_scheduler          : credit
xen_pagesize           : 4096
...

The xm dmesg command is the primary resource for information about the Xen boot information. The command shows the version of Xen, which can be useful for cross-referencing whether known Xen features are supported in a particular version of Oracle VM:

[root@londonvs1 ˜]# xm dmesg
 __  __            _____ _  _    ___
  / /___ _ __   |___ /| || |  / _ 
    // _  \_     |_ | || |_| | | |
  /    __/ | | |  ___) |__   _| |_| |
 /_/\_\___|_| |_| |____(_) |_|(_)___/

(XEN) Xen version 3.4.0 ([email protected]) (gcc version 4.1.2 20080704 (Red Hat 4.1.2-46)) Fri Oct  2 12:01:40 EDT 2009
(XEN) Latest ChangeSet: unavailable
(XEN) Command line: dom0_mem=553M
...

You can use xm dmesg to learn additional information about the configuration. For example, you can use it to get detailed processor information, including feedback on whether hardware virtualization is enabled:

(XEN) HVM: VMX enabled

You can also use to determine the scheduler being used:

(XEN) Using scheduler: SMP Credit Scheduler (credit)

For performance details, you can use either the xm top or xentop command to display runtime information on the domains, including the resources they are using for processor, memory, network, and block devices. The information is displayed in a format similar to what you see with the command top when it is run in a native Linux environment:

xentop - 17:32:47   Xen 3.4.0
3 domains: 1 running, 2 blocked, 0 paused, 0 crashed, 0 dying, 0 shutdown
Mem: 16771152k total, 2865872k used, 13905280k free    CPUs: 8 @ 2660MHz
      NAME  STATE   CPU(sec) CPU(%)     MEM(k) MEM(%)  MAXMEM(k) MAXMEM(%) VCPUS NETS NETTX(k) NETRX(k) VBDS   VBD_OO   VBD_RD   VBD_WR SSID
12_london1 --b---          0    0.0    1048576    6.3    1048576
    6.3     1    2        0     5708    4        0        0       72    0
30_london2 --b---          0    0.0    1048576    6.3    1048576
    6.3     1    2        2      345    1        0        4      143    0
  Domain-0 -----r       7061    6.8     566272    3.4   no limit
    n/a     8    0        0        0    0        0        0
0    0

  Delay  Networks  vBds  VCPUs  Repeat header  Sort order  Quit

You can use the xm console command to connect to the console of a running domain. For example, you can combine this command with the argument of the domain id from xm list to connect to the console of that VM directly from the command line:

root@londonvs1 ˜]# xm console 2

Enterprise Linux Enterprise Linux Server release 5.3 (Carthage)
Kernel 2.6.18-128.0.0.0.2.el5xen on an x86_64

London1 login: root
Password:
[root@london1 ˜]#

Managing Domains

Working from under /OVS/running_pool/, you can start, stop, pause, and migrate domains. If the Xenstore is unaware of a domain, you need to run the xm create command with the configuration file as an argument to both import the configuration and start it running:

[root@londonvs1 12_london1]# xm create ./vm.cfg
Using config file "././vm.cfg".
Started domain 12_london1 (id=3)

Similarly, a domain can be destroyed, as in the following example:

[root@londonvs1 12_london1]# xm destroy 12_london1

Under the default Oracle VM Server configuration, before you can start a domain (as opposed to creating one), you must import it. For example, attempting to start a domain shows that the Xen is not aware of the configuration:

[root@londonvs1 12_london1]# xm start 12_london1
Error: Domain '12_london1' does not exist.

The command xm new imports the configuration and stores the persistent configuration under the directory /var/lib/xend/domains. For example, importing the configuration file of the domain london1 creates the persistent configuration information for this domain. Doing so also enables the domain to be started by name:

[root@londonvs1 12_london1]# xm new ./vm.cfg
Using config file "././vm.cfg".
[root@londonvs1 12_london1]# xm start 12_london1

The corresponding command to delete this configuration is xm delete. However, deletion can only take place once a domain has been halted. Graceful shutdowns and reboots of guest operating systems in domains can be achieved with xm shutdown and xm reboot, as opposed to destroying the underlying domain. A domain can be paused and resumed in a live state using xm pause and xm unpause. That said, the domain continues to hold its existing configuration in memory, even while paused:

[root@londonvs1 12_london1]# xm pause 12_london1
[root@londonvs1 12_london1]# xm list 12_london1
Name              ID   Mem VCPUs      State   Time(s)
12_london1         4  1024     1     --p---      7.7
[root@londonvs1 12_london1]# xm unpause 12_london1

Similar commands are xm save and xm restore. However, in this case the configuration is saved to storage, and it no longer consumes memory resources. It is also possible to migrate domains between Oracle VM Server at the command line using the xm migrate command. Before doing so, you should enable the following settings in xend-config.sxp :

(xend-relocation-server yes)
(xend-relocation-port 8002)
(xend-relocation-hosts-allow '')

By default, a migration will pause the domain, relocate, and then unpause the domain. However, with the --live argument, the migration will stream memory pages across the network and complete the migration without an interruption to service. Note that it is crucial that this migration not be performed when a RAC instance is operational. Also, the database instance, respective ASM instance, and clusterware should be shutdown before migration takes place. Doing so will help you prevent an Oracle RAC cluster node eviction. Assuming it is safe to do so, the following command initiates live migration for a domain:

[root@londonvs1 ˜]# xm migrate 30_london2 londonvs2 --live

The service continues to operate on the original server while the domain is in a blocked and paused state on the target server:

[root@londonvs1 ˜]# xm list
Name                                        ID   Mem VCPUs      State   Time(s)
12_london1                                   5  1024     1     -b----      0.5
Domain-0                                     0   553     8     r-----   7172.0
migrating-30_london2                         6  1024     1     -b----      0.0

Managing Resources

As discussed previously in this chapter, the scheduler is important in an Oracle VM environment because Dom0 is subject to the same scheduling as the other guest domains. However, Dom0 is also responsible for servicing physical I/O requests, which means careful management of scheduling can ensure optimal levels of throughput across all domains. Management of the scheduler should be considered in association with the other xm commands for managing CPU resources; namely, xm vcpu-list, xm vcpu-set and xm vcpu-pin. The first of these commands details the number of virtual CPUs configured on the system, and the following listing shows a system with a default configuration of Dom0 running on a system with eight physical processor cores:

root@londonvs1 ˜]# xm vcpu-list
Name                                ID  VCPU   CPU State   Time(s) CPU Affinity
Domain-0                             0     0     6   r--    1576.7 any cpu
Domain-0                             0     1     7   -b-     438.6 any cpu
Domain-0                             0     2     6   -b-     481.5 any cpu
Domain-0                             0     3     0   -b-     217.9 any cpu
Domain-0                             0     4     1   -b-     137.6 any cpu
Domain-0                             0     5     5   -b-     152.3 any cpu
Domain-0                             0     6     0   -b-     164.6 any cpu
Domain-0                             0     7     3   -b-     102.1 any cpu
12_london1                           2     0     4   -b-      36.9 any cpu
12_london1                           2     1     3   -b-      23.9 any cpu
30_london2                           4     0     1   -b-       5.1 any cpu
30_london2                           4     1     6   -b-       5.0 any cpu

The second of the xm commands to modify processor resources, xm vcpu-set, lets you increase or reduce the number of VCPUs assigned to a domain. The following example decreases the number of virtual CPUs for a domain from the original setting of 2 to 1. The second VCPU displays a paused state and is unallocated to any processor:

[root@londonvs1 ˜]# xm vcpu-set 12_london1 1
[root@londonvs1 ˜]# xm vcpu-list 12_london1
Name                                ID  VCPU   CPU State   Time(s) CPU Affinity
12_london1                           2     0     0   -b-      60.7 any cpu
12_london1                           2     1     -   --p      31.9 any cpu

The number of VCPUs can be increased, although the number cannot be increased beyond the original setting when the domain was started:

[root@londonvs1 ˜]# xm vcpu-set 12_london1 2
[root@londonvs1 ˜]# xm vcpu-list 12_london1
Name                                ID  VCPU   CPU State   Time(s) CPU Affinity
12_london1                           2     0     4   -b-      61.7 any cpu
12_london1                           2     1     0   -b-      31.9 any cpu

As you have already seen, the VCPUs do not by default correspond directly to a physical core. This is illustrated by the section CPU Affinity, which shows that the VCPUs can run on any cpu. To set CPU affinity, the command xm vcpu-pin can be used. This command is of particular importance because it is the approved method for implementing hard partitioning for the purposes of Oracle Database licensing. Specifically, CPUs are pinned to particular guest domains and licensed accordingly; this is in contrast to licensing all of the CPUs physically installed in the server. You can use xm vcpu-pin in combination with the arguments for the domain to pin CPUs in, the VCPU to pin, and the physical CPU to pin it to. This command makes it possible to set up such a configuration, where you dedicate cores to a particular environment. For example, the following listing shows VCPU 1 being pinned to physical CPU 1 for a domain:

[root@londonvs1 ˜]# xm vcpu-pin 12_london1 1 1
[root@londonvs1 ˜]# xm vcpu-list 12_london1
Name                                ID  VCPU   CPU State   Time(s) CPU Affinity
12_london1                           2     0     3   -b-      73.1 any cpu
12_london1                           2     1     1   -b-      38.8 1

The same configuration can also be set in the vm.cfg configuration file at the time a domain starts. For example, the following configuration gives the domain two VCPUs pinned to physical cores 1 and 2:

vcpus=2
cpus=[1,2]

The VCPU listing displays the CPU affinity. It is important to note that the pinning is assigned across the cores and not on a 1 to 1 basis. Therefore, the listing in this case could show both of the VCPUs running on physical core 1, physical core 2, or on either of the cores, as in this example:

[root@londonvs1 12_london1]# xm vcpu-list 12_london1
Name                                ID  VCPU   CPU State   Time(s) CPU Affinity
12_london1                           5     0     1   -b-       2.4 1-2
12_london1                           5     1     1   -b-       3.1 1-2
[root@londonvs1 12_london1]# xm vcpu-list 12_london1
Name                                ID  VCPU   CPU State   Time(s) CPU Affinity
12_london1                           5     0     1   r--       5.2 1-2
12_london1                           5     1     2   r--       5.7 1-2
[root@londonvs1 12_london1]# xm vcpu-list 12_london1
Name                                ID  VCPU   CPU State   Time(s) CPU Affinity
12_london1                           5     0     2   -b-       5.7 1-2
12_london1                           5     1     1   -b-       6.5 1-2

It is important to note that that number of CPUs based on the physical core count starts at 0. For example, CPUs 0-7 will be available in an eight core system. In these examples, physical CPU 0 has been reserved for Dom0. In practice, the number of physical CPU cores to assign to Dom0 will depend on the system configuration and the processors in question. For example, as I/O demands increase, increased resources are assigned to Dom0 to ensure optimal throughput. For a RAC configuration, one or two physical cores assigned to Dom0 should be sufficient. However, I/O demands that exceed the throughput demands of a single 4Gbit Fibre Channel HBA may require more CPU resources for Dom0. Only load testing a particular environment can help you determine exactly the right settings.

Coupled with the pinning of CPUs, the command xm sched-credit is also available to fine-tune CPU resource allocation between domains. If run without arguments, the command shows the current configuration:

[root@londonvs1 ˜]# xm sched-credit
Name                                ID Weight  Cap
Domain-0                             0    256    0
12_london1                           5    256    0
30_london2                           4    256    0

The output in the preceding example shows that each domain is assigned a weight and a cap value. By default, a weight of 256 and a cap value of 0 is assigned to each domain. The weight is relative, and it determines the number of CPU cycles a domain will receive. For example, a weight of 256 will proportionally receive twice as cycles as a domain with a weight of 128 ; in a similar vein, it would receive half as many cycles as a domain with a weight of 512. The cap value is given in percentage terms, and it defines the maximum percentage of CPU cycles that a domain may consume from a physical processor. The default value of 0 means that no cap is set for the domain. The percentage value is for a single given processor core on the system; therefore, it can exceed 100, which would cap the limit above the resources provided by just one core. You can display the weight and cap for a particular domain, as in this example:

[root@londonvs1 ˜]# xm sched-credit -d 12_london1
Name                                ID Weight  Cap
12_london1                           5    512    0

You increase the weight of a domain relative to other domains like this:

[root@londonvs1 ˜]# xm sched-credit -d 12_london1 -w 512
[root@londonvs1 ˜]# xm sched-credit -d 12_london1
Name                                ID Weight  Cap
12_london1                           5    512    0

Similarly, the -c argument is used to modify the cap.

As discussed previously in this chapter, memory can be dynamically allocated to domains with the balloon memory management driver. The interface to this method is provided by the xm mem-set command. This command requires two arguments for the domain: the first modifies the memory allocation, while the second modifies the amount of memory, given in megabytes. In this example, the guest is allocated 4GB of memory when logged into the guest domain as root:

[root@london1 ˜]# free
             total       used       free     shared    buffers     cached
Mem:       4096000    3219304     876696          0      65004    2299532
-/+ buffers/cache:     854768    3241232
Swap:      3047416          0    3047416

This example reduces the memory:

[root@london1 ˜]# xm mem-set 12_london1 3500
[root@london1 ˜]# xm list
Name                                        ID   Mem VCPUs      State   Time(s)
Domain-0                                     0   834     8     r-----   3299.2
12_london1                                   5  3500     2     -b----     63.1
30_london2                                   4  4000     2     -b----     16.0

In this result, you can see that the impact is immediately registered in the guest environment:

[root@london1 ˜]# free
             total       used       free     shared    buffers     cached
Mem:       3584000    3219552     364448          0      65004    2299676
-/+ buffers/cache:     854872    2729128
Swap:      3047416          0    3047416

The corresponding values in the vm.cfg file are maxmem and memory ; the following values set these parameters to 6GB and 4GB, respectively:

maxmem = 6000
memory = 4000

Note that both of these values are dynamic, and they can be used conjunctively to vary the memory assigned to a domain beyond the original settings:

[root@londonvs1 ˜]# xm mem-max 12_london1 8000
[root@londonvs1 ˜]# xm mem-set 12_london1 6500
[root@londonvs1 ˜]# xm list
Name                                       ID   Mem VCPUs      State   Time(s)
Domain-0                                    0   834     8     r-----   3300.9
12_london1                                  5  6500     2     -b----     77.4
30_london2                                  4  4000     2     -b----     16.5

Summary

In this chapter, we have looked at the new possibilities presented by RAC when complemented with Oracle VM's virtualization. We began by covering the important concepts and reviewing the design considerations. Next, we looked at Oracle VM and the guest installation, and then at how to handle configuration for RAC guests, with a focus on enabling high availability at the virtualization layer. Finally, we detailed some of the options available for managing an Oracle VM virtualized environment.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset