Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Chapter 1. Introduction

In this chapter, we will discusses reasons for deploying Oracle Real Application Cluster to protect your database-based application from an unplanned outage and giving your application high availability, fault tolerance, and many other benefits that cannot be obtained from running your application against a single-instance Oracle database. This chapter will also cover the history of RAC and the evolution of Oracle clustering products, culminating with the product we know now.

Introducing Oracle Real Application Clusters

Oracle Real Application Clusters (RAC) is an option that sits on top of the Oracle database. Using the shared disk architecture, the database runs across a set of computing nodes offers increased availability, allows applications to scale horizontally, and improves manageability at a lower cost of ownership. RAC is available for both the Enterprise Edition and the Standard Edition of the Oracle database.

When users think of RAC, Oracle also wants them to think of the grid, where the grid stands for having computing power as a utility. With Oracle's tenth major release of the database, the focus changed from the i for Internet users were so familiar with (e.g., Oracle 9i) to a g for grid computing (e.g., Oracle 10g). The trend in the industry away from comparatively expensive proprietary SMP servers to industry-standard hardware running on the Linux operating system seems to support the idea that users want to treat computing power as a utility. And indeed, some of the largest physics experiments conducted today, including those that rely on the Large Hadron Collider (LHC) at the Centre for Nuclear Research CERN in Geneva, are using industry-standard hardware and Oracle RAC for data processing.

The RAC option has been available since Oracle 9i Release 1 in the summer of 2001. Prior to that, the clustered Oracle database option was known as the Oracle Parallel Server option. RAC offers fundamental improvements over Oracle Parallel Server—and the introduction of Cache Fusion has helped improve application scalability and inter-instance communication, as well as propelled RAC into mainstream use.

In a study published by the Gartner Group, analysts suggested that Oracle RAC in 9i Release 1 required skilled staff from various departments for successful RAC implementations. At the time, the analysts rated RAC as a reliable option, allowing users to increase scalability and availability; however, they also said that its complexity was an inhibitor to widespread adoption.

Since then, Oracle has worked hard to address these concerns. Key new features were added in Oracle 10g Release 1 that built on the successful components introduced previously. For example, Oracle Automatic Storage Management provided the functionality of a clustered logical volume manager, removing the dependency that required Oracle users to license such functionality from third-party software vendors (and thereby increasing the desire of Oracle's customers to implement it).

10g Release 1 also included Oracle Clusterware, a unified, portable clustering layer that performed tasks that often required third-party clustering software previously. Prior to 10g Release 1, Oracle Clusterware was available only on Windows and Linux; however, 10g Release 1 marked its release for all major platforms. Of course, non-Oracle cluster-management software can still be used if needed, depending on Oracle's certification with a given stack.

These successful components have been further enhanced with the 11.2 release of the database. Increasing emphasis has been put on computing as a utility. Computing as a utility in this context means less administrator intervention and more automatically performed actions. For example, the new Grid Plug And Play deployment option allows users to easily add and remove nodes from a cluster. Clusters can be logically divided into subunits referred to as server pools. Such a server pool can be declared the home for a RAC database—shrinking and expanding the number of servers in the server pool automatically causes the database to adapt to the new environment by adding or removing database instances.

To summarize, Oracle 11g Release 2 Enterprise Edition RAC promises the following benefits to users:

High availability: The shared-everything architecture guarantees that node failures do not imply loss of service. The remaining nodes of the cluster will perform crash recovery for the failed instance, guaranteeing availability of the database.
Scalability: Multiple nodes allow an application to scale beyond the limits imposed by single-node databases
Manageability: Multiple databases can be consolidated into a RAC cluster.
Reduced cost of ownership: RAC can be deployed on industry standard hardware, offsetting the licensing cost with lower hardware cost.

In addition to the aforementioned features, Oracle 11g Release 2 also includes a product called RAC One Node. Oracle has recognized the fact that some RAC deployments have been installed purely for high availability; it has also discovered that other (virtualization) products are increasingly being used. To counter that trend, RAC One Node builds on the RAC technology stack: Oracle Clusterware, Oracle Automatic Storage Management, and the Oracle database. Oracle RAC One Node will be discussed in more detail in Chapter 3.

Examining the RAC Architecture

Figure 1-1 provides an overview of the RAC technology stack (see Chapter 3 for a much more in-depth discussion of the RAC architecture).

Figure 1.1. The Oracle Real Application Clusters (RAC) software stack

As you can see in Figure 1-1, Oracle RAC is based around the following software components:

Oracle RAC runs on top of an operating system.
Oracle RAC builds on the Oracle software stack.
- Oracle recommends installing Grid Infrastructure—the clustering software layer—with a dedicated user, usually grid. This account has to be created on the operating system level
- In the releases preceding 11g Release 2, it was possible to install the storage layer with a dedicated operating system account. With 11g Release 2, Oracle began bundling its cluster aware logical volume manager software, Automatic Storage Management (ASM), into the cluster software stack. Note that this approach no longer allows a strict separation of duties, as was the case before. The Oracle RDBMS binaries are traditionally installed under the oracle account.
Depending on the choice of storage, Oracle provides libraries to facilitate the discovery and management of shared storage in form of RPMs.
The Oracle Cluster aware layer is a prerequisite for running clustered Oracle databases. It must be installed before the database binaries are installed
Oracle Real Application Clusters requires shared storage for the database files, such as online redo logs, control files, and data files. Various options are available for users to choose from. It appears that Oracle's strategic choice is to use ASM, its own cluster-aware logical volume manager.
Finally, the database binaries are installed.
A database is created after the software stack is installed and configured.

From Oracle 10g Release 1 to Oracle 11g Release 1, Oracle's software components could be installed on certified cluster file systems such as Oracle's own OCFS2, a so-called shared Oracle home. Beginning with Oracle 11g Release 2, only the RDBMS software binaries can be installed as a shared home, Grid Infrastructure, the cluster foundation, can no longer be installed on a shared file system.

As also illustrated in Figure 1-1, you can see the following differences between single instance of Oracle database and a two-node RAC:

A private interconnect is used for intercluster communication; this interconnect relies on a private interconnect switch.
A public network is used for all client communication with the cluster.
To speed up detection of failed nodes, Oracle RAC employs virtual IP addresses as cluster resources. When a node fails, its virtual IP migrates to another node of the cluster. If that were not the case, clients would have to wait for TCP/IP timeouts (which can be very long) before trying the next node of the cluster. When migrated to another host, the virtual IP address can immediately signal that the node is down, triggering the client to try the next host in the local naming file.
Shared storage is required for the database files.

Deploying RAC

As we have seen, systems based on RAC offer a number of advantages over traditional single-instance Oracle databases. In the upcoming sections, we will explore such systems in more depth, focusing on the hallmarks of the RAC option: high availability, scalability, manageability and cost of ownership.

Maintaining High Availability

Compute clustering aims to provide system continuity in the event of (component) failure, thus guaranteeing a high availability of the service. A multitude of ideas have been developed over the past decades to deal with sudden failure of components, and there is a lot of supporting research. Systems fail for many reasons. Most often, aging or faulty hardware causes systems to become unusable, which leads to failures. However, operator errors, incorrect system specifications, improper configuration, and insufficient testing of critical application components can also cause systems to fail. These should be referred to as soft failures, as opposed to the hard failures mentioned previously.

Providing Fault Tolerance by Redundancy

The most common way to address hardware faults is to provide hardware fault tolerance through redundancy—this is common practice in IT today. Any so-called single point of failure—in other words, a component identified as critical to the system—should have adequate backup. Extreme examples lie in space travel-the space shuttles use four redundant computer systems with the same software, plus a fifth system with a different software release. Another example is automated control for public transport, where component failure could put lives at risk. Massive investment in methods and technology to keep hardware (processor cycles, memory, and so on) in sync are justified in such cases.

Today, users of Oracle RAC can use industry standard components to protect against individual component failures; a few milliseconds for instance recovery can usually be tolerated.

Most storage arrays are capable of providing various combinations of striping and mirroring of individual hard disks to protect against failure. Statistically, it's known that hard drives manufactured in batches are likely to fail roughly around the same time, so disk failure should be taken seriously when it happens. The connections between the array(s) and the database host should also be laid out in a redundant way, allowing multiple paths for the data to flow. This not only increases throughput, but failure of a host-based adaptor or a SAN switch can't bring down the system, either.

Of course, all critical production servers should also have redundancy for the most important internal components, such as power supply units. Ideally, components should be hot swappable, but this is becoming less of an issue in a RAC environment because servers can be easily added and removed from the cluster for maintenance, and there are few remaining roadblocks to performing planned maintenance in a rolling fashion.

One of the key benefits of Oracle RAC has always been its ability to provide a highly available database platform for applications. Oracle RAC uses a software layer to enable high availability; it accomplishes this by adding database instances that concurrently access a database. In the event of a node failover, the surviving node(s) can be configured to take the workload over from the failed instance. Again, it is important to design the cluster to allow the surviving node to cope with the workload; otherwise, a complete loss of database service could follow an individual node failure.

Making Failover Seamless

In addition to adding database instances to mitigate node failure, Oracle RAC offers a number of technologies to make a node failover seamless to the application (and subsequently, to the end user), including the following:

Transparent Application Failover
Fast Connect Failover

Transparent Application Failover (TAF) is a client-side feature. The term refers to the failover/reestablishment of sessions in case of instance or node failures. TAF is not limited to RAC configurations; active/passive clusters can benefit equally from it. TAF can be defined through local naming in the client's tnsnames.ora file or, alternatively, as attributes to a RAC database service. The latter is the preferred way of configuring it. Note that this feature requires the use of the OCI libraries, so thin-client only applications won't be able to benefit from it. With the introduction of the Oracle Instant client, this problem can be alleviated somewhat by switching to the correct driver.

TAF can operate in two ways: it can either restore a session or re-execute a select statement in the event of a node failure. While this feature has been around for a long time, Oracle's net manager configuration assistant doesn't provide support for setting up client-side TAF. Also, TAF isn't the most elegant way of handling node failures because any in-flight transactions will be rolled back—TAF can resume running select statements only.

The fast connection failover feature provides a different way of dealing with node failures and other types of events published by the RAC high availability framework (also known as the Fast Application Notification, or FAN). It is more flexible than TAF.

Fast connection failover is currently supported with Oracle's JDBC implicit connection cache, Oracle's Universal Connection Pool, and Oracle Data Provider for .Net session pools, as well as OCI and a few other tools such as CMAN. When registered with the framework, clients can react to events published by it: instead of polling the database to detect potential problems, clients will be informed by way of a push mechanism—all sessions pertaining to a failed node will be marked as invalid and cleaned up. To compensate for the reduction in the number of available sessions, new sessions will be created on another cluster node. FAN uses the Oracle Notification Services (ONS) process or AQ to publish its events. ONS is created and configured by default during a RAC installation on all of the RAC nodes.

An added benefit: It's possible to define user callouts on the database node using FAN events to inform administrators about node up/down events.

Putting the Technology Stack in Perspective

A word of caution at this stage: Focusing on the technology stack up to the database should never be anything other than the first step on the way to a highly available application. Other components in the application stack also need to be designed to allow for the failure of components. There exist cases where well designed database applications adhering to all the criteria mentioned previously are critically flawed because they use only a single network switch for all incoming user traffic. If the switch fails, such an application becomes inaccessible to end users, even though the underlying technology stack as a whole is fully functional.

Defining Scalability

Defining the term scalability is a difficult task, and an all-encompassing definition is probably out of scope for this book. The term is used in many contexts, and many database administrators and developers have a different understanding of it. For RAC systems, we normally consider a system to scale if the application's response time or other key measurement factors remains constant as the workload increases.

Scoping Various Levels of Scalability

Similar to a single point of failure, the weakest link in an application stack—of which the database is really just one component—determines its overall throughput. For example, if your database nodes are connected using Infiniband for storage and the interconnect, but the public traffic coming in to the web servers only uses 100Mbit Ethernet, then you may have a scalability problem from the beginning, even if individual components of the stack perform within the required parameters.

Therefore, we find that scalability has to be considered from all of the following aspects:

Hardware scalability
Storage scalability
Operating system scalability
Database scalability
Application scalability

You will learn more about each of these scalability levels in later chapters of this book.

Scaling Vertically vs. Horizontally

Additional resources can be added to a system in two different ways:

Scale up: Before clustered computing became a widely spread option, database servers were usually upgraded and/or extended to offer better performance. Often, big iron was purchased with some of the CPU sockets unpopulated, along with other methods that allowed room for growth. When needed, components could be replaced and extended, all within the same system image. This is also known as scaling vertically.
Scale out: The design advantage RAC offers over SMP servers lies in the fact that additional nodes can be added to the cluster to increase the overall throughput, whereas even the most powerful SMP server will run out of processor sockets eventually. This is also known as scaling horizontally.

Please bear in mind that, for certain workloads and applications, RAC might not be the best option because of the overhead associated with keeping the caches in sync and maintaining global locks. The CPU processing power available in industry standard hardware continues to increase at an almost exponential rate due to the fundamentals of Moore's Law (see Chapter 4 for more information about this topic).

Changing the underlying hardware can in principle have three different outcomes:

The throughput increases.
The throughput remains constant.
The throughput decreases.

Architects aim for linear scalability, where the throughput remains constant under additional workload—in other words, doubling the number of nodes should also double the throughput of the application. Technical overhead, such as the cache synchronization and global locking, prevent exact linear scalability in RAC; however, a well designed application—one that uses business logic inside the database, bind variables, and other techniques equally applicable to single-instance Oracle systems—will most likely benefit greatly from RAC.

Generally speaking, the scalability achieved with RAC varies according to the application and database design.

Increasing Manageability

The cost of licensing RAC can be partly offset by the improved manageability it offers. For example, the technology behind the RAC technology stack makes it an ideal candidate for database consolidation. Data center managers are increasingly concerned with making optimal use of their available resources, especially with the more recent focus on and interest in green IT.

Achieving Manageability Through Consolidation

Server consolidation comes in many forms. Current trends include the consolidation of databases and their respective applications through virtualization or other forms of physically partitioning powerful hardware. Oracle RAC offers a very interesting avenue for Oracle database server consolidation. One of the arguments used in favor of consolidation is the fact that it is more expensive (not only from a license point of view) to support a large number of small servers, each with its own storage and network connectivity requirements, than a large cluster with one or only a few databases. Also, users can get better service-level agreements, monitoring, and backup and recovery from a centrally managed system. Managers of data centers also like to see their servers working and well utilized. Underutilized hardware is often the target of consolidation or virtualization projects.

Several large companies are implementing solutions where business units can request access to a database, usually in the form of a schema that can then be provisioned with varying levels of service and resources, depending on the requirements. It is possible to assume a scenario where three clusters are employed for Gold, Silver, and Bronze levels of service. The infrastructure department would obviously charge the business users different amounts based on the level and quality of the service provided. A very brief description of such a setup might read as follows:

Gold: This cluster would be very closely monitored. It would also include multiple archive log destinations, standby databases, and 24×7 coverage by DBAs. Flashback features would be enabled, and multiple standby databases would be available in data centers located in secure remote locations. Frequent backups of the database and archived logs would guarantee optimal recoverability at any time. Such a cluster would be used for customer-facing applications that cannot afford downtime, and each application would be configured so that it protected against node failures.
Silver: This cluster would offer a similar level of service, but it would be limited to business hours. It would be used for similarly important applications, with the exception that there will be no users connecting to them after business hours.
Bronze: This cluster would be intended for quality assurance, development, or test environments. Response times for the DBA team would be lower than for Silver or Gold levels, and there wouldn't be backups because frequent refresh operations would allow testers and developers to roll out code.

The preceding examples don't represent strict implementation rules, obviously; your business requirements may be vastly different—hence an evaluation of your requirements should always precede any implementation.

Users of a database could specify their requirements in a very simple electronic form, making the provisioning of database access for applications quite easy and more efficient; this approach offers a high degree of automation.

Note that the Gold-Silver-Bronze scenario assumes that it doesn't matter for many applications if they have multiple schemas in their own database or share one database with other projects. The more static an application's data, the more suited that app is for consolidation.

A different approach to server consolidation is to have multiple databases run on the same cluster, instead of employing one database with multiple schemas. Tom Kyte's web site (http://asktom.oracle.com) includes an ongoing discussion where participants have been debating whether running multiple instances on the same physical host is recommended—the discussion is mostly centered on the fact that some Oracle background processes run in the real-time scheduling class, which could potentially starve other processes out of CPU time. Today's modern and powerful hardware, such as eight-core and above x86-64 processors, have somewhat diminished the weight of such arguments.

Enabling Database Consolidation

Several features in the Oracle database help to make server consolidation successful:

The Resource Manager
Instance Caging
Workload management

The Resource Manager allows the administrator to use a variety of criteria to group users into a resource consumer group. A resource consumer group defines how many resources in a database can be assigned to users. Since Oracle 10, users can be moved into a lower resource consumer group when they cross the threshold for their allowed resource usage. Beginning with Oracle 11, they can also be upgraded once their calls are completed. This is especially useful in conjunction with connection pooling and web applications where one session can no longer be directly associated with an individual, as it was in the days of dedicated server connections and Oracle Forms applications. In the connection pooling scenario, the application simply grabs a connection out of the pool of available connections, performs its assigned task, and then returns the connection to the pool. Often these operations are very short in nature. Connection pooling offers a huge advantage over the traditional way of creating a dedicated connection each time a user performs an operation against the database, greatly reducing the overhead associated with establishing a dedicated server process.

Instance caging is a new Oracle 11.2 feature. It addresses a scenario where multiple databases run on the same cluster (instead of the single database/multiple schemas design discussed previously). In a nutshell, instance caging allows administrators to limit the number of CPUs available to the database instance by setting an initialization parameter. In addition, a resource manager plan needs to be active for this feature to work in cases where resource usage is further defined.

Finally, workload management allows you to logically subdivide your RAC using the concept of services. Services are a logical abstraction from the cluster, and they permit users and applications to connect to a specific number of nodes. Services are also vital for applications to recover from instance failure—a service can be defined to fail over to another node in case the instance it was running on has failed. Oracle allows the administrator to set up a list of nodes as the preferred nodes—nodes where the application preferably connects to. Oracle also allows the administrator to specify available nodes in case one of the preferred nodes fails. Services can also be used for accounting. For example, you might use them to charge a business for the use of a cluster, depending on its resource consumption.

Consolidating Servers

Server consolidation is a good idea, but it shouldn't be used excessively. For example, running the majority of business critical applications on the same cluster in the same data center is not a good idea.

Consolidation also requires input from many individuals, should the system have to switch to the Disaster Recovery (DR) site. Scheduled DR tests can also become difficult to organize as the number of parties increases. Last but not least, the more data is consolidated in the same database, the more difficult it becomes to perform point-in-time recoveries in cases of user error or data corruption, assuming there is a level of data dependence. If you have a situation where 1 product in 15 consolidated on a RAC system needs to revert back to a particular point in time, it will be very difficult to get agreement from the other 14 products, which are perfectly happy with the state of the database and their data.

Assessing the Cost of Ownership

As discussed previously in the section on manageability, many businesses adopt RAC to save on their overall IT infrastructure cost. Most RAC systems in the UK are deployed on industry-standard components running the Linux operating system. This allows businesses to lower their investment in hardware, while at the same time getting more CPU power from their equipment than was possible a few years ago.

However, RAC can contribute considerably to the cost of the Oracle licenses involved, unless Standard Edition is deployed. However, the Oracle Standard Edition doesn't include the Data Guard option, which means users must develop their own managed recovery solutions, including gap resolution.

Choosing RAC vs. SMP

Undeniably, the hardware cost of deploying a four-node RAC system based on industry-standard Intel x86-64 architecture is lower than the procurement of an SMP server based on a different processor architecture that is equipped with 16 CPUs. Before the advent of multicore systems, industry-standard servers were typically available with up to 8 CPU socket configurations, with each socket containing a single-processing core. Currently, systems are available with 64 cores in a single 8-socket x86-64 server that supports multiple terabytes of memory. Such configurations now enable Oracle single-instance processing capabilities on industry-standard hardware that was previously the domain of dedicated RISC and mainframe environments.

Further economies of scale could be achieved by using a standard-hardware model across the enterprise. Industry-standard x86-64 systems offer many features that modern databases need at a relatively low cost. Once the appropriate hardware platform is adopted, the IT department's Linux engineering team can develop a standardized system image to be distributed through local software repositories, making setup and patching of the platform very easy. Additionally, by using similar hardware for e-mail, file sharing, and databases, the cost for training staff such as data center managers, system administrators and to a lesser degree database administrators can also be reduced. Taken together, these benefits also increase efficiency. Hardware maintenance contracts should also be cheaper in such a scenario because there is a much larger similar base for new systems and spares.

The final argument in favor of RAC is the fact that nodes can be added to the cluster on the fly. Technologies such as Grid Plug and Play introduced with Oracle 11.2 make this even simpler. Even the most powerful SMP server will eventually reach its capacity limit; RAC allows you to sidestep that problem by adding more servers.

Evaluating Service-Level Agreements

Many businesses have agreed to levels of service with other parties. Not meeting the contractually agreed level of service usually implies the payment of a fee to the other party. Unplanned downtime can contribute greatly to overrunning service-level agreements, especially if the mean time to recovery (MTTR) is high. It is imperative that the agreed service levels are met at all times. Depending on the fees involved, the party offering the service for others might need to keep engineers for vendor support on hot standby—in other words, as soon as components fail—so that the on-site engineers can replace or fix such components. Needless to say, that level if service comes at a premium.

The use of RAC can help reduce this cost. As described earlier in this introduction, Oracle RAC is a shared-everything environment, which implies that the failure of a node doesn't mean the complete loss of the service, as was the case in the earlier scenario that covered a single instance of Oracle. Further enhancements within the Oracle cluster layer make it truly possible to use computing as a utility. With server pools, hot spare servers can be part of the cluster without actively being used. Should a server pool running an Oracle RAC database fall below the minimum number of usable nodes, spare servers can be moved into the server pool, restoring full service in very little time. Grid Infrastructure is also able to take out a node from a different server pool with a lower priority to satisfy the minimum number of nodes requirement of the higher priority server pool; this enables powerful capacity management and is one of the many improvements offered by Oracle 11.2, which pushes the idea of grid computing to entirely new levels.

However, it should be noted that RAC doesn't protect against site failure, except for the rare case where an extended distance cluster is employed.

Improving Database Management

When done right, server or database consolidation can offer great benefits for the staff involved, and economies of scale can be achieved by reducing the cost of database management. Take backups, for example: instead of having to deploy backup agents to a large number of hosts to allow the tape library to back up databases, only a few distinct systems need to be backed up by the media management library. Patching the backup agents will also become a much simpler task if fewer agents are involved. The consolidated backup can be much simpler to test and verify, as well.

With a consolidated RAC database, disaster-recovery scenarios can also become simpler. Many systems today are using data outside their own schema; in Oracle, database links are often employed in case different databases are used. Add in new technologies such as Service Oriented Architecture or BPEL, and it becomes increasingly difficult to track transactions across databases. This is not so much of a problem if the master database only reads from other sources. As soon as writing to other databases is involved, (disaster) recovery scenarios became very difficult. So instead of using multiple federated databases, a consolidated RAC system with intelligent grants across schemas can make recovery much simpler. Site failures could also be dealt with in a much simpler way by implementing failover across a couple of databases instead of dozens.

Factoring in the Additional Hardware Cost

Deploying RAC involves a more elaborate setup than running a single-instance Oracle database. In the most basic (but not the recommended) way, all that's needed to run an Oracle database is a server with sufficient memory and internal disk capacity running under an employee's desk—and you might be surprised by how many production systems are run that way! With RAC, this is not the case (we are omitting the case of running RAC in a virtual environment for this discussion).

To deploy RAC in a production environment with high availability requirements, you need the following:

A sufficiently robust data center environment: This data center must have enough power, rack space, cooling, and security.
A storage subsystem: This should be configured to provide redundancy for its disks.
A storage infrastructure: The predominant deployment configuration of RAC is on 4 or 8 Gbit/s fiber-channel based storage array networks (SANs), but you also find RAC deployed using NFS, iSCSI, and fibre channel over Ethernet or protocols such as Infiniband.
Multiple host bus adapters: These depend on the technology chosen, but they effectively allow communication between the database server and the storage backend.
Multipathing software: Multipathing software supports channel failover and multiple paths to the storage backend, thereby increasing throughput and offering fault tolerance. The Linux kernel offers the device-mapper-multipath toolset out-of-the-box, but most vendors of host bus adapters (HBAs) have their own multipathing software available for Linux.
Networking infrastructure: A private interconnect is required for RAC for inter-cluster communication, as is a public interface to the RAC database. As with the connection to the storage backend, network cards should be teamed (bonded, in Linux terminology) to provide resilience.
Management and monitoring software: The monitoring of a RAC database should be proactive, users of the system should never be the first ones to alert the administrators of problems with the database or application.
Backup software: Being able to back up and restore an Oracle database is the most important task of a database administrator. The most brilliant performance-tuning specialist would be at a loss if he couldn't get the database back on line. Enterprise-grade backup solutions often have dedicated agents to communicate directly with the database through RMAN; these agents need to be licensed separately.
Operating system support: The Linux distribution must be certified for the use with your Oracle release. You should also have vendor support for your Linux distribution.

Many sites use dedicated engineering teams that certify a standard-operation build, including the version and patch level of the operating system, as well as all required drivers, which makes the roll-out of a new server simple. It also inspires confidence in the database administrator because the prerequisites for the installation of RAC are met. If such a validated product stack does not exist, it will most certainly be created after the decision to roll out RAC has been made.

USING RAC FOR QUALITY ASSURANCE AND DEVELOPMENT ENVIRONMENTS

A question asked quite frequently concerns RAC and quality assurance environmentsp (or even RAC and development environments). After many years as a RAC administrator, the author has learned that patching such a sensitive system is probably the most nerve-racking experience you can face in that role.

It is therefore essential to be comfortable with the patching procedure and potential problems that can arise. In other words, if your company has spent the money, time, and effort to harden its application(s) against failures using the Real Application Clusters option, then it should also be investing in at least one more RAC cluster. If obtaining additional hardware resources is a problem, you might want to consider virtualizing a RAC cluster for testing. This is currently supported with Oracle's own virtualization technology, called Oracle VM, which is free to download and use. Consequently we discuss RAC and virtualization with Oracle VM in Chapter 5. Alternatively, you could opt for virtual machines based on VMWare or another virtualization provider; however, bear in mind that such a configuration has no support from Oracle.

Assessing the Staff and Training Cost

One of the main drawbacks cited against RAC in user community forums is the need to invest in training. It is true that RAC (and to a lesser degree, the introduction of Automatic Storage Management) has changed the requirements for an Oracle DBA considerably. While it was perfectly adequate a few years ago to know about the Oracle database only, the RAC DBA needs to have a broad understanding of networking, storage, the RAC architecture in detail, and many more things. In most cases, the DBA will know the requirements to set up RAC best, and it's her task to enlist the other teams as appropriate, such as networking, system administration, and storage. A well-versed multiplatform RAC DBA is still hard to find and naturally commands a premium.

Clustering with Oracle on Linux

In the final part of this chapter, we will examine the history of Oracle RAC.

Oracle RAC—though branded as an entirely new product when released with Oracle 9i Release 1—has a long track record. Initially known as Oracle Parallel Server (OPS), it was introduced with Oracle 6.0.35, which eventually was renamed Oracle 6.2. OPS was based on the VAX/VMS distributed lock manager because VAX/VMS machines essentially were the only clustered computers at the time; however, the DLM used proved too slow for OPS due to internal design limitations. So Oracle development wrote its own distributed lock manager, which saw the light of day with Oracle 6.2 for Digital.

The OPS code matured well over time in the Oracle 7, 8, and 8i releases. You can read a remarkable story about the implementation of OPS in Oracle Insights: Tales of the Oak Table (Apress, 2004).

Finally, with the advent of Oracle 9.0.1, OPS was relaunched as Real Application Clusters, and it hadn't been renamed since. Oracle was available on the Linux platform prior to 9i Release 1, but at that time no standard enterprise Linux distributions as we know them today were available. Linux—even though very mature by then—was still perceived to be lacking in support, so vendors such as Red Hat and SuSE released road maps and support for their distributions alongside their community versions. By 2001, these platforms emerged as stable and mature, justifying the investment by Oracle and other big software players, who recognized the potential behind the open source operating system. Because it runs on almost all hardware, but most importantly on industry-standard components, Linux offers a great platform and cost model for running OPS and RAC.

At the time the name was changed from OPS to RAC, marketing material suggested that RAC was an entirely new product. However, RAC 9i was not entirely new at the time; portions of its code were leveraged from previous Oracle releases.

That said, there was a significant change between RAC and OPS in the area of cache coherency. The basic dilemma any shared-everything software has to solve is how to limit access to a block at a time. No two processes can be allowed to modify the same block at the same time; otherwise, a split brain situation would arise. One approach to solving this problem is to simply serialize access to the block. However, that would lead to massive contention, and it wouldn't scale at all. So Oracle's engineers decided to coordinate multiple versions of a block in memory across different instances. At the time, parallel cache management was used in conjunction with a number of background processes (most notably the distributed lock manager, DLM). Oracle ensured that a particular block could only be modified by one instance at a time, using an elaborate system of locks. For example, if instance B needed a copy of a block instance A modified, then the dirty block had to be written to disk by instance A before instance B could read it. This was called block pinging, which tended to be slow because it involved disk activity. Therefore, avoiding or reducing block pinging was one of Oracle's design goals when tuning and developing OPS applications; a lot of effort was spent on ensuring that applications connecting to OPS changed only their own data.

Tip

Oracle documentation for older releases is still available; you can find more detail about the PCM concepts available from this URL: http://download.oracle.com/docs/cd/A58617_01/server.804/a58238/ch9_pcm.htm.

The introduction of Cache Fusion phase I in Oracle 8i proved a significant improvement. Block pings were no longer necessary for consistent read blocks and read-only traffic. However, they were still needed for current reads. The Cache Fusion architecture reduced the need to partition workload to instances. The Oracle 8.1.5 "New Features" guide states that changes to the interinstance traffic includes:

"... a new diskless ping architecture, called cache fusion, that provides copies of blocks directly from the holding instance's memory cache to the requesting instance's memory cache. This functionality greatly improves interinstance communication. Cache fusion is particularly useful for databases where updates and queries on the same data tend to occur simultaneously and where, for whatever reason, the data and users have not been isolated to specific nodes so that all activity can take place on a single instance. With cache fusion, there is less need to concentrate on data or user partitioning by instance."

This document too can be found online at: http://download-west.oracle.com/docs/cd/A87862_01/NT817CLI/server.817/a76962/ch2.htm.

In Oracle 9i Release 1, Oracle finally implemented Cache Fusion phase II, which uses a fast, high speed interconnect to provide cache-to-cache transfers between instances, completely eliminating disk IO and optimizing read/write concurrency. Finally, blocks could be shipped across the interconnect for current and consistent reads.

Oracle addressed two general weaknesses of its Linux port with RAC 9.0.1: previous versions lacked a cluster manager and a cluster file system. With Oracle 9i, Oracle shipped its cluster manager, called OraCM for Linux and Windows NT (all other platforms used a third-party cluster manager). OraCM provided a global view of the cluster and all nodes in it. It also controlled cluster membership, and it needed to be installed and configured before the actual binaries for RAC could be deployed.

Cluster configuration was stored in a sever-management file on shared storage, and cluster membership was determined by using a quorum file or partition (also on shared storage).

Oracle also initiated the Oracle Cluster File System (OCFS) project for Linux 2.4 kernels (subsequently OCFS2 has been developed for 2.6 kernels, see below); this file system is released under the GNU public license. OCFS version one was not POSIX compliant; nevertheless, it allowed users to store Oracle database files such as control files, online redo logs, and database files. However, it was not possible to store any Oracle binaries in OCFS for shared Oracle homes. OCFS partitions are configured just like normal file systems in the /etc/fstab configuration file. Equally, they are reported like an ordinary mount point in output of the mount command. The main drawback was the inherent fragmentation that could not be defragmented, except by reformatting the file system.

Note

The file system fragmentation problem is described in My Oracle Support note 338080.1.

With the release of Oracle 10.1, Oracle delivered significant improvements in cluster manageability, many of which have already been discussed. Two of the main new features were Automatic Storage Management and Cluster Ready Services (which was renamed to Clusterware with 10.2 and 11.1, and is now called Grid Infrastructure). The ORACM cluster manager, which was available for Linux and Windows NT only, has been replaced by the Cluster Ready Services feature, which now offers the same "feel" for RAC on every platform. The server-management file has been replaced by the Oracle Cluster Registry, whereas the quorum disk is now known as the voting disk. With 10g Release 2, voting disks could be stored at multiple locations to provide further redundancy in case of logical file corruption. In 10.1, the files could only reside on raw devices; since 10.2, they can be moved to block devices, as well. The Oracle 11.1 installer finally allows the placement of the Oracle Cluster Registry and voting disks on block devices without also having to use raw devices. Raw devices have been deprecated in the Linux kernel in favor of the O_DIRECT flag. With Grid Infrastructure 11.2, the voting disk and cluster registry should be stored in ASM, and they are only allowed on block/raw devices during the migration phase. ASM is a clustered logical volume manager that's available on all platforms and is Oracle's preferred storage option—in fact, you have to use ASM with RAC Standard Edition.

In 2005, Oracle released OCFS2, which was now finally POSIX compliant and much more feature rich. It is possible to install Oracle binaries on OCFS2, but the binaries have to reside on a different partition than the datafiles because different mount options are required. It is no longer possible to install Grid Infrastructure, the successor to Clusterware, as a shared Oracle home on OCFS2; however, it is possible to install the RDBMS binaries on OCFS2 as a shared Oracle home.

Since the introduction of RAC, we've seen the gradual change from SMP servers to hardware, based on the industry-standard x86 and x86-64 architectures. Linux has seen great acceptance in the industry, and it keeps growing, taking market share mainly from the established UNIX systems, such as IBM's AIX, HP-UX, and Sun Solaris. With the combined reduced costs for the hardware and the operating system, RAC is an increasingly viable option for businesses.

Running Linux on Oracle

A considerable advantage of choosing Oracle over many alternative commercial database environments has always been the wide availability of Oracle on different hardware and operating system environments. This freedom of choice has enabled Oracle's customers to maintain their competitive advantage by selecting the best technology available at any single point in time. No other operating system exemplifies this advantage more than Linux. Linux has proven to be a revolutionary operating system, and Oracle has been at the forefront of the revolution with the first commercial database available on the platform. And the Oracle commitment to Linux shows no sign of abating, given that the company's most recent high-profile announcements at the annual Oracle Openworld conference all had a Linux component. In 2006, Oracle announced the release of the first operating system to be directly supported by Oracle: Oracle Enterprise Linux. This was followed in 2007 by the release of Oracle VM, an Oracle Enterprise Linux-based virtualization product. In 2008, Oracle introduced its first two hardware products: the Exadata Storage Server and the HP Oracle Database Machine. Both of these are built on an Oracle Enterprise Linux foundation.

Linux has broken the trend of running Oracle on proprietary operating systems only available on hardware at significant expense from a single vendor. Similarly, clustered Oracle solutions were beyond the reach of many Oracle customers due to the requirement to purchase hardware interconnect technology and clustering software from the same vendors.

Linux offers a higher standard and a greater level of choice to Oracle customers who want to select the best overall environment for their needs. The wide adoption of this new standard is illustrated by the fact that in the year following the publication of the first edition of this book, analyst market research showed that deployments of Oracle on Linux grew at the rate of 72 percent, with more than half of all new RAC implementations being installed on Linux.

The openness of Linux also means that, for the first time, affordable clustered Oracle database solutions eliminate the requirement for third-party clustering software and hardware interconnects. By removing these barriers to entry for clustered database solutions, the increasing popularity of RAC has been closely related to the adoption of Linux as the platform of choice for Oracle customers.

If you run Oracle on Linux, examining the origins of the operating system and its historical context in terms of its relationship to commercial UNIX operating systems is useful. Possessing a level of knowledge about the nature of the GNU General Public License and open source development is also beneficial because it helps you more fully leverage the license models under which Linux and related software is available. We cover these topics in the sections that follow.

Understanding the Role of Unix

To truly understand Linux, you need to start by looking at the background of the Unix operating system. Unix was created in 1969 by Ken Thompson, a researcher at Bell Laboratories (a division of AT&T). It was designed from the outset to be an operating system with multitasking and multiuser capabilities. In 1973, Unix was rewritten in the new C programming language from Dennis Ritchie to be a portable operating system easily modified to run on hardware from different vendors. Further development proceeded in academic institutions to which AT&T had made Unix available for a nominal fee.

AT&T took this course of action, as opposed to developing Unix as a commercial operating system, because, since 1956, AT&T was bound by a consent decree instigated from a complaint made by Western Electric in 1949. This decree prevented AT&T, as a regulated monopoly in the telephony industry, from engaging in commercial activity in other non-telephony markets such as computing. The consent decree is often attributed with being a significant milestone in the birth of the open source movement, enabling the wide and rapid dissemination of Unix technology. For example, one of the most important derivatives of Unix, Berkeley Software Distribution (BSD), was developed at the University of California, Berkeley, as a result.

The judgment on which the consent decree was based was vacated in 1982 when Bell was removed from AT&T, and AT&T developed and sold UNIX System III as a commercial product for the first time. In addition, all Unix derivatives now required a license fee to be paid to AT&T. AT&T combined features from the multiple versions of Unix in distribution, such as BSD, into a unified release of UNIX called System V Release 1, which was released in 1983. Subsequent commercial versions of UNIX were developed under a license from this System V code base, with improvements from releases incorporated into System V eventually resulting in the seminal release of System V Release 4 (SVR4) in 1989. Commercial variants of Unix licensed from AT&T source code were distinguished by the capitalization of the word UNIX, and examples of UNIX included Hewlett Packard's HP-UX, IBM's AIX, and Sun Microsystems's Solaris.

In 1991, AT&T formed the company UNIX System Laboratories (USL), which held the rights and source code to UNIX as a separate business entity. AT&T retained majority ownership until Novell acquired USL in 1993. A year later, the rights to the UNIX trademark and specification, now known as the Single UNIX Specification, were transferred by Novell to the X/Open Company. This marks the point at which the UNIX trademark was separated from the source code. In 1996, the X/Open Company merged with the Open Software Foundation (OSF) to form The Open Group. A year earlier (1995), certain licensing agreements regarding the UNIX source code and UnixWare operating system were purchased from Novell by SCO. In 2003, SCO filed a lawsuit against IBM and Sequent (which was subsequently acquired by IBM), claiming that IBM had copied a small section of the source code of UNIX into Linux; SCO sought damages for the unauthorized use of its intellectual property. In addition, SCO also sent a number of "Dear Linux User" letters to enterprise Linux users, warning them that the use of Linux violated SCO's UNIX copyright. However, Novell disputed the claim that in their agreement SCO had actually purchased the copyright to the UNIX source code. In 2007, the court case was ruled in Novell's favor, establishing Novell and not SCO as the rightful owner of the UNIX copyright. Hence, this ruling also ended SCOs claims against IBM and Linux users. At the time of writing, the Open Group owns the trademark UNIX in trust, while Novell, which owns SUSE Linux, retains the copyright to the UNIX source code.

Because the UNIX source code is separate from the UNIX trademark, there can be and are multiple implementations of UNIX. For an operating system to be defined as UNIX, it must adhere to the standards dictated by The Open Group's Single UNIX Specification; it must also license the rights from The Open Group to use the UNIX trademark. You can view a list of compliant UNIX operating systems on The Open Group's web site (www.opengroup.org).

Liberating Software

At the same time that AT&T began developing Unix commercially, Richard Stallman, a programmer at MIT, initiated a project to construct a Unix-like operating system for which the source code was to be freely available. Stallman's system was named the GNU Project, with the recursive acronym standing for GNU's Not Unix. To guarantee the freedom of the software, Stallman created the Free Software Foundation (FSF); the definition of "free" in this case is related to the concept of liberty (i.e., freedom), as opposed to lack of revenue.

This concept of freedom for software is encapsulated in the GNU General Public License (GPL), which incorporates a modified form of copyright known as copyleft. The GNU GPL, which has become the most popular license for free software, grants its recipients the following rights:

The freedom to run the program for any purpose
The freedom to study how the program works and modify it (implying that the source code must be made freely available)
The freedom to redistribute copies
The freedom to improve the program and release the improvements to the public

GNU GPL–licensed software is always released in conjunction with the source code. As the recipient of software distributed with such a license, you are free to modify and distribute the software as you wish; however, you must subsequently grant the same rights for your version of the software that you received from the original. Therefore, you may not, for example, take GNU GPL software and modify it, copyright it, and subsequently sell executable-only versions.

The first major output of the GNU Project was the GNU C Compiler (GCC), whose release was followed by numerous other tools and utilities required for a fully functional Unix operating system. The Hurd project was also underway to create the kernel of this free Unix operating system; however, it was still far from completion when Linux originated.

Developing Linux

In 1991, Linus Torvalds, then a student at the University of Helsinki, bought a PC with an Intel 80386 processor and installed a commercially available operating system called Minix (miniature Unix), which had been developed by Andrew Tanenbaum, to fully exploit the potential of his new PC. Torvalds began to rewrite parts of the software to introduce desired operating system features; in August 1991, version 0.01 of the Linux kernel was released. Version 0.01 actually still ran wholly under the Minix operating system, and version 0.02 enabled a small number of GNU utilities, such as the bash shell, to be run. The first stable version of the Linux kernel, version 1.0, was released in 1994. Kernel refers to the low-level system software that provides a hardware abstraction layer, disk and file system control, multitasking, load balancing, networking, and security enforcement. Torvalds continues to this day to oversee the development of the Linux kernel. Since the initial Linux version, thousands of developers around the world have contributed to the Linux kernel and operating system.

Linux is written almost entirely in C, with a small amount of assembly language. The Linux kernel is released under the GNU GPL and is therefore free software.

Major releases of the Linux kernel in recent years have included Linux 2.4.0 in January 2001 and 2.6.0 in December 2003. Linux kernel version numbers have the following format:

<kernel_version>.<major version>.<minor_version>.<patch>

For example, recent versions have been numbered 2.4.37 and 2.6.28.7, and the latest versions can be found at www.kernel.org/. Until recently, only the kernel and major and minor version numbers were used. The patch number was added during version 2.6.

Within the Linux kernel version format, the kernel version number is changed least frequently—only when major changes in the code or conceptual changes occur. It has been changed twice in the history of the kernel: in 1994 (version 1.0) and in 1996 (version 2.0).

The second number denotes the major revision of the kernel. Even numbers indicate a stable release (i.e., one deemed fit for production use, such as 2.4 or 2.6); odd numbers indicate development releases (such as 2.5) and are intended for testing new features and drivers until they become sufficiently stable to be included in a production release.

The third number indicates the minor revision of the kernel. Prior to version 2.6.8, this was changed when security patches, bug fixes, new features, or drivers were implemented in the kernel. In version 2.6.8 and later, however, this number is changed only when new drivers or features are introduced; minor fixes are indicated by the fourth number.

The fourth number, or patch number, first occurred when a fatal error, which required immediate fixing, was encountered in the NFS code in version 2.6.8. However, there were not enough other changes to justify the release of a new minor revision (which would have been 2.6.9). So, version 2.6.8.1 was released, with the only change being the fix of that error. With version 2.6.11, the addition of the patch number was adopted as the new official versioning policy. Bug fixes and security patches are now managed by this fourth number, and bigger changes are implemented only in minor revision changes (the third number).

Our emphasis here has been on the Linux kernel, but it is important to note that a Linux operating system should more correctly be viewed as a GNU/Linux operating system—without the GNU tools and utilities, Linux would not be the fully featured Unix operating system on which Oracle RAC installations can and do provide all the features that make it comparable (and more!) to commercial operating systems.

Clarifying the distinction between Linux and commercial UNIX is also worthwhile. Because the Linux community has not licensed the use of the UNIX trademark and is not fully compliant in all aspects with the Single UNIX Specification, it is by definition not a UNIX operating system. Similarly, no version of UNIX is available under GPL licensing. Later versions of glibc (the GNU Project's C standard library), however, do include levels of functionality as defined by the Single UNIX Specification, and the close relationship and common heritage between Linux and UNIX are readily apparent. Therefore, it is normal to see Linux referred to as a "Unix" or "Unix family" operating system, where the use of initial capitalization is intended to draw the distinction between the registered trademark UNIX held by The Open Group and the historical concepts and origins of the Unix operating system from which Linux emerged.

Expanding the Concept of Free with Open Source

Partly based on the growing popularity of free software development that was inspired by the success of Linux, the term "open source" was coined in 1998 to clarify and expand on the definition of what had previously been described as "free" software. Open source is defined by the following nine rules:

Free redistribution: Open source software cannot prevent someone from using the software in a larger aggregated software bundle, such as a Linux distribution that is subsequently sold or given away.
Source code: The source code for any open source software must be available either bundled with the executable form of the software or with the executable form easily accessible. The source code must remain in a form that would be preferential to the author for modification and cannot be deliberately obfuscated.
Derived works: This stipulation of open source is directly inherited from free software and ensures that redistribution of modified forms of the software is permitted under the same license as the original.
Integrity of the author's source code: This condition enables a greater level of restriction than that of free software by ensuring that it is possible to prevent the redistribution of modified source, as long as modifications are permitted in the form of patch files. The license may also prevent redistribution of modified software with the same name or version number as the original.
No discrimination against persons or groups: Open source licenses cannot discriminate against individuals or groups in terms of to whom the software is available. Open source software is available to all.
No discrimination against fields of endeavor: Open source licenses cannot place limitations on whether software can be used in business or commercial ventures.
Distribution of license: The license applied to open source software must be applicable as soon as the software is obtained and prohibits the requirement for additional intermediary licensing.
License must not be specific to a product: The license that applies to the open source software must apply directly to the software itself and cannot be applied selectively only when that software is released as part of a wider software distribution.
License must not restrict other software: The license cannot place requirements on the licensing conditions of other independent software that is distributed along with the open source software. It cannot, for example, insist that all other software distributed alongside it must also be open source.

For software to be correctly described as open source, it must adhere to each and every one of the preceding criteria. In some cases, software is described as open source to simply mean that the source code has been made available along with the executable version of the software. However, this form of open source is often accompanied by restrictions relating to what can be done with the source code once it has been obtained, especially in terms of modification and redistribution. Only through compliance with the preceding rules can software be officially termed open source; the software included in distributions of the Linux operating system are genuinely defined as open source.

Combining Oracle, Open Source, and Linux

In 1998, the Oracle database became the first established commercial database to be available on Linux. Oracle Corporation's commitment to Linux has continued with all Oracle products being made available on the operating system.

At the time of writing, Oracle RAC is supported on the following Linux releases: Oracle Enterprise Linux, Red Hat Enterprise Linux, Novell's SUSE Linux Enterprise Server, and Asianux. For users based in the Americas, Europe, the Middle East, and Africa, the choice is between Oracle Enterprise Linux, Red Hat Enterprise Linux, and Novell's SUSE Linux Enterprise Server. Asianux, on other hand, is also supported, but in the Asia Pacific region only. We do not advocate any of these distributions over the others—all are ideal Linux platforms for running Oracle.

Tip

If you wish to know whether a particular Linux distribution is certified by Oracle and therefore qualifies for support, the definitive source of information is the My Oracle Support web site (support.oracle.com).

Although the Oracle database on Linux remains a commercial product that requires the purchase of a license for production installations in exactly the same way as the Oracle database on other, commercial operating systems, Oracle maintains a much deeper relationship with Linux. Within Oracle Corporation is a Linux Projects development group responsible for the development of free and open source software. Oracle continues to work with the existing Linux distributors Red Hat and Novell to certify Oracle software on Red Hat Enterprise Linux and SUSE Linux Enterprise Server; however, the immediate recipient of Oracle Linux support is Oracle Enterprise Linux. That said, the open source nature of any development for the Linux platform ensures that the improvements for one version of Linux can subsequently benefit all Linux distributions. In addition to Oracle Enterprise Linux and Oracle VM, Oracle also releases a number of products under open source licenses and develops software for incorporation into the Linux kernel, such as the second version of the Oracle Cluster File System (OCFS2).

Drilling Down on Unbreakable Linux

Unique among platforms supported by Oracle, the Linux operating system is backed by Oracle's Unbreakable Linux initiative, which provides Oracle worldwide support for Oracle Enterprise Linux, a derivative version of Red Hat Enterprise Linux that Oracle makes available via download for free. It is important to distinguish between the Unbreakable Linux initiative based around Oracle Enterprise Linux and the previous Unbreakable Linux program, for which was Oracle providing first-level support for selected Linux distributions of Red Hat Enterprise Linux, Novell's SUSE Linux Enterprise Server, and Asianux.

The Oracle Database and RAC continues to be supported on these distributions; however, Linux support issues must now be raised in the first instance with the Linux distributor in question and not with Oracle. Oracle provides Linux operating system support only for Oracle Enterprise Linux. Another difference: in the previous incarnation of the Unbreakable Linux initiative, it was a pre-requisite to have a support contract with Oracle for Oracle products, as well as a standard support subscription contract for the operating system with Red Hat or Novell (or an Asianux alliance member in the Asia Pacific region). With the Unbreakable Linux initiative based on Oracle Enterprise Linux, it is not a pre-requisite to have any level of support contract for the Linux operating system in conjunction with an Oracle Database support contract. In other words, as with the Oracle Database on other operating systems, Linux support is arranged and managed separately from the Oracle Database. You can use Oracle Enterprise Linux for free without support while still receiving support for the Oracle Database, or you can subscribe to Oracle Unbreakable Linux and receive support directly from Oracle for the Oracle Enterprise Linux operating system. You can also subscribe to Oracle Unbreakable Linux and receive support for Oracle Enterprise Linux without using other Oracle software, or you can use an alternative certified Linux distribution to run Oracle software and arrange support accordingly for both Linux and Oracle—the choice is entirely at the customer's discretion.

For those who choose Unbreakable Linux for the associated Linux support from Oracle, the choice of hardware platforms is more defined than it is for other hardware platforms for which the Oracle Database 11g is available on Linux. Unbreakable Linux support is available for the x86/x86-64 and Itanium hardware platforms only.

In this book, our emphasis is on Oracle Enterprise Linux and the hardware platforms supported by Oracle under the Unbreakable Linux initiative. The most compelling reason to focus on this platform is the fact that the installable CD images of Oracle Enterprise Linux are available without restriction, and therefore provide the most accessible Linux platform release applicable to the widest number of Oracle RAC on Linux deployments. However, as we discuss in the following section, from a practical viewpoint Oracle Enterprise Linux and Red Hat Enterprise Linux are the same, and therefore every detail discussed is applicable to both versions of what is essentially a Red Hat Enterprise Linux distribution. Similarly, although we do not discuss installation or configuration of SUSE Linux Enterprise Server or Asianux in depth, it is important to note that all Linux distributions run the Linux kernel, so they all share more similarities than differences. Some of the minor implementation details may differ; however, the majority of features and their usage will be the same, no matter which Linux distribution you use. As long as your particular Linux release is certified by Oracle, then the choice between distributions should be based on business decisions, such as how well the support for a given distribution is tailored to your unique requirements. In other words, you would not choose a given distribution based on the significant technical superiority of any particular release.

Creating and Growing Red Hat Enterprise Linux

In 1994, Marc Ewing released his own distribution of Linux, which he called Red Hat Linux. The following year, ACC Corporation, a company formed by Bob Young in 1993, merged with Ewing's business and the resulting company became Red Hat Software.

Red Hat grew steadily over the next few years, expanding into Europe and Japan; and introducing support, training, and the Red Hat Certified Engineer (RHCE) program.

In July 1998, Oracle announced support for Red Hat Linux. However, at the time, Linux was perceived by some as a complex platform to work with due to the rapid pace of development and number of releases available at any one time. The open source mantra of "release early, release often" presented difficulties for enterprise environments used to the slower, more genteel development cycles of commercial operating systems.

In March 2002, Red Hat announced its first enterprise-class Linux operating system, Red Hat Linux Advanced Server. Oracle, along with the hardware vendors Dell, IBM, and Hewlett-Packard, announced support for the platform. A policy was put in place to stabilize releases on this version for 18 months in order to allow partners such as Oracle to port, test, and deploy their applications. This policy has largely been successful, although Red Hat's quarterly updates still often contain significant changes. Red Hat has also undertaken to support each major release of Red Hat Enterprise Linux for seven years from the point of its initial release.

In March 2003, the Red Hat Enterprise Linux family of operating system products was launched. Red Hat Linux Advanced Server, which was aimed at larger systems, was rebranded Red Hat Enterprise Linux AS (Advanced Server). In addition, two more variants were added: Red Hat Enterprise Linux ES (Edge Server or Entry-level Server) for medium-sized systems and Red Hat Enterprise Linux WS (Workstation) for single-user clients.

Since 2003, Red Hat has focused on the business market and Red Hat Enterprise Linux. Red Hat Linux 9 was the final consumer release; this was eventually supplanted by the Fedora Project.

In Red Hat Enterprise Linux 3, Red Hat backported many of the features from the Linux 2.5 development kernel to the version of the Linux 2.4 kernel on which the release was based. Red Hat Enterprise Linux 3 was followed in February 2005 by Red Hat Enterprise Linux 4, which was based on the Linux 2.6 kernel. In March 2007, Red Hat Enterprise Linux 5 was released. Red Hat Enterprise Linux 4 is the earliest supported release for the Oracle Database 11g. With Red Hat Enterprise Linux 5, the terminology was also changed once more for the different variations of the release. Red Hat Enterprise Linux AS became Red Hat Enterprise Linux Advanced Platform, Red Hat Enterprise Linux ES became just Red Hat Enterprise Linux, and Red Hat Enterprise Linux WS became Red Hat Enterprise Linux Desktop.

Table 2-1 summarizes the major Red Hat Enterprise Linux releases to date.

Table 1.1. Red Hat Enterprise Linux Releases

Version	Release Date
2.1 AS (Pensacola)	March 2002
2.1 ES (Panama)	May 2003
3 (Taroon)	October 2003
4 (Nahant)	February 2005
5 (Tikanga)	March 2007

Red Hat subscription pricing is dependent on the number of processor sockets and the level of support provided. At the time of writing, two support packages are offered for Red Hat Enterprise Linux Advanced Platform as Standard or Premium subscription for systems with more than two processor sockets with an additional Basic package for Red Hat Enterprise Linux with two processor sockets or less. Subscriptions are charged on a per-system basis for one or three years, and installable software is only available for download to subscribed customers only.

CentOS and Scientific Linux are examples of popular derivatives of Red Hat Enterprise Linux that are compatible with the Red Hat Enterprise Linux versions freely available for download.

Extending Red Hat with Oracle Enterprise Linux

Like CentOS and Scientific Linux, Oracle Enterprise Linux is known as a clone or derivative of Red Hat Enterprise Linux Advanced Platform. Oracle Enterprise Linux is made possible by the GNU GPL, under which the software included in Red Hat Enterprise Linux is released. Therefore, Oracle is able to redistribute a Linux operating system that has full source and binary compatibility with Red Hat Enterprise Linux. In other words, Oracle Enterprise Linux is in fact Red Hat Enterprise Linux, except with respect to Red Hat copyrighted material, such as logos and images, which have been removed. Consequently, from a technical standpoint, the implementation of the operating systems delivered by Red Hat and Oracle are indistinguishable from each other. This is because Oracle Enterprise Linux features full kABI (Kernel Application Binary Interface) compliance with Red Hat Enterprise Linux. In the default installation, the kernel is unmodified from the Red Hat version. However, Oracle Enterprise Linux includes an additional kernel in which Oracle fixes bugs from the initial release; this version is available for manual RPM install if you wish to do so.

Where the difference does lie between the releases is in terms of the support programs available. As detailed in the earlier section on Unbreakable Linux, Oracle offers a support program for Oracle Enterprise Linux direct from Oracle, without the involvement of a third-party Linux distributor. An additional significant difference is in the availability of the software. Oracle enables the download of installable software from http://edelivery.oracle.com/linux. It also allows the use of Oracle Enterprise for free, without restrictions. This accessibility means that you are entitled to run Oracle Enterprise Linux on as many systems as you wish, including production systems, and without having purchased a support subscription. It is also possible to copy, redistribute, and use the software as freely as you wish. This makes Oracle Enterprise Linux the only certified Enterprise Linux release available for download in both installable and source form without the prior purchase of a subscription. Oracle also provides additional levels of support. First, it provides an option to subscribe to the Unbreakable Linux Network for software updates between releases. Second, it offers what it calls Basic support (with varying charges for systems with up to two processors) and Premier support (with varying charges for systems with more than two processors).

Drilling Down on SuSE Linux Enterprise Server

SuSE was originally a German company founded in 1992 as a UNIX consulting group by Hubert Mantel, Burchard Steinbild, Roland Dyroff, and Thomas Fehr. SuSE is a German acronym that stands for Software und System Entwicklung, which translates to "software and system development" in English.

The company started by distributing a German version of Slackware Linux, but eventually decided to release its own distribution. The Jurix distribution developed by Florian LaRoche was used as a basis for the first SuSE distribution, released in 1996 as SuSE Linux 4.2.

May 2002 saw the formation of the United Linux consortium, in which SuSE played a prominent role. United Linux was a collaboration among a number of Linux distributors; the goal of this consortium was to create a single Linux enterprise standard to unify their distributions around in terms of development, marketing, and support. The members of United Linux were SuSE, Turbolinux, Conectiva, and the SCO Group. The initial version (1.0) of United Linux was based on the 2.4.18 kernel; however, various factors resulted in United Linux ultimately being unsuccessful in its aims of unification, despite support for the Oracle database being available with the initial release.

During this period, SuSE continued to release its own distributions of Linux. SuSE Enterprise Linux 8.0 (SLES8), based on the 2.4 kernel, was released in May 2002. This release provided a solid foundation on which to run Oracle9i.

In October 2003, SuSE released SLES9, based on the 2.6 kernel. SLES9 includes support for the Native POSIX Thread Library, a key feature of Linux 2.6 releases that significantly boosts the performance of multithreaded Linux applications.

The termination of the United Linux consortium was announced in January 2004 and coincided with the completion of Novell's acquisition of SuSE. Around this time, SuSE was renamed as SUSE. SLES9 was followed in July 2006 by SLES10, which was the earliest supported SLES release for the Oracle Database 11g and SLES 11 in March 2009.

Table 2-2 summarizes the major SUSE Linux Enterprise Server releases.

Table 1.2. SUSE Linux Enterprise Server Releases

Version	Release Date
8.0	April 2002
9.0	October 2003
10.0	July 2006
11.0	March 2009

SLES subscription pricing is charged per server for terms of one or three years, and it is available for Basic, Standard, and Priority levels of support.

UNDERSTANDING THE RESTRICTIONS OF ENTERPRISE LINUX AND OPEN SOURCE

It is important to be aware that, no matter which Linux distribution you choose, all of the software included in an enterprise Linux distribution remains open source, and the Linux kernel, as well as all of the utilities and applications included with the distribution, are free software. No single Linux distributor is responsible for more than a small contribution to what remains a community developed operating system. Therefore, when you purchase a license for an enterprise edition of Linux, you are purchasing the prepackaged distribution of that software; and, most importantly, a subscription to the vendor's Linux support services. You are not purchasing or licensing the Linux software in a manner similar to which you purchase an Oracle software license, for example. Enterprise Linux versions may or may not be available for download from the Linux vendors in an installable form; however, they are always available as stipulated by the conditions of open source as source code and remain freely distributable.

Taking Linux to Asia

Asianux, as its name suggests, is a Linux operating system available in the Asia Pacific region. Asianux is the result of an alliance among five of Asia's leading Linux distributors to produce an enterprise Linux standard in the region. The distributors are China's Red Flag Software; Japan's Miracle Linux, which is over 50% owned by Oracle; Korea's Haansoft; Vietnam's VietSoftware; and Thailand's WTEC. Aisanux Server 2.0 is the earliest supported release for Oracle Database 11g.

As our focus in this book is on the Linux distributions with the widest global Unbreakable Linux support provided by Oracle, we do not cover Asianux in subsequent chapters. However, Asianux is also based on Red Hat Enterprise Linux, so many of the implementation details discussed will be applicable.

Summary

In this chapter, we introduced Oracle Real Application Clusters and the reasons for implementing this database option. In particular, we explained the differences between a single-instance Oracle database and its limitations when it comes to an instance failure. We also discussed aspects of RAC, especially high availability considerations such as the scalability of a RAC solution compared to symmetric multiprocessor machines. Manageability improvements can be a huge advantage when databases are consolidated into a RAC system. We also touched on the various aspects regarding the total cost of ownership of a RAC system. Finally, we concluded with an overview over the history of Oracle clustering technology.

In the Linux part of the chapter, we examined the history of and concepts behind the Linux operating system, with a focus on understanding the features of the Linux platform that distinguish it from the alternative proprietary operating systems on which Oracle RAC is available. In particular, the aim of this chapter was to clarify the meaning of the terms free software and open source, as well as the relationship of Oracle to open source. In addition, we introduced the different versions of Linux on which you may consider deploying your Oracle RAC environment, paying particular attention to Oracle Enterprise Linux.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.