Chapter 3. Key functions and capabilities of IBM zEnterprise EC12

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Key functions and capabilities of IBM zEnterprise EC12

IBM zEnterprise EC12 is the follow-on to the IBM zEnterprise 196 and the flagship of the IBM Systems portfolio. Like its predecessor, the zEC12 offers five hardware models, but has a more powerful processor, more processor units, and new functions and features.

The superscalar design allows the zEC12 to deliver a record level of capacity over the prior IBM System z® servers. It is powered by 120 of the world's most powerful microprocessors which run at 5.5 GHz. This extreme scalability provides up to 50% more total capacity than its predecessor, the z196. This feature makes the zEC12 industry's premier enterprise infrastructure choice for large-scale consolidation, secure data serving, and transaction processing capabilities.

The zEC12 supports heterogeneous platform requirements by introducing the zEnterprise BladeCenter Extension (zBX) Model 003 and an updated Unified Resource Manager. This support allows extending management strengths to other systems and workloads which run on AIX on POWER7, and Linux on IBM System x, and Microsoft Windows on IBM System x servers. The zBX Model 003 can house the IBM WebSphere DataPower Integration Appliance XI50 for zEnterprise (DataPower XI50z), and select IBM BladeCenter PS701 Express blades or IBM BladeCenter HX5 (7873) blades for increased flexibility in “fit for purpose” application deployment.

In this chapter, we describe the following functions and capabilities:

•Virtualization

•zEC12 technology improvements

•Hardware Management Console functions

•zEC12 common time functions

•zEC12 power functions

•zEC12 capacity on demand (CoD)

•Throughput optimization with zEC12

•zEC12 performance

•zEnterprise BladeCenter Extension Model 003

•Reliability, availability, and serviceability (RAS)

•High availability technology for zEC12

3.1 Virtualization

The zEC12 is fully virtualized, with the goal of maximizing the utilization of its resources, lowering the total amount of resources that are needed and their cost. Virtualization is a key strength of the IBM System z® family. It is embedded in the architecture and built into the hardware, firmware, and operating systems.

Virtualization requires a hypervisor. A hypervisor is the control code that manages multiple independent operating system images. Hypervisors can be implemented in software or hardware, and the zEC12 has both. The hardware hypervisor for the zEC12 is known as IBM Processor Resource/Systems Manager™ (PR/SM). PR/SM is implemented in firmware as part of the base system, fully virtualizes the system resources, and does not require any additional software to run. The software hypervisor is implemented by the z/VM operating system. z/VM uses some PR/SM functions.

In the zBX, PowerVM Enterprise Edition is the hypervisor that offers a virtualization solution for any IBM Power Systems workload which runs on AIX. It allows use of the POWER7 processor-based PS blades and other physical resources, providing better scalability and reduction in resource costs. IBM System x blades have a Kernel Virtual Machine-based (KVM-based), integrated hypervisor with identical objectives.

PowerVM is EAL4+ certified and is isolated on the intranode management network, providing intrusion prevention, integrity, and secure virtual switches with integrated consolidation. PowerVM, as well as the integrated hypervisor for the System x blades, is managed by the zEnterprise Unified Resource Manager, so it is shipped, deployed, monitored, and serviced at a single point.

We now describe the hardware and software virtualization capabilities of the zEC12.

3.1.1 zEC12 hardware virtualization

Processor Resource/Systems Manager (PR/SM) was first implemented in the mainframe in the late 1980s. It allows defining and managing subsets of the server resources that are known as logical partitions (LPARs). PR/SM virtualizes processors, memory, and I/O features. Certain features are purely virtual implementations. For example, HiperSockets work like a LAN but do not use any I/O hardware.

PR/SM is always active on the system and has been enhanced to provide more performance and platform management benefits. PR/SM technology on previous IBM System z systems has received Common Criteria EAL5+¹ security certification. Each logical partition is as secure as an isolated system.

Up to 60 LPARs can be defined on a zEC12. Each one can run any of the following supported operating systems:

•z/OS

•z/VM

•z/VSE

•z/TPF

•Linux on System z

The LPAR definition includes a number of logical PUs, memory, and I/O devices. The z/Architecture (inherent in the zEC12 and its predecessors) is designed to meet those stringent requirements with low overhead and the highest security certification in the industry: Common criteria EAL5+ with a specific target of evaluation (logical partitions). This design has been proven in many client installations over several decades.

zEC12 can handle up to 60 LPARs and hundreds or even thousands of virtual servers under z/VM. Therefore, a high rate of context switching is to be expected, and accesses to the memory, caches, and virtual I/O devices must be kept isolated.

Logical processors

Logical processors are defined and managed by PR/SM and are perceived by the operating systems as real processors. These processors assume the following types:

•CPs (central processors)

•zAAPs (System z Application Assist Processors)

•zIIPs (System z Integrated Information Processors)

•IFL (Integrated Facility for Linux)

•ICFs (intersystem communications functions)

SAPs provide support for all LPARs but are never part of an LPAR configuration and are not required to be virtualized.

PR/SM is responsible for accepting requests for work on logical processors by dispatching logical processors on physical processors. Under certain circumstances logical zAAPs and zIIPs can be dispatched on physical CPs. Physical processors can be shared across LPARs, but can also be dedicated to an LPAR. However, an LPAR must have its logical processors either all shared or all dedicated.

The sum of logical processors (LPUs) defined in all of the LPARs activated in a central processor complex (CPC) might be well over the number of physical processors (PPUs). The maximum number of LPUs that can be defined in a single LPAR cannot exceed the total number PPUs that are available in the CPC. To achieve optimal ITR performance in sharing LPUs, it is suggested to keep the total number of online LPUs to a minimum. This action reduces both software and hardware overhead.

PR/SM ensures that, when you switch a physical processor from one logical processor to another, the processor state is properly saved and restored, including all the registers. Data isolation, integrity, and coherence inside the system are strictly enforced at all times.

Logical processors can be dynamically added to and removed from LPARs. Operating system support is required to take advantage of this capability. Starting with the following operating systems: z/OS V1R10, z/VM V5R4, and z/VSE V4R3; the ability to dynamically define and change the number and type of reserved PUs in an LPAR profile can be used for that purpose. No pre-planning is required. The new resources are immediately available to the operating systems and, in the case of z/VM, to its guests. Linux on System z provides the Standby CPU activation/deactivation function, which is implemented in SLES 11 and RHEL 6.

z/VM-mode partitions

The z/VM-mode logical partition (LPAR) mode, first supported on IBM System z10®, is exclusively for running multiple workloads under z/VM. This LPAR mode provides increased flexibility and simplifies systems management by allowing z/VM to manage guests to perform the following tasks in the same z/VM LPAR:

•Operate Linux on System z on IFLs or CPs.

•Operate z/OS, z/VSE, and z/TPF on CPs.

•Operate z/OS while fully allowing System z Application Assist Processor (zAAP) and System z Integrated Information Processor (zIIP) usage by workloads (such as WebSphere and DB2) for an improved economics environment.

•Operate a complete Sysplex with ICF usage. This setup is especially valuable for testing and operations training; however, it is not recommended for production environments.

The z/VM-mode partitions require z/VM V5R4 or later and allow z/VM to use a wider variety of specialty processors in a single LPAR. The following processor types can be configured to a z/VM-mode partition:

•CPs

•IFLs

•zIIPs

•zAAPs

•ICFs

If only Linux on System z is to be run under z/VM, then a z/VM-mode LPAR is not required and we suggest a Linux-only LPAR be used instead.

Memory

To ensure security and data integrity, memory cannot be concurrently shared by active LPARs. In fact, a strict isolation is maintained.

Using the plan-ahead capability, memory can be physically installed without being enabled. It can then be enabled when it is necessary. z/OS and z/VM support dynamically increasing the memory size of the LPAR.

A logical partition can be defined with both an initial and a reserved amount of memory. At activation time the initial amount is made available to the partition and the reserved amount can later be added, partially or totally. Those two memory zones do not have to be contiguous in real memory, but are displayed as logically contiguous to the operating system that runs in the LPAR.

z/OS is able to take advantage of this support by nondisruptively acquiring and releasing memory from the reserved area. z/VM V5R4 and later versions are able to acquire memory nondisruptively and immediately make it available to guests. z/VM virtualizes this support to its guests, which can also increase their memory nondisruptively. Releasing memory is still a disruptive operation.

LPAR memory is said to be virtualized in the sense that, in each LPAR, memory addresses are contiguous and start at zero. LPAR memory addresses are different from the absolute memory addresses of the system, which are contiguous and have a single “zero” byte. Do not confuse this with the operating system virtualizing its LPAR memory, which is done through the creation and management of multiple address spaces.

The z/Architecture has a robust virtual storage architecture that allows, per LPAR, the definition of an unlimited number of address spaces and the simultaneous use by each program of up to 1023 of those address spaces. Each address space can be up to 16 EB (1 exabyte = 260 bytes). Thus, the architecture has no real limits. Practical limits are determined by the available hardware resources, including disk storage for paging.

Isolation of the address spaces is strictly enforced by the Dynamic Address Translation hardware mechanism, which also validates a program’s right to read or write in each page frame. This function is done by comparing the page key with the key of the program that is requesting access. Definition and management of the address spaces is under operating system control. This mechanism has been in use since the System 370. Memory keys were part of, and used by, the original System 360 systems. Three addressing modes, 24-bit, 31-bit, and 64-bit, are simultaneously supported. This provides compatibility with earlier versions and investment protection.

zEC12 introduces 2 GB-pages, in addition to the 4 KB and 1 MB pages, and an extension to the z/Architecture: the Enhanced Dynamic Address Translation-2 (EDAT-2).

Operating systems can allow sharing of address spaces, or parts thereof, across multiple processes. For instance, under z/VM, a single copy of the read-only part of a kernel can be shared by all virtual machines which use that operating system, resulting in large savings of real memory and improvements in performance.

I/O virtualization

The zEC12 supports four Logical Channel Subsystems (LCSSs) each with 256 channels - for a total of 1024 channels. In addition to the dedicated use of channels and I/O devices by an LPAR, I/O virtualization allows concurrent sharing of channels. This architecture also allows sharing of the I/O devices that are accessed through these channels, by several active LPARs. This function is known as Multiple Image Facility (MIF). The shared channels can belong to different channel subsystems, in which case they are known as spanned channels.

Data streams for the sharing LPARs are carried on the same physical path with total isolation and integrity. For each active LPAR that has the channel configured online, PR/SM establishes one logical channel path. For availability reasons, multiple logical channel paths should exist for critical devices (for instance, disks which contain vital data sets).

When more isolation is required, configuration rules allow restricting the access of each logical partition to particular channel paths and specific I/O devices on those channel paths.

Many installations use the parallel access volume (PAV) function, which allows accessing a device by several addresses (normally one base address and an average of three aliases). This feature increases the throughput of the device by using more device addresses. HyperPAV takes the technology a step further by allowing the I/O Supervisor (IOS) in z/OS (and the equivalent function in the Control Program of z/VM) to create PAV structures dynamically. The structures are created depending on the current I/O demand in the system, lowering the need for manually tuning the system for PAV use.

In large installations, the total number of device addresses can be high. Thus, the concept of channel sets was introduced with the IBM System z9®. On the zEC12, up to three sets of
64 K device addresses are available. This availability allows the base addresses to be defined on set 0 (IBM reserves 256 subchannels on set 0) and the aliases on set 1 and set 2. In total, 196,352 subchannel addresses are available per channel subsystem. Channel sets are used by the Metro Mirror (also referred as synchronous Peer-to-Peer Remote Copy (PPRC)) function by the ability to have the Metro Mirror primary devices defined in channel set 0. Secondary devices can be defined in channel set 1 and 2, providing more connectivity through channel set 0.

To reduce the complexity of managing large I/O configurations further, starting with z/OS V1R10, System z introduced extended address volumes (EAVs). EAV is designed to build large disk volumes by using virtualization technology. In addition to z/OS, both z/VM (starting with V5R4 with APARs) and Linux on System z support EAV.

By extending the disk volume size, potentially fewer volumes can be required to hold the amount of data, making systems and data management less complex. EAV is supported by the IBM DS8000® series. Devices from other vendors should be checked for EAV compatibility.

The health checker function in z/OS V1R10 introduced a health check in the I/O Supervisor that can help system administrators identify single points of failure in the I/O configuration.

The dynamic I/O configuration function is supported by z/OS and z/VM. It provides the capability of concurrently changing the currently active I/O configuration. Changes can be made to channel paths, control units, and devices. The existence of a fixed HSA area in the zEC12 greatly eases the planning requirements and enhances the flexibility and availability of these reconfigurations.

3.1.2 zEC12 software virtualization

Software virtualization is provided by the IBM z/VM product. Strictly speaking, it is a function of the Control Program component of z/VM. Starting in 1967, IBM has continuously provided software virtualization in its mainframe servers.

z/VM uses the resources of the LPAR in which it is running to create functional equivalents of real System z servers, which are known as virtual machines (VMs) or guests. A z/VM virtual machine is the functional equivalent of a real server. In addition, z/VM can emulate I/O peripheral devices (for instance, printers) by using spooling and other techniques, and LAN switches and disks by using memory.

z/VM allows fine-grained dynamic allocation of resources. As an example, in the case of processor sharing, the minimum allocation is approximately 1/10,000 of a processor. As another example, disks can be subdivided into independent areas, which are known as minidisks, each of which is used by its users as a real disk, only smaller. Minidisks are shareable, and can be used for all types of data and also for temporary space in a pool of on-demand storage.

Under z/VM, virtual processors, virtual central and expanded storages, and all the virtual I/O devices of the VMs are dynamically definable (provisionable). z/VM supports the concurrent addition (but not the deletion) of memory to its LPAR and immediately makes it available to guests. Guests themselves can support the dynamic addition of memory. All other changes are concurrent. To make these concurrent definitions occur nondisruptively requires support by the operating system that is running in the guest.

Although z/VM imposes no limits on the number of defined VMs, the number of active VMs is limited by the available resources. On a large server, such as the zEC12, thousands of VMs can be activated.

In addition to server consolidation and image reduction by vertical growth, z/VM provides a highly sophisticated environment for application integration and co-residence with data, especially for mission-critical applications.

Virtualization provides hardware-enabled resource sharing, and can also be used for the following functions:

•Isolate production, test, training, and development environments.

•Support previous applications.

•Test new hardware configurations without actually buying the hardware.

•Enable parallel migration to new system or application levels, and provide easy back-out capabilities.

z/VM V6R2 introduced a new feature, single system image (SSI). SSI enables improved availability, better management of planned outages, and capacity growth by creating clusters of z/VM systems with simplified management.

With SSI, is it possible to cluster up to four z/VM images in a single logical image. SSI includes the following highlighted features:

•Live Guest Relocation (LGR) for Linux, the ability to move virtual servers without disruption from one z/VM system to another in the SSI.

•Management of resources with multi-system virtualization to allow up to four z/VM instances to be clustered as a single system image.

•Scalability with up to four systems horizontally, even on mixed hardware generations.

•Availability through non-disruptively moving work to available system resources and non-disruptively moving system resources to work.

For more information about SSI, see An introduction to z/VM Single System Image (SSI) and Live Guest Relocation (LGR), SG24-8006 and Using z/VM v 6.2 Single System Image (SSI) and Live Guest Relocation (LGR), SG24-8039.

The Unified Resource Manager uses the management application programming interface (API) of z/VM to provide a set of resource management functions for the z/VM environment. It is beyond the scope of this book to provide a more detailed description of z/VM or other highlights of its capabilities. For a deeper discussion of z/VM, see Introduction to the New Mainframe: z/VM Basics, SG24-7316, available from the following web page:

http://www.redbooks.ibm.com/abstracts/sg247316.html

3.1.3 zBX virtualized environments

On the zBX, the IBM POWER7 processor-based PS701 blades run PowerVM Enterprise Edition to create a virtualized environment that is similar to the one found in
IBM Power Systems servers. The POWER7 processor-based LPARs run the AIX operating system.

The IBM System x blades are also virtualized. The integrated System x hypervisor uses Kernel-based virtual machines. Support is provided for Linux, and Microsoft Windows. Management of the zBX environment is done as a single logical virtualized environment by the Unified Resource Manager.

3.2 zEC12 technology improvements

zEC12 provides technology improvements in these areas:

•Microprocessor

•Capacity settings

•Memory

•Flash Express

•I/O capabilities

•Cryptography

These features are intended to provide a more scalable, flexible, manageable, and secure consolidation and integration to the platform, which contributes to a lower total cost of ownership.

3.2.1 Microprocessor

The zEC12 have a newly developed microprocessor chip and a newly developed storage control chip. The chip uses CMOS S13 (32nm) technology and represents a major step forward in technology use for the IBM System z® products, resulting in increased packaging density.

As with the z196, the microprocessor chip and the storage control chip for the zEC12 are packaged together on a multi-chip module (MCM). The MCM contains six microprocessor chips (each having either four, or five, or six active cores) and two storage control (SC) chips. The MCM is installed inside a book, and the zEC12 can contain from one to four books. The book also contains the memory arrays, I/O connectivity infrastructure, and various other mechanical and power controls.

The book is connected to the PCI Express (PCIe) I/O drawers, I/O drawers, and I/O cages through one or more cables. These cables use the standard PCIe and InfiniBand protocols to transfer large volumes of data between the memory and the I/O cards that are in the PCIe I/O drawers, I/O drawers, and I/O cages.

zEC12 processor chip

The zEC12 chip provides more functions per chip (six cores on a single chip) because of technology improvements that allow designing and manufacturing more transistors per square inch. This configuration translates into using fewer chips to implement the needed functions, which helps enhance system availability.

The System z microprocessor development followed the same basic design set since the 9672-G4 (announced in 1997) until the z9. That basic design is stretched to its maximum, so a fundamental change was necessary. The z10 chip introduced a high-frequency design, which was improved with the 196 and enhanced again with the zEC12. The zEC12 microprocessor chip (hex core) has an improved design when compared with the z196 (quad core).

The processor chip includes one co-processor for hardware acceleration of data compression and cryptography for each core, I/O bus and memory controllers, and an interface to a separate storage controller/cache chip, see Figure 3-1.

Figure 3-1 zEC12 Hex-Core microprocessor chip

On-chip cryptographic hardware includes the full complement of the Advanced Encryption Standard (AES) algorithm, Secure Hash Algorithm (SHA), and the Data Encryption Standard (DES) algorithm, as well as the protected key implementation.

zEC12 processor design highlights

The z/Architecture offers a rich complex instruction set computer (CISC) Instruction Set Architecture (ISA) supporting multiple arithmetic formats.

The z196 introduced 110 new instructions and offered a total of 984, out of which 762 were implemented entirely in hardware. The zEC12 also introduces new instructions, notably for the Transactional Execution and the EDAT-2 facilities.

Compared to z196, the zEC12 processor design improvements and architectural extensions include the following features:

•Balanced Performance Growth

– 1.5 times more system capacity

• 50% more cores in a central processor chip

• Maximum number of cores increased from 96 to 120

– Third Generation High Frequency processor

• Frequency increased from 5.2 GHz to 5.5 GHz

• Up to 25% faster uniprocessor performance as compared to z196

•Innovative Local Data-Cache design with larger caches but shorter latency

– Total L2 per core is 33% bigger

– Total on-chip shared L3 is 100% larger

– Unique private L2 cache (1 MB) design reduces L1 miss latency

•Second Generation of Out of Order design (OOO) with increased resources and efficiency

– Numerous pipeline improvements that are based on z10 and z196 designs

– Number of instructions in flight is increased by 25%

•Improved Instruction Fetching Unit

– New second-level Branch Prediction Table with 3.5 times more branches

– Improved sequential instruction stream delivery

•Dedicated co-processor for each core with improved performance and more capability

– New hardware support for Unicode UTF8<>UTF16 bulk conversions (CU12/CU21)

– Improved startup latency

•Multiple innovative architectural extensions for software usage

– Transactional Execution (TX), known in the academia as Hardware Transactional Memory (HTM). This feature allows software-defined “lockless” sequences to be treated as an atomic “transaction” and improves efficiency on highly parallelized applications and multi-processor handling

– Runtime Instrumentation. Allows dynamic optimization on code generation as it is being executed

– Enhanced Dynamic Address Translation-2 (EDAT-2). Supports 2 GB page frames

•Increased issue, execution, and completion throughput

– Improved instruction dispatch and grouping efficiency

– Millicode Handling

– Next Instruction Access Intent

– Load and Trap instructions

– Branch Prediction Preload

– Data Prefetch

Hardware decimal floating point function

Hardware decimal floating point (HDFP) support was introduced with the z9 EC and enhanced with a new decimal floating point accelerator feature in IBM zEnterprise 196. zEC12 facilitates better performance on traditional zoned-decimal operations with a broader usage of Decimal Floating Point facility by COBOL and PL/I programs.

This facility is designed to speed up such calculations and provide the necessary precision demanded mainly by the financial institutions sector. The decimal floating point hardware fully implements the new IEEE 754r standard.

Industry support for decimal floating point is growing, with IBM leading the open standard definition. Examples of support for the draft standard IEEE 754r include: Java BigDecimal, C#, XML, C/C++, GCC, COBOL, and other key software vendors such as Microsoft and SAP.

Support and usage of HDFP varies with operating system and release. For a detailed description see IBM zEnterprise EC12 Technical Guide, SG24-8049. See also “Decimal floating point (z/OS XL C/C++ considerations)” on page 116.

Large system images

A single system image can control several processor units (PUs) such as CPs, zIIPs, zAAPs, and IFLs, as appropriate. See “Processor unit characterization” on page 7 for a description.

Table 3-1 on page 61 lists the maximum number of PUs supported for each operating system image. The physical limits of the hardware determine the usable number of PUs.

Table 3-1 Single System image software support

Operating system	Maximum number of (CPs+zIIPs+zAAPs)¹ or IFLs per system image
z/OS V1R11 and later	100
z/VM V5R4 and later	32² ³
z/VSE V4R3 and later	z/VSE Turbo Dispatcher can use up to four CPs and tolerates up to 10-way LPARs
z/TPF V1R1	86 CPs
Linux on System z	SUSE SLES 11: 64 CPs or IFLs SUSE SLES 10: 64 CPs or IFLs Red Hat RHEL 6: 80 CPs or IFLs Red Hat RHEL 5: 80 CPs or IFLs

¹ The number of purchased zAAPs and the number of purchased zIIPs cannot each exceed the number of purchased CPs. A logical partition can be defined with any number of the available zAAPs and zIIPs. The total refers to the sum of these PU characterizations.

² z/VM guests can be configured with up to 64 virtual PUs.

³ The z/VM-mode LPAR supports CPs, zAAPs, zIIPs, IFLs, and ICFs.

3.2.2 Capacity settings

The zEC12 expands the offer on subcapacity settings. Finer granularity in capacity levels allows the growth of installed capacity to follow the enterprise growth more closely, for a smoother, pay-as-you-go investment profile. There are many performance and monitoring tools that are available on System z environments that are coupled with the flexibility of the capacity on-demand options (see 3.6, “zEC12 capacity on demand (CoD)” on page 84). These features help to manage growth by making capacity available when needed.

zEC12

The zEC12 offers four distinct capacity levels for the first 20 central processors (CPs) (full capacity and three subcapacities). These processors deliver the scalability and granularity to meet the needs of medium-sized enterprises, while also satisfying the requirements of large enterprises that have large-scale, mission-critical transaction and data-processing requirements.

A capacity level is a setting of each CP to a subcapacity of the full CP capacity. Full capacity CPs are identified as CP7. On the zEC12 server, 101 CPs can be configured as CP7. Besides full capacity CPs, three subcapacity levels (CP6, CP5, and CP4) are offered for up to 20 CPs, independently of the zEC12 model installed. The four capacity levels are displayed in hardware descriptions as feature codes on the CPs.

If more than 20 CPs are configured in the system, they all must be full capacity because all CPs must be on the same capacity level. Granular capacity adds 60 subcapacity settings to the 101 capacity settings that are available with full capacity CPs (CP7). The 161 distinct capacity settings in the system, provide for a range of over 1:320 in processing power.

A processor that is characterized as anything other than a CP, such as a zAAP, a zIIP, an IFL, or an ICF, is always set at full capacity. There is, correspondingly, a separate pricing model for non-CP processors regarding purchase and maintenance prices, and various offerings for software licensing.

On zEC12, the CP subcapacity settings relative to full capacity are the following amounts:

•Model 7xx = 100%

•Model 6xx = 63%

•Model 5xx = 42%

•Model 4xx = 16%

For administrative purposes, systems that have only ICF or IFL processors, are now given a capacity setting of 400. For either of these systems, it is possible to have up to 101 ICFs or IFLs, which always run at full capacity.

Figure 3-2 gives more details about zEC12 full capacity and subcapacity offerings.

Figure 3-2 zEC12 full and subcapacity CP offerings

To help size a System z server to fit your requirements, IBM provides a no-cost tool that reflects the latest IBM LSPR measurements, called the IBM Processor Capacity Reference for System z (zPCR). The tool can be downloaded from the following web page:

http://www-03.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/PRS1381

Also, see 3.8, “zEC12 performance” on page 86 for more information about LSPR measurements.

3.2.3 Memory

The z196 has greatly increased the available memory capacity over previous systems. The zEC12 can have up to 3,040 GB of usable memory installed.

In addition, the zEC12 has doubled the size of the hardware system area (HSA) when compared with its predecessor, z196. HSA is not included in the memory which the client orders and has a fixed size of 32 GB.

z/Architecture addressing modes: The z/Architecture simultaneously supports 24-bit, 31-bit, and 64-bit addressing modes. These modes provide compatibility with earlier versions and investment protection.

Support of large memory by operating systems consists of the following environments:

•z/OS V1R10 and later support up to 4 TB.

•z/VM V5R4 and later support up to 256 GB.

•z/VSE V4R2 and later support up to 32 GB.

•z/TPF V1R1 supports up to 4 TB.

•SUSE SLES 11 supports 4 TB and Red Hat RHEL 6 supports 3 TB.

Although some of the operating systems that are listed above can support more than 1 TB of memory by their design, on zEC12, the maximum memory size per logical partition is 1 TB.

Hardware system area

The zEC12 has a fixed-size hardware system area of 32 GB which is in a reserved area of memory, separate from client-purchased memory. The HSA is large enough to accommodate all possible configurations for all logical partitions. Therefore, several operations that were disruptive on previous servers because of HSA size, are now concurrent, improving availability. In addition, planning needs are eliminated.

A fixed large HSA allows the Dynamic I/O capability to be enabled by default. It also enables the dynamic addition and removal, without planning, of the following features:

•New logical partition to new or existing channel subsystem (CSS)

•New CSS (up to four can be defined on zEC12)

•New subchannel set (up to three can be defined on zEC12)

•Devices, up to the maximum that is permitted, in each subchannel set

•Logical processors by type

•Cryptographic adapters

Plan-ahead memory

Planning for future memory requirements and installing dormant memory in the CPC allows upgrades to be done concurrently and, with appropriate operating system support, nondisruptively. A specific memory pricing model is available in support of this capability.

If a client can anticipate an increase of the required memory, a target memory size can be configured along with a starting memory size. The starting memory size is activated and the rest remains inactive. When more physical memory is required, it is fulfilled by activating the appropriate number of planned memory features. This activation is concurrent and can be nondisruptive to the applications depending on the operating system support. z/OS and z/VM support this function.

Do not confuse plan-ahead memory with flexible memory support. Plan-ahead memory is for a permanent increase of installed memory, whereas flexible memory provides a temporary replacement of a part of memory that becomes unavailable.

Flexible memory

Flexible memory was first introduced on the z9 EC as part of the design changes and offerings to support enhanced book availability. Flexible memory is used to temporarily replace the memory that becomes unavailable when performing maintenance on a book.

On zEC12, the additional resources that are required for the flexible memory configurations are provided through the purchase of planned memory features, along with the purchase of memory entitlement. Flexible memory configurations are available on multi-book models H43, H66, H89, and HA1, and range from 32 GB to 2272 GB, depending on the model.

Contact your IBM representative to help determine the appropriate configuration for your business.

Large page support

The size of pages and page frames has been 4 KB for a long time. Starting with
IBM System z10, System z servers are capable of having large pages with the size of 1 MB, in addition to supporting pages of 4 KB. This capability is a performance item which addresses particular workloads and relates to large main storage usage. Both page frame sizes can be simultaneously used.

Large pages enable the translation lookaside buffer (TLB) to better represent the working set and suffer fewer misses by allowing a single TLB entry to cover more address translations. Users of large pages are better represented in the TLB and are expected to perform better.

This support benefits long-running applications that are memory access intensive. Large pages are not recommended for general use. Short-lived processes with small working sets are normally not good candidates for large pages and see little to no improvement. The use of large pages must be decided based on knowledge that is obtained from measurement of memory usage and page translation overhead for a specific workload.

The large page support function is not enabled without the required software support. Without the large page support, page frames are allocated at the current 4 KB size. At the time they were introduced, large pages were treated as fixed pages and were never paged out. With the availability of Flash Express, large pages might become pageable. They are only available for 64-bit virtual private storage such as virtual memory located above 2 GB.

2 GB Large Page Support²

zEC12 introduces 2 GB page frames as an architectural extension. It is aimed at increasing efficiency for DB2 buffer pools, Java heap, and other large structures. Usage of 2 GB pages increases TLB coverage without proportionally enlarging the TLB size:

•A 2 GB memory page is:

– 2048 times larger than a large page of 1 MB size

– 524,288 times larger than an ordinary base page with a size of 4 KB

•A 2 GB page allows for a single TLB entry to fulfill many more address translations than either a large page or ordinary base page

•A 2 GB page provides users with much better TLB coverage, and therefore provides:

– Better performance by decreasing the number of TLB misses that an application incurs

– Less time that is spent on converting virtual addresses into physical addresses

– Less real storage that is used to maintain DAT structures

Statement of Direction:

•IBM plans for future maintenance roll-ups of 31-bit and 64-bit IBM SDK7 for
z/OS Java to provide the use of new IBM zEnterprise EC12 features. Some of these new features include: Flash Express and pageable large pages, Transactional Execution Facility, Miscellaneous-Instruction-Extension Facility, and 2 GB pages.

•Beyond the existing DB2 support for zEC12, DB2 plans further enhancements that are designed to improve performance in two ways: 1) using pageable large (1 MB) pages and Flash Express, and 2) enabling support of 2 GB pages

3.2.4 Flash Express

zEC12 introduces the innovative Flash Express feature that helps to improve availability and performance to compete more effectively in today’s service focused market. Flash Express capabilities enable the following features:

•Improved z/OS recovery and diagnostic times

•Handling of workload shifts and coping with dynamic environments more smoothly

•Use of pageable large pages yielding CPU performance benefits

•Offloading GBps of random I/O from the I/O Fabric

•Predictive paging

Flash Express is easy to configure, requires no special skills, and provides rapid time to value. This feature is designed to allow each logical partition to be configured with its own storage-class memory (SCM) address space and to be used for paging. One MB large pages can become pageable.

Flash Express PCIe cards implement internal Flash Solid State Disk (SSD). Flash Express cards are installed by pair to ensure a high level of availability and redundancy, offering a capacity of 1.4 TB of usable storage per pair. A maximum of four pairs of cards can be installed on a zEC12, providing a maximum capacity of 5.6 TB of storage.

In Flash Express environment, the data privacy relies on a symmetric key that encrypts the data that temporarily is on the SSD. By using a smart card and an integrated smart card reader on the Support Element (SE), the encryption key is generated within the secure environment of the smart card. The key is tightly coupled to the SE serial number, which ensures that no other SE is able to share the key or the smart card that is associated with a specific SE. The generated key is replicated in a secure way to the alternate Support Element smart card. The key is transferred from the SE to the Flash Express adapter under the protection of a private and public key pair that is generated by the firmware that manages the Flash Express adapter.

Flash Express (FC 0402) is an optional and priced feature. It is supported by z/OS V1R13 and has more than 4 GB of real storage with the z/OS V1R13 RSM Enablement Offering web deliverable installed.

Additional functions of Flash Express are expected to be supported later, including 2 GB large pages and dynamic reconfiguration for Flash Express.

3.2.5 I/O capabilities

zEC12 has many I/O capabilities for supporting high-speed connectivity to resources inside and outside the system. The connectivity of the zEC12 is designed to maximize application performance and satisfy clustering, security, storage area network (SAN), and local area network (LAN) requirements.

Multiple subchannel sets

Multiple subchannel sets (MSS) are designed to provide greater I/O device configuration capabilities for large enterprises. Up to three subchannel sets for zEC12 can be defined to each channel subsystem.

For each additional subchannel set, the amount of addressable storage capacity is 64 K subchannels, which enable a larger number of storage devices. This increase complements other functions (such as large or extended address volumes) and HyperPAV. This can also help facilitate consistent device address definitions, simplifying addressing schemes for congruous devices.

The first subchannel set (SS0) allows the definition of any type of device (such as bases, aliases, secondaries, and those devices other than disks that do not implement the concept of associated aliases or secondaries). The second and third subchannel sets (SS1 and SS2) can be designated for use for disk alias devices (of both primary and secondary devices) and Metro Mirror secondary devices only. The third subchannel set applies to FICON and zHPF protocols and is used by z/OS and Linux on System z, and supported by z/VM for guest use.

Initial program load from an alternate subchannel set

zEC12 supports the initial program load (IPL) from subchannel set 1 (SS1) or subchannel set 2 (SS2). Devices that are used early during IPL processing can be accessed by using subchannel set 1 or subchannel set 2. This flexibility allows the users of Metro Mirror (PPRC) secondary devices that are defined using the same device number and a new device type in an alternate subchannel set to be used for IPL, input/output definition file (IODF), and stand-alone dump volumes, when needed.

I/O infrastructure

The FICON features in the zEC12 can provide connectivity to servers, FC switches, and various devices (control units, disk, tape, printers) in a SAN environment, while delivering improved throughput, reliability, availability, and serviceability.

High Performance FICON for System z

High Performance FICON for System z (zHPF), first provided on System z10, is a FICON architecture for protocol simplification and efficiency, reducing the number of information units (IUs) processed. Enhancements have been made to the z/Architecture and the FICON interface architecture to provide optimizations for online transaction processing (OLTP) workloads.

When used by the FICON channel, the z/OS operating system, and the control unit (appropriate levels of Licensed Internal Code are required), the FICON channel overhead can be reduced and performance can be improved. Additionally, the changes to the architecture provide end-to-end system enhancements to improve reliability, availability, and serviceability (RAS). The zHPF channel programs can be used, for instance, by z/OS OLTP I/O workloads, DB2, VSAM, PDSE, and zFS. zHPF requires matching support by the DS8000 series or similar devices from other vendors.

At announcement, zHPF supported the transfer of small blocks of fixed size data (4 KB). This has been extended to multitrack operations on z10 EC with a limitation of 64 KB. z196 eliminated this 64 KB data transfer limit on multitrack operations.

The zHPF is exclusive to zEnterprise Systems and System z10. The FICON Express8S, FICON Express8, and FICON Express4 (channel path identifier (CHPID) type FC) concurrently support both the existing FICON protocol and the zHPF protocol in the server Licensed Internal Code.

For more information about FICON channel performance, see the technical papers on the System z I/O connectivity website:

http://www-03.ibm.com/systems/z/hardware/connectivity/ficon_performance.html

Channel subsystem enhancement for I/O resilience

The zEC12 channel subsystem incorporates an improved load balancing algorithm that is designed to provide improved throughput and reduced I/O service times, even when abnormal conditions occur. For example, degraded throughput and response times can be caused by multi-system workload spikes. This reduction can also be caused by resource contention in storage area networks (SANs) or across control unit ports, SAN congestion, suboptimal SAN configurations, problems with initializing optics, dynamic fabric routing changes, and destination port congestion.

When such events occur, the channel subsystem is designed to dynamically select channels to optimize performance. The subsystem also minimizes imbalances in I/O performance characteristics (such as response time and throughput) across the set of channel paths to each control unit. This function is done by using the in-band I/O instrumentation and metrics of the System z FICON and zHPF protocols.

This channel subsystem enhancement is exclusive to zEC12 and is supported on all FICON channels when configured as CHPID type FC. In support of this new function, z/OS V1.12 and V1.13 with a program temporary fix (PTF) also provide an updated health check based on an I/O rate-based metric, rather than on initial control unit command response time.

This enhancement is transparent to operating systems. However, this feature requires an updated health check based on an I/O rate-based metric, rather than on initial control unit command response time, provided by z/OS V1.12 and V1.13 with a PTF.

Modified Indirect Data Address Word (MIDAW) facility

The MIDAW facility is a system architecture and software usage that is designed to improve FICON performance. This facility was introduced with z9 servers and is used by the media manager in z/OS.

The MIDAW facility provides a more efficient structure for certain categories of data-chaining I/O operations:

•MIDAW can significantly improve FICON performance for extended format (EF) data sets. Non-extended data sets can also benefit from MIDAW.

•MIDAW can improve channel use and can significantly improve I/O response time. This benefit reduces FICON channel connect time, director ports, and control unit overhead.

From IBM laboratory tests, it is expected that applications that use EF data sets (such as DB2 or long chains of small blocks) gain significant performance benefits by using the MIDAW facility.

For more information about FICON, FICON channel performance, and MIDAW, see the I/O Connectivity website:

http://www.ibm.com/systems/z/connectivity/

For more information about FICON, see the IBM Redpaper™ publication, How does the MIDAW Facility Improve the Performance of FICON Channels Using DB2 and other workloads?, REDP-4201, at the following website:

http://www.redbooks.ibm.com/abstracts/redp4201.html

Also, see IBM TotalStorage DS8000 Series: Performance Monitoring and Tuning, SG24-7146.

Extended distance FICON

Using an enhancement to the industry standard FICON architecture (FC-SB-3) can help avoid degradation of performance at extended distances by implementing a protocol for persistent information unit (IU) pacing. Control units that use the enhancement to the architecture can increase the pacing count (the number of IUs allowed to be in flight from channel to control unit). Extended distance FICON allows the channel to remember the last pacing update for use on subsequent operations to help avoid degradation of performance at the start of each new operation.

Improved IU pacing can optimize the use of the link (for example, helps to keep a 4 Gbps link that is fully used at 50 km) and allows channel extenders to work at any distance, with performance results similar to those experienced when using emulation.

The requirements for channel extension equipment are simplified with the increased number of commands in flight. This can benefit z/OS Global Mirror (also referred as Extended Remote Copy, XRC) applications, as the channel extension kit is no longer required to simulate specific channel commands. Simplifying the channel extension requirements can help reduce the total cost of ownership of end-to-end solutions.

Extended Distance FICON is transparent to operating systems and applies to all the FICON Express4, FICON Express8, and FICON Express8S features carrying basic FICON traffic (CHPID type FC). For usage, the control unit must support the new IU pacing protocol.

Usage of extended distance FICON is supported by the IBM System Storage® DS8000 series with an appropriate level of Licensed Machine Code (LMC).

z/OS discovery and autoconfiguration

z/OS discovery and autoconfiguration for FICON channels (zDAC) is designed to perform a number of I/O configuration definition tasks automatically for new and changed disk and tape controllers that are connected to an FC switch when attached to a FICON channel.

Users can define a policy, by using the hardware configuration definition (HCD) dialog. Then, when new controllers are added to an I/O configuration or changes are made to existing controllers, the system is designed to discover them and propose configuration changes that are based on that policy. This policy can include preferences for availability and bandwidth, which includes PAV definitions, control unit numbers, and device number ranges.

zDAC is designed to perform discovery for all systems in a sysplex that support the function. The proposed configuration incorporates the current contents of the I/O definition file (IODF) with additions for newly installed and changed control units and devices. zDAC is designed to simplify I/O configuration on zEC12 running z/OS and reduce complexity and setup time. zDAC applies to all FICON features supported on zEC12 when configured as CHPID type FC.

FICON name server registration

The FICON channel provides the same information to the fabric as is commonly provided by open systems, registering with the name server in the attached FICON directors. This enables a quick and efficient management of storage area network (SAN) and performance of problem determination and analysis.

Platform registration is a standard service that is defined in the Fibre Channel - Generic Services 3 (FC-GS-3) standard (INCITS (ANSI) T11.3 group). It allows a platform (storage subsystem, host, and so on) to register information about itself with the fabric (directors).

This zEC12 function is transparent to operating systems and applicable to all FICON Express8S, FICON Express8, and FICON Express4 features (CHPID type FC). For more information, see IBM System z Connectivity Handbook, SG24-5444.

Fibre Channel Protocol enhancements for small block sizes

The Fibre Channel Protocol (FCP) Licensed Internal Code was modified to help provide increased I/O operations per second for small block sizes. This FCP performance improvement is transparent to operating systems and applies to all the FICON Express8S, FICON Express8, and FICON Express4 features when configured as CHPID type FCP, communicating with SCSI devices.

For more information about FCP channel performance, see the performance technical papers on the System z I/O connectivity web page:

FCP channels to support T10-DIF for enhanced reliability

Recognizing that high reliability is important for maintaining the availability of business-critical applications, the System z Fibre Channel Protocol (FCP) has implemented support of the American National Standards Institute's (ANSI) T10 Data Integrity Field (DIF) standard. Data integrity protection fields are generated by the operating system and propagated through the storage area network (SAN). System z helps to provide added end-to-end data protection between the operating system and the storage device.

An extension to the standard, Data Integrity Extensions (DIX), provides checksum protection from the application layer through the host bus adapter (HBA), where cyclical redundancy checking (CRC) protection is implemented.

T10-DIF support by the FICON Express8S and FICON Express8 features, when defined as CHPID type FCP, is exclusive to zEnterprise CPCs. Usage of the T10-DIF standard requires support by the operating system and the storage device.

N_Port ID Virtualization (NPIV)

NPIV is designed to allow the sharing of a single physical FCP channel among operating system images, whether in logical partitions or as z/VM guests. This is achieved by assigning a unique worldwide port name (WWPN) for each operating system that is connected to the FCP channel. In turn, each operating system appears to have its own distinct WWPN in the SAN environment, hence enabling separation of the associated FCP traffic on the channel.

Access controls that are based on the assigned WWPN can be applied in the SAN environment. This function can be done by using standard mechanisms such as zoning in SAN switches and logical unit number (LUN) masking in the storage controllers.

WWPN tool

A part of the installation of your zEC12 server is the planning of the SAN environment. IBM has made a stand-alone tool available to assist with this planning before the installation. The tool, which is known as the WWPN tool, assigns WWPNs to each virtual Fibre Channel Protocol (FCP) channel/port. This function is done by using the same WWPN assignment algorithms that a system uses when assigning WWPNs for channels using NPIV. Thus, the SAN can be set up in advance, allowing operations to proceed much faster after the server is installed.

The WWPN tool takes a .csv file that contains the FCP-specific I/O device definitions and creates the WWPN assignments that are required to set up the SAN. A binary configuration file that can be imported later by the system is also created. The .csv file can either be created manually or exported from the Hardware Configuration Definition/Hardware Configuration Manager (HCD/HCM).

The WWPN tool is available for download at IBM Resource Link® and is applicable to all FICON channels defined as CHPID type FCP (for communication with SCSI devices) on zEC12. IBM Resource Link is available at the following website:

http://www.ibm.com/servers/resourcelink/

Fiber Quick Connect for FICON LX

Fiber Quick Connect (FQC), an optional feature on zEnterprise CPCs, is designed to reduce the amount of time that is required for on-site installation and setup of fiber optic cabling. FQC is offered for all FICON LX (single mode fiber) channels that are in all of the I/O cage, I/O drawer, or PCIe I/O drawer of the server.

FQC facilitates: adds, moves, and changes, of FICON LX fiber optic cables in the data center, and can reduce fiber connection time by up to 80%. FQC is for factory installation of IBM Facilities Cabling Services - Fiber Transport System (FTS) fiber harnesses for connection to channels in the I/O cage, I/O drawer, or PCIe I/O drawer. FTS fiber harnesses enable connection to FTS direct-attach fiber trunk cables from IBM Global Technology Services.

LAN connectivity

The zEC12 offers a wide range of functions that can help consolidate or simplify the LAN environment with the supported OSA-Express features, though also satisfying the demand for more throughput. Improved throughput (mixed inbound/outbound) is achieved by the data router function that is introduced in the OSA-Express3 and enhanced in OSA-Express4S features.

With the data router, the store and forward technique in DMA is no longer used. The data router enables a direct host memory-to-LAN flow. This function avoids a hop and is designed to reduce latency and to increase throughput for standard frames (1492 bytes) and jumbo frames (8992 bytes).

Queued direct I/O (QDIO) optimized latency mode (OLM)

QDIO OLM can help improve performance for applications that have a critical requirement to minimize response times for inbound and outbound data. OLM optimizes the interrupt processing as noted in the following configurations:

•For inbound processing, the TCP/IP stack looks more frequently for available data to process, ensuring that any new data is read from the OSA-Express3 or OSA-Express4S without requiring more program controlled interrupts (PCIs).

•For outbound processing, the OSA-Express3 or OSA-Express4S looks more frequently for available data to process from the TCP/IP stack, thus not requiring a Signal Adapter (SIGA) instruction to determine whether more data is available.

Inbound workload queuing (IWQ)

IWQ is designed to help reduce overhead and latency for inbound z/OS network data traffic and implement an efficient way for initiating parallel processing. This is achieved by using an OSA-Express4S or OSA-Express3 feature in QDIO mode (CHPID types OSD and OSX) with multiple input queues and by processing network data traffic that is based on workload types. The data from a specific workload type is placed in one of four input queues (per device), and a process is created and scheduled to run on one of multiple processors, independent from the other three queues. This improves performance because IWQ can use the symmetric multiprocessor (SMP) architecture of the zEC12.

zEnterprise ensemble connectivity

With the IBM zEnterprise System, two CHPID types were introduced to support the zEnterprise ensemble:

•OSA-Express for Unified Resource Manager (OSM) for the intranode management network (INMN)

•OSA-Express for zBX (OSX) for the intraensemble data network

The INMN is one of the ensemble’s two private and secure internal networks. INMN is used by the Unified Resource Manager functions in the primary HMC.

The z196 introduced the OSA-Express for Unified Resource Manager (OSM) CHPID type. The OSM connections are through the Bulk Power Hubs (BPHs) in the zEnterprise CPC. The BPHs are also connected to the INMN TOR switches in the zBX. The INMN requires two OSA-Express4S 1000BASE-T or OSA-Express3 1000BASE-T ports from separate features.

The IEDN is the ensemble’s other private and secure internal network. IEDN is used for communications across the virtualized images (LPARs and virtual machines).

The z196 introduced the OSA-Express for zBX (OSX) CHPID type. The OSX connection is from the zEnterprise CPC to the IEDN TOR switches in zBX. The IEDN requires two OSA-Express4S 10 GbE or OSA Express3 10 GbE ports from separate features.

Virtual local area network (VLAN) support

VLAN is a function of OSA-Express features that takes advantage of the IEEE 802.q standard for virtual bridged LANs. VLANs allow easier administration of logical groups of stations that communicate as though they were on the same LAN. In the virtualized environment of System z, TCP/IP stacks can exist, potentially sharing OSA-Express features. VLAN provides a greater degree of isolation by allowing contact with a server from only the set of stations that comprise the VLAN.

Virtual MAC (VMAC) support

When sharing OSA port addresses across LPARs, VMAC support enables each operating system instance to have a unique virtual MAC (VMAC) address. All IP addresses associated with a TCP/IP stack are accessible by using their own VMAC address, instead of sharing the MAC address of the OSA port. Advantages include a simplified configuration setup and improvements to IP workload load balancing and outbound routing.

This support is available for Layer 3 mode and is used by z/OS and supported by z/VM for guest usage.

QDIO data connection isolation for the z/VM environment

New workloads increasingly require multitier security zones. In a virtualized environment, an essential requirement is to protect workloads from intrusion or exposure of data and processes from other workloads.

The queued direct input/output (QDIO) data connection isolation enables the following elements:

•Adherence to security and HIPPA-security guidelines and regulations for network isolation between the instances that share physical network connectivity.

•Establishing security zone boundaries that are defined by the network administrators.

•A mechanism to isolate a QDIO data connection (on an OSA port) by forcing traffic to flow to the external network. This feature ensures that all communication flows only between an operating system and the external network.

Internal routing can be disabled on a per-QDIO connection basis. This support does not affect the ability to share an OSA port. Sharing occurs as it does today, but the ability to communicate between sharing QDIO data connections can be restricted through this support.

QDIO data connection isolation (also known as VSWITCH port isolation) applies to the z/VM environment when using the Virtual Switch (VSWITCH) function and to all of the OSA-Express4S and OSA-Express3 features (CHPID type OSD) on zEC12. z/OS supports a similar capability.

QDIO interface isolation for z/OS

Some environments require strict controls for routing data traffic between severs or nodes. In certain cases, the LPAR-to-LPAR capability of a shared OSA port can prevent such controls from being enforced. With interface isolation, internal routing can be controlled on an LPAR basis. When interface isolation is enabled, the OSA discards any packets that are destined for a z/OS LPAR that is registered in the OAT as isolated.

QDIO interface isolation is supported by Communications Server for z/OS V1R11 and later and for all OSA-Express4S and OSA-Express3 features on zEC12.

Open Systems Adapter for NCP (OSN)

The OSN support is able to provide channel connectivity from System z Operating Systems to IBM Communication Controller for Linux on System z (CCL). This function is done by using the Open Systems Adapter for the Network Control Program (OSA for NCP) supporting the Channel Data Link Control (CDLC) protocol.

When SNA solutions that require NCP functions are needed, CCL can be considered as a migration strategy to replace IBM Communications Controllers (374x). The CDLC connectivity option enables z/TPF environments to use CCL.

OSN: The OSN CHPID type is not supported on OSA-Express4S GbE features.

Network management: Query and display OSA configuration

As more complex functions are added to OSA, the ability for the system administrator to display, monitor, and verify the specific current OSA configuration unique to each operating system is becoming more complex. OSA-Express4S and OSA-Express3 have the capability for the operating system to query and display the current OSA configuration information (similar to OSA/SF) directly. z/OS uses this OSA capability by providing the TCP/IP operator command Display OSAINFO which allows the operator to monitor and verify the current OSA configuration, helping to improve the overall management, serviceability, and usability of OSA-Express4S and OSA-Express3 features.

The Display OSAINFO command is exclusive to OSA-Express4S and OSA-Express3 (CHPID types OSD, OSM, and OSX), the z/OS operating system, and z/VM for guest usage.

HiperSockets

HiperSockets have been called the “network in a box.” HiperSockets simulates LANs entirely in the hardware. The data transfer is from LPAR memory to LPAR memory, mediated by microcode. zEC12 supports up to 32 HiperSockets. One HiperSockets network can be shared by up to 60 LPARs on a zEC12. Up to 4096 communication paths support a total of 12,288 IP addresses across all 32 HiperSockets.

HiperSockets Layer 2 support

The HiperSockets internal networks can support two transport modes:

•Layer 2 (link layer)

•Layer 3 (network or IP layer)

Traffic can be Internet Protocol (IP) Version 4 or Version 6 (IPv4, IPv6) or non-IP (such as AppleTalk, DECnet, IPX, NetBIOS, SNA, or others). HiperSockets devices are protocol-independent and Layer 3 independent. Each HiperSockets device has its own Layer 2 Media Access Control (MAC) address, which is designed to allow the use of applications that depend on the existence of Layer 2 addresses such as Dynamic Host Configuration Protocol (DHCP) servers and firewalls.

Layer 2 support can help facilitate server consolidation. Complexity can be reduced, network configuration is simplified and intuitive, and LAN administrators can configure and maintain the mainframe environment the same way as they do for a non-mainframe environment. HiperSockets Layer 2 support is supported by Linux on System z, and by z/VM for guest usage.

HiperSockets Multiple Write Facility

HiperSockets performance was enhanced to allow for the streaming of bulk data over a HiperSockets link between LPARs. The receiving LPAR can now process a much larger amount of data per I/O interrupt. This enhancement is transparent to the operating system in the receiving LPAR. HiperSockets Multiple Write Facility, with fewer I/O interrupts, is designed to reduce CPU use of the sending and receiving LPAR.

The HiperSockets Multiple Write Facility is supported in the z/OS environment.

zIIP-Assisted HiperSockets for large messages

In z/OS, HiperSockets are enhanced for zIIP usage. Specifically, the z/OS Communications Server allows the HiperSockets Multiple Write Facility processing for outbound large messages that originate from z/OS to be performed on a zIIP.

zIIP-Assisted HiperSockets can help make highly secure and available HiperSockets networking an even more attractive option. z/OS application workloads that are based on XML, HTTP, SOAP, Java, and traditional file transfer can benefit from zIIP enablement by lowering general-purpose processor use for such TCP/IP traffic.

When the workload is eligible, the TCP/IP HiperSockets device driver layer (write) processing is redirected to a zIIP, which unblocks the sending application.

zIIP Assisted HiperSockets for large messages is available on zEC12 with z/OS V1R10 (plus service) and later releases.

HiperSockets Network Traffic Analyzer (HS NTA)

HS NTA is a function available in the zEC12 LIC. It can make problem isolation and resolution simpler by allowing Layer 2 and Layer 3 tracing of HiperSockets network traffic.

HS NTA allows Linux on System z to control tracing of the internal virtual LAN. It captures records into host memory and storage (file systems) that can be analyzed by system programmers and network administrators, using Linux on System z tools to format, edit, and process the trace records.

A customized HiperSockets NTA rule enables authorizing an LPAR to trace messages only from LPARs that are eligible to be traced by the NTA on the selected IQD channel.

HiperSockets Completion Queue

The HiperSockets Completion Queue function allows both synchronous and asynchronous transfer of data between logical partitions. With the asynchronous support, during high volume situations, data can be temporarily held until the receiver has buffers available in its inbound queue. This provides end-to-end performance improvement for LPAR to LPAR communication and can be especially helpful in burst situations.

HiperSockets Completion Queue function is supported on zEC12 running z/OS V1R13, Red Hat Enterprise Linux (RHEL) 6.2 for System z, or SUSE Linux Enterprise Server (SLES) 11 SP2 for System z. HiperSockets Completion Queue is planned to be supported in the z/VM and z/VSE environments in a future deliverable.

HiperSockets integration with the intraensemble data network

The zEC12 server provides the capability to integrate HiperSockets connectivity with the intraensemble data network (IEDN). Thus the reach of the HiperSockets network is extended to outside the CPC to the entire ensemble, which is displayed as a single, Layer 2. Because HiperSockets and IEDN are both internal System z networks, the combination allows System z virtual servers to use an optimal path for communications.

The support of HiperSockets integration with the IEDN function is available on z/OS Communication Server V1R13 and z/VM 6.2 with PTFs.

HiperSockets Virtual Switch Bridge Support

The z/VM virtual switch is enhanced to transparently bridge a guest virtual machine network connection on a HiperSockets LAN segment. This bridge allows a single HiperSockets guest virtual machine network connection to also directly communicate with the following systems:

•Other guest virtual machines on the virtual switch

•External network hosts through the virtual switch OSA UPLINK port

z/VM 6.2, TCP/IP, and Performance Toolkit APARs are required for this support.

A HiperSockets channel by itself is only capable of providing intra-CPC communications. The HiperSockets Bridge Port allows a virtual switch to connect z/VM guests by using real HiperSockets devices, the ability to communicate with hosts that reside externally to the CPC. The virtual switch HiperSockets Bridge Port eliminates the need to configure a separate next hop router on the HiperSockets channel to provide connectivity to destinations that are outside of a HiperSockets channel.

z/VSE fast path to Linux support

Linux Fast Path (LFP) allows z/VSE TCP/IP applications to communicate with the TCP/IP stack on Linux without using a TCP/IP stack on z/VSE. LFP for use in a z/VM guest environment is supported on z/VSE V4R3 or higher. When LFP is used in an LPAR environment, it requires the HiperSockets Completion Queue function available on zEnterprise CPCs. LFP in an LPAR environment is supported on z/VSE V5R1.

3.2.6 Cryptography

zEC12 provides cryptographic functions that, from an application program perspective, can be grouped into the following functions:

•Synchronous cryptographic functions, provided by the CP Assist for Cryptographic Function (CPACF)

•Asynchronous cryptographic functions, provided by the Crypto Express features

CP Assist for Cryptographic Function (CPACF)

CPACF offers a set of symmetric cryptographic functions for high performance encryption and decryption with clear key operations for SSL/TLS, VPN, and data-storing applications that do not require FIPS³ 140-2 level 4 security. The CPACF is integrated with the compression unit in the coprocessor (CoP) in the System z microprocessor core.

The CPACF protected key is a function that is designed to facilitate the continued privacy of cryptographic key material while keeping the wanted high performance. CPACF ensures that key material is not visible to applications or operating systems during encryption operations. CPACF protected key provides substantial throughput improvements for large-volume data encryption and low latency for encryption of small blocks of data.

The cryptographic assist includes support for the following functions:

•Data Encryption Standard (DES) data encrypting and decrypting

DES supports the following key types:

– Single-length key DES

– Double-length key DES

– Triple-length key DES (T-DES)

•Advanced Encryption Standard (AES) for 128-bit, 192-bit, and 256-bit keys

•Pseudo random number generation (PRNG)

•Message Authentication Code (MAC)

•Hashing algorithms: SHA-1 and SHA-2 support for SHA-224, SHA-256, SHA-384, and SHA-512

SHA-1 and SHA-2 support for SHA-224, SHA-256, SHA-384, and SHA-512 are shipped enabled on all servers and do not require the CPACF enablement feature. The CPACF functions are supported by z/OS, z/VM, z/VSE, z/TPF, and Linux on System z.

Crypto Express4S

The Crypto Express4S represents the newest generation of the Peripheral Component Interconnect Express (PCIe) cryptographic coprocessors. It is an optional and zEC12 exclusive feature. This feature provides a secure programming and hardware environment wherein Crypto processes are performed. Each cryptographic coprocessor includes a general-purpose processor, non-volatile storage, and specialized cryptographic electronics.

The Crypto Express4S has one PCIe adapter per feature. For availability reasons, a minimum of 2 features is required. Up to 16 Crypto Express4S features are supported (16 PCI Express adapters per zEC12). The Crypto Express4S feature occupies one I/O slot in a zEC12 PCIe I/O drawer.

Each adapter can be configured as a Secure IBM CCA coprocessor, a Secure IBM Enterprise PKCS #11 (EP11) coprocessor or as an accelerator.

The accelerator function is designed for maximum-speed Secure Sockets Layer/Transport Layer Security (SSL/TLS) acceleration, rather than for specialized financial applications for secure, long-term storage of keys or secrets. The Crypto Express4S can also be configured as one of the following configurations:

•Secure IBM CCA coprocessor for Federal Information Processing Standard (FIPS) 140-2 Level 4 certification. This standard includes secure key functions and is optionally programmable to deploy more functions and algorithms using User Defined Extension (UDX).

•Secure IBM Enterprise PKCS #11 (EP11) coprocessor, implementing an industry standardized set of services that adheres to the PKCS #11 specification v2.20 and more recent amendments. It was designed for extended FIPS and Common Criteria evaluations to meet industry requirements.

This new cryptographic coprocessor mode introduced the PKCS #11 secure key function.

TKE feature: The Trusted Key Entry (TKE) Workstation feature is required for supporting the administration of the Crypto Express4S when configured as an Enterprise PKCS #11 coprocessor.

When the Crypto Express4S PCI Express adapter is configured as a secure IBM CCA coprocessor, it still provides accelerator functions. However, up to three times better performance for those functions can be achieved if the Crypto Express4S PCI Express adapter is configured as an accelerator.

Crypto Express3

The Crypto Express3 feature is available on a carry-forward only basis when upgrading from earlier System z systems to zEC12. The Crypto Express3 feature has two PCI Express adapters⁴ and each feature occupies one I/O slot in an I/O cage or in an I/O drawer.

Each adapter can be configured as a coprocessor or as an accelerator:

•Secure coprocessor for Federal Information Processing Standard (FIPS) 140-2 Level 4 certification. This standard includes secure key functions and is optionally programmable to deploy more functions and algorithms using User Defined Extension (UDX).

•Accelerator for public key and private key cryptographic operations that are used with Secure Sockets Layer/Transport Layer Security (SSL/TLS) processing.

When the Crypto Express3 PCI Express adapter is configured as a secure coprocessor, it still provides accelerator functions. However, up to three times better performance for those functions can be achieved if the Crypto Express3 PCI Express adapter is configured as an accelerator.

Statement of Direction: The zEC12 is planned to be the last IBM System z server to support Crypto Express3 feature. Enterprises should begin migrating from the Crypto Express3 feature to the Crypto Express4S feature.

Common Cryptographic Architecture (CCA) enhancements

The following functions were added to Crypto Express4S cryptographic feature and to the Crypto Express3 cryptographic feature when running on zEC12, through the ICSF web deliverable FMID HCR77A0:

•Secure Cipher Text Translate

•Improved wrapping key strength for security and standards compliance

•DUKPT for derivation of Message Authentication Code (MAC) and encryption keys

•Compliance with new Random Number Generator standards

•EMV enhancements for applications which support American Express cards

Web deliverables

For z/OS downloads, see the z/OS website:

http://www.ibm.com/systems/z/os/zos/downloads/

3.3 Hardware Management Console functions

The HMC and SE are appliances that provide hardware platform management for System z. Hardware platform management covers a complex set of setup, configuration, operation, monitoring, and service management tasks and services that are essential to the use of the System z hardware platform product.

The HMC also allows viewing and managing multi-nodal servers with virtualization, I/O networks, service networks, power subsystems, cluster connectivity infrastructure, and storage subsystems through the Unified Resource Manager. A task, Create Ensemble, allows the Access Administrator to create an ensemble that contains CPCs, images, workloads, virtual networks, and storage pools, either with or without an optional zBX.

An ensemble configuration requires a pair of HMCs that are designated as the primary and alternate HMC, and are assigned an ensemble identity. The HMC has a global (ensemble) management function, whereas the SE has local node management responsibility. When tasks are performed on the HMC, the commands are sent to one or more SEs, which issue commands to their CPCs and zBXs.

These Unified Resource Manager features must be ordered to equip an HMC to manage an ensemble:

• Ensemble Membership Flag

• Manage Firmware Suite

• Automate/Advanced Management Firmware Suite (optional)

Additional features might be needed, depending on the selection of Blade options for the zBX.

HMC/SE Version 2.12.0 is the current version available for the zEC12. See Building an Ensemble Using IBM zEnterprise Unified Resource Manager, SG24-7921, for more information about these HMC functions and capabilities.

3.3.1 Hardware Management Console key enhancements for zEC12

The HMC application has several enhancements in addition to the Unified Resource Manager:

•Tasks and panels are updated to support configuring and managing the zEC12 introduced Flash Express and IBM zAware features.

•For STP NTP broadband security, authentication is added to the HMC’s NTP communication with NTP time servers.

•Modem support is removed from HMC. The Remote Support Facility (RSF) for IBM support, service, and configuration update is only possible through an Ethernet broadband connection.

•The Monitors Dashboard on the HMC and SE is enhanced with an adapter table for zEC12. The Crypto Utilization percentage is displayed on the Monitors Dashboard according to the PCHID number. The associated Crypto number (Adjunct Processor Number) for this PCHID is also shown in the table. It provides information about utilization rate on a system-wide basis. The adapter table also displays Flash Express.

•The Environmental Efficiency Statistic Task provides historical power consumption and thermal information for zEC12 on the HMC. This task provides similar data along with a historical summary of processor and channel use. The initial chart display shows the 24 hours that precede the current time so that a full 24 hours of recent data is displayed. The data is presented in table form, graphical (histogram) form, and it can also be exported to a .csv formatted file so that it can be imported into a spreadsheet.

•The microcode update to a specific bundle is now possible.

For more information about the key capabilities and enhancements of the HMC, refer to IBM zEnterprise EC12 Technical Guide, SG24-8049.

3.3.2 Considerations for multiple Hardware Management Consoles

Often, multiple HMC instances are deployed to manage an overlapping collection of systems. Before the announcement of zEnterprise Systems, all HMCs were peer consoles to the managed systems. Furthermore, all management actions are possible to any of the systems that are reachable while logged in to a session on any of the HMCs (subject to access control).

With the definition of an ensemble, this resource management paradigm changes. Management actions that target a node of an ensemble can be done only from the primary HMC for that ensemble.

3.4 zEC12 common time functions

Each server must have an accurate time source to maintain a time-of-day value. Logical partitions use their system’s time. When system images participate in a Sysplex, coordinating the time across all the system images in the Sysplex is critical to its operation.

The zEC12 supports the Server Time Protocol and can participate in a coordinated timing network.

3.4.1 Server Time Protocol (STP)

Server Time Protocol (STP) is a message-based protocol in which timekeeping information is passed over data links between servers. The timekeeping information is transmitted over externally defined coupling links. The STP feature is the supported method for maintaining time synchronization between the zEC12 and coupling facilities (CFs) in Sysplex environments.

The STP design uses a concept called Coordinated Timing Network (CTN). A CTN is a collection of CPCs and CFs that are time synchronized to a time value called
Coordinated Server Time (CST). Each CPC and CF to be configured in a CTN must be STP-enabled. STP is intended for CPCs that are configured to participate in a Parallel Sysplex or servers that are not in a Parallel sysplex, but must be time synchronized.

STP is implemented in LIC as a system-wide facility of zEC12 and other System z servers. STP presents a single view of time to PR/SM and provides the capability for multiple CPCs and CFs to maintain time synchronization with each other. A zEC12 server is enabled for STP by installing the STP feature code. Additional configuration is required for a zEC12 to become a member of a CTN.

Statement of Direction: IBM zEnterprise EC12 will be the last high-end server to support connections to an STP Mixed CTN. This includes the IBM Sysplex Timer® (9037). After zEC12, servers that require time synchronization, such as to support a base or Parallel Sysplex, will require Server Time Protocol (STP), and all servers in that network must be configured in STP-only mode.

STP provides the following additional value over the former used-time synchronization method by a Sysplex Timer:

•STP supports a multi-site timing network of up to 100 km (62 miles) over fiber optic cabling, without requiring an intermediate site. This protocol allows a Parallel Sysplex to span these distances and reduces the cross-site connectivity that is required for a multi-site Parallel Sysplex.

•The STP design allows more stringent synchronization between CPCs and CFs by using communication links that are already used for the sysplex connectivity. With the zEC12, STP supports coupling links over InfiniBand or ISC-3⁵ links.

•STP helps eliminate infrastructure requirements, such as power and space, needed to support the Sysplex Timers and helps eliminate maintenance costs that are associated with the Sysplex Timers.

•STP can reduce the fiber optic infrastructure requirements in a multi-site configuration since it can use the already in use coupling links.

Timing: Concurrent migration from an existing External Time Reference (ETR) network to a timing network using STP is supported only if a z10 EC or z10 BC is used for the Stratum 1 server. System z systems that precede the z10 cannot participate in the same CTN with zEC12.

Server Time Protocol (STP) recovery enhancement

When HCA3-O or HCA3-O LR coupling links are used, an unambiguous “going away signal” is sent when the server on which the HCA3 is running is about to enter a failed state. When the going away signal that is sent by the Current Time Server (CTS) in an STP-only CTN is received by the Backup Time Server (BTS), the BTS can safely take over as the CTS. The take over can occur without relying on the previous recovery methods of Offline Signal (OLS) in a two-server CTN or the Arbiter in a CTN with three or more servers.

The previously available STP recovery design is still available for the cases when a
going away signal is not received or for other failures different from a system failure.

3.4.2 Network Time Protocol (NTP) client support

The use of Network Time Protocol (NTP) servers as an external time source (ETS) usually fulfills a requirement for a time source or common time reference across heterogeneous platforms and for providing a higher time accuracy.

NTP client support is available in the Support Element (SE) code of the zEC12. The code interfaces with the NTP servers. This interaction allows an NTP server to become the single time source for zEC12 and for other servers that have NTP clients. NTP can be used only for an STP-only CTN environment.

ETS access: ETS access through Modem is not supported on the zEC12 HMC.

Pulse per second (PPS) support

Certain NTP servers also provide a pulse per second (PPS) output signal. The PPS output signal is more accurate (within 10 microseconds) than that from the NTP server without PPS (within 100 milliseconds).

Two External Clock Facility (ECF) cards are shipped as a standard feature of the zEC12 system and provide a dual-path interface for the PPS signal. The redundant design allows continuous operation, in case of failure of one card, and concurrent card maintenance.

Each of the standard ECF cards of the zEC12 have a PPS port (for a coaxial cable connection) that can be used by STP with the NTP client.

NTP server on HMC with security enhancements

The NTP server capability on the HMC addresses the potential security concerns that users can have for attaching NTP servers directly to the HMC/SE LAN. When using the HMC as the NTP server, the pulse per second capability is not available.

HMC NTP broadband authentication support for zEC12

The HMC NTP authentication capability is provided by the HMC Level 2.12.0. SE NTP support stays unchanged. To use this option for STP, configure the HMC as the NTP server for the SE.

The authentication support of the HMC NTP server can be set up in either of two ways:

•NTP requests are UDP socket packets and cannot pass through the proxy. If a proxy to access outside corporate data center is used, then this proxy must be configured as an NTP server to get to target servers on the web. Authentication can be set up on the client’s proxy to communicate to the target time sources.

•If a firewall is used, HMC NTP requests must pass through the firewall. Clients in this configuration should use the HMC authentication to ensure untampered time stamps.

For a more in-depth discussion of STP, see the Server Time Protocol Planning Guide, SG24-7280, and the Server Time Protocol Implementation Guide, SG24-7281.

3.5 zEC12 power functions

As environmental concerns raise the focus on energy consumption, zEC12 offers a holistic focus on the environment. New efficiencies and functions, such as an improved integrated cooling system, static power save mode, and cycle steering (as radiator and water-cooled systems are being backed-up), enable a dramatic reduction of energy usage and floor space when consolidating workloads from distributed servers.

3.5.1 High voltage DC power

In today’s data centers, many businesses are paying increasing electric bills and are also running out of power. The zEC12 High Voltage Direct Current power feature adds nominal 380 - 520 Volt DC input power capability to the existing System z, universal 3 phase, 50/60 hertz, totally redundant power capability (nominal 200–240VAC or 380–415VAC or 480VAC).

This feature allows CPCs to directly use the high voltage DC distribution in new, green data centers. A direct HV DC data center power design can improve data center energy efficiency by removing the need for a DC-to-AC inversion step. The zEC12’s bulk power supplies have been modified to support HV DC, so the only difference in shipped HW to implement the option is the DC power cords.

Because HV DC is a new technology, there are multiple proposed standards. The zEC12 supports both ground-referenced and dual-polarity HV DC supplies, such as +/-190V or +/-260V, or +380V. Beyond the data center uninterruptible power supply (UPS) and power distribution energy savings, a zEC12 run on HV DC power draws 1 - 3% less input power. HV DC does not change the number of power cords that a system requires.

3.5.2 Integrated battery feature (IBF)

IBF is an optional feature on the zEC12. See Figure 2-2 on page 25 for a pictorial view of the location of IBF. IBF provides the function of a local uninterrupted power source.

The IBF further enhances the robustness of the power design, increasing power line disturbance immunity. The feature provides battery power to preserve processor data if there is a loss of power on all four AC feeds from the utility company. The IBF can hold power briefly during a brownout, or for orderly shutdown in a longer outage.

3.5.3 Power capping and power saving

zEC12 supports power capping, which gives the ability to control the maximum power consumption and reduce cooling requirements (especially with zBX). To use power capping, the Automate Firmware Suite feature must be ordered. This feature is used to enable the Automate suite of functionality that is associated with the Unified Resource Manager. The Automate suite includes representation of resources in a workload context, goal-oriented monitoring and management of resources, and energy management. A static power-saving mode is also available for the zEC12 when the Automate Firmware Suite feature is installed. It uses frequency and voltage reduction to reduce energy consumption and can be set up ad hoc or as a scheduled operation. It means, for example, in periods of low utilization or on CBU systems, that clients can set the system in a static power-saving mode. Power Saving functions are also provided for the blades in the zBX.

3.5.4 Power estimation tool

The power estimation tool for the zEC12 allows you to enter your precise server configuration to produce an estimate of power consumption. Log in to the Resource Link and go to Planning → Tools → Power Estimation Tools. Specify the quantity for the features that are installed in your machine. This tool estimates the power consumption for the specified configuration. The tool does not verify that the specified configuration can be physically built.

Power consumption: The exact power consumption for your machine will vary. The objective of the tool is to produce an estimation of the power requirements to aid you in planning for your machine installation. Actual power consumption after installation can be confirmed on the HMC monitoring tools.

3.5.5 zEC12 radiator cooling system

The cooling system in zEC12 is redesigned for better availability and lower cooling power consumption. Water cooling technology is now fully used in zEC12 MCMs.

The Modular Refrigeration Unit (MRU) technology for cooling the processor modules in previous System z servers is replaced with a “Radiator” design - a new closed loop water cooling pump system. The radiator has no connection to the chilled water required and the water is added to the closed loop system during installation through the new Fill and Drain Tool.

The radiator cooling system is designed with two pumps and two blowers, but a single working pump and blower can handle the entire load. Replacement of a pump or blower is concurrent, without any performance impact.

In zEC12, radiator is the primary cooling source of the MCM and is backed up by an air cooling system in the rare case of failure of the entire radiator. During the backup air cooling mode, hot air exits through the top of the system and the oscillator card is set to a slower cycle time. In this “cycle steering mode”, the system slows down to allow the degraded cooling capacity to maintain the proper temperature range. Running at a slower cycle time, the MCMs produce less heat. The slowdown process is done in steps, which are based on the temperature in the books.

3.5.6 zEC12 water cooling

The zEC12 continues to offer the possibility of using the building’s chilled water to cool the system, by employing the water cooling unit (WCU) technology. The MCM in the book is cooled by an internal, closed, water cooling loop. In the internal closed loop, water exchanges heat with building chilled water through a cold plate. The source of building chilled water is provided by the client.

In addition to the MCMs, the internal water loop also circulates through two heat exchangers that are in the path of the exhaust air in the rear of the frames. These heat exchangers remove approximately 60% - 65% of the residual heat from the I/O drawers, the air cooled logic in the books and the heat that is generated within the power enclosures. Almost two thirds of the total heat that is generated is removed from the room by the chilled water.

The zEC12 operates with two fully redundant water cooling units (WCUs). One water cooling unit can support the entire load and the replacement of WCU is fully concurrent. If there is a total loss of building chilled water or if both water cooling units fail, the backup blowers are turned on to keep the system running, similarly to the radiator cooling system. At that time, cycle time degradation is required.

Unlike the z196, the zEC12 books are the same in both the air and water-cooled systems. However, conversion between air and water cooling is not available.

3.5.7 IBM Systems Director Active Energy Manager

IBM Systems Director Active Energy Manager™ (AEM) is an energy management solution building block that returns true control of energy costs to the client. This feature enables you to manage actual power consumption and resulting thermal loads that IBM servers place on the data center. It is an industry-leading cornerstone of the IBM energy management framework. In tandem with chip vendor Intel and AMD and consortium such as Green Grid, AEM advances the IBM initiative to deliver price performance per square foot.

AEM runs on Windows, Linux on System x, AIX, Linux on IBM System p®, and Linux on System z. For more information, see its documentation.

How AEM works

The following list is a brief overview of how AEM works:

•Hardware, firmware, and systems management software in servers and blades, can take inventory of components.

•AEM adds power draw-up for each server or blade and tracks that usage over time.

•When power is constrained, AEM allows power to be allocated on a server-by-server basis. Consider the following information:

– Care must be taken that limiting power consumption does not affect performance.

– Sensors and alerts can warn the user if limiting power to this server could affect performance.

•Certain data can be gathered from the SNMP API on the HMC:

– System name, machine type, model, serial number, firmware level

– Ambient and exhaust temperature

– Average and peak power (over a 1-minute period)

– Other limited status and configuration information

3.5.8 zEC12 Top Exit Power

zEC12 introduces the Top Exit Power optional feature (FC 7901) offering. See Figure 3-3. This feature enables installing a radiator (air) cooled zEC12 on a non-raised floor, when the optional top exit I/O cabling feature (FC 7942) is also installed. Water-cooled zEC12 models cannot be installed on a non-raised floor as top exit support for water cooling systems is not available. On a raised floor, either radiator or water cooling is supported.

Figure 3-3 zEC12 Top Exit Power and I/O Cabling Option

3.6 zEC12 capacity on demand (CoD)

The zEC12 continues to deliver on-demand offerings. The offerings provide flexibility and control to the client, ease the administrative burden in the handling of the offerings, and give the client finer control over resources that are needed to meet the resource requirements in various situations.

The zEC12 can perform concurrent upgrades, providing more capacity with no server outage. In most cases, with operating system support, a concurrent upgrade can also be nondisruptive to the operating system. It is important to consider that these upgrades are based on the enablement of resources already physically present in the zEC12.

Capacity upgrades cover both permanent and temporary changes to the installed capacity. The changes can be done using the Customer Initiated Upgrade (CIU) facility, without requiring IBM service personnel involvement. Such upgrades are initiated through the web by using IBM Resource Link. Use of the CIU facility requires a special contract between the client and IBM, through which terms and conditions for online capacity on demand (CoD) buying of upgrades and other types of CoD upgrades are accepted. For more information, consult the IBM Resource Link web page:

http://www.ibm.com/servers/resourcelink

For more information about the CoD offerings, see the IBM zEnterprise EC12 Technical Guide, SG24-8049.

3.6.1 Permanent upgrades

Permanent upgrades of processors (CPs, IFLs, ICFs, zAAPs, zIIPs, and SAPs) and memory, or changes to a server’s Model-Capacity Identifier, up to the limits of the installed processor capacity on an existing zEC12, can be performed by the client through the IBM Online Permanent Upgrade offering by using the CIU facility.

3.6.2 Temporary upgrades

Temporary upgrades of a zEC12 can be done by On/Off CoD, Capacity Backup (CBU), or Capacity for Planned Event (CPE) ordered from the CIU facility.

On/Off CoD function

On/Off CoD is a function that is available on the zEC12 that enables concurrent and temporary capacity growth of the CPC. On/Off CoD can be used for client peak workload requirements, for any length of time, has a daily hardware charge and can have an associated software charge. On/Off CoD offerings can be pre-paid or post-paid. Capacity tokens are available on zEC12. Capacity tokens are always present in pre-paid offerings and can be present in post-paid if the client so desires. In both cases capacity tokens are being used to control the maximum resource and financial consumption.

When using the On/Off CoD function, the client can concurrently add processors (CPs, IFLs, ICFs, zAAPs, zIIPs, and SAPs), increase the CP capacity level, or both.

Capacity Backup (CBU) function

CBU allows the client to perform a concurrent and temporary activation of additional CPs, ICFs, IFLs, zAAPs, zIIPs, and SAPs, an increase of the CP capacity level, or both. This function can be used in the event of an unforeseen loss of System z capacity within the client's enterprise, or to perform a test of the client's disaster recovery procedures. The capacity of a CBU upgrade cannot be used for peak workload management.

CBU features are optional and require unused capacity to be available on installed books of the backup system, either as unused PUs or as a possibility to increase the CP capacity level on a subcapacity system, or both. A CBU contract must be in place before the LIC-CC code that enables this capability can be loaded on the system. An initial CBU record provides for one test for each CBU year (each up to 10 days in duration) and one disaster activation (up to 90 days in duration). The record can be configured to be valid for up to five years.

Proper use of the CBU capability does not incur any additional software charges from IBM.

Capacity for Planned Event (CPE) function

CPE allows the client to perform a concurrent and temporary activation of more CPs, ICFs, IFLs, zAAPs, zIIPs, and SAPs, an increase of the CP capacity level, or both. This function can be used in the event of a planned outage of System z capacity within the client’s enterprise (for example, data center changes or system maintenance). CPE cannot be used for peak workload management and can be active for a maximum of three days.

The CPE feature is optional and requires unused capacity to be available on installed books of the back-up system, either as unused PUs or as a possibility to increase the CP capacity level on a subcapacity system, or both. A CPE contract must be in place before the LIC-CC that enables this capability can be loaded on the system.

3.6.3 z/OS capacity provisioning

Capacity provisioning helps clients manage the CP, zAAP, and zIIP capacity of zEC12 that is running one or more instances of the z/OS operating system. Using z/OS Capacity Provisioning Manager (CPM) component, On/Off CoD temporary capacity can be activated and deactivated under control of a defined policy. Combined with functions in z/OS, the zEC12 provisioning capability gives the client a flexible, automated process to control the configuration and activation of On/Off CoD offerings.

3.7 Throughput optimization with zEC12

The z990 was the first server to use the concept of books. The memory and cache structure implementation in the books have been enhanced from the z990, through successive system generations to the zEC12 to provide sustained throughput and performance improvements. Despite the memory that is being distributed through the books and the books having individual levels of caches private to the cores and shared by the cores, all processors have access to the highest level of caches and all the memory. Thus, the system is managed as a memory coherent symmetric multiprocessor (SMP).

Processors within the zEC12 book structure have different distance-to-memory attributes. As described in 2.4, “zEC12 processor cage, books, and multiple chip modules” on page 26, books are connected in a star configuration to minimize the distance. Other non-negligible effects result from data latency when grouping and dispatching work on a set of available logical processors. To minimize latency, one can aim to dispatch and later redispatch work to a group of physical CPUs that share the same cache levels.

PR/SM manages the use of physical processors by logical partitions by dispatching the logical processors on the physical processors. But PR/SM is not aware of which workloads are being dispatched by the operating system in what logical processors. The Workload Manager (WLM) component of z/OS has the information at the task level, but is unaware of physical processors. This disconnect is solved by enhancements that allow PR/SM and WLM to work more closely together. They can cooperate to create an affinity between task and physical processor rather than between logical partition and physical processor. This is known as HiperDispatch.

HiperDispatch

HiperDispatch, introduced with the z10 Enterprise Class, is enhanced in z196 and zEC12. It combines two functional enhancements, one in the z/OS dispatcher and one in PR/SM. This function is intended to improve efficiency both in the hardware and in z/OS.

In general, the PR/SM dispatcher assigns work to the minimum number of logical processors that are needed for the priority (weight) of the LPAR. PR/SM attempts to group the logical processors into the same book on a zEC12 and, if possible, in the same chip. The result is to reduce the multi-processor effects, maximize use of shared cache, and lower the interference across multiple partitions.

The z/OS dispatcher is enhanced to operate with multiple dispatching queues, and tasks are distributed among these queues. Specific z/OS tasks can be dispatched to a small subset of logical processors. PR/SM ties these logical processors to the same physical processors, thus improving the hardware cache reuse and locality of reference characteristics, such as reducing the rate of cross-book communication.

To use the correct logical processors, the z/OS dispatcher obtains the necessary information from PR/SM through interfaces that are implemented on the zEC12. The entire zEC12 stack (hardware, firmware, and software) now tightly collaborates to obtain the full potential of the hardware.

The HiperDispatch function is enhanced on the zEC12 to use the new hex-core chip and improve computing efficiency. It is possible to dynamically switch on and off HiperDispatch without requiring an IPL.

3.8 zEC12 performance

The System z microprocessor chip of the zEC12 has a high-frequency design that uses
IBM leadership technology and offers more cache per core than other chips. In addition, an enhanced instruction execution sequence delivers world-class per-thread performance. z/Architecture is enhanced which provides more instructions that are intended to deliver improved CPU-centric performance. For CPU-intensive workloads, more gains can be achieved by multiple compiler-level improvements. Improved performance of the zEC12 is a result of the enhancements that we described in Chapter 2, “Hardware overview” on page 21 and 3.2, “zEC12 technology improvements” on page 57.

The zEC12 Model HA1 is designed to offer approximately 50% more capacity than the z196 Model M80 system. Uniprocessor performance has also increased significantly. A zEC12 Model 701 offers, based on an average workload, performance improvements of about 25% over the z196 Model 701.

However, variations on the observed performance increase are dependent upon the workload type.

To help in better understanding workload variations, IBM provides a no-cost tool that is called the IBM Processor Capacity Reference for System z (zPCR). The tool can be downloaded from the following web page:

http://www-03.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/PRS1381

IBM continues to measure performance of the systems by using various workloads and publishes the results in the Large Systems Performance Reference (LSPR) report. The LSPR is available at the following website:

https://www-304.ibm.com/servers/resourcelink/lib03060.nsf/pages/lsprindex?

The MSU ratings are available at the following website:

http://www-03.ibm.com/systems/z/resources/swprice/reference/exhibits/hardware.html

LSPR workload suite: zEC12 changes

Historically, LSPR capacity tables, including pure workloads and mixes, have been identified with application names or a software characteristic. Examples are CICS, IMS, OLTP-T⁶, CB-L⁷, LoIO-mix⁸, and TI-mix⁹. However, capacity performance is more closely associated with how a workload uses and interacts with a particular processor hardware design. Workload capacity performance is sensitive to three major factors:

•Instruction path length

•Instruction complexity

•Memory hierarchy

With the availability of the CPU measurement facility (MF) data, the ability to gain insight into the interaction of workload and hardware design in production workloads has arrived. CPU MF data helps LSPR to adjust workload capacity curves that are based on the underlying hardware sensitivities, in particular the processor access to caches and memory. This is known as nest activity intensity. With the IBM zEnterprise System, the LSPR introduced three new workload capacity categories that replace all prior primitives and mixes:

•LOW (relative nest intensity)

A workload category that represents light use of the memory hierarchy. This category is similar to past high scaling primitives.

•AVERAGE (relative nest intensity)

A workload category that represents average use of the memory hierarchy. This category is similar to the past LoIO-mix workload and is expected to represent most of the production workloads.

•HIGH (relative nest intensity)

A workload category that represents heavy use of the memory hierarchy. This category is similar to the past TI-mix workload.

These categories are based on the relative nest intensity, which is influenced by many variables such as application type, I/O rate, application mix, CPU usage, data reference patterns, LPAR configuration, and the software configuration that is running, among others. CPU MF data can be collected by z/OS System Measurement Facility on SMF 113 records.

Guidance in converting LSPR previous categories to the new ones is provided, and built-in support is added to the IBM zPCR tool.

In addition to low, average, and high categories, the latest zPCR provides the low-average and average-high mixed categories, which allow better granularity for workload characterization.

The LSPR tables continue to rate all z/Architecture processors running in LPAR mode and 64-bit mode. The single-number values are based on a combination of the default mixed workload ratios, typical multi-LPAR configurations, and expected early-program migration scenarios. In addition to z/OS workloads used to set the single-number values, the LSPR tables contain information that pertains to Linux and z/VM environments.

The LSPR contains the internal throughput rate ratios (ITRRs) for the zEC12 and the previous generations of processors that are based on measurements and projections using standard IBM benchmarks in a controlled environment. The actual throughput that any user might experience varies depending on factors such as the amount of multiprogramming in the user's job stream, the I/O configuration, and the workload processed.

Experience demonstrates that System z servers can be run at up to 100% use levels, sustained, although most clients prefer to leave a bit of white space and run at 90% or slightly under. For any capacity comparison, using “one number” such as the MIPS or MSU metrics is not a valid method. That is why, while doing capacity planning, we suggest using zPCR and involving IBM technical support. For more information about zEC12 performance, refer to the IBM zEnterprise EC12 Technical Guide, SG24-8049.

3.9 zEnterprise BladeCenter Extension Model 003

zEC12 introduces the zEnterprise BladeCenter Extension (zBX) Model 003.

The zEnterprise BladeCenter Extension (zBX) Model 003 continues to support workload optimization and integration. As an optional feature that is attached to the zEC12 by a secure high-performance private network, the zBX can house multiple environments that include AIX and Linux on System x and Windows, supporting a “fit for purpose” application deployment.

The zBX is tested and packaged together at the IBM manufacturing site and shipped as one unit, relieving complex configuration and set up requirements. With a focus on availability, the zBX has hardware redundancy that is built in at various levels: the power infrastructure, rack-mounted network switches, power and switch units in the BladeCenter chassis, and redundant cabling for support and data connections. The zEnterprise BladeCenter Extension (zBX) Model 003 components are configured, managed, and serviced the same way as the other components of the System z server.

Although the zBX processors are not z/Architecture PUs, the zBX is handled by System z firmware called zEnterprise Unified Resource Manager. The zBX hardware features are part of the mainframe, not add-ons.

Statement of Direction:

•IBM intends to deliver automated multi-site recovery for zBX hardware components that are based upon GDPS technologies. These capabilities will help facilitate the management of planned and unplanned outages across IBM zEnterprise EC12.

•IBM intends to deliver new functionality with IBM Systems Director offerings to support IBM zBX. Such planned new capabilities will be designed to provide virtual image management and enhanced energy management functions for the Power Systems and System x blades.

•IBM intends to deliver workload-aware optimization for IBM System x Blades in the zBX. This allows virtual CPU capacity to be adjusted automatically across virtual servers within a hypervisor, helping to ensure that System x resources in the zBX are executing to the defined service level agreement (SLAs).

3.9.1 IBM blades

IBM offers select IBM BladeCenter PS701 Express blades that can be installed and operated on the zBX Model 003. These blades are virtualized by PowerVM Enterprise Edition. The virtual servers in PowerVM run the AIX operating system.

PowerVM handles all the access to the hardware resources, providing a Virtual I/O Server (VIOS) function and the ability to create logical partitions. The logical partitions can be either dedicated processor LPARs, which require a minimum of one core per partition, or shared processor LPARs (micro-partitions), which in turn can be as small as 0.1 core per partition.

A select set of IBM BladeCenter HX5 (7873) blades is available for the zBX. These blades have an integrated hypervisor, and their virtual machines run Linux on System x and Microsoft Windows.

Also available on the zBX is the IBM WebSphere DataPower XI50 for zEnterprise appliance. The DataPower XI50z is a multifunctional appliance that can help provide multiple levels of XML optimization, streamline and secure valuable service-oriented architecture (SOA) applications, and provide drop-in integration for heterogeneous environments by enabling core enterprise service bus (ESB) functionality. These functions include: routing, bridging, transformation, and event handling. It can help to simplify, govern, and enhance the network security for XML and web services.

Software that is supported on blades is described in more detail in 4.3, “Software support for zBX” on page 112.

3.10 Reliability, availability, and serviceability (RAS)

The IBM zEnterprise System family presents numerous enhancements in the RAS areas. In the availability area, focus was given to reduce the planning requirements, while continuing to improve the elimination of planned, scheduled, and unscheduled outages.

Enhanced driver maintenance (EDM) helps reduce the necessity and the eventual duration of a scheduled outage. One of the contributors to scheduled outages is LIC Driver updates that are performed in support of new features and functions. When properly configured, the zEC12 can concurrently activate a new LIC Driver level. Concurrent activation of the select new LIC Driver level is supported at specifically released synchronization points. However, there are certain LIC updates where a concurrent update or upgrade is not possible.

With enhanced book availability, the affect of book replacement is minimized. In a multiple book system, a single book can be concurrently removed and reinstalled for an upgrade or repair. To ensure that the zEC12 configuration supports removal of a book with minimal affect to the workload, the flexible memory option should be considered.

The zEC12 provides a way to increase memory availability, called Redundant Array of Independent Memory (RAIM), where a fully redundant memory system can identify and correct memory errors without stopping. The implementation is similar to the RAID concept used in storage systems for a number of years. See IBM zEnterprise EC12 Technical Guide, SG24-8049, for a detailed description of the RAS features.

To prevent both scheduled and unscheduled outages, there are several availability improvements in different components of zEC12. These enhancements include error detection and recovery improvements in both caches and memory, IBM zAware, Flash Express, Fibre Channel Protocol support for T10-DIF, a fixed HSA with its size doubled to 32 GB, OSA firmware changes to increase the capability of concurrent maintenance change level (MCL) updates, radiator cooling system with N+1 redundancy, corrosion sensors, new CFCC level, RMF reporting, zBX connectivity.

zEC12 continues to support concurrent addition of resources, such as processors or I/O cards, to an LPAR to achieve better serviceability. If an additional system assist processor (SAP) is required on a zEC12 (for example, as a result of a disaster recovery situation), the SAPs can be concurrently added to the CPC configuration.

It is possible to concurrently add CPs, zAAPs, zIIPs, IFLs, and ICFs processors to an LPAR. This function is supported by z/VM V5R4 and later, and also (with appropriate PTFs) by z/OS and z/VSE V4R3 and later. Previously, proper planning was required to add CPs, zAAPs, and zIIPs to a z/OS LPAR concurrently. It is possible to concurrently add memory to an LPAR. This ability is supported by z/OS and z/VM.

zEC12 supports dynamically adding Crypto Express features to an LPAR by being able to change the cryptographic information in the image profiles without outage to the LPAR. Users can also dynamically delete or move Crypto Express features. This enhancement is supported by z/OS, z/VM, and Linux on System z.

3.10.1 IBM System z Advanced Workload Analysis Reporter (IBM zAware)

Introduced with the zEC12, the IBM zAware feature is an integrated expert solution that uses sophisticated analytics to help clients identify potential problems and improve overall service levels.

IBM zAware runs analytics in firmware and intelligently examines z/OS message logs for potential deviations, or inconsistencies, or variations from the norm, providing out-of-band monitoring and machine learning of operating system health.

IBM zAware can accurately identify system anomalies in minutes. This feature analyzes massive amounts of processor data to identify problematic messages and provides information that can feed other processes or tools. The IBM zAware virtual appliance monitors the z/OS operations log (OPERLOG), which contains all messages that are written to the z/OS console, including application-generated messages. IBM zAware provides a graphical user interface (GUI) for easy drill-down into message anomalies, which can lead to faster problem resolution.

Statement of Direction: IBM plans to provide new capability within the IBM Tivoli® Integrated Service Management family of products that are designed to use analytics information from IBM zAware, and to provide alert and event notification.

More detail about IBM zAware can be found in IBM zEnterprise EC12 Technical Guide, SG24-8049, IBM zAware Concept Guide , SG24-8070, and Advanced Workload Analysis Reporter (IBM zAware), SC27-2623.

3.10.2 RAS capability for the HMC

In an ensemble environment, the Unified Resource Manager routines are run in the HMC.

The Unified Resource Manager is an active part of the ensemble infrastructure. Thus, the HMC has a stateful environment that needs high-availability features to ensure survival of the system in case of an HMC failure.

Each ensemble requires two HMC workstations: a primary and a backup. The contents and activities of the primary are kept synchronously updated on the backup HMC so that the backup can automatically take over the activities of the primary if the primary fails. Although the primary HMC can perform the classic HMC activities in addition to the Unified Resource Manager activities, the backup HMC can be the only backup. No additional tasks or activities can be performed at the backup HMC.

3.10.3 RAS capability for zBX

The zBX was built following the traditional System z hardware QoS to include RAS capabilities. The zBX offering provides extended service capability through the hardware management structure. The HMC/SE functions of the zEC12 provide management and control functions for the zBX solution.

Apart from a zBX configuration with one chassis that is installed, the zBX is configured to provide N + 1 components. The components are replaced concurrently. In addition, zBX configuration upgrades can be performed concurrently.

The zBX has two Top of Rack (TOR) switches for each network (INMN and IEDN). These switches provide N + 1 connectivity for the private networks between the zEC12 and the zBX for monitoring, controlling, and managing the zBX components.

zBX firmware

The testing, delivery, installation, and management of the zBX firmware is handled the same way as for the zEC12. The same zEC12 processes and controls are used. Any fixes to the zBX machine are downloaded to the controlling zEC12’s SE and are applied to the zBX.

The MCLs for the zBX are concurrent and their status can be viewed at the zEC12’s HMC.

These and additional features are further described in IBM zEnterprise EC12 Technical Guide, SG24-8049.

3.11 High availability technology for zEC12

System z is renowned for its reliability, availability, and serviceability capabilities, of which Parallel Sysplex is an exponent. Extended availability technology with IBM PowerHA® for Power is available for blades in the zBX. First, we describe the System z Parallel Sysplex technology and then the PowerHA technology:

•3.11.1, “High availability for zEC12 with Parallel Sysplex” on page 92

•3.11.2, “PowerHA in zBX environment” on page 94

3.11.1 High availability for zEC12 with Parallel Sysplex

Parallel Sysplex technology is a clustering technology for logical and physical servers, allowing the highly reliable, redundant, and robust System z technology to achieve near-continuous availability. Both hardware and software tightly cooperate to achieve this result. The hardware components are made up of the following elements:

•Coupling Facility (CF)

This is the cluster center. It can be implemented either as an LPAR of a stand-alone System z system or as an additional LPAR of a System z system where other loads are running. Processor units that are characterized as either CPs or ICFs can be configured to this LPAR. ICFs are often used because they do not incur any software license charges. Two CFs are recommended for availability.

•System-managed CF structure duplexing

System-managed CF structure duplexing provides a general-purpose, hardware-assisted, easy-to-exploit mechanism for duplexing structure data hold in CFs. This function provides a robust recovery mechanism for failures (such as loss of a single structure on CF or loss of connectivity to a single CF). The recovery is done through rapid failover to the other structure instance of the duplex pair.

Clients that are interested in deploying system-managed CF structure duplexing can read the technical paper System-Managed CF Structure Duplexing, ZSW01975USEN, which can be accessed by selecting Learn More on the Parallel Sysplex website:

http://www.ibm.com/systems/z/pso/index.html

•Coupling Facility Control Code (CFCC)

This IBM Licensed Internal Code is both the operating system and the application that runs in the CF. No other code runs in the CF.

CFCC can also run in a z/VM virtual machine (as a z/VM guest system). In fact, a complete Sysplex can be set up under z/VM allowing, for instance, testing and operations training. This setup is not recommended for production environments.

•Coupling links

These are high-speed links that connect the several system images (each running in its own logical partition) that participate in the Parallel Sysplex. At least two connections between each physical server and the CF must exist. When all of the system images belong to the same physical server, internal coupling links are used.

On the software side, the z/OS operating system uses the hardware components to create a Parallel Sysplex.

z/TPF: z/TPF can also use the CF hardware components. However, the term Sysplex exclusively applies to z/OS usage of the CF.

Normally, two or more z/OS images are clustered to create a Parallel Sysplex, although it is possible to have a configuration setting with a single image, called a monoplex. Multiple clusters can span several System z servers, although a specific image (logical partition) can belong to only one Parallel Sysplex.

A z/OS Parallel Sysplex implements shared-all access to data. This is facilitated by System z I/O virtualization capabilities such as the multiple image facility (MIF). MIF allows several logical partitions to share I/O paths in a secure way, maximizing use and greatly simplifying the configuration and connectivity.

In short, a Parallel Sysplex comprises one or more z/OS operating system images that are coupled through one or more coupling facilities. A properly configured Parallel Sysplex cluster is designed to maximize availability at the application level. Rather than a quick recovery of a failure, the Parallel Sysplex design objective is zero failure.

The major characteristics of a Parallel Sysplex include the following features:

•Data sharing with integrity

The CF is key to the implementation of a share-all access to data. Every z/OS system image has access to all the data. Subsystems in z/OS declare resources to the CF. The CF accepts and manages lock and unlock requests on those resources, guaranteeing data integrity. A duplicate CF further enhances the availability. Key users of the data sharing capability are DB2, WebSphere MQ, WebSphere ESB, IMS, and CICS. Because these are major infrastructure components, applications that use them inherently benefit from sysplex characteristics. For instance, many large SAP implementations have the database component on DB2 for z/OS, in a Parallel Sysplex.

•Continuous (application) availability

Changes, such as software upgrades and patches, can be introduced one image at a time, while the remaining images continue to process work. For more details, see Parallel Sysplex Application Considerations, SG24-6523.

•High capacity

Scales 2 - 32 images. Remember that each image can have 1 - 100 processor units. CF scalability is near-linear. This structure contrasts with other forms of clustering that employ n-to-n messaging, which leads to rapidly degrading performance with growth of the number of nodes.

•Dynamic workload balancing

Viewed as a single logical resource, work can be directed to any of the Parallel Sysplex cluster operating system images where capacity is available.

•Systems management

This architecture provides the infrastructure to satisfy a client requirement for continuous availability, while enabling techniques for achieving simplified systems management consistent with this requirement.

•Resource sharing

A number of base z/OS components use CF shared storage. This usage enables the sharing of physical resources with significant improvements in cost, performance, and simplified systems management.

•Single system image

The collection of system images in the Parallel Sysplex is displayed as a single entity to the operator, user, database administrator, and so on. A single system image ensures reduced complexity from both operational and definition perspectives.

•N-2 support

Multiple hardware generations (normally three) are supported in the same Parallel Sysplex. This configuration provides for a gradual evolution of the systems in the Sysplex, without forcing changing all simultaneously. Similarly, software support for multiple releases or versions is supported.

Figure 3-4 illustrates the components of a Parallel Sysplex as implemented within the System z architecture. The diagram shows one of many possible Parallel Sysplex configurations.

Figure 3-4 Sysplex hardware overview

Figure 3-4 shows a zEC12 system which contains multiple z/OS sysplex partitions and an internal coupling facility (CF02), a z10 EC server containing a stand-alone CF (CF01), and a z196 containing multiple z/OS sysplex partitions. STP over coupling links provides time synchronization to all servers. Appropriate CF link technology (1x IFB or 12x IFB) selection depends on server configuration and how distant they are physically located. ISC-3 links can be carried forward to zEC12 only when upgrading from either z196 or z10 EC.

Through this state-of-the-art cluster technology, the power of multiple z/OS images can be harnessed to work in concert on shared workloads and data. The System z Parallel Sysplex cluster takes the commercial strengths of the z/OS platform to improved levels of system management, competitive price/performance, scalable growth, and continuous availability.

3.11.2 PowerHA in zBX environment

An application that runs on AIX can be fitted with high availability by the use of the
IBM PowerHA SystemMirror® for AIX (formerly known as IBM HACMP™¹⁰). PowerHA is easy to configure (menu driven) and provides high availability for applications that run on AIX.

PowerHA helps define and manage resources (required by applications) running on AIX by providing service/application continuity through platform resources and application monitoring, and automated actions (start/manage/monitor/restart/move/stop).

Terminology: Resource movement and application restart on the second server is known as failover.

Automating the failover process speeds up recovery and allows for unattended operations, thus providing improved application availability.

A PowerHA configuration or cluster consists of two or more servers¹¹ (up to 32) that have their resources managed by PowerHA cluster services to provide automated service recovery for the applications managed. Servers can have physical or virtual I/O resources, or a combination of both.

PowerHA performs the following functions at the cluster level:

•Manage and monitor operating systems and hardware resources.

•Manage and monitor application processes.

•Manage and monitor network resources.

•Automate applications (start, stop, restart, move).

The virtual servers that are defined and managed in zBX use only virtual I/O resources. PowerHA can manage both physical and virtual I/O resources (virtual storage and virtual network interface cards).

PowerHA can be configured to perform automated service recovery for the applications that run in virtual servers that are deployed in zBX. PowerHA automates application failover from one virtual server in an IBM POWER processor-based blade to another virtual server in a different POWER processor-based blade that has a similar configuration.

Failover protects service (masks service interruption) in case of unplanned or planned (scheduled) service interruption. During failover, users might experience a short service interruption while resources are configured by PowerHA on the new virtual server.

The PowerHA configuration for the zBX environment is similar to standard POWER environments, with the particularity that it uses only virtual I/O resources. Currently, PowerHA for zBX support is limited to failover inside the same zBX.

Figure 3-5 shows a typical PowerHA cluster.

Figure 3-5 Typical PowerHA cluster diagram

A PowerHA configuration must cover the following functions:

•Network planning (VLAN and IP configuration definition and for server connectivity)

•Storage planning (shared storage must be accessible to all blades that provide resources for a PowerHA cluster)

•Application planning (start/stop/monitoring scripts and OS, CPU, and memory resources)

•PowerHA software installation and cluster configuration

•Application integration (integrating storage, networking, and application scripts)

•PowerHA cluster testing and documentation

For more information about IBM PowerHA SystemMirror for AIX, see the following web page:

http://pic.dhe.ibm.com/infocenter/aix/v6r1/index.jsp?topic=%2Fcom.ibm.aix.powerha.navigation%2Fpowerha_main.htm

¹ Evaluation Assurance Level with specific Target of Evaluation, Certificate for z196 published in October 12, 2011.

² Targeted for 1Q2013* on z/OS V1R13 with web deliverable FMID and PTF installation.

³ Federal Information Processing Standards (FIPS) 140-2 Security Requirements for Cryptographic Modules

⁴ The systems management Crypto Express3-1P feature has one PCIe adapter and just one PCHID associated to it.

⁵ ISC-3 features are only available on zEC12 when carried forward during an upgrade.

⁶ Traditional online transaction processing workload (formerly known as IMS)

⁷ Commercial batch with long-running jobs

⁸ Low I/O Content Mix Workload

⁹ Transaction Intensive Mix Workload

¹⁰ High Availability Cluster Multi-Processing

¹¹ Servers can be also virtual servers. One server equals one instance of the AIX Operating System.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Chapter 3. Key functions and capabilities of IBM zEnterprise EC12

Create new playlist

Sign In

Sign Up

Table of Contents for
Chapter 3. Key functions and capabilities of IBM zEnterprise EC12