Chapter 3. Key functions and capabilities of IBM z13

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Key functions and capabilities of IBM z13

IBM z13 is the follow-on to the IBM zEnterprise zEC12 and the flagship of the IBM Systems portfolio. Like its predecessor, the z13 offers five hardware models, but has a more powerful processor, more processor units, and new functions and features.

The superscalar design allows the z13 to deliver a record level of capacity over the prior IBM z Systems servers. It is powered by 141 of the world's most powerful microprocessors that run at 5.0 GHz. The extreme scalability of z13 provides up to 40% more total capacity than its predecessor, the zEC12. The z13 is the industry's premier enterprise infrastructure choice for large-scale consolidation, secure data serving, transaction processing, and analytics capabilities.

For existing users of the zEnterprise BladeCenter Extension (zBX) an upgrade is available with z13 to a stand-alone IBM z BladeCenter Extension (zBX) Model 004.

In this chapter, we highlight several z13 functions and capabilities and point out solution areas where they can be of special value:

•Virtualization

•The z13 technology improvements

•Capacity and performance

•Common time functions of z Systems

•Hardware Management Console (HMC) functions

•z13 power and cooling functions

•IBM z BladeCenter Extension (zBX) Model 004

•Reliability, availability, and serviceability (RAS)

•High availability

•IBM z Systems and emerging paradigms

3.1 Virtualization

The z13 servers are highly virtualized, with the goal of maximizing the utilization of computing resources, lowering the total amount of resources that are needed for defined workloads and their cost. Virtualization is a key strength of the z Systems. Virtualization is embedded in the architecture and built into the hardware, firmware, and operating systems.

Virtualization requires a hypervisor, which is the control code that manages resources that are required for multiple independent operating system images. Hypervisors can be implemented in software or hardware, and the z13 has both. The hardware hypervisor for the z13 is known as IBM Processor Resource/Systems Manager (PR/SM). PR/SM is implemented in firmware as part of the base system, fully virtualizes the system resources, and does not require any additional software to run. The software hypervisor is implemented by the z/VM operating system. z/VM uses some PR/SM functions.

Statement of Direction¹ (KVM offering for IBM z Systems): In addition to the continued investment in z/VM, IBM intends to support a kernel-based virtual machine (KVM) offering for z Systems that will host Linux on z Systems guest virtual machines. The KVM offering will be software that can be installed on z Systems processors like an operating system and can coexist with z/VM virtualization environments, z/OS, Linux on z Systems, z/VSE and z/TPF. The KVM offering will be optimized for z Systems architecture and will provide standard Linux and KVM interfaces for operational control of the environment, and also supporting OpenStack interfaces for virtualization management, enabling enterprises to easily integrate Linux servers into their existing infrastructure and cloud offerings.

¹ All statements regarding IBM plans, directions, and intent are subject to change or withdrawal without notice. Any reliance on these statements of general direction is at the relying party’s sole risk and will not create liability or obligation for IBM.

In the zBX, PowerVM Enterprise Edition is the hypervisor that offers a virtualization solution for any IBM Power Systems workload that runs on AIX. It allows use of the POWER7 processor-based PS 701 blades and other physical resources, providing better scalability and reduction in resource costs. IBM System x blades have a KVM, integrated hypervisor with identical objectives.

Virtualization is key to establishing flexible infrastructures, with automated management and monitoring, such as those underpinning cloud offerings, including infrastructure as a service (IaaS), platform as a service (PaaS), and software as a service (SaaS). We return to this subject after we describe the hardware and software virtualization capabilities of the z13.

3.1.1 z13 hardware virtualization

PR/SM was first implemented in the mainframe in the late 1980s. It allows defining and managing subsets of the server resources that are known as logical partitions (LPAR). PR/SM virtualizes processors, memory, and I/O features. Certain features are purely virtualized implementations. For example, HiperSockets works like a LAN but does not use any I/O hardware.

PR/SM is always active on the system and is enhanced to provide more performance and platform management benefits. PR/SM technology on zEC12 received Common Criteria EAL5+ security certification.

Each LPAR can run any of the following supported operating systems:

•z/OS

•z/VM

•z/VSE

•z/TPF

•Linux on z Systems

The LPAR definition includes a number of logical PUs (LPUs), memory, and I/O devices. The z/Architecture (inherent in the z13 and its predecessors) is designed to meet those stringent requirements with low overhead and has achieved the highest security certification of the industry: Common Criteria EAL5+ with a Specific Target of Evaluation (Logical Partitions). This design has been proven in many client installations over several decades.

On z13, up to 85 LPARs can be defined, and hundreds or even thousands of virtual servers can be run under z/VM. Therefore, a high rate of context switching is to be expected, and accesses to the memory, caches, and virtual I/O devices must be kept isolated.

Logical processors

Logical processors are defined and managed by PR/SM and are perceived by the operating systems as real processors. These processors can be characterized as follows:

•Central processors (CP)

•IBM z Integrated Information Processors (zIIP)¹

•Integrated Facility for Linux (IFL)

•Internal Coupling Facility (ICF)

In addition, pre-characterized processors that are part of the system base configuration are always present: system assist processors (SAP) and integrated firmware processors (IFP). They provide support for all LPARs but are never part of an LPAR configuration.

PR/SM is responsible for accepting requests for work on logical processors by dispatching logical processors on physical processors. Physical processors can be shared across LPARs, but can also be dedicated to an LPAR. However, an LPAR must have its logical processors either all shared or all dedicated.

The sum of logical processors (LPU) defined in all of the LPARs activated in a central processor complex (CPC) might be well over the number of physical processor units (PPUs). The maximum number of LPUs that can be defined in a single LPAR cannot exceed the total number physical PUs that are available in the CPC. To achieve optimal ITR performance in sharing LPUs, keep the total number of online LPUs to a minimum. This action reduces both software and hardware overhead.

PR/SM ensures that, when switching a physical processor from one logical processor to another, the processor state is properly saved and restored, including all the registers. Data isolation, integrity, and coherence inside the system are strictly enforced at all times.

Logical processors can be dynamically added to and removed from LPARs. Operating system support is required to take advantage of this capability. Starting with z/OS V1R10, z/VM V5R4², and z/VSE V4R3³ the ability to dynamically define and change the number and type of reserved PUs in an LPAR profile can be used for that purpose. No pre-planning is required.

The new resources are immediately available to the operating systems and, in the case of z/VM, to its guests. Linux on z Systems provides the Standby CPU activation/deactivation function, which is implemented in SLES 11 and RHEL 6.

z/VM mode partitions

The z/VM mode logical partition (LPAR), first supported on IBM System z10®, is exclusively for running multiple workloads under z/VM. This LPAR mode provides increased flexibility and simplifies systems management by allowing z/VM to manage guests to perform the following tasks in the same z/VM LPAR:

•Operate Linux on z Systems on IFLs or CPs.

•Operate z/OS, z/VSE, and z/TPF on CPs.

•Operate z/OS while fully allowing zIIP usage by workloads (such as WebSphere and DB2) for an improved economics environment.

•Operate a complete Sysplex with ICF usage. This setup is especially valuable for testing and operations training; however, it is not recommended for production environments.

The z/VM-mode partitions require z/VM V5R42 or later and allows z/VM to use a wider variety of specialty processors in a single LPAR. The following processor types can be configured to a z/VM-mode partition:

•CPs

•IFLs

•zIIPs

•ICFs

If only Linux on z Systems is to be run under z/VM, then a z/VM mode LPAR is not required, and we suggest that a Linux-only LPAR be used instead.

Memory

To ensure security and data integrity, memory cannot be concurrently shared by active LPARs. In fact, a strict isolation is maintained.

Using the plan-ahead facility, memory can be physically installed without being enabled. It can then be enabled when it is necessary. z/OS and z/VM support dynamically increasing the memory size of the LPAR.

A logical partition can be defined with both an initial and a reserved amount of memory. At activation time, the initial amount is made available to the partition and the reserved amount can later be added, partially or totally. Those two memory zones do not have to be contiguous in real memory, but are displayed as logically contiguous to the operating system that runs in the LPAR.

z/OS is able to take advantage of this support by nondisruptively acquiring and releasing memory from the reserved area. z/VM V6R2 and later versions are able to acquire memory nondisruptively and quickly make it available to guests. z/VM virtualizes this support to its guests, which can also increase their memory nondisruptively. Releasing memory is still a disruptive operation.

LPAR memory is said to be virtualized in the sense that, in each LPAR, memory addresses are contiguous and start at address zero. LPAR memory addresses are different from the absolute memory addresses of the system, which are contiguous and have a single address of zero. Do not confuse this capability with the operating system that virtualizes its LPAR memory, which is done through the creation and management of multiple address spaces.

The z/Architecture has a robust virtual storage architecture that allows, per LPAR, the definition of an unlimited number of address spaces and the simultaneous use by each program of up to 1023 of those address spaces. Each address space can be up to 16 EB (1 exabyte = 260 bytes). Thus, the architecture has no real limits. Practical limits are determined by the available hardware resources, including disk storage for paging.

Isolation of the address spaces is strictly enforced by the Dynamic Address Translation hardware mechanism.The validation of a program’s right to read or write in each page frame is accomplished by comparing the page key with the key of the program that is requesting access. This mechanism has been in use since the System/370. Memory keys were part of, and used by, the original System/360 systems. Definition and management of the address spaces is under operating system control. Three addressing modes (24-bit, 31-bit, and 64-bit) are simultaneously supported. This provides compatibility with earlier versions and investment protection.

z13 supports 2 GB pages, introduced with EC12 and zBC12, in addition to the 4 KB and 1 MB pages, and an extension to the z/Architecture: the Enhanced Dynamic Address Translation-2 (EDAT-2). With additional hardware, 1 MB pages can be pageable⁴.

Operating systems can allow sharing of address spaces, or parts of them, across multiple processes. For instance, under z/VM, a single copy of the read-only part of a kernel can be shared by all virtual machines that use that operating system, resulting in large savings of real memory and improvements in performance.

I/O virtualization

The z13 supports six Logical Channel Subsystems (LCSS) each with 256 channels, for a total of 1536 channels. In addition to the dedicated use of channels and I/O devices by an LPAR, I/O virtualization allows concurrent sharing of channels. This architecture also allows sharing the I/O devices that are accessed through these channels, by several active LPARs. This function is known as multiple image facility (MIF). The shared channels can belong to different channel subsystems, in which case they are known as spanned channels.

Data streams for the sharing LPARs are carried on the same physical path with total isolation and integrity. For each active LPAR that has the channel configured online, PR/SM establishes one logical channel path. For availability reasons, multiple logical channel paths should exist for critical devices (for instance, disks that contain vital data sets).

When more isolation is required, configuration rules allow restricting the access of each logical partition to particular channel paths and specific I/O devices on those channel paths.

Many installations use the parallel access volume (PAV) function, which allows accessing a device by several addresses (normally one base address and an average of three aliases). This feature increases the throughput of the device by using more device addresses. HyperPAV takes the technology a step further by allowing the I/O Supervisor (IOS) in z/OS (and the equivalent function in the Control Program of z/VM) to create PAV structures dynamically. The structures are created depending on the current I/O demand in the system, lowering the need for manually tuning the system for PAV use.

In large installations, the total number of device addresses can be high. Thus, the concept of channel sets was introduced with the IBM System z9®. On the z13, up to four sets of approximately 64 K device addresses are available. This availability allows the base addresses to be defined on set 0 (IBM reserves 256 subchannels on set 0) and the aliases on set 1, set 2, and set 3. In total, 261,885 subchannel addresses are available per channel subsystem. Channel sets are used by the Metro Mirror (also referred as synchronous Peer-to-Peer Remote Copy (PPRC)) function by the ability to have the Metro Mirror primary devices defined in channel set 0. Secondary devices can be defined in channel sets 1,2, and 3, providing more connectivity through channel set 0.

To reduce the complexity of managing large I/O configurations further, starting with z/OS V1R10, z Systems introduced extended address volumes (EAV). EAV provides large disk volumes. In addition to z/OS, both z/VM (starting with V5R4 with APARs) and Linux on z Systems support EAV.

By extending the disk volume size, potentially fewer volumes can be required to hold the same amount of data, making systems and data management less complex. EAV is supported by the IBM DS8000® series. Devices from other vendors should be checked for EAV compatibility.

The health checker function in z/OS V1R10 introduced a health check in the I/O Supervisor that can help system administrators identify single points of failure in the I/O configuration.

The dynamic I/O configuration function is supported by z/OS and z/VM. It provides the capability of concurrently changing the currently active I/O configuration. Changes can be made to channel paths, control units, and devices. The existence of a fixed HSA area in the z13 greatly eases the planning requirements and enhances the flexibility and availability of these reconfigurations.

3.1.2 IBM z Systems software virtualization

Software virtualization is provided by the IBM z/VM product. Strictly speaking, it is a function of the Control Program component of z/VM. Starting in 1972⁵, IBM has continuously provided commercial software virtualization in its mainframe servers.

z/VM uses the resources of the LPAR in which it is running to create functional equivalents of real z Systems CPCs, which are known as virtual machines (VM) or guests. A z/VM virtual machine is the functional equivalent of a real server. In addition, z/VM can emulate I/O peripheral devices (for instance, printers) by using spooling and other techniques, and LAN switches and disks by using memory.

z/VM allows fine-grained dynamic allocation of resources. As an example, in the case of processor sharing, the minimum allocation is approximately 1/10,000 of a processor. As another example, disks can be subdivided into independent areas, which are known as minidisks, each of which is exploited by its users as a real disk, only smaller. Minidisks are shareable, and can be used for all types of data and also for temporary space in a pool of on-demand storage.

Under z/VM, virtual processors, virtual memory, and all the virtual I/O devices of the VMs are dynamically definable (provisionable). z/VM supports the concurrent addition (but not the deletion) of memory to its LPAR and immediately makes it available to guests. Guests themselves can support the dynamic addition of memory. All other changes are concurrent. To make these concurrent definitions occur non disruptively requires support by the operating system that is running in the guest.

Although z/VM imposes no limits on the number of defined VMs, the number of active VMs is limited by the available resources. On a z13, thousands of VMs can be activated.

In addition to server consolidation and image reduction by vertical growth, z/VM provides a highly sophisticated environment for application integration and co-residence with data, especially for mission-critical applications.

Virtualization provides hardware-enabled resource sharing, and can also be used for the following functions:

•Isolate production, test, training, and development environments.

•Support previous applications.

•Test new hardware configurations without actually buying the hardware.

•Enable parallel migration to new system or application levels, and provide easy back-out capabilities.

z/VM V6R2 introduced a new feature, single system image (SSI). SSI enables improved availability, better management of planned outages, and capacity growth by creating clusters of z/VM systems with simplified management.

With SSI, clustering up to four z/VM images in a single logical image is possible. These are the highlights of the SSI features:

•Live Guest Relocation (LGR) for Linux offers the ability to move executing virtual servers without disruption from one z/VM system to another in the SSI.

•Management of resources with multi-system virtualization so that up to four z/VM instances are allowed to be clustered as a single system image.

•Horizontal scalability with up to four systems, even on mixed hardware generations.

•Availability, through nondisruptively moving work to available system resources and nondisruptively moving system resources to work.

•An SSI cluster can contain both 6.2 and 6.3 members, and a member can be upgraded from 6.2 to 6.3 using the upgrade in place installation feature.

For more information about SSI, see the following resources:

•An introduction to z/VM Single System Image (SSI) and Live Guest Relocation (LGR), SG24-8006

•Using z/VM v 6.2 Single System Image (SSI) and Live Guest Relocation (LGR), SG24-8039

In light of the IBM cloud strategy and adoption of OpenStack, the management of z/VM environments in IBM z Unified Resource Manager (zManager) is now stabilized and will not be further enhanced. So, zManager will not provide systems management support for z/VM V6R3 and later releases. However, zManager will continue to play a distinct and strategic role in the management of virtualized environments that are created by integrated firmware hypervisors (PR/SM, PowerVM, and System x hypervisor based on KVM) of z Systems.

The zManager⁶ uses the management application programming interface (API) of z/VM to provide a set of resource management functions for the z/VM V6R2 environment.

Providing a more detailed description of z/VM or other highlights of its capabilities is beyond the scope of this book. For a deeper discussion of z/VM, see Introduction to the New Mainframe: z/VM Basics, SG24-7316.

3.1.3 zBX virtualized environments

On the zBX Model 004, available as an upgrade of zBX Model 002 or 003, the IBM POWER7 processor-based PS701 blades run PowerVM Enterprise Edition to create a virtualized environment that is similar to the one found in IBM Power Systems servers. The POWER7 processor-based LPARs run the AIX operating system.

PowerVM is EAL4+ certified and is isolated on the intranode management network, providing intrusion prevention, integrity, and secure virtual switches with integrated consolidation.

The IBM System x blades are also virtualized. The integrated System x hypervisor uses kernel-based virtual machines (KVMs). Support is provided for Linux and Microsoft Windows.

PowerVM, and also the integrated hypervisor for the System x blades, is managed by the IBM z Unified Resource Manager, so it is shipped, deployed, monitored, and serviced at a single point.

Management of the zBX environment is done as a single logical virtualized environment by the Unified Resource Manager.

3.1.4 z Systems based clouds

Cloud computing is a paradigm for providing IT services. It capitalizes on the ability to rapidly and securely deliver standardized offerings, while retaining the capacity for customizing the environment. Elasticity, allowing to accompany the ebbs and flows of demand, and using just-in-time provisioning is another requirement. We make no distinction here between private and public clouds, because they are both well addressed by z Systems.

Virtualization is critical to the economic and financial viability of those offerings, because it allows minimizing the over-provisioning of resources, and reusing them at the end of the virtual server lifecycle.

Because of the extreme integration in the hardware, virtualization on z13 is highly efficient (the best in the industry) and encompasses computing and also I/O resources, including the definition of internal virtual networks with switches. These are all characteristics of Software Defined Environments, and allow supporting on a single real server, dense sets of virtual servers and server networks, with up to 100% sustained resource utilization and the highest levels of isolation and security. Therefore, the cloud solution costs, whether hardware, software, or management, are minimized.

Cloud elasticity requirements are covered by the z13 granularity offerings, including capacity levels and Capacity on Demand. These and other technologic leadership characteristics that make the z Systems CPCs the server golden standard, are discussed in the remainder of this chapter.

Z/VM V6R3 Integrated xCAT support: If you want to get started with cloud computing, the Extreme Cloud Administration Toolkit (xCAT), a scalable open source tool developed by IBM, can be used to provision, manage, and monitor physical and virtual machines. Because xCAT is integrated into z/VM V6R3, it no longer needs to be separately downloaded, installed, and configured. You can quickly deploy xCAT with a small amount of tailoring. While xCAT provides rudimentary cloud management, z/VM V6R3 also provides OpenStack enablement for more sophisticated and complete solutions, like IBM Cloud Manager.

3.1.5 GDPS Virtual Appliance

Statement of Direction¹: In the first half of 2015, IBM intends to deliver a GDPS/Peer to Peer Remote Copy (GDPS/PPRC) multiplatform resiliency capability for customers who do not run the z/OS operating system in their environment. This solution is intended to provide IBM z Systems customers who run z/VM and their associated guests, for instance, Linux on z Systems, with similar high availability and disaster recovery benefits to those who run on z/OS. This solution will be applicable for any IBM z Systems announced after and including the zBC12 and zEC12.

To reduce IT costs and complexity, many enterprises are consolidating independent servers into Linux images (guests) running on z Systems platform. Linux on z Systems can be implemented either as guests running under z/VM or native Linux LPARs on z Systems. Workloads with an application server running on Linux on z Systems and a database server running on z/OS are common. Two examples are as follows:

•WebSphere Application Server running on Linux and CICS, DB2 running under z/OS

•SAP application servers running on Linux and database servers running on z/OS

With a multitiered architecture, there is a need to provide a coordinated near-continuous availability and disaster recovery solution for both z/OS and Linux on z Systems.

GDPS Virtual Appliance is a fully integrated continuous availability and disaster recovery solution for Linux on z Systems customers and consists of these components:

•An operating system image

•The application components

•An appliance management layer which makes the image self-containing

•API and UI for customization, administration, and operation tailored for the appliance function.

GDPS Virtual Appliance can improve both consumability and time-to-value for customers. For more information, see IBM z13 Technical Guide, SG24-8251.

3.2 The z13 technology improvements

z13 provides technology improvements in these areas:

•Microprocessor

•Memory

•Capacity and performance

•Flash Express feature

•10GbE RoCE Express feature

•zEDC Express feature

•Cryptography

•I/O capabilities

These features are intended to provide a more scalable, flexible, manageable, and secure consolidation and integration to the platform, which contributes to a lower total cost of ownership.

3.2.1 Microprocessor

The z13 has a newly developed microprocessor chip and storage control chip. The chips use CMOS 14S0 (22nm) technology and represent a major step forward in technology use for the z Systems products, resulting in increased packaging density.

The microprocessor chip and the storage control chip for the z13 are packaged each one in a single chip module (SCM). The SCM can contain one microprocessor (PU) chip or one storage control (SC) chip. The SCMs are installed inside a CPC drawer, and the z13 can contain from one to four CPC drawers. Each CPC drawer has two nodes and each node has three microprocessor (PU) chips and one storage control chip. The CPC drawer also contains the memory arrays, I/O connectivity infrastructure, and various other mechanical and power controls.

The CPC drawer is connected to the PCI Express (PCIe) I/O drawers and I/O drawers, through one or more cables.

Standard PCIe and InfiniBand protocols are used for fast transfer of large volumes of data between the memory in the CPC drawer and the I/O cards housed in the PCIe I/O drawers and I/O drawers.

z13 processor chip

The z13 processor chips provide more functions per chip (eight cores on a single chip) because of technology improvements that allow designing and manufacturing more transistors per unit of area. Each processor chip can have six, seven or eight active cores. This configuration translates into using fewer chips to implement the needed functions, which helps enhance system availability.

The z Systems microprocessor development followed the same basic design set since the 9672-G4 (announced in 1997) until the z9. That basic design was stretched to its maximum, so a fundamental change was necessary. The z10 chip introduced a high-frequency design, which was improved with the z196 and enhanced again with the hex-core zEC12 and with eight core microprocessor chip for z13.

To allow an increased number of processors sharing larger caches with faster access time and improved capacity and performance, the z13 adjusted the cycle time to 0.2 nanoseconds (5.0 GHz).

Each core of the processor chip (Figure 3-1) includes one coprocessor for hardware acceleration of data compression and cryptography, I/O bus and memory controllers, and an interface to a separate storage controller/cache chip.

Figure 3-1 z13 eight-core microprocessor chip

On-chip cryptographic hardware includes the full complement of the Advanced Encryption Standard (AES) algorithm, Secure Hash Algorithm (SHA), and the Data Encryption Standard (DES) algorithm, and also the protected key implementation.

z13 processor design highlights

The z/Architecture offers a rich complex instruction set computer (CISC) Instruction Set Architecture (ISA) that supports multiple arithmetic formats.

The z196 introduced 110 instructions and offered a total of 984, out of which 762 were implemented entirely in hardware. The zEC12 also introduced new instructions, notably for the Transactional Execution and the EDAT-2 facilities. The z13 introduces 139 instructions for analytics vector processing.

Compared to zEC12, the z13 processor design improvements and architectural extensions include the following features:

•Balanced performance growth:

– 40% more system capacity:

• 33% more cores in a central processor chip (increased from 6 to 8).

• Maximum number of cores increased from 120, on the zEC12, to 168 on the z13.

• Maximum number of configurable cores increased from 101, on the zEC12, to 141 on the z13.

– Fourth Generation High Frequency processor:

• Although the frequency has been lowered from 5.5 GHz, on the zEC12, to 5.0 GHz on z13, its uniprocessor performance is up to 10% faster as compared to zEC12.

•Innovative core-cache design (L1 and L2), processors chip-cache design (L3) and node-cache design (L4) optimized by HiperDispatch, with focus on keeping more data closer the processor, increasing the cache sizes and decreasing the latency to access the next levels of cache:

– Total L1 per core is 40% larger.

– Total L2 per core is 100% larger.

– Total on-chip shared L3 is 33% larger.

– Total shared L4 is 266% larger, with non-data inclusive coherent (NIC) directory.

– Unique private L2 cache (2 MB for instructions and 2 MB for data) design reduces L1 miss latency.

•Re-optimized pipeline depth for power and performance:

– Increased instructions pipeline width per core.

– Number of instructions inflight is increased from seven to ten.

– Greater integer execution bandwidth, with four fixed-point arithmetic execution units.

– Improved fixed point and floating point divide.

– Greater floating point execution bandwidth, with two binary and two decimal floating-point arithmetic execution units.

•Improved Instruction Fetching Unit, with new branch prediction and instruction fetch front end to support multithreading and to improve branch prediction throughput

•Wider instruction decode, dispatch and completion bandwidth increased to six instructions per cycle compared to three on zEC12.

•Dedicated co-processor for each core with improved performance and more capability:

– The Central Processor Assist for Cryptographic Function (CPACF) is optimized to provide up to 2x faster encryption functions.

• TDES with twice more throughput of zEC12 CPACF for large blocks

• AES with twice more throughput of zEC12 CPACF for large blocks

• SHA four times more throughput of zEC12 CPACF for large blocks

– Hashing functions in CPACF are up to 3.5x faster.more performance as compared to zEC12

•Multiple innovative architectural extensions for software usage:

– Single-instruction, multiple-data (SIMD): Set of instructions that allows optimization of code to complex mathematical models and business analytics vector processing.

– Simultaneous multithreading (SMT): Allows up to two active threads per core sharing the IFL or ZIIP cores resources.

•Increased instruction issue, execution, and completion throughput:

– Improved instruction dispatch and grouping efficiency

– Millicode handling

– Next Instruction Access Intent

– Load and Trap instructions

– Branch Prediction Preload

– Data prefetch

Hardware decimal floating point function

Hardware decimal floating point (HDFP) support was introduced with the z9 EC and enhanced with a new decimal floating point accelerator feature in IBM zEnterprise 196. zEC12 and zBC12 facilitate better performance on traditional zoned-decimal operations with a broader usage of Decimal Floating Point facility by COBOL and PL/I programs. The z13 includes a decimal floating point packed conversion facility with z/OS V2R1 and z/VM V6R3 support.

This facility is designed to speed up such calculations and provide the necessary precision demanded mainly by the financial institutions sector. The decimal floating point hardware fully implements the new IEEE 754r standard.

Industry support for decimal floating point is growing, with IBM leading the open standard definition. Examples of support for the draft standard IEEE 754r include Java BigDecimal, C#, XML, C/C++, GCC, COBOL, and other key software vendors such as Microsoft and SAP.

Support and usage of HDFP varies with operating system and release. For a detailed description, see IBM z13 Technical Guide, SG24-8251. Also see “z/OS XL C/C++ considerations” on page 131.

Simultaneous multithreading (SMT)

The simultaneous multithreading (SMT) allows more than one thread to simultaneously execute in the same core, sharing all its resources. This functionality is available in the z13 IFL and zIIP processor cores and allows up to two threads executing in the same processor, thereby providing better utilization of the cores and an increased capacity of processing.

When a program accesses a memory location that is not in the cache, it is called a cache miss. Because the processor then must wait for the data to be fetched from the next cache level, or from main memory, before it can continue to execute, cache misses directly influence the performance and capacity of the core to execute instructions. With simultaneous multithreading exploitation, when one thread in the core is waiting, for example, for data to be fetched from the next cache levels or from main memory, the second thread in core can utilize the shared resources rather than remain idle.

Exploitation support for SMT functionality is provided in z/OS V2R1 for zIIPs and z/VM V6R3 for IFLs.

Single-instruction, multiple-data (SIMD)

z13 architecture has designed with a set of instructions to improve the performance of complex mathematical models and analytics workloads, through vector processing and new complex instructions able to process a large number of data with only a single instruction.

This new set of instructions, known as single-instruction, multiple-data (SIMD), which is designed for parallel computing can accelerate code with integer, string, character, and floating point data types. enabling better consolidation of analytics workloads and business transactions on the z Systems platform. SIMD provides the next phase of enhancements on z Systems analytics capability.

Transactional Execution (TX) facility

This capability, which is known in the industry as hardware transactional memory, allows issuing a group of instructions atomically, that is, either all the results of the instructions in the group are committed or none are, in true transactional way. The execution is optimistic: The instructions are issued, but previous state values are saved in a “transactional memory.” If the transaction succeeds, the saved values are discarded, otherwise they are used to restore the original values. Software can test the execution’s success and re-drive the code, if needed, using the same or a different path.

The TX facility provides several instructions, including declaring the beginning and end of a transaction, and to cancel the transaction. TX is expected to provide significant performance benefits and scalability to workloads by being able to avoid most of the locks. This ability is especially important for heavily threaded applications, such as Java.

Runtime Instrumentation Facility

This facility provides managed runtimes and just-in-time compilers with enhanced feedback about application behavior, allowing dynamic optimization on code generation as it is being executed.

3.2.2 Memory

The z13 can have up to 10 TB of usable memory installed. This is significantly more than its predecessor zEC12 which had 3 TB maximum.

In addition, the z13 increased the size of the hardware system area (HSA) to 96 GB, when compared with its predecessor, zEC12 which has 32 GB of HSA. The HSA is not included in the memory which the client orders, and has a fixed size of 96 GB.

z/Architecture addressing modes: The z/Architecture simultaneously supports 24-bit, 31-bit, and 64-bit addressing modes. These modes provide compatibility with earlier versions and investment protection.

Support of large memory varies with the operating system, as follows:

•z/OS V1R12 and later support up to 4 TB.

•z/VM V6R3 supports up to 1 TB.

•z/VM V6R2 support up to 256 GB.

•z/VSE V5R1 and later support up to 32 GB.

•z/TPF V1R1 supports up to 4 TB.

•SLES 11 supports 4 TB and RHEL 6 supports 3 TB.

The maximum memory size per logical partition has changed with z13. Up to 10 TB can now be defined to a logical partition in the image profile. Each operating system will be able to allocate central storage according to their individual maximum memory amount supported, as shown above.

Dynamic memory reallocation

On z13, the memory allocation algorithm has changed. PR/SM will try to allocate memory in to a single processor drawer, striped between the two nodes. Basically, the PR/SM memory and processor resources allocation goal is to place all partition resources on a single processor drawer, if possible. The resources, memory, and processors, are assigned to the partitions when they are activated. Later, when all partitions have been activated, PR/SM can move memory between processor drawers to benefit performance, without operating system knowledge.

Plan-ahead memory

When a client can anticipate the requirements for future increases of the installed memory, the initial system order can contain both a starting and additional memory sizes. The additional memory is referred to as plan-ahead memory. A specific memory pricing model is available in support of this capability.

The starting memory size is activated at system installation time and the rest remains inactive. When more physical memory is required, it is fulfilled by activating the appropriate number of plan-ahead memory features. This activation is concurrent and can be nondisruptive to the applications depending on the operating system support. z/OS and z/VM support this function.

Do not confuse plan-ahead and flexible memory support:

•Plan-ahead memory is for a permanent increase of installed memory.

•Flexible memory provides a temporary replacement of a part of memory that becomes unavailable.

Flexible memory

Flexible memory was first introduced on the z9 EC as part of the design changes and offerings to support enhanced book availability (EBA). Flexible memory was used to temporarily replace the memory that becomes unavailable when performing maintenance on a book.

On z13, the additional resources that are required for the flexible memory configurations are provided through the purchase of planned memory features, along with the purchase of memory entitlement. Flexible memory configurations are available only on multi-CPC drawers (models N63, N96, NC9, and NE1) and range from 256 GB to 2.5 TB, depending on the model.

Contact your IBM representative to help determine the appropriate configuration for your business.

Large page support

The size of pages and page frames has remained at 4 KB for a long time. Starting with the IBM System z10, z Systems platforms are capable of having large pages of 1 MB, in addition to supporting pages of 4 KB. This capability is a performance item that addresses particular workloads and relates to large main storage usage. Both page frame sizes can be simultaneously used.

Large pages enable the translation lookaside buffer (TLB) to better represent the working set and suffer fewer misses by allowing a single TLB entry to cover more address translations. Users of large pages are better represented in the TLB and are expected to perform better.

This support benefits long-running applications that are memory access intensive. Large pages are not recommended for general use. Short-lived processes with small working sets are normally not good candidates for large pages and see little to no improvement. The use of large pages must be decided based on knowledge that is obtained from measurement of memory usage and page translation overhead for a specific workload.

The large page support function is not enabled without the required software support. Without the large page support, page frames are allocated at the current 4 KB size. At the time they were introduced, large pages were treated as fixed pages and were never paged out. Under z/OS, they are available only for 64-bit virtual private storage, such as virtual memory that is located above 2 GB. With the availability of the Flash Express hardware feature (see 3.2.4, “Flash Express” on page 75), large pages can become pageable.

Support for 2 GB large page

z13 uses 2 GB page frames, introduced with zEC12, as an architectural extension. This is to increase efficiency for DB2 buffer pools, Java heap, and other large structures. Use of 2 GB pages increases TLB coverage without proportionally enlarging the TLB size:

•A 2 GB memory page has the following characteristics:

– It is 2048 times larger than a large page of 1 MB size.

– It is 524,288 times larger than an ordinary base page with a size of 4 KB.

•A 2 GB page allows for a single TLB entry to fulfill many more address translations than either a large page or ordinary base page.

•A 2 GB page provides users with much better TLB coverage, and therefore provides better performance in the following ways:

– By decreasing the number of TLB misses that an application incurs

– By spending less time on converting virtual addresses into physical addresses

– By using less real storage to maintain DAT structures

2 GB large page exploitation: Exploitation of 2 GB pages is provided for the IBM 31-bit SDK for z/OS, Java Technology Edition, V7.0.0 (5655-W43) and SDK7 IBM 64-bit SDK for z/OS, Java Technology Edition, V7.0.0 (5655-W44).

3.2.3 Native PCIe features and integrated firmware processor

zEC12 introduced feature card types, know as native PCIe features, which require a different management design. The following native PCIe features are available:

•10GbE RoCE Express

•zEDC Express

These features are plugged exclusively into a PCIe I/O drawer, where they coexist with the other, non-native PCIe, I/O adapters and features, but they are managed in a different way from those other I/O adapters and features. The native PCle feature cards have a PCHID assigned according to the physical location in the PCIe I/O drawer.

For non-native PCIe features, which are plugged into a PCIe I/O drawer, and on the z13 supported carry-forward I/O drawer, all adaptation layer functions are integrated into the adapter hardware.

For the native PCIe features introduced by zEC12 and supported by z13, there are drivers included in the operating system, and the adaptation layer is not needed. The adapter management functions (such as diagnostics and firmware updates) are provided by Resource Groups partitions running on the integrated firmware processor (IFP).

The IFP is used to manage native PCIe adapters installed in a PCIe I/O drawer. The IFP is allocated from a pool of PUs that are available for the whole system. Because the IFP is exclusively used to manage native PCIe adapters, it is not taken from the pool of PUs that can be characterized for customer usage.

If a native PCIe feature is present in the system, the IFP is initialized and allocated during the system POR phase. Although the IFP is allocated to one of the physical PUs, it is not visible to the customer. In case of error or failover scenarios, the IFP will act like any other PU (that is, sparing is invoked).

3.2.4 Flash Express

The Flash Express feature helps to improve system and application availability and performance to compete more effectively in today’s service focused market. Flash Express capabilities enable the following features:

•Improved z/OS recovery and diagnostic times

•Handling of workload shifts and coping with dynamic environments more smoothly

•Use of pageable large pages (1 MB) yielding CPU performance benefits

•Offloading GBps of random I/O from the I/O Fabric

•Predictive paging

•Overflow areas for certain Coupling Facility list structures for WebSphere MQ

Flash Express is easy to configure, requires no special skills, and provides rapid time to value. This feature is designed to allow each logical partition to be configured with its own storage-class memory (SCM) address space, to be used for paging and act as an overflow area for CF structures. Pages that are 1 MB become pageable (z/OS only). Support is provided to configure SCM increments offline through a z/OS operator command, and allows the PLPA and COMMON paging data sets to be optional.

Flash Express data privacy

For Flash Express, the data privacy relies on a symmetric key that encrypts the data that is temporarily stored on the SSD. By using a smart card and an integrated smart card reader on the Support Element (SE), the encryption key is generated within the secure environment of the smart card. The key is tightly coupled to the SE serial number, which ensures that no other SE is able to share the key or the smart card that is associated with a specific SE. The generated key is replicated in a secure way to the alternate Support Element smart card. The key is transferred from the SE to the Flash Express adapter under the protection of a private and public key pair that is generated by the firmware that manages the Flash Express adapter.

CFCC exploitation of Flash Express

With CFCC Level 19 and later, the Flash Express feature can be exploited to help handle the overflow of WebSphere MQ shared queue structures. This is designed to allow structure data to be migrated to Flash Express memory as needed and migrated back to real memory to be processed. This requires WebSphere MQ for z/OS V7, z/OS V2R1 or V1R13, with additional service.

Software and operating system support

Exploitation of pageable 1 MB pages for z/OS includes these items:

•IBM z/OS V1R13 Language Environment® when used with a run-time option.

•Java, with the IBM 31-bit SDK for z/OS, Java technology Edition, V7.1.0 and IBM 64-bit SDK for z/OS, Java Technology Edition, V7.1.0.

•DB2 10⁷ and DB2 11 exploits pageable 1 MB frames for buffer pool and executable code.

•Pageable large pages are used by IMS⁸ Common queue server (CQS) interface buffers and selected database storage pools on a zEC12 or z13.

Flash Express is an optional feature. It is supported by z/OS V1R13 and later, with the z/OS V1R13 RSM Enablement Offering web deliverable installed. It is fully supported by z/OS V2R1.

3.2.5 zEDC Express

zEDC Express is an optional feature, available on z13, zEC12, and zBC12 systems, and is designed to help to improve cross-platform data exchange, reduce CPU consumption, and save disk space by providing hardware-based acceleration for data compression and decompression for the enterprise. It provides data compression with lower CPU consumption than compression technology previously available on z Systems.

This capability is of special interest, for instance, to clients experiencing significant year-to-year growth in storage. Savings can be realized initially by making more efficient use of existing capacity, allowing more data to be kept active and online at lower cost, and longer term by elongating time frames for acquisitions for additional storage.

z/OS V2R1 zEnterprise Data Compression

Exploitation support of zEDC Express functionality is provided exclusively by z/OS V2R1 zEnterprise Data Compression for both data compression and decompression.

Support for data recovery (decompression) in the case that zEDC Express is not installed, or installed but not available, on the system, is provided through software on z/OS V2R1, and on V1R13 and V1R12 with appropriate PTFs. Software decompression is slow and uses considerable processor resources, thus it is not recommended for production environments.

z/OS guests running under z/VM V6R3 may exploit the zEDC Express feature. IBM zEnterprise Data Compression (zEDC) for z/OS V2R1 and the zEDC Express feature are designed to support a new data compression function to help provide high-performance, low-latency compression without significant CPU overhead. This may help to reduce disk usage, provide optimized cross-platform exchange of data, and provide higher write rates for System Management Facility (SMF) data.

z/OS V2R1 can use zEDC to compress SMF records. zEDC with z/OS SMF Logger alleviates SMF constraints across the entire lifecycle of a record using compression technology while storing data in System Logger and reducing Logger CPU usage.

Clients who have large sequential data that uses BSAM/QSAM⁹ extended format, can use zEDC to help reduce disk space usage and improve effective bandwidth without significant CPU overhead. zEDC will also be used by DFSMSdss and DFSMShsm to deliver efficient compression when backing up and restoring data.

IBM 31-bit and 64-bit SDK7 for z/OS Java Technology Edition, Version 7 provides transparent exploitation of the zEDC Express feature.

Similarly, the IBM Encryption Facility for z/OS (IBM EF) uses zEDC Express when used with z/OS Java Version 7. For IBM EF users not already using compression, compression with zEDC can provide IBM EF users a reduction of elapsed time and CPU times. This complements the software compression that exists today with Encryption Facility OpenPGP support.

IBM Sterling Connect: Direct for z/OS V5R2 can automatically leverage zEDC Express Accelerator for file compression and decompression as files are transferred cross-platform. Usage may provide reduction in elapsed time to transfer a file from z/OS to z/OS with minimal CPU increase.

Table 3-1 compares compression technologies.

Table 3-1 Compression technologies for z Systems

Type	Optimized for	Performance overhead	Supported data	Frequency of access post compression
CMPSC compression on z Systems processor chip	DB2 and select DFSMS files	On chip, relatively little CPU overhead and less I/O, fast	DB2: optimized for row-wise access to data is required DFSMS files: for VSAM and non-VSAM extended format data sets	Often
Other software compression (zlib, or similar)	Most compression uses industry std today. Used by many file types	Higher CPU: software instructions executed. Note: if Java then eligible for zIIP (or zAAP)	Any. De facto standard for almost any type of data.	Often
Tape hardware compression	Tape compression: optimized for use with large files, archival purposes	Performed by tape subsystem	Any	Often or rare (application dependent)
Archival or backup	Archive data and data backup/copy	CPU overhead, longer wall clock time	DFSMShsm, DFSMSdss	Often or rare (application dependent)
Real time compression	IBM NAS storage	No performance degradation	SAN Volume Controller	Designed for active primary data.
zEDC Express	Active, for cross-platform data exchange. Enables compression of active and inactive data	Processing on zEDC Express: expect minimal CPU overhead, low I/O latency	SMF though logger zlib compatible Java BSAM/QSAM Extended format SOD DFSMShsm/DFSMSdss Encryption facility	Frequent access required. Useful for files that previously used software compression also

IBM z Batch Network Analyzer

The IBM z Batch Network Analyzer (zBNA) is a no-cost, “as-is” tool. It is available to clients, IBM Business Partners, and IBM employees.

zBNA replaces the BWATOOL. It is Windows based, provides graphical and text reports, including Gantt charts, and support for alternate processors.

zBNA can be used to analyze customer provided SMF records, in order to identify jobs and data sets which are candidates for zEDC compression, across a specified time window, typically a batch window. zBNA is able to generate lists of data sets by job:

•Those that already do hardware compression and might be candidates for zEDC

•Those that might be zEDC candidates but are not in extended format

Thus, zBNA can help estimate utilization of zEDC features and help size the number of features needed.

Find zBNA at these web addresses:

•IBM Clients can obtain zBNA and other CPS tools at this site:

http://www.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/PRS5132

•IBM Business Partners can obtain zBNA and other CPS tools at this site:

https://www.ibm.com/partnerworld/wps/servlet/mem/ContentHandler/tech_PRS5133

•IBM Employees can obtain zBNA and other CPS tools at this site:

http://w3.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/PRS5126

3.2.6 10GbE RoCE Express

The 10 Gigabit Ethernet (10GbE) RoCE Express feature helps reduce consumption of CPU resources for applications that use the TCP/IP stack (such as WebSphere accessing a DB2 database).

Use of the 10GbE RoCE Express feature might also help to reduce network latency with memory-to-memory transfers utilizing Shared Memory Communications - Remote Direct Memory Access (SMC-R) in z/OS V2R1. It is transparent to applications and can be used for LPAR-to-LPAR communication on a single CPC or server-to-server communication in a multiple CPC environment.

z/OS V2.1 with PTF supports the new sharing capability available for the Remote Direct Memory Access (RDMA) over Converged Ethernet (RoCE Express) features on z13 processors. This enhancement allows full use of the ports in the RoCE adapter and to share the adapters across up to 31 z/OS images on a z13 processor.

z/VM V6R3 supports also guest exploitation of RoCE

In addition, Communication Server is enhanced to support selecting between TCP/IP and RoCE transport layer protocols automatically based on traffic characteristics. This feature is supported on z13, zEC12, and zBC12¹⁰ and is installed in the PCIe I/O drawer. A maximum of 16 features can be installed. In z13, both ports are enabled to be used; each feature, two ports, can be shared up to 31 logical partitions. On zEC12 and zBC12 only one port can be used and the port had to be dedicated to a logical partition.

3.2.7 I/O capabilities

The z13 has many I/O capabilities for supporting high-speed connectivity to resources inside and outside the system. The connectivity of the z13 is designed to maximize application performance and satisfy clustering, security, storage area network (SAN), and local area network (LAN) requirements.

Multiple subchannel sets

Multiple subchannel sets (MSS) provide greater I/O device configuration capabilities for large enterprises. Up to four subchannel sets for z13 can be defined to each channel subsystem (CSS). Up to six channel subsystems can be defined on the z13.

For each additional subchannel set, the amount of addressable storage capacity is 65535 (64k = 65536 -1 subchannel), which enables a larger number of storage devices. This increase complements other functions (such as large or extended address volumes) and HyperPAV. This can also help facilitate consistent device address definitions, simplifying addressing schemes for congruous devices.

The first subchannel set (SS0) allows the definition of any type of device (such as bases, aliases, secondaries, and those devices other than disks that do not implement the concept of associated aliases or secondaries). The second, third, and fourth subchannel sets (SS1, SS2, and SS3) can be designated for use for disk alias devices (of both primary and secondary devices) and Metro Mirror secondary devices only.

Initial program load from an alternate subchannel set

The z13 support the initial program load (IPL) from subchannel set 1 (SS1), subchannel set 2 (SS2), or subchannel set 3 (SS3). Devices that are used early during initial program load (IPL) processing can be accessed by using subchannel set 1, subchannel set 2, or subchannel set 3. This flexibility allows the users of Metro Mirror (PPRC) secondary devices that are defined using the same device number and a new device type in an alternate subchannel set to be used for IPL, input/output definition file (IODF), and stand-alone dump volumes, when needed.

Channel subsystem enhancement for I/O resilience

The z13 channel subsystem incorporates an improved load-balancing algorithm that is designed to provide improved throughput and reduced I/O service times, even when abnormal conditions occur. For example, degraded throughput and response times can be caused by multi-system workload spikes. This reduction can also be caused by resource contention in storage area networks (SAN) or across control unit ports, SAN congestion, suboptimal SAN configurations, problems with initializing optics, dynamic fabric routing changes, and destination port congestion.

When such events occur, the channel subsystem is designed to dynamically select channels to optimize performance. The subsystem also minimizes imbalances in I/O performance characteristics (such as response time and throughput) across the set of channel paths to each control unit. This function is done by using the in-band I/O instrumentation and metrics of the z Systems FICON and zHPF protocols.

This channel subsystem enhancement is available to z13, zEC12, and zBC12 and is supported on all FICON channels when configured as CHPID type FC. In support of this new function, z/OS V1R12 and V1R13 with a program temporary fix (PTF) also provide an updated health check based on an I/O rate-based metric, rather than on initial control unit command response time.

This enhancement is transparent to operating systems. However, this feature requires an updated health check based on an I/O rate-based metric, rather than on initial control unit command response time, provided by z/OS V1R12 and V1R13 with a PTF and later.

FICON connectivity

The Fibre Connection (FICON) features in the z13 can provide connectivity to servers, FC switches, and various devices (control units, disk, tape, printers) in a SAN environment. FICON improves upon the Fibre Channel Protocol (FCP) and continues to evolve, delivering improved throughput, reliability, availability, and serviceability.

High Performance FICON for z Systems

High Performance FICON for z Systems (zHPF), first provided on System z10, is a FICON architecture for protocol simplification and efficiency, reducing the number of information units (IU) processed. Enhancements to the z/Architecture and the FICON interface architecture provide optimizations for online transaction processing (OLTP) workloads.

When used by the FICON channel, the z/OS operating system, and the control unit (appropriate levels of Licensed Internal Code are required), the FICON channel overhead can be reduced and performance can be improved. Additionally, the changes to the architecture provide end-to-end system enhancements to improve reliability, availability, and serviceability (RAS). The zHPF channel programs can be used, for instance, by z/OS OLTP I/O workloads, DB2, VSAM, PDSE, and zFS. zHPF requires matching support by the DS8000 series or similar devices from other vendors.

The zHPF is exclusive to z Systems. The FICON Express16S, FICON Express8S, and FICON Express8 (channel path identifier (CHPID) type FC) concurrently support both the existing FICON protocol and the zHPF protocol in the server Licensed Internal Code.

High Performance FICON for z Systems (zHPF) is enhanced to allow all large write operations greater than 64 KB at distances up to 100 km to be run in a single round trip to the control unit thereby not elongating the I/O service for these write operations at extended distances. This is especially advantageous for IBM GDPS HyperSwap® configurations.

For more information about FICON channel performance, see the technical papers at the z Systems I/O connectivity website:

http://www.ibm.com/systems/z/hardware/connectivity/ficon_performance.html

Modified Indirect Data Address Word (MIDAW) facility

The MIDAW facility is a system architecture and software usage that is designed to improve FICON performance. This facility was introduced with z9 servers and is used by the media manager in z/OS.

The MIDAW facility provides a more efficient structure for certain categories of data-chaining I/O operations resulting in improved FICON performance and I/O response times, in particular for extended format data-sets (DB2 is a major user). For more information about FICON, FICON channel performance, and MIDAW, see the following resources:

•I/O Connectivity web page:

http://www.ibm.com/systems/z/connectivity/

•These Redbooks publications

– How does the MIDAW Facility Improve the Performance of FICON Channels Using DB2 and other workloads?, REDP-4201

– DS8000 Performance Monitoring and Tuning, SG24-7146

Extended distance FICON

Using an enhancement to the industry standard FICON architecture (FC-SB-3) can help avoid degradation of performance at extended distances by implementing a protocol for persistent information unit (IU) pacing. Control units that use the enhancement to the architecture can increase the pacing count (the number of IUs allowed to be in flight from channel to control unit). Extended distance FICON allows the channel to remember the last pacing update for use on subsequent operations to help avoid degradation of performance at the start of each new operation.

Improved IU pacing can optimize the use of the link (for example, helps to keep a 4 Gbps link that is fully used at 50 km) and allows channel extenders to work at any distance, with performance results similar to those experienced when using emulation.

The requirements for channel extension equipment are simplified with the increased number of commands in flight. This can benefit z/OS Global Mirror (also referred as Extended Remote Copy, XRC) applications, as the channel extension kit is no longer required to simulate specific channel commands. Simplifying the channel extension requirements can help reduce the total cost of ownership of end-to-end solutions.

Extended Distance FICON is transparent to operating systems and applies to all the FICON Express16S, FICON Express8S, and FICON Express8 features carrying basic FICON traffic (CHPID type FC). For usage, the control unit must support the new IU pacing protocol.

Usage of extended distance FICON is supported by the IBM System Storage® DS8000 series with an appropriate level of Licensed Machine Code (LMC).

z/OS discovery and autoconfiguration

z/OS discovery and autoconfiguration for FICON channels (zDAC) automatically performs a number of I/O configuration definition tasks for new and changed disk and tape controllers that are connected to an FC switch, when attached to a FICON channel.

Users can define a policy, by using the hardware configuration definition (HCD) dialog. Then, when new controllers are added to an I/O configuration or changes are made to existing controllers, the system is designed to discover them and propose configuration changes that are based on that policy. This policy can include preferences for availability and bandwidth, which includes PAV definitions, control unit numbers, and device number ranges.

zDAC is designed to perform discovery for all systems in a sysplex that support the function. The proposed configuration incorporates the current contents of the I/O definition file (IODF) with additions for newly installed and changed control units and devices. zDAC is designed to simplify I/O configuration on z13 running z/OS and reduce complexity and setup time. zDAC applies to all FICON features supported on z Systems when configured as CHPID type FC.

FICON name server registration

The FICON channel provides the same information to the fabric as is commonly provided by open systems, registering with the name server in the attached FICON directors. This enables a quick and efficient management of storage area network (SAN) and performance of problem determination and analysis.

Platform registration is a standard service that is defined in the Fibre Channel - Generic Services 3 (FC-GS-3) standard (INCITS (ANSI) T11.3 group). It allows a platform (storage subsystem, host, and so on) to register information about itself with the fabric (directors).

This z13 function is transparent to operating systems and applicable to all FICON Express16S, Express8S, and FICON Express8 features (CHPID type FC). For more information, see IBM z Systems Connectivity Handbook, SG24-5444.

FCP connectivity

Fibre Channel Protocol is fully supported on the z13. It is commonly used with Linux on z Systems and supported by the z/VM, z/VSE, and Linux on z Systems operating systems.

Fibre Channel Protocol enhancements for small block sizes

The Fibre Channel Protocol (FCP) Licensed Internal Code was modified to help provide increased I/O operations per second for small block sizes. This FCP performance improvement is transparent to operating systems and applies to all the FICON Express 16S, FICON Express8S, and FICON Express8 features, when configured as CHPID type FCP, communicating with SCSI devices.

For more information about FCP channel performance, see the performance technical papers on the z Systems I/O Connectivity web page:

http://www.ibm.com/systems/z/hardware/connectivity/fcp_performance.html

FCP channels to support T10-DIF for enhanced reliability

Recognizing that high reliability is important for maintaining the availability of business-critical applications, the z Systems Fibre Channel Protocol (FCP) has implemented support of the American National Standards Institute's (ANSI) T10 Data Integrity Field (DIF) standard. Data integrity protection fields are generated by the operating system and propagated through the storage area network (SAN). IBM z Systems help to provide added end-to-end data protection between the operating system and the storage device.

An extension to the standard, Data Integrity Extensions (DIX), provides checksum protection from the application layer through the host bus adapter (HBA), where cyclical redundancy checking (CRC) protection is implemented.

T10-DIF support by the FICON Express16S, FICON Express8S, and FICON Express8 features, when defined as CHPID type FCP, is available to z13 and to zEnterprise CPCs. Usage of the T10-DIF standard requires support by the operating system and the storage device.

N_Port ID Virtualization (NPIV)

NPIV is designed to allow the sharing of a single physical FCP channel among operating system images, whether in logical partitions or as z/VM guests. This is achieved by assigning a unique worldwide port name (WWPN) for each operating system that is connected to the FCP channel. In turn, each operating system appears to have its own distinct WWPN in the SAN environment, hence enabling separation of the associated FCP traffic on the channel.

Access controls that are based on the assigned WWPN can be applied in the SAN environment. This function can be done by using standard mechanisms such as zoning in SAN switches and logical unit number (LUN) masking in the storage controllers.

WWPN tool

A part of the installation of your z13 server is the planning of the SAN environment (if applicable). IBM has made a stand-alone tool available to assist with this planning before the installation. The tool, which is known as the WWPN tool, assigns WWPNs to each virtual Fibre Channel Protocol (FCP) channel/port. This function is done by using the same WWPN assignment algorithms that a system uses when assigning WWPNs for channels using NPIV. Thus, the SAN can be set up in advance, allowing operations to proceed much faster after the server is installed.

The WWPN tool takes a CSV (.csv) file that contains the FCP-specific I/O device definitions and creates the WWPN assignments that are required to set up the SAN. A binary configuration file that can be imported later by the system is also created. The.csv file can either be created manually or exported from the Hardware Configuration Definition/Hardware Configuration Manager (HCD/HCM).

The WWPN tool is available for download from the IBM Resource Link and is applicable to all FICON channels defined as CHPID type FCP (for communication with SCSI devices) on z13.

http://www.ibm.com/servers/resourcelink/

LAN connectivity

The z13 offer a wide range of functions that can help consolidate or simplify the LAN environment with the supported OSA-Express features, though also satisfying the demand for more throughput. Improved throughput (mixed inbound/outbound) is achieved by the data router function that was introduced in the OSA-Express3 and enhanced in OSA-Express5S, and OSA-Express4S features.

With the data router, the store and forward technique in DMA is no longer used. The data router enables a direct host memory-to-LAN flow. This function avoids a hop and is designed to reduce latency and to increase throughput for standard frames (1492 bytes) and jumbo frames (8992 bytes).

Queued direct I/O (QDIO) optimized latency mode

QDIO optimized latency mode can help improve performance for applications that have a critical requirement to minimize response times for inbound and outbound data. It optimizes the interrupt processing as noted in the following configurations:

•For inbound processing, the TCP/IP stack looks more frequently for available data to process, ensuring that any new data is read from the OSA-Express5S, or OSA-Express4S without requiring more program controlled interrupts (PCI).

•For outbound processing, the OSA-Express5S, or OSA-Express4S looks more frequently for available data to process from the TCP/IP stack, thus not requiring a Signal Adapter (SIGA) instruction to determine whether more data is available.

Inbound workload queuing (IWQ)

IWQ helps reduce overhead and latency for inbound z/OS network data traffic and implement an efficient way for initiating parallel processing. This is achieved by using an OSA-Express5S, or OSA-Express4S feature in QDIO mode (CHPID types OSD and OSX) with multiple input queues and by processing network data traffic that is based on workload types. The data from a specific workload type is placed in one of four input queues (per device), and a process is created and scheduled to run on one of multiple processors, independent from the other three queues. This improves performance because IWQ can use the symmetric multiprocessor (SMP) architecture of the z13.

Virtual local area network (VLAN) support

VLAN is a function of OSA-Express features that takes advantage of the IEEE 802.q standard for virtual bridged LANs. VLANs allow easier administration of logical groups of stations that communicate as though they were on the same LAN. In the virtualized environment of z Systems, TCP/IP stacks can exist, potentially sharing OSA-Express features. VLAN provides a greater degree of isolation by allowing contact with a server from only the set of stations that comprise the VLAN.

Virtual MAC (VMAC) support

When sharing OSA port addresses across LPARs, VMAC support enables each operating system instance to have a unique virtual MAC (VMAC) address. All IP addresses associated with a TCP/IP stack are accessible by using their own VMAC address, instead of sharing the MAC address of the OSA port. Advantages include a simplified configuration setup and improvements to IP workload load balancing and outbound routing.

This support is available for Layer 3 mode and is used by z/OS and supported by z/VM for guest usage.

QDIO data connection isolation for the z/VM environment

New workloads increasingly require multitier security zones. In a virtualized environment, an essential requirement is to protect workloads from intrusion or exposure of data and processes from other workloads.

The QDIO data connection isolation enables the following elements:

•Adherence to security and HIPPA-security guidelines and regulations for network isolation between the instances that share physical network connectivity.

•Establishment of security zone boundaries that are defined by the network administrators.

•A mechanism to isolate a QDIO data connection (on an OSA port) by forcing traffic to flow to the external network. This feature ensures that all communication flows only between an operating system and the external network.

Internal routing can be disabled on a per-QDIO connection basis. This support does not affect the ability to share an OSA port. Sharing occurs as it does today, but the ability to communicate between sharing QDIO data connections can be restricted through this support.

QDIO data connection isolation (also known as VSWITCH port isolation) applies to the z/VM environment when using the Virtual Switch (VSWITCH) function and to all of the OSA-Express5S, and OSA-Express4S features (CHPID type OSD) on z13. z/OS supports a similar capability.

QDIO interface isolation for z/OS

Some environments require strict controls for routing data traffic between servers or nodes. In certain cases, the LPAR-to-LPAR capability of a shared OSA port can prevent such controls from being enforced. With interface isolation, internal routing can be controlled on an LPAR basis. When interface isolation is enabled, the OSA discards any packets that are destined for a z/OS LPAR that is registered in the OAT as isolated.

QDIO interface isolation is supported by Communications Server for z/OS V1R11 and later and for all OSA-Express5S, and OSA-Express4S features on z13.

Open Systems Adapter for NCP (OSN)

The OSN support is able to provide channel connectivity from z Systems Operating Systems to IBM Communication Controller for Linux on z Systems (CCL). This function is done by using the Open Systems Adapter for the Network Control Program (OSA for NCP) supporting the Channel Data Link Control (CDLC) protocol.

When SNA solutions that require NCP functions are needed, CCL can be considered as a migration strategy to replace IBM Communications Controllers (374x). The CDLC connectivity option enables z/TPF environments to use CCL.

OSN: The OSN CHPID type is not supported on OSA-Express 5S GbE or OSA-Express4S GbE features.

Network management: Query and display OSA configuration

As more complex functions are added to OSA, the ability for the system administrator to display, monitor, and verify the specific current OSA configuration unique to each operating system is becoming more complex. OSA-Expres5S, and OSA-Express4S have the capability for the operating system to query and display the current OSA configuration information (similar to OSA/SF) directly. z/OS uses this OSA capability by providing the TCP/IP operator command, Display OSAINFO, which allows the operator to monitor and verify the current OSA configuration, helping to improve the overall management, serviceability, and usability of OSA-Express5S, and OSA-Express4S features.

The Display OSAINFO command is exclusive to OSA-Express5S, and OSA-Express4S (CHPID types OSD, OSM, and OSX), the z/OS operating system, and is supported on z/VM for guest usage.

z Systems ensemble connectivity

With the IBM zEnterprise Systems, two CHPID types have been introduced to support the z Systems ensemble:

•OSA-Express for Unified Resource Manager (OSM) for the intranode management network (INMN)

•OSA-Express for zBX (OSX) for the intraensemble data network (IEDN)

The INMN is one of the ensemble’s two private and secure internal networks. INMN is used by the Unified Resource Manager functions in the primary HMC.

The OSM connections are through the System Control Hub (SCH) in the z13. The INMN requires two OSA-Express5S 1000BASE-T, or OSA-Express4S 1000BASE-T ports from separate features.

The IEDN is the ensemble’s other private and secure internal network. IEDN is used for communications across the virtualized images (LPARs and virtual machines). The IEDN connections use MAC addresses, not IP addresses (Layer 2 connection).

The OSX connections are from the z Systems CPC to the IEDN TOR switches in zBX. The IEDN requires two OSA-Express5S 10 GbE, or OSA-Express4S 10 GbE ports from separate features.

HiperSockets

HiperSockets feature has been referred to as the “network in a box.” HiperSockets simulates LANs entirely in the hardware. The data transfer is from LPAR memory to LPAR memory, mediated by microcode. The z13 support up to 32 HiperSockets. One HiperSockets network can be shared by up to 85 LPARs on a z13. Up to 4096 communication paths support a total of 12,288 IP addresses across all 32 HiperSockets.

HiperSockets Layer 2 support

The HiperSockets internal networks can support two transport modes:

•Layer 2 (link layer)

•Layer 3 (network or IP layer)

Traffic can be Internet Protocol (IP) Version 4 or Version 6 (IPv4, IPv6) or non-IP (such as AppleTalk, DECnet, IPX, NetBIOS, SNA, or others). HiperSockets devices are independent of protocol and Layer 3. Each HiperSockets device has its own Layer 2 Media Access Control (MAC) address, which is designed to allow the use of applications that depend on the existence of Layer 2 addresses such as Dynamic Host Configuration Protocol (DHCP) servers and firewalls.

Layer 2 support can help facilitate server consolidation. Complexity can be reduced, network configuration is simplified and intuitive, and LAN administrators can configure and maintain the mainframe environment the same way as they do for a non-mainframe environment. HiperSockets Layer 2 support is provided by Linux on z Systems, and by z/VM for guest usage.

HiperSockets Multiple Write Facility

HiperSockets performance is enhanced to allow for the streaming of bulk data over a HiperSockets link between LPARs. The receiving LPAR can now process a much larger amount of data per I/O interrupt. This enhancement is transparent to the operating system in the receiving LPAR. HiperSockets Multiple Write Facility, with fewer I/O interrupts, reduces CPU use of the sending and receiving LPAR.

The HiperSockets Multiple Write Facility is supported in the z/OS environment.

zIIP-Assisted HiperSockets for large messages

In z/OS, HiperSockets are enhanced for zIIP usage. Specifically, the z/OS Communications Server allows the HiperSockets Multiple Write Facility processing for outbound large messages that originate from z/OS to be performed on a zIIP.

zIIP-Assisted HiperSockets can help make highly secure and available HiperSockets networking an even more attractive option. z/OS application workloads that are based on XML, HTTP, SOAP, Java, and traditional file transfer can benefit from zIIP enablement by lowering general-purpose processor use for such TCP/IP traffic.

When the workload is eligible, the TCP/IP HiperSockets device driver layer (write) processing is redirected to a zIIP, which unblocks the sending application.

zIIP Assisted HiperSockets for large messages is available on z13 with z/OS V1R12 and later releases.

HiperSockets Network Traffic Analyzer (NTA)

HiperSockets NTA is a function that is available in the LIC of the z13. It can simplify problem isolation and resolution by allowing Layer 2 and Layer 3 tracing of HiperSockets network traffic.

HiperSockets NTA allows Linux on z Systems to control tracing of the internal virtual LAN. It captures records into host memory and storage (file systems) that can be analyzed by system programmers and network administrators, using Linux on z Systems tools to format, edit, and process the trace records.

A customized HiperSockets NTA rule enables authorizing an LPAR to trace messages only from LPARs that are eligible to be traced by the NTA on the selected IQD channel.

HiperSockets Completion Queue

The HiperSockets Completion Queue function allows both synchronous and asynchronous transfer of data between logical partitions. With the asynchronous support, during high volume situations, data can be temporarily held until the receiver has buffers available in its inbound queue. This provides end-to-end performance improvement for LPAR to LPAR communication and can be especially helpful in burst situations.

HiperSockets Completion Queue function is supported on the z13 running z/OS V1R13, z/VM V6R2 (with maintenance) and later, z/VSE V5R1 (with maintenance), Red Hat Enterprise Linux (RHEL) 6.2, or SUSE Linux Enterprise Server (SLES) 11 SP2 (with maintenance), and later.

HiperSockets integration with the intraensemble data network

The z13 servers provide the capability to integrate HiperSockets connectivity with the intraensemble data network (IEDN). Thus the reach of the HiperSockets network is extended to outside the CPC to the entire ensemble, which is displayed as a single, Layer 2 network. Because HiperSockets and IEDN are both internal z Systems networks, the combination allows z Systems virtual servers to use an optimal path for communications.

The support of HiperSockets integration with the IEDN function is available starting with z/OS Communication Server V1R13.

HiperSockets Virtual Switch Bridge Support

The z/VM virtual switch is enhanced to transparently bridge a guest virtual machine network connection on a HiperSockets LAN segment. This bridge allows a single HiperSockets guest virtual machine network connection to also directly communicate with the following systems:

•Other guest virtual machines on the virtual switch

•External network hosts through the virtual switch OSA UPLINK port

z/VM V6R2 and later, TCP/IP, and Performance Toolkit APARs are required for this support.

A HiperSockets channel by itself is only capable of providing intra-CPC communications.
The HiperSockets Bridge Port allows a virtual switch to connect z/VM guests by using real HiperSockets devices, the ability to communicate with hosts that reside externally to the CPC. The virtual switch HiperSockets Bridge Port eliminates the need to configure a separate next hop router on the HiperSockets channel to provide connectivity to destinations that are outside of a HiperSockets channel.

z/VSE fast path to Linux support

Linux Fast Path (LFP) allows z/VSE TCP/IP applications to communicate with the TCP/IP stack on Linux without using a TCP/IP stack on z/VSE. LFP for use in a z/VM guest environment is supported since z/VSE V5R1. When LFP is used in an LPAR environment, it requires the HiperSockets Completion Queue function available on z Systems CPCs. LFP in an LPAR environment is supported since z/VSE V5R1.

Coupling and Server Time Protocol connectivity

Coupling connectivity in support of Parallel Sysplex environments is provided on the z13 by the following features:

•New PCIe Gen3, Integrated Coupling Adapter (ICA SR), which allows two ports coupling links connectivity for a distance of up to 150 m (492 feet) at 8 GBps each.

•HCA3-O,12x InfiniBand coupling links offering up to 6 GBps of bandwidth between z13, zBC12, z196 and z114 systems, for a distance of up to 150 m (492 feet).

•HCA3-O LR, 1x InfiniBand up to 5 Gbps connection bandwidth between z13, zEC12, zBC12, z196 and z114 for a distance of up to 10 km (6.2 miles).

•Internal Coupling Channels (ICs), operating at memory speed.

All coupling link types can be used to carry Server Time Protocol (STP) messages. The z13 does not support ISC-3 connectivity. Also, HCA2-O 12x and HCA2-O LR 1x InfiniBand features are not supported in z13.

3.2.8 Cryptography

z13 provides cryptographic functions that, from an application program perspective, can be grouped as follows:

•Synchronous cryptographic functions, provided by the CP Assist for Cryptographic Function (CPACF)

•Asynchronous cryptographic functions, provided by the Crypto Express features

CP Assist for Cryptographic Function (CPACF)

CPACF offers a set of symmetric cryptographic functions for high performance encryption and decryption with clear key operations for SSL/TLS, VPN, and data-storing applications that do not require FIPS¹¹ 140-2 level 4 security. The CPACF is integrated with the compression unit in the coprocessor (CoP) in the z13 microprocessor core.

The CPACF protected key is a function that facilitates the continued privacy of cryptographic key material while keeping the wanted high performance. CPACF ensures that key material is not visible to applications or operating systems during encryption operations. CPACF protected key provides substantial throughput improvements for large-volume data encryption and low latency for encryption of small blocks of data.

The cryptographic assist includes support for the following functions:

•Data Encryption Standard (DES) data encrypting and decrypting.

DES supports the following key types:

– Single-length key DES

– Double-length key DES

– Triple-length key DES (T-DES)

•Advanced Encryption Standard (AES) for 128-bit, 192-bit, and 256-bit keys

•Pseudo random number generation (PRNG)

•Message Authentication Code (MAC)

•Hashing algorithms: SHA-1 and SHA-2 support for SHA-224, SHA-256, SHA-384, and SHA-512

SHA-1and SHA-2 support for SHA-224, SHA-256, SHA-384, and SHA-512 are shipped enabled on all servers and do not require the CPACF enablement feature. The CPACF functions are supported by z/OS, z/VM, z/VSE, z/TPF, and Linux on z Systems.

Crypto Express5S

The Crypto Express5S represents the newest generation of the Peripheral Component Interconnect Express (PCIe) cryptographic coprocessors. It is an optional feature exclusive to the z13. This feature provides a secure programming and hardware environment wherein crypto processes are performed. Each cryptographic coprocessor includes a general-purpose processor, non-volatile storage, and specialized cryptographic electronics.

The Crypto Express5S has one PCIe adapter per feature. For availability reasons, a minimum of two features is required. Up to 16 Crypto Express5S features are supported (16 PCI Express adapters per z13). The Crypto Express5S feature occupies one I/O slot in a z13 PCIe I/O drawer.

Each adapter can be configured as a Secure IBM CCA coprocessor, a Secure IBM Enterprise PKCS #11 (EP11) coprocessor, or as an accelerator.

Crypto Express5S is enhanced to provide domain support for up to 85 logical partitions on IBM z13.

The accelerator function is designed for maximum-speed Secure Sockets Layer and Transport Layer Security (SSL/TLS) acceleration, rather than for specialized financial applications for secure, long-term storage of keys or secrets. The Crypto Express5S can also be configured as one of the following configurations:

•Secure IBM CCA coprocessor for Federal Information Processing Standard (FIPS) 140-2 Level 4 certification. This standard includes secure key functions and is optionally programmable to deploy more functions and algorithms using User Defined Extension (UDX).

•Secure IBM Enterprise PKCS #11 (EP11) coprocessor, implementing an industry standardized set of services that adheres to the PKCS #11 specification v2.20 and more recent amendments. It was designed for extended FIPS and Common Criteria evaluations to meet industry requirements.

This new cryptographic coprocessor mode introduced the PKCS #11 secure key function.

TKE feature: The Trusted Key Entry (TKE) Workstation feature is required for supporting the administration of the Crypto Express5S when configured as an Enterprise PKCS #11 coprocessor.

When the Crypto Express5S PCI Express adapter is configured as a secure IBM CCA coprocessor, it still provides accelerator functions. However, up to three times better performance for those functions can be achieved if the Crypto Express5S PCI Express adapter is configured as an accelerator.

Web deliverables

For z/OS downloads, see the z/OS website:

http://www.ibm.com/systems/z/os/zos/downloads/

3.3 Capacity and performance

The z13 offers significant increases in capacity and performance over its predecessor, zEC12. Many factors contribute to this effect, including the larger number of processors, individual processor performance, memory caches, simultaneous multithreading-2 (SMT-2) and machine instructions, including the new single-instruction, multiple-data (SIMD). Subcapacity settings continue to be offered.

3.3.1 Capacity settings

The z13 expands the offer on subcapacity settings. Finer granularity in capacity levels allows the growth of installed capacity to more closely follow the enterprise growth, for a smoother, pay-as-you-go investment profile. There are many performance and monitoring tools that are available on z Systems environments that are coupled with the flexibility of the capacity on-demand options (see 3.3.2, “z13 Capacity on Demand (CoD)” on page 91). These features help to manage growth by making capacity available when needed.

Regardless of the installed model, the z13 offers four distinct capacity levels for the first 30 central processors (CP):

•One full capacity

•Three subcapacities

These processors deliver the scalability and granularity to meet the needs of medium-sized enterprises, while also satisfying the requirements of large enterprises that have large-scale, mission-critical transaction and data-processing requirements.

A capacity level is a setting of each CP to a subcapacity of the full CP capacity. The clock frequency of those processors remains unchanged. The capacity adjustment is achieved through other means.

Full capacity CPs are identified as CP7. On the z13 server, 141 CPs can be configured as CP7. The three subcapacity levels are identified by CP6, CP5, and CP4, respectively, and are displayed in hardware descriptions as feature codes on the CPs.

If more than 30 CPs are configured to the system, then all must be full capacity because all CPs must be on the same capacity level. Granular capacity adds 90 subcapacity settings to the 141 capacity settings that are available with full capacity CPs (CP7). The 231 distinct capacity settings in the system, provide for a range of over 1:320 in processing power.

A processor that is characterized as anything other than a CP, such as a zIIP, an IFL, or an ICF, is always set at full capacity. There is, correspondingly, a separate pricing model for non-CP processors regarding purchase and maintenance prices, and various offerings for software licensing.

On z13, the CP subcapacity levels are a fraction of full capacity, as follows:

•Model 7xx = 100%

•Model 6xx = 63%

•Model 5xx = 44%

•Model 4xx = 15%

For administrative purposes, systems that have only ICF or IFL processors, are now given a capacity setting of 400. For either of these systems, having up to 141 ICFs or IFLs, which always run at full capacity, is possible.

Figure 3-2 gives more details about z13 full capacity and subcapacity offerings.

Figure 3-2 z13 full and subcapacity CP offerings¹²

To help size a z Systems platform to fit client requirements, IBM provides a no-cost tool that reflects the latest IBM LSPR measurements, called the IBM Processor Capacity Reference for z Systems (zPCR). The tool can be downloaded from the following web page:

http://www.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/PRS1381

For more information about LSPR measurements, see 3.3.3, “z13 performance” on page 92.

3.3.2 z13 Capacity on Demand (CoD)

The z13 continues to provide on-demand offerings. They provide flexibility and control to the client, ease the administrative burden in the handling of the offerings, and give the client finer control over resources that are needed to meet the resource requirements in various situations.

The z13 can perform concurrent upgrades, providing an increase of processor capacity with no server outage. In most cases, with operating system support, a concurrent upgrade can also be non disruptive to the operating system. It is important to consider that these upgrades are based on the enablement of resources already physically present in the z13.

Capacity upgrades cover both permanent and temporary changes to the installed capacity. The changes can be done by using the Customer Initiated Upgrade (CIU) facility, without requiring IBM service personnel involvement. Such upgrades are initiated through the web by using IBM Resource Link. Use of the CIU facility requires a special contract between the client and IBM, through which terms and conditions for online Capacity on Demand (CoD) buying of upgrades and other types of CoD upgrades are accepted. For more information, consult the IBM Resource Link.

For more information about the CoD offerings, see IBM z13 Technical Guide, SG24-8251.

Permanent upgrades

Permanent upgrades of processors (CP, IFL, ICF, zIIP, and SAP) and memory, or changes to a server’s Model-Capacity Identifier, up to the limits of the installed processor capacity on an existing z13, can be performed by the client through the IBM Online Permanent Upgrade offering by using the CIU facility.

Temporary upgrades

Temporary upgrades of a z13 can be done by On/Off CoD, Capacity Backup (CBU), or Capacity for Planned Event (CPE) ordered from the CIU facility.

On/Off CoD function

On/Off CoD is a function that is available on the z13 that enables concurrent and temporary capacity growth of the CPC. On/Off CoD can be used for client peak workload requirements, for any length of time, has a daily hardware charge and can have an associated software charge. On/Off CoD offerings can be pre-paid or post-paid. Capacity tokens are available on z13. Capacity tokens are always present in prepaid offerings and can be present in post-paid if the client wants that. In both cases capacity tokens are being used to control the maximum resource and financial consumption.

When using the On/Off CoD function, the client can concurrently add processors (CP, IFL, ICF, zIIP, and SAP), increase the CP capacity level, or both.

Capacity Backup (CBU) function

CBU allows the client to perform a concurrent and temporary activation of additional CP, ICF, IFL, zIIP, and SAP, an increase of the CP capacity level, or both. This function can be used in the event of an unforeseen loss of z Systems capacity within the client’s enterprise, or to perform a test of the client’s disaster recovery procedures. The capacity of a CBU upgrade cannot be used for peak workload management.

CBU features are optional and require unused capacity to be available on CPC drawers of the backup system, either as unused PUs or as a possibility to increase the CP capacity level on a subcapacity system, or both. A CBU contract must be in place before the LIC-CC code that enables this capability can be loaded on the system. An initial CBU record provides for one test for each CBU year (each up to 10 days in duration) and one disaster activation (up to 90 days in duration). The record can be configured to be valid for up to five years. Client can also order additional tests for a CBU record if needed, in quantities of five tests up to a maximum of 15.

Proper use of the CBU capability does not incur any additional software charges from IBM.

Capacity for Planned Event (CPE) function

CPE allows the client to perform a concurrent and temporary activation of additional CPs, ICFs, IFLs, zIIPs, and SAPs, an increase of the CP capacity level, or both. This function can be used in the event of a planned outage of z Systems capacity within the client’s enterprise (for example, data center changes, system or power maintenance). CPE cannot be used for peak workload management and can be active for a maximum of three days.

The CPE feature is optional and requires unused capacity to be available on CPC drawers of the back-up system, either as unused PUs or as a possibility to increase the CP capacity level on a subcapacity system, or both. A CPE contract must be in place before the LIC-CC that enables this capability can be loaded on the system.

z/OS capacity provisioning

Capacity provisioning helps clients manage the CP and zIIP capacity of z13 that is running one or more instances of the z/OS operating system. By using z/OS Capacity Provisioning Manager (CPM) component, On/Off CoD temporary capacity can be activated and deactivated under control of a defined policy. Combined with functions in z/OS, the z13 provisioning capability gives the client a flexible, automated process to control the configuration and activation of On/Off CoD offerings.

3.3.3 z13 performance

The z Systems microprocessor chip of the z13 has a high-frequency design that uses IBM leading technology and offers more cache per core than other chips. In addition, an enhanced instruction execution sequence, along with processing technologies such as SMT delivers world-class per-thread performance. z/Architecture is enhanced by providing more instructions, including SIMD, that are intended to deliver improved CPU-centric performance and analytics. For CPU-intensive workloads, more gains can be achieved by multiple compiler-level improvements. Improved performance of the z13 is a result of the enhancements that are described in Chapter 2, “Hardware overview” on page 25 and 3.2, “The z13 technology improvements” on page 67.

The z13 Model NE1 offers up to 40% more capacity than the largest zEC12 system. Uniprocessor performance also increased significantly. A z13 Model 701 offers, based on an average workload, performance improvements of up to 10% over the zEC12 Model 701.

However, variations on the observed performance increase depend on the workload type.

LSPR workload suite: z13 changes

To help you better understand workload variations, IBM provides a no-cost tool, IBM Processor Capacity Reference for z Systems (zPCR), which is available at this web page:

http://www.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/PRS1381

IBM continues to measure performance of the systems by using various workloads and publishes the results in the Large Systems Performance Reference (LSPR) report. The LSPR is available at the following web page:

https://www-304.ibm.com/servers/resourcelink/lib03060.nsf/pages/lsprindex?

The MSU ratings are available at the following web page:

http://www-03.ibm.com/systems/z/resources/swprice/reference/exhibits/hardware.html

Historically, LSPR capacity tables, including pure workloads and mixes, have been identified with application names or a software characteristic. Examples are as follows:

•CICS

•IMS

•OLTP-T: Traditional online transaction processing workload (formerly known as IMS)

•CB-L: Commercial batch with long-running jobs

•LoIO-mix: Low I/O Content Mix Workload

•TI-mix: Transaction Intensive Mix Workload

However, capacity performance is more closely associated with how a workload uses and interacts with a particular processor hardware design. Workload capacity performance is sensitive to three major factors:

•Instruction path length

•Instruction complexity

•Memory hierarchy

With the availability of the CPU measurement facility (MF) data, the ability to gain insight into the interaction of workload and hardware design in production workloads has arrived. CPU MF data helps LSPR to adjust workload capacity curves that are based on the underlying hardware sensitivities, in particular the processor access to caches and memory. This is known as nest activity intensity. With the IBM zEnterprise System, the LSPR introduced three workload capacity categories that replace all prior primitives and mixes:

•LOW (relative nest intensity):

A workload category that represents light use of the memory hierarchy. This category is similar to past high scaling primitives.

•AVERAGE (relative nest intensity):

A workload category that represents average use of the memory hierarchy. This category is similar to the past LoIO-mix workload and is expected to represent most of the production workloads.

•HIGH (relative nest intensity):

A workload category that represents heavy use of the memory hierarchy. This category is similar to the past TI-mix workload.

These categories are based on the relative nest intensity, which is influenced by many variables such as application type, I/O rate, application mix, CPU usage, data reference patterns, LPAR configuration, and the software configuration that is running, among others. CPU MF data can be collected by z/OS System Measurement Facility on SMF 113 records.

Guidance in converting LSPR previous categories to the new ones is provided, and built-in support is added to the IBM zPCR tool.

In addition to low, average, and high categories, the latest zPCR provides the low-average and average-high mixed categories, which allow better granularity for workload characterization.

The LSPR tables continue to rate all z/Architecture processors running in LPAR mode and 64-bit mode. The single-number values are based on a combination of the default mixed workload ratios, typical multi-LPAR configurations, and expected early-program migration scenarios. In addition to z/OS workloads used to set the single-number values, the LSPR tables contain information that pertains to Linux and z/VM environments.

The LSPR contains the internal throughput rate ratios (ITRR) for the z13 and the previous generations of processors that are based on measurements and projections using standard IBM benchmarks in a controlled environment. The actual throughput that any user might experience varies depending on factors such as the amount of multiprogramming in the user's job stream, the I/O configuration, and the workload processed.

Experience demonstrates that z Systems servers can be run at up to 100% utilization levels, sustained, although most clients prefer to leave a bit of white space and run at 90% or slightly under. For any capacity comparison, using “one number” such as the MIPS or MSU metrics is not a valid method. That is why, while doing capacity planning, we suggest using zPCR and involving IBM technical support. For more information about z13 performance, see the IBM z13 Technical Guide, SG24-8251

Throughput optimization with z13

The z990 was the first server to use the concept of books. The memory and cache structure implementation in the z13 CPC drawers are enhanced, from the z990 through successive system generations to the z13, to provide sustained throughput and performance improvements. Although the memory is distributed throughout the CPC drawers and the CPC drawers have individual levels of caches that are private to the cores and shared by the cores, all processors have access to the highest level of caches and all of the memory. Thus, the system is managed as a memory coherent symmetric multiprocessor (SMP).

Processors within the z13 CPC drawer structure have different distance-to-memory attributes. As described in 2.3, “z13 CPC drawers, and single chip modules” on page 31, CPC drawers are connected in a star configuration to minimize the distance. Other non-negligible effects result from data latency when grouping and dispatching work on a set of available logical processors. To minimize latency, the system attempts to dispatch and later re-dispatch work to a group of physical CPUs that share cache levels.

PR/SM manages the use of physical processors by logical partitions by dispatching the logical processors on the physical processors. But PR/SM is not aware of which workloads are being dispatched by the operating system in what logical processors. The Workload Manager (WLM) component of z/OS has the information at the task level, but is unaware of physical processors. This disconnect is solved by enhancements that allow PR/SM and WLM to work more closely together. They can cooperate to create an affinity between task and physical processor rather than between logical partition and physical processor. This is known as HiperDispatch.

HiperDispatch

HiperDispatch, introduced with the z10 Enterprise Class, and evolved in z196 and zEC12, is further enhanced in z13. It combines two functional enhancements, one in the z/OS dispatcher and one in PR/SM. This function is intended to improve efficiency both in the hardware and in z/OS. z/VM HiperDispatch is introduced by z/VM V6R3.

In general, the PR/SM dispatcher assigns work to the minimum number of logical processors that are needed for the priority (weight) of the LPAR. On z13, PR/SM attempts to group the logical processors into the same node (see Figure 2-5 on page 32) or in the neighbor node in the same CPC drawer and, if possible, in the same chip. This results in reducing the multi-processor effects, maximizing use of shared cache, and lowering the interference across multiple partitions.

The z/OS dispatcher is enhanced to operate with multiple dispatching queues, and tasks are distributed among these queues. Specific z/OS tasks can be dispatched to a small subset of logical processors. PR/SM ties these logical processors to the same physical processors, thus improving the hardware cache reuse and locality of reference characteristics, such as reducing the rate of cross communication.

To use the correct logical processors, the z/OS dispatcher obtains the necessary information from PR/SM through interfaces that are implemented on the z13. The entire z13 stack (hardware, firmware, and software) now tightly collaborates to obtain the full potential of the hardware. z/VM HiperDispatch provides support similar to the z/OS one.

The HiperDispatch function is enhanced on the z13 to use the new eight-core chip and improve computing efficiency. It is possible to dynamically switch on and off HiperDispatch without requiring an initial program load (IPL).

Note: HiperDispatch is required if SMT is enabled.

3.4 Common time functions of z Systems

Each server must have an accurate time source to maintain a time-of-day value. Logical partitions use their system’s time. When system images participate in a Sysplex, coordinating the time across all system images in the sysplex is critical to its operation.

The z13 supports the Server Time Protocol (STP) and can participate in a STP-only coordinated timing network (CTN).

3.4.1 Server Time Protocol (STP)

STP is a message-based protocol in which timekeeping information is passed over data links between servers. The timekeeping information is transmitted over externally defined coupling links. The STP feature is the supported method for maintaining time synchronization between the z13 and coupling facilities (CF) in sysplex environments.

The STP design uses a concept that is called Coordinated Timing Network (CTN). A CTN is a collection of CPCs that are time-synchronized to a time value called Coordinated Server Time (CST). Each CPC to be configured in a CTN must be STP-enabled. STP is intended for CPCs that are configured to participate in a Parallel Sysplex or CPCs that are not in a Parallel Sysplex, but must be time-synchronized.

STP is implemented in LIC as a system-wide facility of the z13 and other z Systems CPCs. STP presents a single view of time to PR/SM and provides the capability for multiple CPCs to maintain time synchronization with each other. The z13 server is enabled for STP by installing the STP feature code. Extra configuration is required for a z13 to become a member of a CTN.

Important: The IBM z13 cannot join a CTN that includes a z10 or earlier system as a member. Because the z10 was the last server that supported the IBM Sysplex Timer (9037) connectivity, the z13 cannot be configured as a member of a mixed CTN. The z13 can join only an STP-only CTN.

STP provides the following additional value over the former used-time synchronization method by a Sysplex Timer:

•STP supports a multi-site timing network of up to 100 km (62 miles) over fiber optic cabling, without requiring an intermediate site. This protocol allows a Parallel Sysplex to span these distances and reduces the cross-site connectivity that is required for a multi-site Parallel Sysplex.

•The STP design allows more stringent synchronization between CPCs and CFs by using communication links that are already used for the sysplex connectivity. With the z13, STP supports coupling links over InfiniBand or Integrated Coupling Adapter links.

•STP helps eliminate infrastructure requirements, such as power and space, needed to support the Sysplex Timers and helps eliminate maintenance costs that are associated with the Sysplex Timers.

•STP can reduce the fiber optic infrastructure requirements in a multi-site configuration because it can use the coupling links that are already in use.

STP recovery enhancement

When HCA3-O, HCA3-O LR, or ICA SR coupling links are used, an unambiguous “going away signal is sent when the server on which the HCA3 or ICA is running is about to enter a failed state. When the going away signal that is sent by the Current Time Server (CTS) in an STP-only CTN is received by the Backup Time Server (BTS), the BTS can safely take over as the CTS. The takeover can occur without relying on the previous recovery methods of offline signal (OLS) in a two-server CTN or the arbiter in a CTN with three or more servers.

The previously available STP recovery design is still available for the cases when a going away signal is not received or for other failures different from a system failure.

3.4.2 Network Time Protocol (NTP) client support

The use of NTP servers as an external time source (ETS) usually fulfills a requirement for a time source or common time reference across heterogeneous platforms and for providing a higher time accuracy.

NTP client support is available in the Support Element (SE) code of the z13. The code interfaces with the NTP servers. This interaction allows an NTP server to become the single time source for z13 and for other servers that have NTP clients. NTP can be used only for an STP-only CTN environment.

Pulse per second (PPS) support

Two oscillator cards (OSC), included as a standard feature of the z13, provide a dual-path interface for the PPS signal. The cards contain a BNC connector for PPS attachment at the rear side of the CPC frame A. The redundant design allows continuous operation, in case of failure of one card, and concurrent card maintenance.

STP tracks the highly stable accurate PPS signal from the NTP server. PPS maintains accuracy of 10 µs as measured at the PPS input of the z13 CPC.

If STP uses an NTP server without PPS, a time accuracy of 100 ms to the ETS is maintained. A cable connection from the PPS port to the PPS output of an NTP server is required when the z13 is configured for using NTP with PPS as the ETS for time synchronization.

NTP server on HMC with security enhancements

The NTP server capability on the HMC addresses the potential security concerns that users can have for attaching NTP servers directly to the HMC/SE LAN. When using the HMC as the NTP server, the pulse per second capability is not available.

HMC NTP broadband authentication support for z13

The HMC NTP authentication capability is provided by the HMC Level 2.12.0 and later. SE NTP support stays unchanged. To use this option for STP, configure the HMC as the NTP server for the SE.

The authentication support of the HMC NTP server can be set up in either of two ways:

•NTP requests are UDP socket packets and cannot pass through the proxy. If a proxy to access outside corporate data center is used, then this proxy must be configured as an NTP server to get to target servers on the web. Authentication can be set up on the client’s proxy to communicate to the target time sources.

•If a firewall is used, HMC NTP requests must pass through the firewall. Clients in this configuration should use the HMC authentication to ensure untampered time stamps.

For more details about STP, see the following books:

•Server Time Protocol Planning Guide, SG24-7280

•Server Time Protocol Implementation Guide, SG24-7281

3.5 Hardware Management Console (HMC) functions

The HMC and SE are appliances that provide hardware platform management for z Systems. Hardware platform management covers a complex set of setup, configuration, operation, monitoring, and service management tasks and services that are essential to the use of the z Systems hardware platform product.

When tasks are performed on the HMC, the commands are sent to one or more SEs, which issue commands to their CPCs and zBXs.

HMC/SE Version 2.13.0 is the current version available for the z13. See IBM z13 Technical Guide, SG24-8251, for more information about these HMC functions and capabilities, and also zBX Model 004.

3.5.1 HMC key enhancements for z13

The HMC application has several enhancements:

•Tasks and panels are updated to support configuring and managing Flash Express, IBM zAware, zEDC Express, and 10GbE RoCE Express features.

•The Backup for HMC and SE can saved additional to an FTP server for z13.

•OSA/SF is available on the HMC for specific OSA-Express features.

•For STP NTP broadband security, authentication is added to the HMC’s NTP communication with NTP time servers and panels to configure STP are redesigned.

•Modem support is removed from HMC. The Remote Support Facility (RSF) for IBM support, service, and configuration update is only possible through an Ethernet broadband connection.

•The Monitors Dashboard on the HMC and SE is enhanced with an adapter table. The Crypto Utilization percentage is displayed on the Monitors Dashboard according to the PCHID number. The adapter table also displays Flash Express. You can now display the activity for a logical partition (LPAR) by processor type and the Monitors Dashboard is enhanced with showing simultaneous multithreading (SMT) usage.

•The Environmental Efficiency Statistic Task provides historical power consumption and thermal information for z13 on the HMC. This task provides similar data along with a historical summary of processor and channel use. The initial chart display shows the 24 hours that precede the current time so that a full 24 hours of recent data is displayed. The data is presented in table form, graphical (histogram) form, and it can also be exported to a .csv formatted file so that it can be imported into a spreadsheet.

•The microcode update to a specific bundle is possible.

Statements of Direction¹:

•Removal of support for Classic Style User Interface on the Hardware Management Console and Support Element: IBM z13 will be the last z Systems server to support Classic Style User Interface. In the future, user interface enhancements will be focused on the Tree Style User Interface.

•Removal of support for the Hardware Management Console Common Infrastructure Model (CIM) Management Interface: IBM z13 will be the last z Systems server to support the Hardware Console Common Infrastructure Module (CIM) Management Interface. The Hardware Management Console Simple Network Management Protocol (SNMP), and Web Services application programming interfaces (APIs) will continue to be supported.

For more information about the key capabilities and enhancements of the HMC, see IBM z13 Technical Guide, SG24-8251.

3.6 z13 power and cooling functions

As environmental concerns raise the focus on energy consumption, z13 offer a holistic focus on the environment. New efficiencies and functions, such as an improved integrated cooling system and static power save mode enable a reduction of energy usage. The new design of the rear door covers addresses past data center issues regarding airflow challenges. You have the possibility that the covers are in a vectored down or up orientation for the outgoing air.

3.6.1 High voltage DC power

In today’s data centers, many businesses are paying increasing electric bills and are also running out of available power.

This feature allows CPCs to directly use the high voltage DC distribution in new, green data centers. A direct HVDC¹³ data center power design can improve data center energy efficiency by removing the need for a DC-to-AC and AC-to-DC inversion/conversion steps. The z13 bulk power supplies are able to support HVDC, so the only difference in shipped HW to implement the option is the DC power cords.

Because HVDC is a new technology, there are multiple proposed standards. The z13 supports both ground-referenced and dual-polarity (differential) HVDC supplies, such as +/-190V or +/-260V, or +380V. Beyond the data center uninterruptible power supply (UPS) and power distribution energy savings, a z13 running on HVDC power draws 1 - 3% less input power by eliminating the AC-to-DC internal conversion. HVDC does not change the number of power cords that a system requires.

3.6.2 Integrated battery feature (IBF)

IBF is an optional feature on the z13. See Figure 2-2 on page 29 or Figure 2-3 on page 30 for a view of the location of IBF. IBF provides the function of a local uninterrupted power source.

The IBF further enhances the robustness of the power design, increasing power line disturbance immunity. The feature provides battery power to preserve processor data if there is a total loss of power from the utility company. The IBF can hold power briefly during a brownout, or for orderly shutdown in a longer outage.

3.6.3 Power capping and power saving

Power capping limits the maximum power consumption and reduces the cooling requirements especially with the zBX. A z13 server does not support power capping.

A static power-saving mode is also available for the z13 when the Unified Resource Manager Automate Firmware Suite feature is installed. It uses frequency and voltage reduction to reduce energy consumption and can be set up ad hoc or as a scheduled operation. It means, for example, in periods of low utilization or on CBU systems, that clients can set the system in a static power-saving mode. Power Saving functions are also provided for the blades in the zBX.

3.6.4 Power estimation tool

The power estimation tool for z13 is a web-based tool that is available to registered users of IBM Resource Link. The tool allows entering the exact server configuration to produce an estimate of power consumption.

Log in to IBM Resource Link and go to Planning Tools Power Estimation Tools. Specify the quantity for the features that are installed in the machine. The tool estimates the power consumption for the specified configuration. The tool does not verify whether the specified configuration can be physically built.

Power consumption: The exact power consumption for a machine will vary. The objective of the tool is to produce an estimation of the power requirements to aid in planning for machine installation. Actual power consumption after installation can be confirmed with the HMC monitoring tools.

3.6.5 IBM Systems Director Active Energy Manager

IBM Systems Director Active Energy Manager™ is an energy management solution building block that returns true control of energy costs to the client. This feature enables management of the actual power consumption and resulting thermal loads that IBM servers place on the data center. It is an industry-leading cornerstone of the IBM energy management framework. In tandem with chip vendors Intel and AMD, and consortium such as Green Grid, Active Energy Manager advances the IBM initiative to deliver price performance per unit of area.

Active Energy Manager runs on Windows, Linux on System x, AIX, Linux on IBM System p®, and Linux on z Systems. For more information, see the documentation for Active Energy Manager:

http://www.ibm.com/systems/software/director/resources.html

How Active Energy Manager works

The following list is a brief overview of how Active Energy Manager works:

•Hardware, firmware, and systems management software in servers and blades can take inventory of components.

•Active Energy Manager adds power draw-up for each server or blade and tracks that usage over time.

•When power is constrained, Active Energy Manager allows power to be allocated on a server-by-server basis. Consider the following information:

– Be careful that limiting power consumption does not affect performance.

– Sensors and alerts can warn the user if limiting power to this server could affect performance.

•Certain data can be gathered from the SNMP API on the HMC:

– System name, machine type, model, serial number, firmware level

– Ambient and exhaust temperature

– Average and peak power (over a 1-minute period)

– Other limited status and configuration information

3.6.6 Top Exit Power

IBM z13 supports the optional Top Exit Power feature. This feature enables installing a radiator (air) cooled z13 on a non-raised floor, when the optional top exit I/O cabling feature is also installed. Water-cooled z13 models cannot be installed on a non-raised floor as top exit support for water cooling systems is not available. On a raised floor, either radiator or water cooling is supported.

3.7 IBM z BladeCenter Extension (zBX) Model 004

z13 introduces the IBM z BladeCenter Extension (zBX) Model 004. The IBM z BladeCenter Extension (zBX) Model 004 continues to support workload optimization and integration. As a stand-alone node of an existing ensemble the zBX can house multiple environments that include AIX, Linux on System x, and Windows, supporting a “fit for purpose” application deployment.

The zBX is tested and packaged together at the IBM manufacturing site and shipped as one unit, relieving complex configuration and set up requirements. With a focus on availability, the zBX has hardware redundancy that is built in at various levels: the power infrastructure, rack-mounted network switches, power and switch units in the BladeCenter chassis, and redundant cabling for support and data connections. The IBM z BladeCenter Extension (zBX) Model 004 components are configured, managed, and serviced using a pair of internal 1U rack mounted Support Elements as a node of an ensemble, defined to the ensemble HMC as any other ensemble member.

Although the zBX processors are not z/Architecture PUs, the zBX is handled by z Systems firmware called IBM z Unified Resource Manager.

GDPS/PPRC and GDPC/GM support zBX hardware components, providing workload failover for automated multi-site recovery. These capabilities can help facilitate the management of planned and unplanned outages across IBM z13.

3.7.1 IBM blades

zBX Model 004 supports IBM AIX on IBM POWER7, Linux on System x, Microsoft Windows on System x and IBM WebSphere DataPower Integration Appliance XI50 for zEnterprise on a blade form factor, which are connected to the z Systems CPCs through virtual LANs supported on a high-speed private network.

IBM BladeCenter PS701 Express blades virtualized by PowerVM Enterprise Edition. The virtual servers in PowerVM run the AIX operating system. PowerVM handles all the access to the hardware resources, providing a Virtual I/O Server (VIOS) function and the ability to create logical partitions. The logical partitions can be either dedicated processor LPARs, which require a minimum of one core per partition, or shared processor LPARs (micro-partitions), which in turn can be as small as 0.1 core per partition.

A select set of IBM BladeCenter HX5 (7873) blades can be used by the zBX. These blades have an integrated hypervisor, and their virtual machines run Linux on System x and Windows Server 2012.

When ordering a zBX Model 004 MES upgrade, a new entitlement record can be acquired by the client. This new entitlement record allows IBM System x blades or IBM POWER7 PS701 to be ordered and added to the zBX, up to the limit of available empty (not used) slots in the zBX existing blade centers.

Unsupported: The addition of new racks or new blade centers cannot be done and are not supported. Also the addition of the IBM WebSphere DataPower Integration Appliance XI50 for zEnterprise is not supported.

3.8 Reliability, availability, and serviceability (RAS)

The IBM z Systems family presents numerous enhancements in the RAS areas. Focus was given to reducing the planning requirements, while continuing to reduce planned, scheduled, and unscheduled outages. One of the contributors to scheduled outages are LIC Driver updates that are performed in support of new features and functions. Enhanced driver maintenance (EDM) can help reduce the necessity and eventual duration of a scheduled outage. When properly configured, the z13 can concurrently activate a new LIC Driver level. Concurrent activation of the select new LIC Driver level is supported at specifically released synchronization points. However, for certain LIC updates, a concurrent update or upgrade is not possible.

The effects of drawer repair and upgrade actions are minimized on the z13 with enhanced drawer availability (EDA). In a multiple drawer system, a single drawer can be concurrently removed and reinstalled for an upgrade or repair. To ensure that the z13 configuration supports removal of a drawer with minimal affect to the workload, consider the Flexible Memory option (see “Flexible memory” on page 73).

The z13 provides a method to increase memory availability, referred to as redundant array of independent memory¹⁴ (RAIM), where a fully redundant memory system can identify and correct memory errors without stopping. The implementation is similar to the RAID concept used in storage systems for a number of years. See IBM z13 Technical Guide, SG24-8251 for a detailed description of the RAS features.

To help prevent outages, improvements in several components of z13 are introduced. These enhancements include changes to the:

•Physical packaging

•Bus structures

•Processor cores

•Memory and cache hierarchy

•Power subsystem

•Thermal subsystem

•Service subsystem

•Integrated sparing.

The z13 central processor complex (CPC) subsystem consists of a horizontal drawers designed as a field replaceable unit (FRU). Connections among the drawers are established using symmetric multiprocessing (SMP) cables. Each drawer is consists of two nodes, and each node contains three processor unit (PU) chips, one system cache (SC) chip, and 10 or 15 DDR3 DIMM slots. With a two-node drawer structure, the z13 design supports system activation with partial-drawer resources in a degraded mode, if necessary. The PU and SC chips are designed as single chip modules (SCMs) and FRUs.

A redundant pair of distributed converter assemblies (DCAs) step down the bulk power and connect to 10 point of load (POL) cards, which provide power conversion and regulation. Two redundant oscillators are connected to the drawers through an isolated backplane. Time domain reflectometry (TDR) techniques are applied to isolate failures on the SMP cables, between chips (PU-PU, PU-SC, and SC-SC), and between the PU chips and DIMMs.

Additional redundancy is designed into new N+1 system control hubs (SCHs) and associated power supplies, and 1U service elements (SEs). Improvements to the z13 I/O infrastructure reliability include better recovery of FICON channels facilitated through forward error correction code (FEC) technology.

An air-cooled configuration features a fully-redundant N+2 radiator pump design that cools the PU chips through a water manifold FRU.

Further RAS enhancements include integrated sparing, error detection and recovery improvements in caches and memory, refreshes to IBM zAware, Flash Express, RoCE, and PCIe coupling, Fibre Channel Protocol support for T10-DIF, a fixed HSA with its size increased to 96 GB on the z13, OSA firmware changes to increase the capability of concurrent maintenance change level (MCL) updates, a new radiator cooling system with N+2 redundancy, new CFCC level, and IBM RMF™ reporting.

z13 continues to support concurrent addition of resources, such as processors or I/O cards to an LPAR to achieve better serviceability. If an additional system assist processor (SAP) is required on a z13 (for example, as a result of a disaster recovery situation), the SAPs can be concurrently added to the CPC configuration.

Concurrently adding CP, zIIP, IFL, and ICF processors to an LPAR is possible. This function is supported by z/VM V5R4¹⁵ and later, and also (with appropriate PTFs) by z/OS and z/VSE V4R3 and later. Previously, proper planning was required to add CP, zAAP, and zIIP to a z/OS LPAR concurrently. Concurrently adding memory to an LPAR is possible. This is supported by z/OS and z/VM.

z13 supports adding Crypto Express features to an LPAR dynamically by changing the cryptographic information in the image profiles. Users can also dynamically delete or move Crypto Express features. This enhancement is supported by z/OS, z/VM, and Linux on z Systems.

3.8.1 IBM z Advanced Workload Analysis Reporter (IBM zAware)

Introduced with the zEC12 and also available with the zBC12, the IBM zAware feature is an integrated expert solution that uses sophisticated analytics to help clients identify potential problems and improve overall service levels.

IBM zAware runs analytics in a dedicated Logical Partition (LPAR) and intelligently examines z/OS message logs for potential deviations, or inconsistencies, or variations from the norm, providing out-of-band monitoring and machine learning of operating system health.

IBM zAware can accurately identify system anomalies in minutes. This feature analyzes massive amounts of processor data to identify problematic messages and provides information that can feed other processes or tools. The IBM zAware virtual appliance monitors the z/OS operations log (OPERLOG), which contains all messages that are written to the z/OS console, including application-generated messages. IBM zAware provides a graphical user interface (GUI) for to help you easily drill-down into message anomalies, which can lead to faster problem resolution.

IBM zAware is enhanced to support Linux on z Systems images running natively or as guests in z/VM, identifying the unusual system behavior by analyzing the syslog.

Statement of Direction¹: IBM intends to deliver IBM z Advanced Workload Analysis Reporter (IBM zAware) support for z/VM. This future release of IBM zAware is intended to help identify unusual behaviors of workloads running on z/VM in order to accelerate problem determination and improve service levels

For more information about IBM zAware, see these sources:

•IBM z13 Technical Guide, SG24-8251

•Extending z/OS System Management Functions with IBM zAware, SG24-8070

•IBM z Advanced Workload Analysis Reporter (IBM zAware) Guide V2.0, SC27-2632

3.8.2 RAS capability for the SE

Enhancements are made to the Support Element (SE) design for z13. Notebooks that were used on prior generations of z System servers have been replaced with rack-mounted 1U System x servers in a redundant configuration on z13. The new, more powerful 1U SEs offer RAS improvements such as ECC memory, redundant physical networks for SE networking requirements, redundant power modules, and better thermal characteristics.

3.8.3 RAS capability for the HMC

Enhancements are made to the HMC designs for z13 also. New for z13 is an option to order 1U System x servers for traditional and ensemble HMC configurations. This new 1U HMC offers the same RAS improvements as those of the 1U SE. The 1U HMC option is a customer-supplied rack and power consolidation solution that can save space in data centers. The MiniTower design used prior to z13 will still be available.

The Unified Resource Manager is an active part of the ensemble infrastructure. Thus, the HMC has a stateful environment that needs high-availability features to ensure survival of the system in case of an HMC failure.

Each ensemble requires two HMC workstations:

•A primary

•A backup (alternate)

The contents and activities of the primary are updated on the alternate HMC synchronously so that the alternate can take over the activities of the primary should the primary fail. Although the primary HMC can perform the classic HMC activities in addition to the Unified Resource Manager activities, the alternate HMC can be only a backup.

3.8.4 RAS capability for zBX

The zBX was built following the traditional z Systems hardware quality of service (QoS) to include RAS capabilities. The zBX Model 004 provides extended service capability as a member of the ensemble infrastructure. Flexibility and scale-out are improved by eliminating the management coupling between a controlling CPC and the zBX Model 004. CPC upgrade complexities are also reduced. With the zBX Model 004 configured as an independent node in the ensemble, serviceability on the zBX Model 004 does not affect other CPCs in the ensemble and vice versa. The ensemble HMC provides management and control functions for the zBX Model 004 solution.

Independent of the number of zBX racks installed, the zBX Model 004 is configured to provide N+1 redundancy. Two SEs with HMC network connectivity are included in the upgrade. Installed only on zBX Model 004’s first rack are four Top of Rack (TOR) switches, two for each network (INMN and IEDN). These switches provide N + 1 connectivity for the data network between a CPC and the zBX Model 004, and for the management network used for monitoring and controlling the zBX Model 004 components. The zBX components can be replaced concurrently.

zBX firmware

The testing, delivery, installation, and management of the zBX Model 004 firmware are handled the same way as for the z13. The same z13 processes and controls are used. Any fixes to the zBX Model 004 are downloaded and applied independent of the CPCs in the ensemble. The zBX Model 004 SE will contain only the applicable MCL streams for the zBX configurations. The z13 will not contain any of the blade MCL streams. Most MCLs for the zBX Model 004 are concurrent and their status can be viewed at the ensemble’s HMCs.

These and other features are described in IBM z13 Technical Guide, SG24-8251.

3.9 High availability

z Systems platform is renowned for its reliability, availability, and serviceability capabilities, of which Parallel Sysplex is an exponent. Extended availability technology with IBM PowerHA® for AIX is available for blades in the zBX. We describe the z Systems Parallel Sysplex technology and the PowerHA technology.

3.9.1 High availability for z Systems with Parallel Sysplex

Parallel Sysplex technology is a clustering technology for logical and physical servers, allowing highly reliable, redundant, and robust z Systems technology to achieve availability that is near-continuous. Both hardware and software tightly cooperate to achieve this result.

A Parallel Sysplex has the following minimum components:

•Coupling facility (CF)

This is the cluster center. It can be implemented either as an LPAR of a stand-alone z Systems CPC or as an additional LPAR of a z Systems CPC where other loads are running. Processor units that are characterized as either CPs or ICFs can be configured to this LPAR. ICFs are often used because they do not incur any software license charges. Two CFs are recommended for availability.

•Coupling Facility Control Code (CFCC)

This IBM Licensed Internal Code is both the operating system and the application that runs in the CF. No other code runs in the CF. The code is used to create and maintain the structures, which are exploited under z/OS by software components such as z/OS itself, DB2 for z/OS, WebSphere MQ, among others.

CFCC can also run in a z/VM virtual machine (as a z/VM guest system). In fact, a complete sysplex can be set up under z/VM, allowing, for instance, testing and operations training. This setup is not recommended for production environments.

•Coupling links

These are high-speed links that connect the several system images (each running in its own logical partition) that participate in the Parallel Sysplex. At least two connections between each physical server and the CF must exist. When all of the system images belong to the same physical server, internal coupling links are used.

On the software side, the z/OS operating system uses the hardware components to create a Parallel Sysplex. One example of z/OS and CF collaboration is the System-managed CF structure duplexing, which provides a general-purpose, hardware-assisted, easy-to-exploit mechanism for duplexing structure data hold in CFs. This function provides a robust recovery mechanism for failures (such as loss of a single structure on CF or loss of connectivity to a single CF). The recovery is done through rapid failover to the other structure instance of the duplex pair.

If you are interested in deploying system-managed CF structure duplexing, read the technical paper System-Managed CF Structure Duplexing, ZSW01975USEN, which you can access by clicking Learn more on the Parallel Sysplex website:

http://www.ibm.com/systems/z/pso/index.html

z/TPF: z/TPF can also use the CF hardware components. However, the term sysplex exclusively applies to z/OS usage of the CF.

Normally, two or more z/OS images are clustered to create a Parallel Sysplex, although it is possible to have a configuration setting with a single image, called a monoplex. Multiple clusters can span several z Systems CPCs, although a specific image (logical partition) can belong to only one Parallel Sysplex.

A z/OS Parallel Sysplex implements shared-all access to data. This is facilitated by z Systems I/O virtualization capabilities such as the multiple image facility (MIF). MIF allows several logical partitions to share I/O paths in a secure way, maximizing use and greatly simplifying the configuration and connectivity.

In short, a Parallel Sysplex comprises one or more z/OS operating system images that are coupled through one or more coupling facilities. A properly configured Parallel Sysplex cluster is designed to maximize availability at the application level. Rather than a quick recovery of a failure, the Parallel Sysplex design objective is zero failure.

The major characteristics of a Parallel Sysplex include the following features:

•Data sharing with integrity

The CF is key to the implementation of a share-all access to data. Every z/OS system image has access to all the data. Subsystems in z/OS declare resources to the CF. The CF accepts and manages lock and unlock requests on those resources, guaranteeing data integrity. A duplicate CF further enhances the availability. Key users of the data sharing capability are DB2, WebSphere MQ, WebSphere ESB, IMS, and CICS. Because these are major infrastructure components, applications that use them inherently benefit from sysplex characteristics. For instance, many large SAP implementations have the database component on DB2 for z/OS, in a Parallel Sysplex.

•Continuous (application) availability

Changes, such as software upgrades and patches, can be introduced one image at a time, while the remaining images continue to process work. For more details, see Improving z/OS Application Availability by Managing Planned Outages, SG24-8178.

•High capacity

Parallel Sysplex scales from two to 32 images. Remember that each image can have from one to 128 (z/OS V2R1) processor units. CF scalability is near-linear. This structure contrasts with other forms of clustering that employ n-to-n messaging, which leads to rapidly degrading performance with a growing number of nodes.

•Dynamic workload balancing

Viewed as a single logical resource, work can be directed to any of the Parallel Sysplex cluster operating system images where capacity is available.

•Systems management

This architecture provides the infrastructure to satisfy a client requirement for continuous availability, while enabling techniques for achieving simplified systems management consistent with this requirement.

•Resource sharing

A number of base z/OS components use CF shared storage. This usage enables the sharing of physical resources with significant improvements in cost, performance, and simplified systems management.

•Single system image

The collection of system images in the Parallel Sysplex is displayed as a single entity to the operator, user, database administrator, and so on. A single system image ensures reduced complexity from both operational and definition perspectives.

•N-2 support

Multiple hardware generations (normally three, which are the current and the two previous ones) are supported in the same Parallel Sysplex. This configuration provides for a gradual evolution of the systems in the Sysplex, without forcing changing all simultaneously. Similarly, software support for multiple releases or versions is supported.

Figure 3-3 illustrates the components of a Parallel Sysplex as implemented within the z Systems architecture. The diagram shows one of many possible Parallel Sysplex configurations.

Figure 3-3 Sysplex hardware overview

Figure 3-3 shows a z13 system that contains multiple z/OS sysplex partitions and an internal coupling facility (CF02), a z13 server containing a stand-alone CF (CF01), and a zEC12 containing multiple z/OS sysplex partitions. STP over coupling links provides time synchronization to all servers. Appropriate CF link technology (1x IFB, 12x IFB, or ICA-SR) selection depends on server configuration and how distant they are physically located. ICA-SR links can only be used from z13 to z13, within a short distance.

3.9.2 PowerHA in zBX environment

High availability for applications running on AIX is provided by the IBM PowerHA SystemMirror® for AIX (formerly known as IBM HACMP™¹⁶). PowerHA is easy to configure (menu-driven) and helps define and manage resources (required by applications) running on AIX by providing service and application continuity through platform resources and application monitoring, and automated actions (start/manage/monitor/restart/move/stop).

Terminology: Resource movement and application restart on the second server is known as failover.

Automating the failover process speeds up recovery and allows for unattended operations, thus providing improved application availability.

A PowerHA configuration or cluster consists of two or more servers¹⁷ (up to 32) that have their resources managed by PowerHA cluster services to provide automated service recovery for the applications managed. Servers can have physical or virtual I/O resources, or a combination of both.

PowerHA does the following functions at the cluster level:

•Manage and monitor operating systems and hardware resources.

•Manage and monitor application processes.

•Manage and monitor network resources.

•Automate applications (start, stop, restart, move).

The virtual servers that are defined and managed in zBX use only virtual I/O resources. PowerHA can manage both physical and virtual I/O resources (virtual storage and virtual network interface cards).

PowerHA can be configured to perform automated service recovery for the applications that run in virtual servers that are deployed in zBX. PowerHA automates application failover from one virtual server in an IBM POWER® processor-based blade to another virtual server in a different POWER processor-based blade that has a similar configuration.

Failover protects service (masks service interruption) in case of unplanned or planned (scheduled) service interruption. During failover, users might experience a short service interruption while resources are configured by PowerHA on the new virtual server.

The PowerHA configuration for the zBX environment is similar to standard POWER environments, with the particularity that it uses only virtual I/O resources. Currently, PowerHA for zBX support is limited to failover inside the same zBX.

Figure 3-4 shows a typical PowerHA cluster.

Figure 3-4 Typical PowerHA cluster diagram

For more information about IBM PowerHA SystemMirror, see the following web page:

http://www.ibm.com/support/knowledgecenter/SSPHQG/welcome

3.10 IBM z Systems and emerging paradigms

Having reviewed the most recent and important characteristics of z Systems, we conclude this discussion with observations on the role that z13 can play in today’s leading IT initiatives.

We are witnessing a transformation of the interaction between users and systems, increasingly based on mobile devices, and instrumented devices (“the Internet of things”). This front office transformation requires highly responsive and dynamic transaction systems, and demands high security. As described in 3.1.4, “z Systems based clouds” on page 66, and evidenced by the descriptions of z13, these systems can answer the infrastructure hardware requirements, whether for I/O bandwidth, computing dynamic scalability, or security.

In addition, software requirements are covered. We note that several software licensing offerings are available on z Systems to cater to various environments and workloads. In particular, Linux on z Systems closely follows the distributed paradigm; see Appendix A, “Software licensing” on page 159.

Several transactional servers are available on the Linux on z and z/OS environments, which can be used by mobile applications such as those developed with the state-of-the-art IBM MobileFirst Foundation software. Those applications can benefit from the unmatched reliability, availability and, serviceability (RAS) features offered by a z Systems.

IBM DB2 Analytics Accelerator, which can transparently, that is without application modification, benefit from the radical acceleration of complex queries such as those used by business intelligence and data analytics, enabling their insertion into online applications.

Finally, but no less important, the need for a coherent security landscape across the enterprise is increasingly being recognized. The z13 have specialized offerings such as the Enterprise Key Management Foundation enabling their security features, such as the Crypto Express and secure key management, to be used by the larger enterprise.

For further information about how to benefit from IBM z13, see Chapter 5, “A system of insight for digital business” on page 135.

¹ The zEC12 and zBC12 were the last z Systems servers to offer support for zAAP specialty engine processors. IBM supports running zAAP workloads on zIIP processors (“zAAP on zIIP”).

² z/VM V5R4 is not supported on z13.

³ z/VSE V4R3 not supported on z13.

⁴ z/OS support only

⁵ Commercial name was VM/370.

⁶ zManager does not support systems management of z/VM on z13.

⁷ Service is required.

⁸ IMS 12 needs PTF for CQS interface buffers and IMS 13 includes this support also.

⁹ BSAM/QSAM: basic sequential access method and queued sequential access method

¹⁰ Feature sharing and dual port are not supported on zEC12 and zBC12.

¹¹ Federal Information Processing Standards (FIPS) 140-2 Security Requirements for Cryptographic Modules

¹² Processor Capacity Index (PCI)

¹³ HVDC: high voltage direct current

¹⁴ Meaney, P.J.; Lastras-Montano, L.A.; Papazova, V.K.; Stephens, E.; Johnson, J.S.; Alves, L.C.; O'Connor, J.A.; Clarke, W.J., “IBM zEnterprise redundant array of independent memory subsystem,” IBM Journal of Research and Development, vol.56, no.1.2, pp.4:1,4:11, Jan.-Feb. 2012, doi: 10.1147/JRD.2011.2177106

¹⁵ z/VM V5R4 is not supported on z13.

¹⁶ High Availability Cluster Multi-Processing

¹⁷ Servers can be also virtual servers. One server equals one instance of the AIX Operating System.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Chapter 3. Key functions and capabilities of IBM z13

Create new playlist

Sign In

Sign Up

Table of Contents for
Chapter 3. Key functions and capabilities of IBM z13