Chapter 3. Virtualization

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Virtualization

Virtualization is a key factor for productive and efficient use of IBM Power Systems servers. In this chapter, you find a brief description of the virtualization technologies that are available for POWER9 processor-based servers. The following IBM Redbooks publications provide more information about the virtualization features.

•IBM PowerVM Best Practices, SG24-8062

•IBM PowerVM Virtualization Introduction and Configuration, SG24-7940

•IBM PowerVM Virtualization Active Memory Sharing, REDP-4470

•IBM PowerVM Virtualization Managing and Monitoring, SG24-7590

•IBM Power Systems SR-IOV: Technical Overview and Introduction, REDP-5065

3.1 POWER Hypervisor

IBM Power Systems servers that are combined with PowerVM technology offer key capabilities that can help you consolidate and simplify your IT environment:

•Improve server usage and share I/O resources to reduce the total cost of ownership (TCO) and better use IT assets.

•Improve business responsiveness and operational speed by dynamically reallocating resources to applications as needed to better match your changing business needs or handle unexpected changes in demand.

•Simplify IT infrastructure management by making workloads independent of hardware resources so that you can make business-driven policies to deliver resources based on time, cost, and service-level requirements.

Combined with features in the POWER9 processors, the IBM POWER Hypervisor delivers functions that enable other system technologies, including logical partitioning technology, virtualized processors, IEEE VLAN-compatible virtual switches, virtual SCSI adapters, virtual Fibre Channel adapters, and virtual consoles. The POWER Hypervisor is a basic component of the system’s firmware and offers the following functions:

•Provides an abstraction between the physical hardware resources and the logical partitions (LPARs) that use them.

•Enforces partition integrity by providing a security layer between LPARs.

•Controls the dispatch of virtual processors to physical processors.

•Saves and restores all processor state information during a logical processor context switch.

•Controls hardware I/O interrupt management facilities for LPARs.

•Provides virtual LAN channels between LPARs that help reduce the need for physical Ethernet adapters for inter-partition communication.

•Monitors the service processor and performs a reset or reload if it detects the loss of the service processor, notifying the operating system if the problem is not corrected.

The POWER Hypervisor is always active, regardless of the system configuration and also when not connected to the managed console. It requires memory to support the resource assignment to the LPARs on the server. The amount of memory that is required by the POWER Hypervisor firmware varies according to several factors:

•Memory usage for hardware page tables (HPTs)

•Memory usage for I/O devices

•Memory usage for virtualization features

Memory usage for hardware page tables

Each partition on the system has its own HPT that contributes to hypervisor memory usage. The HPT is used by the operating system to translate from effective addresses to physical real addresses in the hardware. This translation from effective to real addresses allows multiple operating systems to run simultaneously in their own logical address space. Whenever a virtual processor for a partition is dispatched on a physical processor, the hypervisor indicates to the hardware the location of the partition HPT that should be used when translating addresses.

The amount of memory for the HPT is based on the maximum memory size of the partition and the HPT ratio. The default HPT ratio is either 1/64th of the maximum (for IBM i partitions) or 1/128th (for AIX, Virtual I/O Server (VIOS), and Linux partitions) of the maximum memory size of the partition. AIX, VIOS, and Linux use larger page sizes (16 KB and 64 KB) instead of using 4 KB pages. Using larger page sizes reduces the overall number of pages that must be tracked so that the overall size of the HPT can be reduced. As an example, for an AIX partition with a maximum memory size of 256 GB, the HPT would be 2 GB. When defining a partition, the maximum memory size that is specified should be based on the amount of memory that can be dynamically added to the partition (DLPAR) without having to change the configuration and restart the partition.

In addition to setting the maximum memory size, the HPT ratio can also be configured. The hpt_ratio parameter on the chsyscfg Hardware Management Console (HMC) command can be issued to define the HPT ratio to be used for a partition profile. The valid values are 1:32, 1:64, 1:128, 1:256, or 1:512. Specifying a smaller absolute ratio (1/512 is the smallest value) decreases the overall memory that is assigned to the HPT. Testing is required when changing the HPT ratio because a smaller HPT may incur more CPU consumption because the operating system might need to reload the entries in the HPT more frequently. Most customers choose to use the IBM provided default values for the HPT ratios.

Memory usage for I/O devices

In support of I/O operations, the hypervisor maintains structures that are called the Translation Control Entries (TCEs), which provide an information path between I/O devices and partitions. The TCEs provide the address of the I/O buffer, indication of read versus write requests, and other I/O-related attributes.

There are many TCEs in use per I/O device so that multiple requests can be active simultaneously to the same physical device. To provide better affinity, the TCE entries are spread across multiple processor chips or drawers to improve performance while accessing the TCEs. For physical I/O devices, the base amount of space for the TCEs is defined by the hypervisor based on the number of I/O devices that are supported. A system that supports high-speed adapters can also be configured to allocate more memory to improve I/O performance.

Currently, Linux is the only operating system that uses these additional TCEs, so the memory can be freed for use by partitions if the system is using only AIX or IBM i.

Memory usage for virtualization features

Virtualization requires more memory to be allocated by the POWER Hypervisor for hardware statesave areas and various virtualization technologies. For example, on POWER9 processor-based systems, each processor core supports up to eight simultaneous multithreading (SMT) threads of execution and each thread contains over 80 different registers. The POWER Hypervisor must set aside save areas for the register contents for the maximum number of virtual processors that is configured. The greater the number of physical hardware devices, the greater the number of virtual devices, the greater the amount of virtualization, and the more hypervisor memory is required. For efficient memory usage, wanted and maximum values for various attributes (processors, memory, and virtual adapters) should be based on business needs, and not set to values that are significantly higher than the actual requirements.

Predicting memory that is used by the POWER Hypervisor

The IBM System Planning Tool (SPT) is a resource that you can use to estimate the amount of hypervisor memory that is required for a specific server configuration. After the SPT executable file is downloaded and installed, a configuration can be defined by selecting the appropriate hardware platform, selecting installed processors and memory, and defining partitions and partition attributes. Given a configuration, the SPT can estimate the amount of memory that will be assigned to the hypervisor, which is helpful when you change an existing configuration or deploy new servers.

The POWER Hypervisor provides the following types of virtual I/O adapters:

•Virtual SCSI

•Virtual Ethernet

•Virtual Fibre Channel

•Virtual (TTY) console

3.1.1 Virtual SCSI

The POWER Hypervisor provides a virtual SCSI mechanism for the virtualization of storage devices. The storage virtualization is accomplished by using two paired adapters: a virtual SCSI server adapter and a virtual SCSI client adapter.

3.1.2 Virtual Ethernet

The POWER Hypervisor provides a virtual Ethernet switch function that allows partitions either fast and secure communication on the same server without any need for physical interconnection or connectivity outside of the server if a Layer 2 bridge to a physical Ethernet adapter is set in one VIOS partition, also known as a Shared Ethernet Adapter (SEA).

3.1.3 Virtual Fibre Channel

A virtual Fibre Channel adapter is a virtual adapter that provides client LPARs with a Fibre Channel connection to a storage area network through the VIOS partition. The VIOS partition provides the connection between the virtual Fibre Channel adapters on the VIOS partition and the physical Fibre Channel adapters on the managed system.

3.1.4 Virtual (TTY) console

Each partition must have access to a system console. Tasks such as operating system installation, network setup, and various problem analysis activities require a dedicated system console. The POWER Hypervisor provides the virtual console by using a virtual TTY or serial adapter and a set of Hypervisor calls to operate on them. Virtual TTY does not require the purchase of any additional features or software, such as the PowerVM Edition features.

3.2 POWER processor modes

Although they are not virtualization features, the POWER processor modes are described here because they affect various virtualization features.

On Power Systems servers, partitions can be configured to run in several modes, including the following modes:

•POWER7 compatibility mode

This is the mode for POWER7+ and POWER7 processors, implementing Version 2.06 of the IBM Power Instruction Set Architecture (ISA). For more information, see
IBM Knowledge Center.

•POWER8 compatibility mode

This is the native mode for POWER8 processors implementing Version 2.07 of the IBM Power ISA. For more information, see IBM Knowledge Center.

•POWER9 compatibility mode

This is the native mode for POWER9 processors implementing Version 3.0 of the IBM Power ISA. For more information, see IBM Knowledge Center.

Figure 3-1 shows available processor modes on a POWER9 processor-based server.

Figure 3-1 POWER9 processor modes

Processor compatibility mode is important when Live Partition Mobility (LPM) migration is planned between different generation of servers. An LPAR that potentially might be migrated to a server that is managed by a processor from other generation must be activated in a specific compatibility mode.

Table 3-1 shows an example where a processor mode must be selected when migration from a POWER9 processor-based server to POWER8 processor-based server is planned.

Table 3-1 Processor compatibility modes for POWER 9 to POWER8 migration

Source environment POWER9 processor-based server		Destination environment POWER8 processor-based server
		Active migration		Inactive migration
Wanted processor compatibility mode	Current processor compatibility mode	Wanted processor compatibility mode	Current processor compatibility mode	Wanted processor compatibility mode	Current processor compatibility mode
POWER9	POWER9	Fails because the wanted processor mode is not supported on the destination.		Fails because the wanted processor mode is not supported on the destination.
POWER9	POWER8	Fails because the wanted processor mode is not supported on the destination.		Fails because the wanted processor mode is not supported on the destination.
Default	POWER9	Fails because the wanted processor mode is not supported on the destination.		Default	Power8
POWER8	POWER8	POWER8	POWER8	POWER8	POWER8
Default	POWER8	DEFAULT	POWER8	Default	POWER8
POWER7	POWER7	POWER7	POWER7	POWER7	POWER7

3.3 Active Memory Expansion

Active Memory Expansion is an optional feature for the Power H924 server when you select #4794 or for the Power H922 server when you select #4793 in the e-Config tool.

This feature enables memory expansion on the system. By using compression and decompression of memory, content can effectively expand the maximum memory capacity, providing more server workload capacity and performance.

Active Memory Expansion is a technology that allows the effective maximum memory capacity to be much larger than the true physical memory maximum. Compression and decompression of memory content can allow memory expansion up to 125% for AIX partitions, which in turn enables a partition to perform more work or support more users with the same physical amount of memory. Similarly, it can allow a server to run more partitions and do more work for the same physical amount of memory.

Note: The Active Memory Expansion feature is not supported by IBM i or Linux.

3.4 Single root I/O virtualization

Single root I/O virtualization (SR-IOV) is an extension to the Peripheral Component Interconnect Express (PCIe) specification that allows multiple operating systems to simultaneously share a PCIe adapter with little or no runtime involvement from a hypervisor or other virtualization intermediary.

SR-IOV is PCI standard architecture that enables PCIe adapters to become self-virtualizing. It enables adapter consolidation, through sharing, much like logical partitioning enables server consolidation. With an adapter capable of SR-IOV, you can assign virtual slices of a single physical adapter to multiple partitions through logical ports; all of this is done without the need for a VIOS.

POWER9 processor-based servers provide the following SR-IOV enhancements:

•Faster ports: 10 Gb, 25 Gb, 40 Gb, and 100 Gb.

•More virtual functions (VFs) per port: The target has 60 VFs per port (120 VFs per adapter) for 100 Gb adapters.

•vNIC and vNIC failover support for Linux.

For more information, see IBM Power Systems SR-IOV: Technical Overview and Introduction, REDP-5065.

3.5 PowerVM

The PowerVM platform is the family of technologies, capabilities, and offerings that delivers industry-leading virtualization on Power Systems servers. It is the umbrella branding term for Power Systems virtualization (logical partitioning, IBM Micro-Partitioning®, POWER Hypervisor, VIOS, LPM, and more). As with Advanced Power Virtualization in the past, PowerVM is a combination of hardware enablement and software.

The Power H922 and the Power H924 servers come with PowerVM Enterprise Edition
(#5228) by default. Furthermore, a temporary PowerVM Enterprise license
(#ELPM) is included so that older servers can make a seamless move to POWER9 at no additional cost.

Logical partitions

LPARs and virtualization increase the usage of system resources and add a level of configuration possibilities.

Logical partitioning is when a server runs as though it were two or more independent servers. When you logically partition a server, you divide the resources on the server into subsets that are called LPARs. You can install software on an LPAR, and the LPAR runs as an independent logical server with the resources that you allocated to the LPAR. An LPAR is the equivalent of a virtual machine (VM).

You can assign processors, memory, and input/output devices to LPARs. You can run AIX, IBM i, Linux, and VIOS in LPARs. VIOS provides virtual I/O resources to other LPARs with general-purpose operating systems.

LPARs share a few system attributes, such as the system serial number, system model, and processor FC. All other system attributes can vary from one LPAR to another.

Micro-Partitioning

When you use the Micro-Partitioning technology, you can allocate fractions of processors to an LPAR. An LPAR that uses fractions of processors is also known as a shared processor partition or micropartition. Micropartitions run over a set of processors that is called a shared processor pool (SPP), and virtual processors are used to let the operating system manage the fractions of processing power that are assigned to the LPAR. From an operating system perspective, a virtual processor cannot be distinguished from a physical processor unless the operating system is enhanced to determine the difference. Physical processors are abstracted into virtual processors that are available to partitions.

On the POWER9 processors, a partition can be defined with a processor capacity as small as 0.05processing units. This number represents 0.05 of a physical core. Each physical core can be shared by up to 20 shared processor partitions, and the partition’s entitlement can be incremented fractionally by as little as 0.01 of the processor. The shared processor partitions are dispatched and time-sliced on the physical processors under control of the POWER Hypervisor. The shared processor partitions are created and managed by the HMC.

The Power H922 server supports up to 20 cores in a single system. Here are the maximum numbers:

•20 dedicated partitions

•400 micropartitions (a maximum of 20 micropartitions per physical active core)

The Power H924 server supports up to 24 cores in a single system. Here are the maximum numbers:

• 24 dedicated partitions

• 480 micropartitions (a maximum of 20 micropartitions per physical active core)

The maximum amounts are supported by the hardware, but the practical limits depend on application workload demands.

Processing mode

When you create an LPAR, you can assign entire processors for dedicated use, or you can assign partial processing units from an SPP. This setting defines the processing mode of the LPAR.

Dedicated mode

In dedicated mode, physical processors are assigned as a whole to partitions. The SMT feature in the POWER9 processor core allows the core to run instructions from two, four, or eight independent software threads simultaneously.

Shared dedicated mode

On POWER9 processor-based servers, you can configure dedicated partitions to become processor donors for idle processors that they own, allowing for the donation of spare CPU cycles from dedicated processor partitions to an SPP. The dedicated partition maintains absolute priority for dedicated CPU cycles. Enabling this feature can help increase system usage without compromising the computing power for critical workloads in a dedicated processor.

Shared mode

In shared mode, LPARs use virtual processors to access fractions of physical processors. Shared partitions can define any number of virtual processors (the maximum number is 20 times the number of processing units that are assigned to the partition). The POWER Hypervisor dispatches virtual processors to physical processors according to the partition’s processing units entitlement. One processing unit represents one physical processor’s processing capacity. All partitions receive a total CPU time equal to their processing unit’s entitlement. The logical processors are defined on top of virtual processors. So, even with a virtual processor, the concept of a logical processor exists, and the number of logical processors depends on whether SMT is turned on or off.

3.5.1 Multiple shared processor pools

Multiple shared processor pools (MSPPs) are supported on POWER9 processor-based servers. This capability allows a system administrator to create a set of micropartitions with the purpose of controlling the processor capacity that can be used from the physical SPP.

Micropartitions are created and then identified as members of either the default processor pool or a user-defined SPP. The virtual processors that exist within the set of micropartitions are monitored by the POWER Hypervisor, and processor capacity is managed according to user-defined attributes.

If the Power Systems server is under heavy load, each micropartition within an SPP is ensured its processor entitlement plus any capacity that it might be allocated from the reserved pool capacity if the micropartition is uncapped.

If certain micropartitions in an SPP do not use their capacity entitlement, the unused capacity is ceded and other uncapped micropartitions within the same SPP are allocated the additional capacity according to their uncapped weighting. In this way, the entitled pool capacity of an SPP is distributed to the set of micropartitions within that SPP.

All Power Systems servers that support the MSPPs capability have a minimum of one (the default) SPP and up to a maximum of 64 SPPs.

3.5.2 Virtual I/O Server

The VIOS is part of PowerVM. It is a specific appliance that allows the sharing of physical resources between LPARs to allow more efficient usage (for example, consolidation). In this case, the VIOS owns the physical resources (SCSI, Fibre Channel, network adapters, or optical devices) and allows client partitions to share access to them, thus minimizing the number of physical adapters in the system. The VIOS eliminates the requirement that every partition owns a dedicated network adapter, disk adapter, and disk drive. The VIOS supports OpenSSH for secure remote logins. It also provides a firewall for limiting access by ports, network services, and IP addresses.

Figure 3-2 shows an overview of a VIOS configuration.

Figure 3-2 Architectural view of the VIOS

As a preferred practice, run two VIOSes per physical server.

Shared Ethernet Adapter

You can use a SEA to connect a physical Ethernet network to a virtual Ethernet network. The SEA provides this access by connecting the POWER Hypervisor VLANs with the VLANs on the external switches. Because the SEA processes packets at Layer 2, the original MAC address and VLAN tags of the packet are visible to other systems on the physical network. IEEE 802.1 VLAN tagging is supported.

The SEA also provides the ability for several client partitions to share one physical adapter. With an SEA, you can connect internal and external VLANs by using a physical adapter. The SEA service can be hosted only in the VIOS, not in a general-purpose AIX or Linux partition, and acts as a Layer 2 network bridge to securely transport network traffic between virtual Ethernet networks (internal) and one or more (Etherchannel) physical network adapters (external). These virtual Ethernet network adapters are defined by the POWER Hypervisor on the VIOS.

Virtual SCSI

Virtual SCSI is used to “see” a virtualized implementation of the SCSI protocol. Virtual SCSI is based on a client/server relationship. The VIOS LPAR owns the physical resources and acts as a server or, in SCSI terms, a target device. The client LPARs access the virtual SCSI backing storage devices that are provided by the VIOS as clients.

The virtual I/O adapters (virtual SCSI server adapter and a virtual SCSI client adapter) are configured by using a managed console or through the Integrated Virtualization Manager (IVM) on smaller systems. The virtual SCSI server (target) adapter is responsible for running any SCSI commands that it receives. It is owned by the VIOS partition. The virtual SCSI client adapter allows a client partition to access physical SCSI and SAN-attached devices and LUNs that are assigned to the client partition. The provisioning of virtual disk resources is provided by the VIOS.

N_Port ID Virtualization

N_Port ID Virtualization (NPIV) is a technology that allows multiple LPARs to access independent physical storage through the same physical Fibre Channel adapter. This adapter is attached to a VIOS partition that acts only as a pass-through, managing the data transfer through the POWER Hypervisor.

Each partition has one or more virtual Fibre Channel adapters, each with their own pair of unique worldwide port names, enabling you to connect each partition to independent physical storage on a SAN. Unlike virtual SCSI, only the client partitions see the disk.

For more information and requirements for NPIV, see IBM PowerVM Virtualization Managing and Monitoring, SG24-7590.

3.5.3 Live Partition Mobility

LPM allows you to move a running LPAR from one system to another without disruption. Inactive partition mobility allows you to move a powered-off LPAR from one system to another.

LPM provides systems management flexibility and improves system availability through the following functions:

•Avoid planned outages for hardware upgrade or firmware maintenance.

•Avoid unplanned downtime. With preventive failure management, if a server indicates a potential failure, you can move its LPARs to another server before the failure occurs.

For more information and requirements for NPIV, see IBM PowerVM Live Partition Mobility, SG24-7460.

3.5.4 Active Memory Sharing

Active Memory Sharing provides system memory virtualization capabilities, allowing multiple partitions to share a common pool of physical memory.

The physical memory of an IBM Power System can be assigned to multiple partitions in either dedicated or shared mode. A system administrator can assign some physical memory to a partition and some physical memory to a pool that is shared by other partitions. A single partition can have either dedicated or shared memory:

•With a pure dedicated memory model, the system administrator’s task is to optimize available memory distribution among partitions. When a partition suffers degradation because of memory constraints and other partitions have unused memory, the administrator can manually issue a dynamic memory reconfiguration.

•With a shared memory model, the system automatically decides the optimal distribution of the physical memory to partitions and adjusts the memory assignment based on partition load. The administrator reserves physical memory for the shared memory pool, assigns partitions to the pool, and provides access limits to the pool.

3.5.5 Active Memory Deduplication

In a virtualized environment, the systems might have a considerable amount of duplicated information that is stored on RAM after each partition has its own operating system, and some of them might even share kinds of applications. On heavily loaded systems, this behavior might lead to a shortage of the available memory resources, forcing paging by the Active Memory Sharing partition operating systems, the Active Memory Deduplication pool, or both, which might decrease overall system performance.

Active Memory Deduplication allows the POWER Hypervisor to map dynamically identical partition memory pages to a single physical memory page within a shared memory pool. This way enables a better usage of the Active Memory Sharing shared memory pool, increasing the system’s overall performance by avoiding paging. Deduplication can cause the hardware to incur fewer cache misses, which also leads to improved performance.

Active Memory Deduplication depends on the Active Memory Sharing feature being available, and it uses CPU cycles that are donated by the Active Memory Sharing pool’s VIOS partitions to identify deduplicated pages. The operating systems that are running on the Active Memory Sharing partitions can “hint” to the POWER Hypervisor that some pages (such as frequently referenced read-only code pages) are good for deduplication.

3.5.6 Remote Restart

Remote Restart is a high availability option for partitions. In the event of an error that causes a server outage, a partition that is configured for Remote Restart can be restarted on a different physical server. At times, it might take longer to start the server, in which case the Remote Restart function can be used for faster reprovisioning of the partition. Typically, remote restart can be done faster than restarting the server that failed and then restarting the partitions.

The Remote Restart function relies on technology similar to LPM, where a partition is configured with storage on a SAN that is shared (accessible) by the server that hosts the partition.

HMC V9R1 brings following enhancements to the Remote Restart feature:

•Remote restart a partition with reduced or minimum CPU/memory on the target system.

•Remote restart by choosing a different virtual switch on the target system.

•Remote restart the partition without powering on the partition on the target system.

•Remote restart the partition for test purposes when the source-managed system is in the Operating or Standby state.

•Remote restart by using the REST API.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Chapter 3. Virtualization

Create new playlist

Sign In

Sign Up

Table of Contents for
Chapter 3. Virtualization