Client experience
In this appendix, we describe the experiences of a large Parallel Sysplex client as they migrated two of their data centers from z10 processors with ICB4 links to z196 processors with InfiniBand links.
Overview of the client experience
For high-speed coupling, ICB4 links offered the best response time prior to the introduction of HCA3 adapters. Because of the large number of CF requests that are generated by the client’s workload (over 300,000 requests a second in one of their sysplexes), they used ICB4 links.
The client’s initial configuration in one of the data centers consisted of six z10 processors running z/OS, and two z10 processors that acted as stand-alone Coupling Facilities. The sysplex in that site consisted of 10 LPARs and two 5-way CFs. Because of the performance requirements of that sysplex, it used only ICB4 links.
The other data center had four z10 processors running z/OS and two z10 processors containing simply CF LPARs. That data center contained six sysplexes. Because of the connectivity requirements of all those sysplexes, that site originally used both ICB4 and ISC3 links and migrated to InfiniBand over the course of two sets of processor upgrades.
Although the client found that the ICB4 links generally provided excellent response times, they wanted to be able to have more of them, specifically to address the following issues:
Times when bursts of CF requests resulted in subchannel busy and path busy conditions
The need for better response times in sysplexes other than the large production sysplexes
The client saw the migration to InfiniBand technology as an opportunity to address both of these issues.
Large production sysplex
The production sysplex in the first data center configuration consisted of ten z/OS LPARs and two CFs. These ran on eight z10 processors that were to be migrated to z196 processors. Two of these were single-book, CF-only processors that each supported a maximum of 16 ICB4 links. The CFs were home to the PTS and BTS STP roles, so two of the 16 ports were used for CF-to-CF connections that enable STP signals to be exchanged between those processors (STP is required for migration to z196). The remaining 14 links were used to connect to the remaining six processors as coupling links.
Four of the processors were configured with two ICB4 links to each CF, and the remaining two processors (with the highest volume CF traffic) were configured with three ICB4 links, thus accounting for all 16 ports on the external CF processors.
Due to the heavy DB2 data sharing activity the client used external CFs to avoid the need to use System Managed Duplexing for the DB2 lock structure, which was thought to cause an unacceptable impact on performance.
The CF LPARs were configured with five dedicated ICF engines, each of which normally ran between 20% and 30% utilization, with the exception of several heavy batch intervals. During those batch intervals, CF utilization increased to the 40% to 50% range. Bursts of DB2 activity during those intervals had a tendency to cause subchannels to be overrun, resulting in more synchronous requests being converted to asynchronous and increasing the service times.
With the elimination of ICB4 support on the z196 processors, the client was somewhat concerned that implementing InfiniBand technology might result in a significant increase in response times; the general guideline was that replacing ICB4 links with InfiniBand technology results in response times increasing by about 40%. The client was concerned that the combination of moving to a faster processor technology with slower CF links might result in an unacceptable increase in coupling overhead (see Table 1-2 on page 9 for more information about coupling overhead for various combinations of processor models and link types).
As part of the client’s involvement in the z196 Early Support Program, they conducted a number of measurements with a combination of z10 and z196 processors. The evaluation sysplex consisted of two z/OS LPARs and up to eight CF LPARs. The hardware consisted of one z10 and one z196.
The processors were connected with both ICB4 links (wrapped on the z10, from the z/OS LPARs to the CF LPAR) and HCA2-O 12X InfiniBand links (to both the z10 and z196 CFs). All the LPARs were able to be brought up on either processor. During the measurements, the client varied the number of CFs, the number of engines in each CF, the type and number of CF links, and the placement of various important (high use) structures. The results of this exercise gave them the confidence to proceed with the migration to InfiniBand and also caused them to adjust their CF infrastructure.
Production migration to z196 with InfiniBand links
One effect of this extensive testing was that the client discovered how easy it is to move structures between CFs using the SETXCF REALLOCATE and MAINTMODE commands. Combined with the results of various measurements, that led the client to implement the new z196 hardware with an entirely new CF configuration.
Instead of having two CFs (each with five engines) as used previously, the client decided to migrate to a 4-CF LPAR configuration on the two z196 processors. The new configuration would have two 2-way CFs and two 3-way CFs. The reason for this change was to reduce the n-way effect and achieve more usable capacity from the same installed configuration.
To facilitate the migration, HCA2-O 12X adapters were installed in the z/OS z10 processors in preparation for the first z196 installs. The first z196s to be installed were the external CF processors, which expanded their sysplex configuration to 10 processors as shown in Figure D-1. The ICB4 links continued to be used to connect to the z10 CFs, and the HCA2-O links were used to connect the z10 z/OS processors to the z196 CFs. In the interests of readability, the figure only shows the logical connectivity. However, in reality, two InfiniBand links were installed between every z10 and each of the two z196 CFs.
Figure D-1 Interim configuration with both z10 and z196 CFs
Implementing InfiniBand consisted of expanding the structure candidate list in the CFRM policy to include all six CF LPARs: the original two on the z10s, and the four on the new z196 CF processors.
The migration (which was carried out at a time of low activity) consisted of issuing the following commands:
SETXCF START,MAINTMODE,CFNAME=(CF1,CF2)
SETXCF START,REALLOCATE
When the REALLOCATE command completed, all structures had been moved from the z10 CFs to the z196 CFs. The client then waited for the high volume processing to start. Just in case the performance proved to be unacceptable, they were fully prepared to return to the z10 CF processors by simply stopping MAINTMODE on the z10 CFs, placing the z196 CF LPARs into MAINTMODE, and then reissuing the REALLOCATE command.
Monitoring the peak processing times provided a pleasant surprise. Rather than experiencing the anticipated 40% increase in response times, the new configuration was showing more than a 20% improvement over the service times that the client experienced with the ICB4 configuration. CF utilization had dropped to between 8 and 12%. Additionally, the high number of NO SUBCHANNEL events that had been an ongoing issue when batch kicked off was no longer evident. The client attributed the unexpected conversion improvements to the new 4-CF implementation and the increase in the number of available subchannels.
The addition of the two new CF LPARs was enabled by the migration to InfiniBand. The performance issues the client had experienced from time to time with ICB4 indicated a requirement for more subchannels and link buffers. However, the only way to get more of these was by adding more ICB4 links to the CF processors, which required installing another book in each of those processors and an RPQ to support more than 16 ICB4 links.
With InfiniBand, the client was able to define four CHPIDs between the z/OS and CF processors on the same physical connection, as shown in Figure D-2. Two of the four CHPIDs on each physical link were connected to each CF LPAR. With a minimum of two physical (failure-isolated) links between the processors, they increased the number of available subchannels from 14 to 28 per CF LPAR.
Furthermore, because the client had four CFs now, with 28 subchannels to each one, they now had a total of 112 subchannels in each z/OS for the CFs rather than the 28 they had previously. The increase in subchannels was a primary reason for the improved tolerance of our batch burst activity.
 
Note: Only consider splitting CF LPARs if the original CF LPAR has multiple ICF engines. CF LPARs require dedicated engines for optimum performance, so splitting the CF LPAR is only valuable if you can dedicate a minimum of one engine for each CF LPAR.
Figure D-2 Sharing InfiniBand links between CF LPARs
Based on their experiences, the client attributed the performance improvements to these factors:
There are faster engines in the new z196 CFs.
The 2-way and 3-way CFs had more usable capacity than 5-way CFs (because of the reduced multiprocessor effect).
Based on projections from the zPCR tool, configuring a z196 in this manner can deliver 9.3% more capacity than configuring it as a single 5-way CF LPAR.
Each CF processed a subset of the prior activity and structures.
Additional subchannels were better able to handle burst activity processing.
Structure allocation was divided by the attributes of the workloads. The structures that were most sensitive to short response times were placed in the 3-way CFs, and structures that had mainly asynchronous requests were placed in the CFs with two engines.
Serialization delays that might have been caused by having all structures in only two CFs were reduced.
Because this client is an intensive user of DB2 data sharing, the response time of the DB2 lock structure is critical to their applications. Table D-1 lists the request rate and response time for the DB2 lock structure when using ICB4 links compared to HCA2-O links.
Table D-1 Critical DB2 lock structure
 
Average Sync Request Rate
Average Sync Resp Time (in microseconds)
z10/ICB4
51342
12.4
z196/HCA2-O
55560
9.7
The performance of the Coupling Facilities and increased number of coupling subchannels also had an impact on overall z/OS CPU utilization, batch job elapsed times, and online response times; see Table D-2.
Table D-2 Overall average synchronous requests over 12-hour shift
 
Average Sync Request Rate
Average Sync Resp Time (in microseconds)
z10/ICB4
166043
14.7
z196/HCA2-O
180360
11.8
As shown, although more requests were processed every second, the average synchronous response time was reduced by over 20%.
One effect of the lack of subchannels in the ICB4 configuration was that synchronous requests were converted to asynchronous, with the corresponding increase in response times. Therefore, one of the objectives of the InfiniBand configuration was to reduce the number of instances of subchannel busy. Table D-3 shows that moving to InfiniBand resulted in a 60% reduction in the number of these events.
Table D-3 Subchannel busy conditions
CF configuration
Number of subchannel busy over 12 hours
z10/ICB4
2753375
z196/HCA2-O
841730
The net result was that the migration to z196 and HCA2-O 12X links was quite successful for the client’s largest production sysplex.
Exploiting InfiniBand for link consolidation
The second data center contained six sysplexes spread over four z10 processors running z/OS and two z10s running only CF LPARs. In this data center, the sysplex loads were smaller. However, the large number of sysplexes meant that many CF links were required for connectivity reasons rather than performance reasons. In this data center, the client’s primary requirement was to be able to support multiple sysplexes, with optimal performance for each one, while minimizing the number of coupling links.
As part of the previous upgrade to z10 processors, the client had installed InfiniBand links alongside the ICB4 links that were carried over from the previous processors. With six sysplexes spread over four processors, ICB4 links would not have been able to provide all the connectivity they required. z10 CF processors support a maximum of 16 ICB4 links (32 with an RPQ modification).
Because a single book can support 8 HCA fanout cards, a single book has the ability to support the entire 16 ICB4 connections. ICB4 links support a single CHPID per link, and each CHPID can only be used by one sysplex. In an environment with many sysplexes, the ICB4 links were quickly exhausted.
As part of the upgrade to the z196s, the client planned to eliminate the remaining ICB4 links and consolidate on simply two physical InfiniBand links between each z/OS processor and each CF.
Figure D-3 shows how each sysplex had its own ICB4 links prior to the installation of the first InfiniBand links (in reality, each sysplex had two ICB4 links for availability).
Figure D-3 Coupling requirements pre-InfiniBand
 
Figure D-4 shows how the four sysplexes were able to use a single InfiniBand link to replace the four ICB4 links.
Figure D-4 Coupling connectivity with PSIFB links
Due to the consolidation capabilities of InfiniBand, this client was able to reduce the number of links on each CF from the 48 used before they started introducing InfiniBand to only 24 links today. The 24 links also contain enough spare capacity to enable that data center to act as a disaster recovery backup for the site discussed in “Large production sysplex” on page 254.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset