Performance and monitoring
This chapter describes the factors that determine and influence the performance of the IBM Virtualization Engine TS7700. It also describes what actions to take, when necessary, to improve the TS7700 Virtualization Engine’s performance.
This chapter includes the following information:
An overview of the shared tasks that are running in the TS7700 Virtualization Engine server
A description of a TS7700 Virtualization Engine monitoring and performance evaluation methodology
A walkthrough of a TS7700 Virtualization Engine capacity planning case study
A review of bulk volume information retrieval (BVIR) and VEHSTATS reporting
Scenarios are described to show the effect of various algorithms on the z/OS and the TS7700 Virtualization Engine R2.0 device allocation. These scenarios help you to understand how settings and definitions affect device allocation.
TS7700 Virtualization Engine shared resources are also described so that you can understand the impact that contention for these resources has on the performance of the TS7700 Virtualization Engine.
The monitoring section can help you understand the performance-related data recorded in the TS7700 Virtualization Engine. It discusses the performance issues that might arise with the TS7700 Virtualization Engine. This chapter can also help you recognize the symptoms that indicate that the TS7700 Virtualization Engine configuration is at or near its maximum performance capability. The information provided can help you evaluate the options available to improve the throughput and performance of the TS7700 Virtualization Engine.
The capacity planning case study illustrates guidelines and techniques for the management of virtual and stacked volumes associated with the TS7700 Virtualization Engine.
This chapter includes the following topics:
Performance Evaluation Tools available on Techdocs
10.1 TS7700 Virtualization Engine performance characteristics
The TS7700 Virtualization Engine can provide significant benefits to a tape processing environment. In general, performance depends on such factors, such as total system configuration, Tape Volume Cache (TVC) capacity, the number of physical tape drives available to the TS7700 Virtualization Engine, the number of channels, the read/write ratio, and data characteristics, such as blocksize and mount pattern. You might experience deviations from the presented figures in your environment. The measurements are based on a theoretical workload profile and cannot be fully compared with a varying workload. The performance factors and numbers for configurations are shown in the following pages.
Based on initial modeling and measurements, and assuming a 2.66:1 compression ratio, Figure 10-1 shows the evolution in the write performance with the TS7700 family, which is also described in more detail in the IBM Virtualization Engine TS7720 and TS7740 Releases 1.6, 1.7, 2.0 and R3.0 Performance white paper, WP102247. The following charts are for illustrative purposes only. Always use the most recently published performance white papers available on the Techdocs website at the following address:
Figure 10-1 VTS and TS7700 maximum write bandwidth
Figure 10-1 shows the evolution of performance in the TS7700 IBM Virtualization Engine family compared with the previous member of the IBM Tape Virtualization family, the IBM Virtual Tape Server (VTS). The numbers were obtained from runs with 128 concurrent jobs, each job writing 800 MB (uncompressed) using 32 KB blocks. The number of buffers (BUFNO) used by QSAM was twenty (QSAM BUFNO = 20).
Figure 10-2 shows the read hit performance numbers in a similar fashion.
Figure 10-2 VTS and TS7700 maximum read hit bandwidth
The numbers shown in Figure 10-2 were obtained with 128 concurrent jobs in all runs, each job reading 800 MB (uncompressed) using 32 KB blocks, QSAM BUFNO = 20.
Compared with the VTS, the TS7700 has introduced faster performing hardware components (such as faster FICON channel adapters, the more powerful TS7700 Virtualization Engine controller, and faster disk cache) along with the new TS7700 Virtualization Engine architecture, providing for improved performance and throughput characteristics of the TS7700 Virtualization Engine. From a performance aspect, the architecture offers these important characteristics:
With the selection of DB2 as the central repository, the TS7700 Virtualization Engine provides a standard SQL interface to the data, and all data is stored and managed in one place. DB2 also allows for more control over back-end performance.
The cluster design with virtualization node (vNode) and hierarchical data storage management node (hNode) provides increased configuration flexibility over the monolithic design of the VTS.
The use of TCP/IP instead of Fibre Channel connection (FICON) for site-to-site communication eliminates the requirement to use channel extenders.
10.1.1 Performance overview
This section describes several performance attributes of the TS7700.
Types of throughput
Because the TS7720 is a disk-cache-only cluster, its read and write data rates have been found to be fairly consistent throughout a given workload.
Because the TS7740 contains physical tapes to which the cache data is periodically written, recalls from tape to cache occur, and Copy Export and reclaim activities occur, the TS7740 exhibits four basic throughput rates:
Peak write
Sustained write
These four rates are described next.
Peak and sustained write throughput
For the TS7740, a measurement is not begun until all previously written data has been copied, or premigrated, from the disk cache to physical tape. Starting with this initial condition, data from the host is first written into the TS7740 disk cache with little if any premigration activity taking place. This approach allows for a higher initial data rate, and is termed the peak data rate.
After a pre-established threshold is reached of non-premigrated data, the amount of premigration is increased, which can reduce the host write data rate. This threshold is called the premigration priority threshold, and has a default value of 1600 GB. When a further threshold of non-premigrated data is reached, the incoming host activity is actively throttled to allow for increased premigration activity. This throttling mechanism operates to achieve a balance between the amount of data coming in from the host and the amount of data being copied to physical tape. The resulting data rate for this mode of behavior is called the “sustained” data rate, and can theoretically continue on forever, given a constant supply of logical and physical scratch tapes.
This second threshold is called the premigration throttling threshold, and has a default value of 2000 GB. These two thresholds can be used in conjunction with the peak data rate to project the duration of the peak period. Both the priority and throttling thresholds can be increased through a host command line request, which is described later in this chapter.
Read-hit and recall throughput
Similar to write activity, there are two types of TS7740 read performance:
Read-hit (also referred to as peak) occurs when the data requested by the host is currently in the disk cache.
Recall (also referred to as read-miss) occurs when the data requested is no longer in the disk cache, and must be first read in from physical tape.
Read-hit data rates are typically higher than recall data rates.
The two read performance metrics, along with peak and sustained write performance, are sometimes referred to as the four corners of virtual tape performance. Recall performance is dependent on several factors that can vary greatly from installation to installation, such as number of physical tape drives, spread of requested logical volumes over physical volumes, location of the logical volumes on the physical volumes, and length of the physical media.
Grid considerations
Up to six TS7700 clusters can be linked together to form a grid configuration. The connection between these clusters is provided by two or four 1-Gbps TCP/IP links per cluster or two 10 Gbps links. Data written to one TS7700 cluster can be optionally copied to the other cluster (or clusters). Up to six TS7700 clusters can be host-driven, depending on individual requirements.
Data can be copied between the clusters in either Synchronous, RUN (also known as Immediate), or Deferred copy mode. Synchronous mode means that the data is copied instantly to a second cluster. Synchronous mode copies occur during the writes from the host and are synchronized when a tape synch event occurs. Typically, when the Rewind Unload (RUN) is issued, the Synchronous copy has completed. With an immediate copy (RUN), the second copy is not started until the RUN command is received. In RUN copy mode, the rewind-unload at job end is held up until the received data is copied to the other cluster (or clusters).
In Deferred copy mode, data is queued for copying, but the copy does not have to occur before job end. Deferred copy mode allows for a temporarily higher data rate than Synchronous or RUN copy mode, and can be useful for meeting peak workload demands. Be sure, however, that there is sufficient recovery time for Deferred copy mode so that the deferred copies can be completed before the next peak demand.
Deferred copies are controlled during heavy host I/O with Deferred Copy Throttling (DCT). More priority can be given to deferred copying by lowering the DCT value. A detailed description of DCT and how to modify the default value is described in 10.7.3, “Adjusting the TS7700” on page 713.
10.2 TS7700 components and task distribution
In the process of writing scratch volumes, or premigrating and recalling virtual volumes from physical stacked volumes, hardware components are shared by tasks running on the TS7700 Virtualization Engine. Some of these tasks represent users’ work, such as scratch mounts, and other tasks are associated with the internal operations of the TS7700 Virtualization Engine, such as reclamation in a TS7740 Virtualization Engine.
All these tasks will be sharing the same resources, especially the TS7700 Virtualization Engine Server processor, the TVC, and the physical tape drives attached to a TS7740 Virtualization Engine. Contention might occur for these resources when high workload demands are placed on the TS7700 Virtualization Engine. To manage the use of shared resources, the TS7700 Virtualization Engine uses various resource management algorithms, which can have a significant impact on the level of performance achieved for a specific workload.
This section discusses the effects on performance of the following shared resources:
TS7700 Virtualization Engine processor
TS7740 Virtualization Engine physical tape drives
TS7740 Virtualization Engine physical stacked volumes
See Figure 10-3 for an overview about all the tasks. The tasks that TS7700 performs, the correlation of the tasks to the components that are involved, and tuning points that can be used to favor certain tasks over others are all described.
Figure 10-3 Tasks performed by the TS7700
10.2.1 Tasks performed by CPU (processor cycles)
All tasks running in the TS7700 Virtualization Engine server require a portion of processor cycles. From one cluster’s perspective, these tasks are included:
Operating system: TS7700 runs on an IBM AIX operating system.
Host read and write data: Data being written to or being read from TS7700 cache. This data is compressed coming into the TS7700 and decompressed going back to host by the host bus adapters (HBAs). Internally, data passes from HBAs through buffers in the CPU memory, and from the memory to the Fibre Channel Protocol (FCP) adapter on its way to the cache. The data rate is limited by the number of Performance Increments installed. Also, the Host Write throttle affects the read/write rate.
Copy data to other clusters: This is data that is copied from this cluster’s cache to other clusters as synchronous (S), immediate (RUN or R), or deferred (D) copies. Deferred Copies are regulated by the DCT, which is also known as the Deferred Copy Read Throttle. Synchronous and immediate copies are not throttled by the originating cluster.
Copy data from others clusters: This is data that is copied into this cluster’s cache from other clusters as synchronous (S), immediate (RUN or R) or deferred (D) copies. Immediate and Deferred copies are regulated by the Copy Throttle. Synchronous copies are not regulated by the Copy Throttle.
Cross-cluster mounts: Either from another cluster accessing a volume in this cluster’s cache or this cluster accessing remote volumes in other clusters’ cache. Host write throttle has control over cross-cluster mount for write from another cluster to this cluster. This also applies to a Synchronous mode copy, which is essentially a remote write. Cross-cluster mounts do not pass data through the cache of the cluster whose virtual logical device was allocated for the mount.
Copy Export (TS7740 only): When a Copy Export occurs, data to be exported that resides only in cache is premigrated to physical tape. Also, each physical tape to be exported is mounted and a snapshot of the TS7740 database is written to it.
Premigrate (TS7740 only): Logical volumes that reside only in cache are written to physical tape. Premigration includes both primary and secondary copies. There are several algorithms in premigration:
 – Idle Premigration
 – Fast Host Write Premigration
 – Somewhat Busy Premigration Ramp Up
 – Preferred Premigration
 – Limit number of drives for premigration in physical volume pool definitions
Recall (TS7740 only): Bringing data from a physical tape to cache to satisfy a specific mount.
Reclaim (TS7740 only): Transferring logical volumes from one physical tape to another. This data passes through the CPU’s memory: Cache is not used in this operation. Reclaim is influenced by the number of available drives, by the Reclaim Threshold, the Inhibit Reclaim schedule, and the maximum number of reclaim tasks (set using hot console request RCLMMAX).
Management interface: The management interface (MI) is a task that consumes CPU power. The MI is used to configure, operate, and monitor the TS7700.
The R2.0 server models V07 and VEB have more CPU power; therefore, providing better overall performance. The additional CPU allows more processing, such as premigration activity to occur in parallel.
10.2.2 Tasks performed by the TS7700 Tape Volume Cache
The TS7700 cache is the focal point for all data transfers except reclaim activity. When evaluating performance, remember that the cache has a finite size and I/O bandwidth to satisfy a variety of tasks. Equally important to remember is that all data moved within the TS7700 is compressed host data. The HBAs compress and decompress the data as it is written and read by the host:
Host read/write
Copies to/from other clusters
Remote write and read (cross-cluster mounts)
Clarification: Cross-cluster mounts to other clusters do not move data through local cache. Also, reclaim data does not move through the cache.
10.2.3 Tasks performed by the grid
The grid is not a physical component that can be clearly identified like the TS7700 Virtual Engine (CPU) or the TS7700 Cache. A functional module runs within the TS7700 using internal resources, such as CPU and memory, and external resources, such as the gigabit link infrastructure.
That functional module is responsible for the following items:
Copies to/from other clusters:
 – Synchronous copies
 – Immediate copies (RUN)
 – Deferred copies
Cross-cluster mounts:
 – Using another cluster’s cache
 – Another cluster using this cluster’s cache
Cluster coordination traffic:
 – Ownership takeover
 – Volume attribute changes
 – Logical volume insert
 – Configuration
10.2.4 Tasks performed by the TS7740 Tape Drives
The TS7740 physical back-end tape drives read and write data for a variety of reasons. The back-end drives are shared among these tasks. Balancing their usage is controlled by default algorithms that can be adjusted somewhat by the user. It is important to monitor the use of the physical drives to ensure that there are enough back-end drives to handle the workload:
Copy Export:
 – Volumes that have not being premigrated are copied to physical tape
 – Process writes the TS7740 database to each exported volume
 – Logical volumes are copied from cache to physical tape
 – Primary and secondary copies
 – Secondary copies can be Copy Export volumes
 – Private (non-Fast Ready) mount
 – Whole logical volume read into cache
 – Two drives per task
 – Process runs to completion (source empty)
10.3 Throttling, tasks, and knobs
The main goal is to help you understand how data flows within the TS7700, how the internal resources are used by the regular tasks, where the limits are, and how to adjust the available tuning points (knobs) to get the maximum performance from your TS7700, considering your variables and particular needs.
To use the tunable parameters available to you in the best manner, you need a good understanding of the types of throttling within the TS7700, and the underlying mechanisms. Throttling is the mechanism adopted to control and balance several tasks that run at the same time within the TS7700, prioritizing certain tasks over others. These mechanisms are called upon only when the system reaches higher levels of utilization, where the components are used almost to their maximum capacity and bottlenecks start to show. The criteria balances the user needs with the vital resources needed for the TS7700 to function.
This control is accomplished by delaying the launch of new tasks and prioritizing more important tasks over the other tasks. After the tasks are dispatched and running, control over the execution is accomplished by slowing down a specific functional area by introducing calculated amounts of delay in the operations. This alleviates stress on an overloaded component or leaves extra CPU cycles to another needed function, or simply waits for a slower operation to finish.
The throttling algorithms and the control knobs that can be used to customize the TS7700 behavior to your particular needs are described.
10.3.1 Throttling in the TS7700
The TS7700 subsystem aggregated I/O rate is the sum of all data transfers (reads or writes) that are performed by all active logical drives mounted at a certain time in a TS7700. Each logical drive that is mounted and transferring data gets a proportional share of the total data throughput.
For instance, if there are 50 mounted and active virtual drives, each one will receive, on average, two percent of the current host data transfer. Important factors that can affect host data transfer bandwidth are the regular operational tasks, such as premigrating, immediate and deferred copies, Copy Export tasks, and reclaim activities.
The subsystem has a series of self-regulatory mechanisms that try to optimize the shared resources usage. Subsystem resources, such as CPU, cache bandwidth, cache size, host channel bandwidth, grid network bandwidth, physical drives, and so on, are limited, and they must be shared by all tasks moving data throughout the subsystem.
The resources will implicitly throttle by themselves when reaching their limits. The TS7700 introduces a variety of explicit throttling methods to give higher priority tasks more of the shared resources. The following list shows normally running tasks that move data:
Immediate copies
Copy Export
Host I/O, which includes Sync Mode Copy writes
Deferred copies
In certain situations, the TS7700 will grant higher priority to activities in order to solve a problem state, for example:
Panic reclamation: The TS7740 detects that the number of empty physical volumes has dropped below the minimum value and reclaims need to be done immediately to increase the count.
Cache fills with copy data: To protect from having uncopied volumes removed from cache, the TS7740 throttles data coming into the cache.
Cache overfills: If no more data can be placed into the cache before data is removed, other tasks trying to add to the cache are heavily throttled.
There are three types of throttling:
Host Write throttle
Copy throttle
Deferred Copy throttle (DCT)
10.3.2 Host Write Throttle
This mechanism is applied to limit the amount of data written into cache from the host.
It throttles incoming host writes from the channel and host I/O due to mounts from other clusters. If a cluster is applying host write throttle, a Sync Mode Copy write into the cluster is also slowed down, because a Sync Mode Copy is treated as a host write.
This throttle is triggered by the following items:
Full Cache: A full cache of data is to be copied.
Immediate Copy: Large amounts of immediate copy data need to be moved.
Premigrate: Large amounts of data in cache have not been written to tape yet.
Free Space: Cache is nearly full of any kind of data.
Figure 10-4 is a visual representation of the Host Write throttle mechanism and where it applies.
Figure 10-4 Host Write Throttle
10.3.3 Copy throttle
Copy throttle is used to limit the amount of data being written into cache from other clusters’ copy data. This mechanism throttles incoming copies, both immediate and deferred, that come from other clusters in the grid. The number of threads/tasks used for copy data between clusters depends of the number of links that are enabled. The default value for a V07/VEB is 20 for clusters with two 1 Gbps Ethernet links, and 40 for clusters with four 1 Gbps Ethernet links or two 10 Gbps Ethernet links. The default for a V06/VEA is 20 regardless of the connection.
Copy throttling is triggered by the following items:
Full Cache
Amount of data to be premigrated
Figure 10-5 shows the mechanism and where it applies.
Figure 10-5 Copy throttle
10.3.4 Deferred Copy throttle
Deferred Copy throttle (DCT) acts on the rate of deferred copies going to other clusters. This throttle is applied with the intention of saving CPU cycles to favor host I/O data transfer. Here, you are trading the deferred copy rate for higher host I/O throughput. This throttle is also known as Deferred Copy Read throttle, because other clusters are “reading” data from this cluster. See Figure 10-6 on page 665.
Figure 10-6 Deferred Copy throttle (Deferred Read Copy throttle)
Remember: The 100 MBps threshold is the default. It can be changed through the Host Console Request.
Causes for host write and copy throttling
Host write throttle and copy throttle are triggered by the same factors:
Full cache: Cache is full of data that needs to be copied to another cluster.
 – Amount of data to be copied to another cluster is greater than 95% of cache size and the TS7700 has been up more than 24 hours.
 – This is reported as Write throttle and Copy throttle in VEHSTATS.
Immediate copy: Immediate copies to other clusters, where this cluster is the source, are taking too long or are predicted to take too long:
 – The TS7700 evaluates the need for this throttle every two minutes.
 – The TS7700 examines the depth of the immediate copy queue and the amount of time that the copies have been in the queue to determine whether to apply the throttle:
The algorithm looks at the age of the oldest immediate copy in the queue:
 • If the oldest immediate copy is 10 - 30 minutes old, the throttle is set between 0.00166 seconds to two seconds and the linear ramp is set in the range of 10 - 30 minutes.
 • The maximum throttle (2 seconds) is applied immediately if an immediate copy has been in the queue for 30 minutes or longer.
Look at quantity of data, and calculate how long the transfer will take:
 • If greater than 35 minutes, the TS7700 sets the throttle to the maximum
(2 seconds).
 • If 5 - 35 minutes, the TS7700 sets the throttle from .01111 seconds to 2 seconds, and it sets the linear ramp from 5 - 35 minutes.
 – Immediate copy is reported as Write Throttle in VEHSTATS.
Example: The time required for a 4000 MB immediate copy is five times longer than an 800 MB immediate copy.
 – Host Write Throttle, because of Immediate Copies taking too long, can be disabled by using the Host Console Request.
Premigrate: Amount of data to be premigrated is above the threshold (default 2000 GB):
 – Premigrate is reported as Write throttle and Copy throttle in VEHSTATS.
 – These throttle values will be equal if premigrate is the sole reason for throttling.
Free space: Invoked when cache is nearly full of data.
 – Used to ensure that there is enough cache to handle the currently mounted volumes.
 – Free space is reported as Write Throttle in VEHSTATS.
What activates the Deferred Copy Throttle
Starting with R2.0, DCT is applied in the following conditions:
DCTAVGTD: DCT 20 Minute Average Threshold looks at the 20 minute average of the compressed host read and write rate. The threshold defaults to 100 MB/s.
Cache Write Rate: Compressed writes to disk cache, which include host write, recall write, grid copy-in write, and cross-cluster write to this cluster. The threshold is fixed at
150 MB/s.
Cluster utilization looks at both the CPU usage and the disk cache usage. The threshold is when either one is 85% busy or more.
DCT is applied when Cluster Utilization > 85% or the Cache Write Rate is more than 150 MB/s, and the 20-minute average compressed host I/O rate is more than DCTAVGTD. This algorithm was added in R2.0.
The Cache Write Rate was introduced at R2.0 due to the increased CPU power on the POWER7 processor. The CPU usage is quite often below 85% during peak host I/O periods. Prior to R2.0, the Cache Write Rate was not considered.
DCT remains in effect for the subsequent 30-second interval, after which it is reevaluated. The default DCT value is 125 ms. The default value of 125 ms severely slows deferred copy activity (125 ms is added between each 32 K block of data sent for a volume).
The DCT can be set using Host Console Request. The setting of the DCT is described in detail in the IBM Virtualization Engine TS7700 Series z/OS Host Command Line Request User’s Guide, which is available on Techdocs. Use the SETTING, THROTTLE, DCOPYT keywords.
The DCT threshold can be set by using Host Console Request. The setting of the DCT is described in detail in the IBM Virtualization Engine TS7700 Series z/OS Host Command Line Request User’s Guide, which is available on Techdocs. Use the SETTING, THROTTLE, DCTAVGTD keywords.
10.3.5 Managing tasks within TS7700
This section describes more command management tasks with the TS7700.
Prioritizing copies within the grid for TS7700
In a multicluster grid, copies of volumes are made from one cluster to another. Copies are made in either an immediate or deferred manner. There are three classifications of copies: immediate, immediate-deferred, and deferred. Immediate-deferred are volumes that were originally defined as immediate copies but were changed to deferred copies. An immediate copy can be changed to an immediate-deferred copy if the cluster that is to receive the copy is not online. Volumes in the source cluster are defined as either Preference Level 0 (PG0) or Preference Level 1 (PG1) volumes through the Storage Class. Within the three copy classifications, the TS7740 gives priority to PG0 volumes over PG1 volumes. Prioritizing PG0 over PG1 allows the PG0 volumes to be removed from cache as soon as possible, allowing more PG1 volumes to reside in cache.
The copies are processed in the following order:
1. Synchronous-Deferred PG0
2. Synchronous-Deferred PG1
3. Immediate PG1
4. Immediate-Deferred PG0
5. Immediate-Deferred PG1
6. Deferred PG0
7. Deferred PG1
Premigration tasks
The TS7740 uses a variety of criteria to manage the number of premigration tasks. The TS7700 looks at these criteria every five seconds to determine whether one more premigration tasks must be added. Adding a premigration task is based on the following factors and others:
Host-compressed write rate
CPU activity
The amount of data that needs to be premigrated per pool
The amount of data that needs to be premigrated in total
Figure 10-7 shows a migration demonstration. A premigration task does not preempt a recall, reclaim, or Copy Export task. Four algorithms are working in concert to determine whether to start another premigration task. General details are described. The actual algorithm has several nuances that are not described here.
Figure 10-7 Migration demonstration
Consider the following factors:
Idle premigration:
 – If the CPU usage is idle more than 5% of the time, a premigrate task is started, if appropriate.
 – The number of tasks is limited to six or the maximum premigration drives defined by pool properties, whichever is lower.
Fast host write premigration mode:
 – Compressed host write rate is higher than 30 MBps and CPU idle time is less than 1%.
 – Premigration tasks are limited to two, for all pools, and lowered to one or zero if the mode continues.
Premigration ramp-up:
 – Compressed host write is less than 30 MBps and CPU idle time is less than 5%.
 – Indicator that the TS7700 is somewhat busy.
 – Ramp-up is limited to available back-end drives, minus one. Also, it is limited by the ‘Maximum number of premigrate drives” setting in the pool properties.
Preferred premigration:
 – Amount of non-premigrated data exceeds the preferred premigration threshold. The default threshold is 1600 GB.
 – Limited to available back-end drives, minus one. Also, limited by the “Maximum number of premigrate drives” setting in the pool properties.
 – Preferred premigration takes precedence over the other algorithms.
Immediate-copy set to immediate-deferred state
The goal of an immediate copy is to complete one or more RUN consistency point copies of a logical volume before surfacing status of the RUN command to the mounting host. If one or more of these copies cannot complete, the replication state of the targeted volume will enter the immediate-deferred state. The volume will remain in an immediate-deferred state until all of the requested RUN consistency points contain a valid copy. The immediate-deferred volume will replicate with a priority greater than standard deferred copies, but lower than non-deferred immediate copies.
There are numerous reasons why a volume might enter the immediate-deferred state. For example, it might not complete within 40 minutes, or one or more clusters targeted to receive an immediate copy are not available. Independently of why a volume might enter the immediate-deferred state, the host application or job that is associated with the volume is not aware that its previously written data has entered the immediate-deferred state.
The reasons why a volume moves to the immediate-deferred state are contained in the Error Recovery Action (ERA) 35 sense data. The codes are divided into unexpected and expected reasons. From a z/OS host view, the ERA is part of message IOS000I (Example 10-1).
Example 10-1 Message IOS000I
IOS000I 1029,F3,EQC,0F,0E00,,**,489746,HADRMMBK  745                          
New failure content is introduced into the CCW(RUN) ERA35 sense data:
Byte 14 FSM Error - If set to 0x1C (Immediate Copy Failure), the additional new fields are populated.
Byte 18 Bits 0:3 - Copies Expected: Indicates how many RUN copies were expected for this volume.
Byte 18 Bits 4:7 - Copies Completed: Indicates how many RUN copies were actually verified as successful before surfacing Sense Status Information (SNS).
Byte 19 - Immediate Copy Reason Code:
 – Unexpected - 0x00 to 0x7F: The reasons are based on unexpected failures:
 • 0x01 - A valid source to copy was unavailable.
 • 0x02 - Cluster targeted for a RUN copy is not available (unexpected outage).
 • 0x03 - Forty minutes have passed and one or more copies have timed out.
 • 0x04 - Is downgraded to immediate-deferred because of health/state of RUN target clusters.
 • 0x05 - Reason is unknown.
 – Expected - 0x80 to 0xFF: The reasons are based on the configuration or a result of planned outages:
 • 0x80 - One or more RUN target clusters are out of physical back-end scratch.
 • 0x81 - One or more RUN target TS7720 clusters are low on available cache
(95%+ full).
 • 0x82 - One or more RUN target clusters are in service-prep or service.
 • 0x83 - One or more clusters have copies explicitly disabled through the Library Request operation.
 • 0x84 - The volume cannot be reconciled and is currently “Hot” against peer clusters.
The additional data contained within the CCW(RUN) ERA35 sense data can be used within a z/OS custom user exit to act on a job moving to the immediate-deferred state. Because the requesting application that results in the mount has already received successful status before the issuing of the CCW(RUN), it cannot act on the failed status. However, future jobs can be suspended or other custom operator actions can be taken using the information provided within the sense data.
Synchronous mode copy set to synchronous deferred
The default behavior of Synchronous mode copy (SMC) is to fail a write operation if both clusters with the “S” copy policy are not available or become unavailable during write operations. You can enable the Synchronous Deferred on Write Failure (SDWF) option to permit update operations to continue to any valid consistency point in the grid. If there is a write failure, the failed “S” locations are set to a state of “synchronous-deferred”. After the volume is closed, any synchronous-deferred locations are updated to an equivalent consistency point through asynchronous replication. If the SDWF option is not checked (default) and a write failure occurs at either of the “S” locations, host operations fail and you must view only content up to the last successful sync point as valid.
For example, imagine a three-cluster grid and a copy policy of Sync-Sync Deferred (SSD), Sync Copy to Cluster 0 and Cluster 1 and a deferred copy to Cluster 2. The host is connected to Cluster 0 and Cluster 1. With this option disabled, both Cluster 0 and Cluster 1 must be available for write operations. If either one becomes unavailable, write operations fail. With the option enabled, if either Cluster 0 or Cluster 1 becomes unavailable, write operations continue. The second “S” copy becomes a synchronous-deferred copy.
In the previous example, if the host is attached to Cluster 2 only and the option is enabled, the write operations continue even if both Cluster 0 and Cluster 1 become unavailable. The “S” copies become synchronous-deferred copies.
The synchronous-deferred volume replicates with a priority greater than immediate-deferred and standard-deferred copies.
See the “IBM Virtualization Engine TS7700 Series Best Practices - Synchronous Mode Copy” white paper on Techdocs for detailed information concerning Synchronous mode copy.
10.3.6 TS7740 tape drives and overall cluster performance
The physical tape drives are managed by the TS7740 Virtualization Engine internal management software and cannot be accessed from any other attached host. These drives are used exclusively by the TS7740 Virtualization Engine for the mounts required for copying virtual volumes to stacked volumes, recalling virtual volumes into the cache, and reclaiming stacked volume space.
The availability of TS7740 Virtualization Engine physical tape drives for certain functions can significantly affect TS7740 Virtualization Engine performance. The TS7740 Virtualization Engine manages the internal allocation of these drives as required for various functions, but it always reserves at least one physical drive for recall and one drive for premigration.
TVC management algorithms also influence the allocation of back-end physical tape drives, as described in the following examples:
Cache freespace low: The TS7740 Virtualization Engine increases the number of drives available to the premigration function and reduces the number of drives available for recalls.
Premigration threshold crossed: The TS7740 Virtualization Engine reduces the number of drives available for recall down to a minimum of one drive to make drives available for the premigration function.
The number of drives available for recall or copy is also reduced during reclamation.
If the number of drives available for premigration is restricted, this can lead to limiting the number of virtual volumes in the cache to be migrated, which can then lead to free space or copy queue throttling being applied.
If the number of drives for recall is restricted, this can lead to elongated virtual mount times for logical volumes being recalled.
Recall performance is highly dependent on both the placement of the recalled logical volumes on stacked volumes and the order in which the logical volumes are recalled. To minimize the effects of volume pooling on sustained write performance, volumes are premigrated by using a different distribution algorithm.
This algorithm chains several volumes together on the same stacked volume for the same pool. This can change recall performance, sometimes making it better, sometimes making it worse. Other than variations in performance because of differences in distribution over the stacked volumes, recall performance must be constant.
Reclaim policies must be set in the MI for each volume pool. Reclamation occupies drives and can affect performance. Using multiple physical pools can cause a higher usage of physical drives for premigration and reclaim. In general, the more pools are used, the more drives are needed.
The Inhibit Reclaim schedule is also set from the MI, and it can prevent reclamation from running during specified time frames during the week. If Secure Data Erase is used, fewer physical tape drives might be available even during times when you use inhibited reclamation. If used, limit it to a specific group of data. Inhibit Reclaim specifications only partially apply to Secure Data Erase.
Note: Secure Data Erase does not honor your settings and, therefore, can run erasure operations as long as there are physical volumes to be erased.
The use of Copy Export and Selective Dual Copy also increases the use of physical tape drives. Both are used to create two copies of a logical volume in a TS7740 Virtualization Engine.
Figure 10-8 on page 672 shows how the sustained write data rate can be affected by the back-end physical tape drives. The chart shows sustained write data rates achieved in the laboratory for a stand-alone TS7740 with various numbers of TS1130 back-end physical tape drives. The data for the chart was measured with no TS7740 activity other than the sustained writes and premigration (host write balanced with premigration to tape).
Figure 10-8 TS7740 stand-alone sustained write versus the number of online drives
All runs were made with 128 concurrent jobs. Each job wrote 800 MB (uncompressed) using 32 KB blocks, data compression 2.66 to 1, QSAM BUFNO = 20, and four 4-Gb FICON channels from a z10 logical partition (LPAR).
Figure 10-9 on page 673 shows the TS7740 premigration rates. The rates at which cache-resident data is copied to physical tapes depends on the number of drives available for premigration.
Clarification: No TS7740 activity other than premigration is measured in this chart.
Figure 10-9 TS7740 stand-alone premigration rate versus premigration drives
All runs were made with 128 concurrent jobs. Each job wrote 800 MB (uncompressed) using 32-KB blocks, data compression 2.66:1, QSAM BUFNO = 20, and four 4-Gb FICON channels from a z10 logical partition (LPAR).
10.4 Virtual Device Allocation in z/OS with JES2
z/OS (JES2 only) allocation characteristics in general are described and how allocation algorithms are being influenced by z/OS allocation parameter settings EQUAL and BYDEVICES, by the TS7700 Virtualization Engine Copy Consistency Points, by Override Settings, and by the Allocation Assistance functions is shown. Carefully plan for your device allocation requirements because improper use of the parameters, functions, and settings can have unpredictable results. Various scenarios are presented showing the influence of the algorithms involved in virtual device allocation.
A configuration with two 3-cluster grid configurations, named GRID1 and GRID2, is used. Each grid has a TS7720 Virtualization Engine (Cluster 0) and a TS7740 Virtualization Engine (Cluster 1) at the primary Production Site, and a TS7740 Virtualization Engine (Cluster 2) at the Disaster Site. The TS7720 Cluster 0 in the Production Site can be considered as a deep cache for the TS7740 Cluster 1 in the scenarios that are described next.
Figure 10-10 gives a general overview of the configuration.
Figure 10-10 Grid configuration overview
In Figure 10-10, the host in the Production Site has direct access to the local clusters in the Production Site, and has access over the extended FICON fabric to the remote clusters in the Disaster Site. The extended FICON fabric can include dense wavelength division multiplexing (DWDM) connectivity, or can use FICON tape acceleration technology over IP. Assume that connections to the remote clusters have a limited capacity bandwidth.
Furthermore, there is a storage management subsystem (SMS) Storage Group per grid. The groups are defined in the SMS Storage Group routine as GRID1 and GRID2. SMS will equally manage Storage Groups: The order in the definition statement does not influence the allocations.
The following scenarios are described. Each scenario adds functions to the previous scenario so that you can better understand the effects of the added functions:
EQUAL allocation: Describes the allocation characteristics of the default load-balancing algorithm (EQUAL) and its behavior across the sample TS7700 Virtualization Engine configuration with two grids. See 10.4.1, “EQUAL allocation” on page 675.
BYDEVICES allocation: Adds the new BYDEVICES algorithm to the configuration. It explains how this algorithm can be activated and the differences from the default EQUAL algorithm. See 10.4.2, “BYDEVICES allocation” on page 677.
Allocation and Copy Consistency Point setting: Adds information about the effect of the Copy Consistency Point on the cache data placement. The various TS7700 Virtualization override settings influence this data placement. See 10.4.3, “Allocation and Copy Consistency Point setting” on page 679.
Allocation and device allocation assistance (DAA): DAA is activated, and the effects on allocation are described. The unavailability of cluster and device information influences the allocation when DAA is enabled. DAA is enabled, by default. See 10.4.4, “Allocation and device allocation assistance” on page 681.
Allocation and scratch allocation assistance (SAA): SAA is activated, and its effects are described. A sample workload in this scenario is presented to clarify SAA. The advantages of SAA and the consequences of the unavailability of clusters and devices are explained. SAA is disabled, by default. See 10.4.5, “Allocation and scratch allocation assistance” on page 683.
10.4.1 EQUAL allocation
For non-specific (scratch) allocations, by default, MVS device allocation will first randomize across all eligible libraries and then, after a library is selected, will randomize on the eligible devices within that library. In terms of the TS7700 Virtualization Engine, “library” refers to a composite library because the MVS allocation has no knowledge of the underlying clusters (distributed libraries). The default algorithm (EQUAL) works well if the libraries under consideration have an equal number of online devices. For example, if two libraries are eligible for a scratch allocation and each library has 128 devices, over time, each library will receive approximately half of the scratch allocations. If one of the libraries has 128 devices and the other library has 256 devices, each of the libraries will still receive approximately half of the scratch allocations. The allocations are independent of the number of online devices in the libraries.
Remember: With EQUAL allocation, the scratch allocations will randomize across the libraries. EQUAL allocation is not influenced by the number of online devices in the libraries.
In this first scenario, both DAA and SAA are assumed to be disabled. With the TS7700 Virtualization Engine, you can control both assistance functions with the LIBRARY REQUEST command. DAA is ENABLED by default and can be DISABLED with the command. SAA is DISABLED by default and can be ENABLED with the command. Furthermore, none of the TS7700 Virtualization Engine override settings are used.
Assuming that the Management Class for the logical volumes will have a Copy Consistency Point of [R,R,R] in all clusters and that the number of available virtual drives are the same in all clusters, the distribution of the allocation across the two grids (composite libraries) will be evenly spread. The multicluster grids are running in BALANCED mode, so there is no preference of one cluster above another cluster.
With the default algorithm EQUAL, the distribution of allocations across the clusters (in a multiple cluster grid) depends on the order in which the library port IDs were initialized during IPL (or input/output definition file (IODF) activate). The distribution of allocations across the clusters also depends on whether the library port IDs in the list (returned by the DEVSERV QLIB,composite-library-id command) randomly represent each of the clusters or if the library port IDs in the list tend to favor the library port IDs in one cluster first, followed by the next cluster, and so on. The order in which the library port IDs are initialized and appear in this DEVSERV list can vary across IPLs or IODF activates, and can influence the randomness of the allocations across the clusters.
So with the default algorithm EQUAL, there might be times when device randomization within the selected library (composite library) appears unbalanced across clusters in a TS7700 Virtualization Engine that have online devices. As the number of eligible library port IDs increases, the likelihood of this imbalance occurring also increases. If this imbalance affects the overall throughput rate of the library, consider enabling the BYDEVICES algorithm described in 10.4.2, “BYDEVICES allocation” on page 677.
Remember: Exceptions to this can also be caused by z/OS JCL backward referencing specifications (UNIT=REF and UNIT=AFF).
With z/OS V1R11 and later, as well as z/OS V1R8 through V1R10 with APAR OA26414 installed, it is possible to change the selection algorithm to BYDEVICES. The algorithm EQUAL, which is the default algorithm used by z/OS, can work well if the libraries (composite libraries) under consideration have an equal number of online devices and the cluster behavior above is understood.
The non-specific (scratch) allocation distribution is shown in Figure 10-11.
Figure 10-11 ALLOC EQUAL scratch allocations
For specific allocations (DAA DISABLED in this scenario), it is first determined which of the composite libraries, GRID1 or GRID2, has the requested logical volume. That grid is selected and the allocation can go to any of the clusters in the grid. If it is assumed that the logical volumes were created with the EQUAL allocation setting (the default), it can be expected that specific device allocation to these volumes will be distributed equally among the two grids. However, how well the allocations are spread across the clusters depends on the order in which the library port IDs were initialized (discussion above) and whether this order was randomized across the clusters.
In a TS7740 Virtualization Engine multicluster grid configuration, only the original copy of the volume will stay in cache, normally in the mounting cluster’s TVC for a Copy Consistency Point setting of [R,R,R]. The copies of the logical volume in the other clusters will be managed as a TVC Preference Level 0 (PG0 - remove from cache first) unless a Storage Class specifies Preference Level 1 (PG1 - stay in cache) for these volumes.
A number of possibilities can influence the cache placement:
You can define a Storage Class for the volume with Preference Level 0 (PG0). The logical volume will not stay in the I/O TVC cluster.
You can set the CACHE COPYFSC option, with a LIBRARY REQUEST,GRID[1]/[2],SETTING,CACHE,COPYFSC,ENABLE command. When the ENABLE keyword is specified, the logical volumes copied into the cache from a peer TS7700 cluster are managed using the actions defined for the Storage Class construct associated with the volume as defined at the TS7740 cluster receiving the copy. Therefore, a copy of the logical volume will also stay in cache in each non-I/O TVC cluster where a Storage Class is defined as Preference Level 1 (PG1). However, because the TS7720 is used as a deep cache, there are no obvious reasons to do so.
In the hybrid multicluster grid configuration used in the example, there are two cache allocation schemes, depending on the I/O TVC cluster selected when creating the logical volume. Assume a Storage Class setting of Preference Level 1 (PG1) in the TS7740 Cluster 1 and Cluster 2.
If the mounting cluster for the non-specific request is the TS7720 Cluster 0, only the copy in that cluster stays. The copies in the TS7740 Cluster 1 and Cluster 2 will be managed as Preference Level 0 (PG0) and will be removed from cache after placement of the logical volume on a stacked physical volume. If a later specific request for that volume is directed to a virtual device in one of the TS7740s, a cross-cluster mount from Cluster 1 or Cluster 2 occurs to Cluster 0’s cache.
If the mounting cluster for the non-specific request is the TS7740 Cluster 1 or Cluster 2, not only the copy in that cluster stays, but also the copy in the TS7720 Cluster 0. Only the copy in the other TS7740 cluster will be managed as Preference Level 0 (PG0) and will be removed from cache after placement of the logical volume on a stacked physical volume. Cache preferencing is not valid for the TS7720 cluster. A later specific request for that logical volume creates only a cross-cluster mount if the mount point is the vNode of the TS7740 cluster not used at data creation of that volume.
With the EQUAL allocation algorithm used for specific mount requests, there will always be cross-cluster mounts when the cluster where the device is allocated is not the cluster where the data resides. Cache placement can limit the number of cross-cluster mounts but cannot avoid them. Cross-cluster mounts over the extended fabric is likely not acceptable, so vary the devices of Cluster 2 offline.
10.4.2 BYDEVICES allocation
The alternative algorithm BYDEVICES will randomize scratch allocations across all devices. For example, if two libraries are eligible for a scratch allocation and each library has 128 devices, over time each library will receive approximately half of the scratch allocations, similar to the EQUAL algorithm. Again, in terms of the TS7700 Virtualization Engine, “library” refers to a composite library because MVS allocation has no knowledge of the underlying clusters (distributed libraries). However, if one of the libraries has 128 devices and the other library has 256 devices, over time, the library that has 128 devices will receive approximately 1/3 of the scratch allocations and the library that has 256 devices will receive approximately 2/3 of the scratch allocations. This is completely different compared to the default algorithm EQUAL, which did not take the number of online devices in a library into consideration.
Clarification: With BYDEVICES, the scratch allocation will randomize across all devices in the libraries and will be influenced by the number of online devices.
With z/OS V1R11 and later, as well as z/OS V1R8 through V1R10 with APAR OA26414 installed, it is possible to influence the selection algorithm. The BYDEVICES algorithm can be enabled through the ALLOCxx PARMLIB member by using the SYSTEM TAPELIB_PREF(BYDEVICES) parameter or it can be enabled dynamically through the SETALLOC operator command by issuing SETALLOC SYSTEM,TAPELIB_PREF=BYDEVICES.
The alternate BYDEVICES algorithm can be replaced by the default EQUAL algorithm by specifying EQUAL through the SETALLOC command or the ALLOCxx PARMLIB member in a similar manner. Before enabling the new load balancing support, care must be taken to ensure that the desired results will be achieved. This is particularly important if the libraries are being shared across multiple systems and the systems are at different levels of support, or if manual actions have already been taken to account for the behavior of algorithms used in previous releases.
Restriction: The SETALLOC operator command support is available only in z/OS V1R11 or later releases. In earlier z/OS releases, BYDEVICES must be enabled through the ALLOCxx PARMLIB member.
Let us assume now that GRID1 has a total of 60 virtual devices online and GRID2 has 40 virtual devices online. For each grid, the distribution of online virtual drives is 50% for Cluster 0, 25% for Cluster 1, and 25% for Cluster 2. The expected distribution of the scratch allocations will be as shown in Figure 10-12.
Figure 10-12 ALLOC BYDEVICES scratch allocations
As stated in 10.4.1, “EQUAL allocation” on page 675, DAA is ENABLED by default and was DISABLED by using the LIBRARY REQUEST command. Furthermore, none of the TS7700 Virtualization Engine override settings are activated.
For specific allocations (DAA DISABLED in this scenario), it is first determined which one of the composite libraries, GRID1 or GRID2, has the requested logical volume. That grid is selected, and the allocations can go to any cluster in the grid and are proportionately distributed based on the number of online devices in each cluster.
The logical volume cache placement possibilities and the two allocation schemes, both described in 10.4.1, “EQUAL allocation” on page 675, are also applicable for the BYDEVICES allocation.
With the BYDEVICES allocation algorithm used for specific mount requests, there will always be cross-cluster mounts when the cluster where the device is allocated is not the cluster where the data resides. Cache placement can limit the number of cross-cluster mounts but cannot avoid them. Cross-cluster mounts over the extended fabric is likely not acceptable, so vary the devices of Cluster 2 offline.
10.4.3 Allocation and Copy Consistency Point setting
By defining the Copy Consistency Point, you control if and how a given volume will be placed in a determined cluster of the grid. If you plan to use the TS7720 Cluster 0 as a deep cache, you probably prefer to define the Management Class Construct as [R,D,D]. By defining this, Cluster 0 will be the primary placeholder of the data. At job completion time, only this cluster will have a valid copy of the data. The other cluster will create a deferred copy of that logical volume afterward.
For more information for Copy Consistency Points, see the “IBM TS7700 Best practices - Synchronous Mode Copy and IBM TS7700 Best practices - Copy Consistency Point” white papers:
It is further assumed that the allocation characteristics apply as described in 10.4.2, “BYDEVICES allocation” on page 677. Both DAA and SAA are DISABLED in this scenario, too, and none of the TS7700 Virtualization Engine override settings are used.
For non-specific (scratch) allocations, the BYDEVICES algorithm will randomize across all devices, resulting in allocations on all three clusters of each grid. I/O TVC selection will subsequently assign the TVC of Cluster 0 as the I/O TVC due to the Copy Consistency Point setting. There are many factors that might influence this selection, as explained in 2.2.2, “Tape volume cache” on page 27, but normally the cluster with a Copy Consistency Point of R(un) will get preference over other clusters. As a consequence, the TVC of Cluster 0 is selected as the I/O TVC and cross-cluster mounts will be issued from both Cluster 1 and Cluster 2.
By activating the override setting “Prefer Local Cache for Fast Ready Mount Requests” in both clusters in the Disaster Site, cross-cluster mounts are avoided but the copy to Cluster 0 is made before the job ends, caused by the R(un) Copy Consistency Point setting for this cluster. By further defining a family for the Production Site clusters, Cluster 1 will retrieve its copy from Cluster 0 in the Production Site location, therefore avoiding using the remote links between the locations.
The method to prevent device allocations at the Disaster Site, implemented mostly today, is just varying offline all the remote virtual devices. The disadvantage is that in losing a cluster in the Production Site, an operator action is required to vary online manually the virtual devices of Cluster 2 of the grid with the failing cluster.
With the TS7700 Virtualization Engine R2.0, an alternate solution is exploiting scratch allocation assistance (SAA), which will be described in 10.4.5, “Allocation and scratch allocation assistance” on page 683.
For specific allocations, the algorithm described in 10.4.2, “BYDEVICES allocation” on page 677 applies when DAA is disabled. It is first determined which of the composite libraries, GRID1 or GRID2, has the requested logical volume. That grid is selected and the allocation over the clusters is subsequently randomized. It can be assumed that, if the requested logical volumes were earlier created with the BYDEVICES allocation scenario, these logical volumes are spread over the two grids and allocation distribution within the grid over the three clusters has been determined by the number of the online devices in each of the clusters.
Cluster 0 is likely to have a valid copy of the logical volume in the cache due to the Copy Consistency Point setting of [R,D,D]. If the vNodes of Cluster 1 and Cluster 2 are selected as mount points, it results in cross-cluster mounts. It might happen that this volume has been removed by a policy in place for TS7720 Cluster 0, resulting in the mount point TVC as the I/O TVC.
Activating, in the TS7700 Virtualization Engine, the override “Force Local TVC to have a copy of the data” will first result in a recall of the virtual volume from a stacked volume. If there is no valid copy in the cluster or if it fails, a copy will be retrieved from one of the other clusters before the mount completes. Activating the override setting “Prefer Local Cache for non-Fast Ready Mount Requests” recalls a logical volume from tape instead of using the grid links for retrieving the data of the logical volume from Cluster 0. This might result in longer mount times.
With the TS7700 Virtualization Engine, an alternate solution can be considered by exploiting device allocation assistance (DAA) that will be described in 10.4.4, “Allocation and device allocation assistance” on page 681. DAA is enabled by default.
Figure 10-13 shows the allocation results of specific and non-specific allocations when the devices of the remote clusters in the Disaster Site are not online. Allocation BYDEVICES is used. GRID1 has a total of 60 devices online and GRID2 has 40 devices online. For each grid, the distribution of online devices is 75% for Cluster 0 and 25% for Cluster 1. Cross-cluster mounts might apply for the specific allocations in Cluster 1 because it is likely that only the TS7720 Cluster 0 will have a valid copy in cache. The red arrows show the data flow as result of these specific allocations.
Figure 10-13 Allocation and Copy Consistency Points set at R,D,D
10.4.4 Allocation and device allocation assistance
Device allocation assistance (DAA) allows the host to query the TS7700 Virtualization Engine to determine which clusters must be preferred for a private (specific) mount request before the actual mount is requested. DAA returns to the host a ranked list of clusters (the preferred cluster is listed first) where the mount must be executed. If DAA is enabled, it is for the composite library, and it will be exploited by all z/OS JES2 LPARs having the proper level of supporting software.
The selection algorithm orders the clusters first by those having the volume already in cache, then by those having a valid copy on tape, and then by those without a valid copy. Subsequently, host processing will attempt to allocate a device from the first cluster returned in the list. If an online device is not available within that cluster, it will move to the next cluster in the list and try again until a device is chosen. This allows the host to direct the mount request to the cluster that will result in the fastest mount, typically the cluster that has the logical volume resident in cache. If the mount is directed to a cluster without a valid copy, a cross-cluster mount will result. Thus, in special cases, even if DAA is enabled, cross-cluster mounts and recalls can still occur.
If the default allocation algorithm EQUAL is used, it supports an ordered list for the first seven library port IDs returned in the list. After that, if an eligible device is not found, all of the remaining library port IDs are considered equal. The alternate allocation algorithm BYDEVICES removes the ordered library port ID limitation. With the TS7700 Virtualization Engine, install the additional APAR OA30718 before enabling the new BYDEVICES algorithm. Without this APAR, the ordered library port ID list might not be honored correctly, causing specific allocations to appear randomized.
In the scenario that is described in 10.4.3, “Allocation and Copy Consistency Point setting” on page 679, if you enable DAA (this is the default) by issuing the command LIBRARY REQUEST,GRID[1]/[2],SETTING,DEVALLOC,PRIVATE,ENABLE, it will influence the specific requests in the following manner. The Copy Consistency Point is defined as [R,D,D]. It is assumed that there are no mount points in Cluster 2. It is further assumed that the data is not in the cache of the TS7740 Virtualization Engine Cluster 1 anymore because this data is managed as TVC Preference Level 0 (PG0), by default. It is first determined which of the composite libraries, GRID1 or GRID2, has the requested logical volume. That grid is selected and the allocation over the clusters is subsequently determined by DAA. The result is that all allocations will select the TS7720 Cluster 0 as the preferred cluster.
Remember that you can influence the placement in cache by setting the CACHE COPYFSC option with the LIBRARY REQUEST,GRID[1]/[2],SETTING,CACHE,COPYFSC,ENABLE command. When the ENABLE keyword is specified, the logical volumes copied into the cache from a peer TS7700 cluster are managed using the actions defined for the Storage Class construct associated with the volume as defined at the TS7740 cluster receiving the copy. Therefore, a copy of the logical volume will also stay in cache in each non-I/O TVC cluster where a Storage Class is defined as Preference Level 1 (PG1). However, because the TS7720 is used as a deep cache, there are no obvious reasons to do so.
There are two major reasons why Cluster 0 might not be selected:
No online devices are available in Cluster 0, but are in Cluster 1.
The defined removal policies in the TS7720 caused Cluster 0 to not have a valid copy of the logical volume anymore.
In both situations, DAA will select the TS7740 Virtualization Engine Cluster 1 as the preferred cluster:
When the TS7740 Cluster 1 is selected due to lack of online virtual devices on Cluster 0, cross-cluster mounts might happen unless the TS7700 Virtualization Engine override settings, as described in 10.4.3, “Allocation and Copy Consistency Point setting” on page 679, are preventing this from happening.
When the TS7740 Cluster 1 is selected because the logical volume is not in the TS7720 Cluster 0 cache anymore, its cache is selected for the I/O TVS and, because the Copy Consistency Point setting is [R,D,D], a copy to the TS7720 Cluster 0 will be made as part of successful Rewind Unload processing.
Even when DAA is enabled, there might be specific mounts for which the device affinity call is not made. For example, DFSMShsm, when appending a volume, will go to allocation requiring that a scratch volume be mounted. Then, when a device is allocated and a volume is to be mounted, it will select from the list of hierarchical storage management (HSM)-owned volumes. In this case, because the allocation started as a scratch request, the device affinity is not made for this specific mount. The MARKFULL option can be specified in DFSMShsm to mark migration and backup tapes that are partially filled during tape output processing as full. This enforces a scratch tape to be selected the next time that the same function begins.
Figure 10-14 on page 683 shows the allocation result of specific allocations. The devices of the remote clusters in the Disaster Site are not online. GRID1 has in total 60% of specific logical volumes and GRID2 has 40% of the specific logical volumes. This was the result of earlier BYDEVICES allocations when the logical volumes were created. The expected distribution of the specific allocations will be as shown. Cross-cluster mounts might apply in situations where DAA has selected the vNode of Cluster 1 as mount point. The red arrows show the data flow for both the creation of the copy of the data for scratch allocations and for specific allocations.
Figure 10-14 Allocation and DAA
With DAA, you can vary the devices in the Disaster Site Cluster 2 online without changing the allocation preference for the TS7720 cache as long as the logical volumes exist in this cluster and as long as this cluster is available. If these conditions are not met, DAA will manage the local Cluster 1 and the remote Cluster 2 as equal and cross-cluster mounts over the extended fabric will be issued in Cluster 2. A new copy of the logical volume will be created due to the Management Class setting [R] for Cluster 0. This is likely not an acceptable scenario and so, even with DAA ENABLED, vary the devices of Cluster 2 offline.
If you plan to have an alternate Management Class setup for the Disaster Site (perhaps for the Disaster Test LPARs), you must carefully plan the Management Class settings, the device ranges that must be online, and whether DAA will be enabled. You will probably read production data and create test data using a separate category code. If you do not want the grid links overloaded with test data, vary the devices of Cluster 0 and Cluster 1 offline on the disaster recovery (DR) host only and activate the TS7700 Virtualization Engine Override Setting “Force Local TVC” to have a copy of the data. A specific volume request enforces a mount in Cluster 2 even if there is a copy in the deep cache of the TS7720 Cluster 0.
10.4.5 Allocation and scratch allocation assistance
The scratch allocation assistance (SAA) function can be used when there is a need for a method to have z/OS JES2 allocate to specific clusters (candidate clusters) for a given workload. For example, DFSMShsm Migration Level 2 (ML2) migration can be directed to a TS7720 Virtualization Engine cluster with its deep cache while archive workload needs to be directed to a TS7740 Virtualization Engine cluster within the same grid configuration. SAA is an extension of the DAA function for scratch mount requests. SAA filters the list of clusters in a grid to return to the host a smaller list of candidate clusters specifically designated as scratch mount candidates. By identifying a subset of clusters in the grid as sole candidates for scratch mounts, SAA optimizes scratch mounts to a TS7700 grid.
When a composite library supports/enables the SAA function, the host will issue an SAA handshake to all SAA-enabled composite libraries and provide the Management Class that will be used for the upcoming scratch mount. A cluster is designated as a candidate for scratch mounts by using the Scratch Mount Candidate option on the Management Class construct, which is accessible from the TS7700 Management Interface, as shown in Figure 10-15. By default, all clusters are considered candidates.
Figure 10-15 Scratch Mount Candidate definition
The targeted composite library will use the provided Management Class definition and the availability of the clusters within the same composite library to filter down to a single list of candidate clusters. Clusters that are unavailable or in service are excluded from the list. If the resulting list has zero clusters present, the function will then view all clusters as candidates. In addition, if the filtered list returns clusters that have no devices configured within z/OS, all clusters in the grid become candidates. The candidate list is not ordered, meaning that all candidate clusters are viewed as equals and all clusters excluded from the list are not candidates.
Because this function introduces overhead into the z/OS scratch mount path, a new LIBRARY REQUEST option is introduced to globally enable or disable the function across the entire multicluster grid. SAA is disabled, by default. When this option is enabled, the z/OS JES2 software will obtain the candidate list of mount clusters from a given composite library. Use the LIBRARY REQUEST,GRID[1]/[2],SETTING,DEVALLOC,SCRATCH,ENABLE command to enable SAA. All clusters in the multicluster grid must be at R2.0 level before SAA will be operational. A supporting z/OS APAR OA32957 is required to use SAA in a JES2 environment of z/OS. Any z/OS environment with earlier code can exist, but it will continue to function in the traditional way with respect to scratch allocations.
Assume that there are two main workloads. The application workload consists of logical volumes that are created and subsequently retrieved on a regular, daily, weekly, or monthly basis. This workload can best be placed in the TS7720 deep cache. The backup workload is normally never retrieved and can best be placed directly in the TS7740 Cluster 1. SAA will help to direct the mount point to the most efficient cluster for the workload:
The application workload can best be set up in the following manner. In the Management Class construct, the Management Class is defined with a Copy Consistency Point of [R,D,D]. Cluster 0 is selected in all clusters as Scratch Mount Candidate. In Cluster 1, the Storage Class can best be set as TVC Preference Level 1. This is advised because in cases where Cluster 0 is not available or no online devices are available in that cluster, Cluster 1 can be activated as the mount point. Cluster 2 can have set Preference Level 0. You can control the placement in cache per cluster by setting the SETTING CACHE COPYFSC option. When the ENABLE keyword is specified, the logical volumes copied into the cache from a peer TS7700 cluster are managed using the actions defined for the Storage Class construct associated with the volume as defined at the TS7740 cluster receiving the copy. The Storage Class in Cluster 0 needs to have a Volume Copy Retention Group of Prefer Keep. This will allow that logical volumes can be removed from the TS7720 deep cache if additional space is needed.
The Backup workload can best be set up in the following manner. In the Management Class construct, the Management Class is defined with a Copy Consistency Point of [D,R,D] or [N,R,D]. Cluster 1 is selected in all clusters as Scratch Mount Candidate. In Cluster 1 and Cluster 2, the Storage Class can best be set as TVC Preference Level 0. There is no need to keep the data in cache. The Storage Class in Cluster 0 can have a Volume Retention Group of Prefer Remove. If Cluster 0 is activated as mount point because of the unavailability of Cluster 1 or because there are no online devices in that cluster, the logical volumes with this Management Class can be removed first when cache removal policies in the TS7720 require the removal of volumes from cache.
With these definitions, the scratch allocations for the application workload will be directed to TS7720 Cluster 0 and the scratch allocations for the Backup workload are directed to TS7740 Cluster 1. The devices of the remote clusters in the Disaster Site are not online. Allocation “BYDEVICES” is used. GRID1 has in total 60 devices online and GRID2 has 40 devices online. For each grid, the distribution of online devices is now not determined within the grid by the number of online devices, as in the scenario BYDEVICES, but is determined by the SAA setting of the Management Class.
The expected distribution of the scratch allocations is shown in Figure 10-16.
Figure 10-16 Allocation and SAA
Clusters not included in the list are never used for scratch mounts unless those clusters are the only clusters known to be available and configured to the host. If all candidate clusters have either all their devices varied offline to the host or have too few devices varied online, z/OS will not revert to devices within non-candidate clusters. Instead, the host will go into allocation recovery. In allocation recovery, the existing z/OS allocation options for device allocation recovery (WTOR | WAITHOLD | WAITNOH | CANCEL) are used.
Any time that a service outage of candidate clusters is expected, the SAA function needs to be disabled during the entire outage by using the LIBRARY REQUEST command. If left enabled, the devices that are varied offline can result in zero candidate devices, causing z/OS to enter the allocation recovery mode. After the cluster or clusters are again available and their devices are varied back online to the host, SAA can be enabled again.
If you vary offline too many devices within the candidate cluster list, z/OS might have too few devices to contain all concurrent scratch allocations. When many devices are taken offline, first disable SAA by using the LIBRARY REQUEST command and then re-enable SAA after they have been varied back on.
If you plan to have an alternate Management Class setup for the Disaster Site (perhaps for the Disaster Test LPARs), carefully plan the Management Class settings, the device ranges that need to be online, and whether SAA will be used. Read production data and create test data using a separate category code. If you use the same Management Class as used in the Production LPAR and if you define in Cluster 2 the Management Class with SAA for Cluster 2 and not for Cluster 0 or 1(as determined by the type of workload), it might happen that Cluster 2 will be selected for allocations in the Production LPARs. SAA randomly select one of the clusters for determining the scratch mount candidate clusters in the Management Class constructs. Therefore, the devices in Cluster 2 must not be made available to the Production LPARs and the devices in the clusters in the Production Site must not be made available in the Disaster Site.
Furthermore, the Copy Consistency Point for the Management Classes in the Disaster Site can be defined as [D,D,R] or even [N,N,R] if it is test only. If it is kept equal with the setting in the Production Site, with an [R] for Cluster 0 or Cluster 1, cross-cluster mount might occur. If you do not want the grid links overloaded with test data, update the Copy Consistency Point setting or use the TS7700 Virtualization override setting “Prefer Local Cache for Fast Ready Mount Requests” in Cluster 2 in the Disaster Site. Cross-cluster mounts are avoided but the copy to Cluster 0 or 1 is still made before the job ends, caused by the Production R(un) Copy Consistency Point setting for these clusters. By further defining a family for the Production Site clusters, the clusters will source their copies from the other clusters in the Production Site location, therefore optimizing the usage of the remote links between the locations.
10.5 Data movement through Tape Volume Cache
The data flow through the TS7700 cache in several configurations is defined to help you understand the data traffic and the pieces of data movement that must share the resources of the TS7700.
Clarification: Reclaim data in a TS7740 is transferred from one tape drive to another, not passed through the cache.
TS7720 data flow is the TS7700 basic configuration in terms of cache data flow.
For both the TS7720 and TS7740, the following data flows through the subsystem:
Uncompressed host data is compressed by the host bus adapters (HBAs) and the compressed data is written to cache.
Compressed data is read from the cache and uncompressed by the HBA, as shown in Figure 10-17.
Figure 10-17 TS7720 stand-alone data flow
For more information about cache management, see IBM Virtualization Engine TS7700 Series Best Practices: Cache Management in the TS7720 V1.5 at the Techdocs web site:
Compare to a TS7740 stand-alone as shown in Figure 10-18 on page 688:
If a read is requested and the logical volume does not exist in the cache, a stacked physical tape is mounted and the logical volume is read into cache. The host then reads the logical volume from the TS7740 cache.
Host data will be written from cache to the physical stacked volumes in a process called premigrate.
The back-end drives account for the difference. In a TS7740, the tape drives are used for recall data from the stacked volumes, and to premigrate data into physical tape.
Figure 10-18 TS7740 stand-alone data flow
10.5.1 Cache data flow in a TS7700 two-cluster grid
Figure 10-19 shows a TS7720 two-cluster grid being used in balanced mode. Host accesses are available to both clusters, and you operate with logical drives online in both clusters. For a scratch mount, host allocation can select a virtual device from either cluster. The figure shows data flow from a cache perspective in Cluster 0.
Figure 10-19 TS7720 Two-cluster grid showing balanced mode data flow
For both the TS7720 and TS7740, the following data is moved through the subsystem:
Local write with no remote copy, that is, a Copy Consistency Point of run-none [R,N] includes writing the compressed host data to cache.
Local write with remote copy (Copy Consistency Point of [D,D], [R,R], or [S,S]) includes writing the compressed host data to cache and to the grid:
 – For a Copy Consistency Point of [S,S], the logical volume data is written to both clusters at the same time. When a tape sync event occurs, either explicit or implicit, the sync event is not complete until all data written up to that point has been written to non-volatile storage in both clusters.
 – For a Copy Consistency Point of [R,R], the copy is immediate and must complete before Rewind Unload (RUN) is complete. Copies are placed in the immediate copy queue.
 – For a Copy Consistency Point of [D,D], the copy is deferred where the completion of the RUN is not tied to the completion of the copy operation. Copies are placed in the Deferred Copy Queue.
Remote write with no local copy (Copy Consistency Point of [N,R]) includes writing compressed host data to the grid.
Local read with a local cache hit. Here, the compressed host data is read from the local cache.
Local read with a remote cache hit. Here, the compressed host data is read from the remote cache through the grid link.
Synchronous, immediate, and deferred copies from the remote cluster. Here, compressed host data is received on the grid link and copied into the local cache.
The TS7740 has the back-end tape drives for recalls and premigrates.
Figure 10-20 shows a similar configuration with TS7740, and shows data flow from the cache perspective in Cluster 0.
Figure 10-20 TS7740 two-cluster grid with a balanced mode data flow
Consider the following information:
A write with copy or no copy to another cluster includes the premigrate process.
A read with local cache miss has one of the following results:
 – A cross-cluster mount without recall
 – A recall into local cache from a local stacked volume
 – A cross-cluster mount requiring recall from a remote stacked volume
Host data is written from cache to the physical stacked volumes in a premigrate process, including data written as a result of a local mount, volumes copied from other clusters, and a cross-cluster mount to this cluster for write (not shown).
The grid can be used in a preferred mode. Preferred mode means that only one cluster will have the logical drives varied online. Host allocation will select a virtual device only from the cluster with varied on virtual devices. Data movement through the cache in this mode is a subset of the balanced mode model.
10.5.2 Cache data flow in a TS7700 three-cluster grid
This model covers a three-cluster grid configuration where two clusters are in the production site being used in high availability (HA) (balanced) mode. That is, both clusters in the production site operate with their logical drives online, being optioned by the host allocation for mounts. The third cluster is in a remote site, kept in a vault for the DR site, with its virtual devices varied offline.
Consider data movement through the subsystem. Balanced mode means the host has virtual devices in both clusters varied online. Host allocation can select a virtual device from either cluster. For both the TS7720 and TS7740, the following data is moved through the subsystem:
Local write with no remote copy (Copy Consistency Point of [R,N,N]) includes writing the compressed host data to cache.
Local write with a HA copy (Copy Consistency Point of [R,D,N] or [R,R,N], or [S,S,N]) includes writing the compressed host data to cache and to the grid:
 – For a Copy Consistency Point of [S,S,N], the logical volume data is written to both HA clusters at the same time. When a tape sync event occurs, either explicit or implicit, the sync event is not complete until all data written up to that point has been written to non-volatile storage in both clusters.
 – For a Copy Consistency Point of [R,R,N], the copy to the other HA cluster is immediate and must complete before rewind-unload (RUN) is complete. Copies are placed in the immediate copy queue.
 – For a Copy Consistency Point of [D,D,N], the copy to the other HA cluster is deferred where the completion of the RUN is not tied to the completion of the copy operation. Copies are placed in the Deferred Copy Queue.
Local write with HA and remote copy (Copy Consistency Point of [R,R,D] or [R,D,D]/[D,D,D], or [S,S,D]) includes writing the compressed host data to cache and to the grid:
 – For a Copy Consistency Point of [S,S,D], the logical volume data is written to both HA clusters at the same time. When a tape sync event occurs, either explicit or implicit, the sync event is not complete until all data written up to that point has been written to non-volatile storage in both clusters. The copy of the remote cluster is deferred.
 – For a Copy Consistency Point of [R,R,D], the copy to the HA cluster is immediate. The copy to the remote cluster is deferred. The immediate copy will be sourced from the mounting cluster. The remote copy will be sourced from either of the two HA clusters. The grid link performance and other factors are used to determine from which cluster the remote cluster will source the deferred copy.
 – For a Copy Consistency Point of [R,D,D]/[D,D,D], the copies to the other HA cluster and remote cluster are deferred. Each cluster can source the copy from either of the other two clusters. Which cluster has a valid copy, the grid link performance, and other factors are used to determine from which cluster the two clusters will source the deferred copy.
Remote write with no local copy (Copy Consistency Point of [N,R,N] or [N,N,R]) includes writing compressed host data to the grid.
Local read with a local cache hit. Here, the compressed host data is read from the local cache.
Local read with a remote cache hit. Here, the compressed host data is read from one of the other two clusters’ cache through the grid link.
Immediate and deferred copies from the other HA cluster. Here, compressed host data is received on the grid link and copied into the local cache.
Figure 10-21 TS7720 three-cluster grid with HA and DR mode data flow
The TS7740 adds back-end tape drives for recalls and premigrates:
A write with no copy and a write with copy also include the premigrate process.
A read with local cache miss has one of the following results:
 – A cross-cluster mount without recall
 – A recall into local cache from a local stacked volume
 – A cross-cluster mount requiring a recall from a remote stacked volume
Host data will be written from cache to the physical stacked volumes in a premigrate process. This addition includes data written as a result of a local mount for write, volumes copied from other clusters, and a mount for write from the HA cluster (not shown).
Figure 10-22 on page 692 shows the TS7740 data movement through cache in this grid.
Figure 10-22 TS7740 three-cluster grid with HA and DR mode data flow
10.5.3 TS7700 four-cluster grid considerations
One possible configuration for a four-cluster grid is two local clusters (Cluster 0 and Cluster 1) and two remote clusters (Cluster 2 and Cluster 3). Configure the Copy Consistency Points in a way that Cluster 0 is replicated to Cluster 2, and data written to Cluster 1 is replicated to Cluster 3. In this way, you will have two two-cluster grids within the four-cluster grid. With this configuration, two copies of data are in the grid. All data is accessible by all clusters either within the cluster or through the grid. Therefore, all data is available when one of the clusters is not available. See Figure 10-23 for the four-cluster grid example.
Figure 10-23 Two two-cluster grids in one
With DAA, which is supported by JES2 but not JES3, the host can allocate a virtual device for a private mount on the best cluster. The best cluster is typically the cluster that contains the logical volume in its cache.
Remember: The same configuration considerations apply for five-cluster grid configurations and six-cluster grid configurations. These configurations are available with a request for price quotation (RPQ).
10.5.4 TS7700 hybrid grid considerations
Hybrid grids provide many possible configurations. Six basic combinations are available (one 2-way, two 3-way, and three 4-way possibilities). The same performance considerations apply to hybrids and homogeneous grids. An interesting hybrid grid, which is illustrated in Figure 10-24, is one where two or three TS7720 clusters are attached to the production hosts, and a single, perhaps remote, TS7740 DR cluster. The TS7720 clusters do not replicate to each other, but all of the TS7720 clusters replicate to the single TS7740. The advantages are a large front-end cache as presented by the TS7720 clusters, and a deep back end for archiving and DR. However, the replication traffic from all of the TS7720 clusters is traveling across the same grid network. It is essential that adequate network bandwidth be provided to handle the traffic to the TS7740. Also, the network needs to have enough bandwidth to retrieve logical volumes that reside only in the TS7740 cluster. Figure 10-24 shows three TS7720 production clusters and one TS7740 DR cluster.
Figure 10-24 Hybrid grid (three TS7720 production clusters and one TS7740 DR cluster)
10.5.5 Cluster families and cooperative replication
When two clusters are at one site and the other two are at a remote site, and the two remote clusters need a copy of the data, cluster families make it so that only one copy of the data is sent across the long grid link. Also, when deciding where to source a volume, a cluster will give higher priority to a cluster in its family over a cluster in another family.
Family members are given higher weight when deciding which cluster to prefer for TVC selection.
Because only one copy is required to be transferred to a family, the family is consistent after the one copy is complete. Because a family member prefers to get its copy from another family member instead of getting the volume across the long grid link, the copy time is much shorter for the family member. Because each family member is pulling a copy of a separate volume, this will make a consistent copy of all volumes to the family quicker.
With cooperative replication, a family will prefer retrieving a new volume that the family does not have a copy of yet, over copying a volume within a family. When fewer than 20 new copies are to be made from other families, the family clusters will copy among themselves. This means second copies of volumes within a family are deferred in preference to new volume copies into the family. When a copy within a family has been queued for 12 hours or more, it is given equal priority with copies from other families. This prevents family copies from stagnating in the copy queue.
See the following resources for details about cluster families:
IBM Virtualization Engine TS7700 Series Best Practices -TS7700 Hybrid Grid Usage:
TS7700 Technical Update (R1.7) and Hybrid Grid Best Practices:
10.6 Monitoring TS7700 Virtualization Engine performance
The IBM Virtualization Engine TS7700 series is the latest in the line of tape virtualization products that has revolutionized the way that mainframes use their tape resources. As the capability of tape virtualization has grown, so has the need to efficiently manage the large number of logical volumes that the system supports. Internal to the TS7700 Virtualization Engine, a large amount of information is captured and maintained about the state and operational aspects of the resources within the TS7700 Virtualization Engine. The TS7700 Virtualization Engine provides a management interface based on open standards through which a storage management application can request specific information that the TS7700 Virtualization Engine maintains. The open standards are not currently supported for applications running under z/OS, so an alternative method is needed to provide the information to mainframe applications.
You can use the following interfaces, tools, and methods to monitor the TS7700 Virtualization Engine:
IBM System Storage TS3500 Tape Library Specialist (TS7740 only)
TS7700 Virtualization Engine management interface (MI)
Bulk Volume Information Retrieval function (BVIR)
The specialist and MI are web-based. With the BVIR function, various types of monitoring and performance-related information can be requested through a host logical volume in the TS7700 Virtualization Engine. Finally, the VEHSTATS tools can be used to format the BVIR responses, which are in a binary format, to create usable statistical reports.
In conjunction with the VEHSTATS data, there are now performance evaluation tools available on Techdocs that quickly create performance-related charts. Performance tools are provided to analyze 24 hours worth of 15-minute data, 7 days worth of one-hour interval data, and 90 days worth of daily summary data. See the following link to Techdocs:
All interfaces, tools, and methods to monitor the TS7700 Virtualization Engine are explained in detail next. An overview of these interfaces, tools, and methods is shown in Figure 10-25.
Figure 10-25 Interfaces, tools, and methods to monitor the TS7700 Virtualization Engine
10.6.1 Using the TS3500 Tape Library Specialist for monitoring
The Tape Library Specialist (TS3500 Tape Library Specialist), only available with the TS7740 Virtualization Engine, allows users to manage and monitor items related to the TS3500 Tape Library. Initially, the web user interface to the TS3500 Tape Library only supported a single user at any given time. Now, each Ethernet-capable frame on the TS3500 Tape Library allows five simultaneous users of the web user interface so that multiple users can access the TS3500 Tape Library Specialist interface at the same time.
Figure 10-26 on page 696 shows the TS3500 Tape Library System Summary window.
Figure 10-26 TS3500 Tape Library Specialist System Summary window
The TS3500 Tape Library Specialist session will time out after a default setting of ten minutes. This is different from the TS7700 Virtualization Engine MI. You can change the default values through the TS3500 Tape Library Specialist by selecting Manage Access  Operator Panel Security, which opens the window shown in Figure 10-27 on page 697.
Figure 10-27 TS3500 Tape Library Specialist Operator Panel Security window
Some information provided by the TS3500 Tape Library Specialist is in a display-only format and there is no option to download data. Other windows provide a link for data that is available only when downloaded to a workstation. The data, in comma-separated value (CSV) format, can be downloaded directly to a computer and then used as input for snapshot analysis for the TS3500. This information refers to the TS3500 and its physical drive usage statistics from a TS3500 standpoint only.
For more information, including how to request and use this data, see IBM TS3500 Tape Library with System z Attachment A Practical Guide to Enterprise Tape Drives and TS3500 Tape Automation, SG24-6789.
The following information is available:
Accessor Usage: Display only:
 – Activity of each Accessor and gripper
 – Travel meters of Accessors
Drive Status and Activity: Display only
Drive Statistics: Download only:
 – Last VOLSER on this drive
 – Write and Read MB per drive
 – Write and Read errors corrected per drive
 – Write and Read errors uncorrected per drive
Mount History for cartridges: Download only:
 – Last Tape Alert
 – Number of Mounts of a specific cartridge
 – Number of Write and Read retries of a specific cartridge in the life cycle
 – Number of Write and Read permanent errors of a specific cartridge in the life cycle
Fibre Port statistics: Download only
The Fibre Port statistics include fiber errors, aborts, resets, and recoveries between the TS7700 Virtualization Engine and the physical tape drives in the TS3500 Tape Library.
Restriction: This statistic does not provide information from the host to the TS7700 Virtualization Engine or from the host to the controller.
Library statistics, on a hourly basis: Download only:
 – Total Mounts
 – Total Ejects
 – Total Inserts
 – Average and Maximum amount of time that a drive was mounted on a drive (residency)
 – Average and Maximum amount of time that was needed to perform a single mount
 – Average and Maximum amount of time that was needed to perform an eject
These statistics can be downloaded to a workstation for more analysis. These statistics are not included in the Bulk Volume Information Retrieval (BVIR) records processed by the TS7700 Virtualization Engine.
10.6.2 Using the TS7700 Virtualization Engine management interface to monitor IBM storage products
The TS7700 Virtualization Engine management interface (MI) belongs to the family of tools used for reporting and monitoring IBM storage products. These tools do not provide reports, but they can be used for online queries about the status of the TS7700 Virtualization Engine, its components, and the distributed libraries. They also provide information about the copies that have not completed yet and the amount of data to be copied.
The TS7700 Virtualization Engine MI is based on a web server that is installed in each TS7700 Virtualization Engine. You can access this interface with any standard web browser.
The TS7700 Virtualization Engine MI is a Storage Management Initiative - Specification (SMI-S)-compliant interface that provides you with a single access point to remotely manage resources through a standard web browser. The MI is required for implementation and operational purposes. In a TS7700 Virtualization Engine configuration, two possible web interfaces are available:
The TS3500 Tape Library Specialist
The TS7700 Virtualization Engine MI
A link is available to the TS3500 Tape Library Specialist from the TS7700 Virtualization Engine MI, as shown at the lower left corner of Figure 10-28 on page 699. This link might not be available if not configured during TS7740 installation or for a TS7720 Virtualization Engine.
The Performance and Statistics windows of the TS7700 Virtualization Engine MI are described.
Performance and statistics
Information that relates to viewing performance information and statistics for the TS7700 Virtualization Engine for single and multicluster grid configurations is described. The graphical views display snapshots of the processing activities from the last 15 minutes if nothing else is stated when describing the windows. You can access the following selections by navigating to the Performance & Statistics section in the TS7700 Virtualization Engine MI. The examples are taken from different cluster configurations.
The navigation pane is available on the left side of the MI, as shown in the Grid Summary window shown in Figure 10-28 on page 699.
Figure 10-28 TS7700 Virtualization Engine MI - Performance
Historical Summary
This window (Figure 10-29 on page 700) shows the various performance statistics over a period of 24 hours. Data is retrieved from the Historical Statistic Records. It presents data in averages over 15-minute periods:
Throughput from Tape
Throughput to Tape
Throughput read by Host Compressed
Throughput written by Host Compressed
Throughput in Raw GB Read/Written by host
Throughput copied in over the Grid Network.
Throughput copied out over the Grid Network.
GB to copy
Reclaim mounts
Maximum virtual drives mounted
Figure 10-29 TS7700 MI Historical Summary
While this is a snapshot, the performance evaluation tools on TECHDOCS provide you a 24-hour or 90-day overview of these numbers. Review the numbers to help you with these tasks:
Identify your workload peaks and possible bottlenecks.
See trends to identify increasing workload.
Identify schedule times for reclaim.
If performance issues occur, check how many throughput increments are enabled on the cluster first. The next graphics (Figure 10-30 on page 701 and Figure 10-31 on page 701) show how to determine this information.
Figure 10-30 Feature code license entry picture
Figure 10-31 shows the installed increments (Feature Code (FC) 5268). In this example, four increments are installed.
Figure 10-31 Feature code license example
Active Data Distribution
Select them from the Physical volumes - Active data distribution drop-down menu. This window (Figure 10-32 on page 702) shows the active pools and correspondent data distribution (number of cartridges by a occupancy percentage range).
Figure 10-32 Pool Active Data Distribution
Click a pool link. Information for the pool is displayed, as shown in Figure 10-33.
Figure 10-33 Physical volume pool active data distribution detail
Review your Active Data Distribution. A low utilization percentage results in a higher number of stacked volumes. Also, ensure that you monitor the number of empty stacked volumes to avoid an “out of stacked volumes” condition. If you have defined multiple physical pools, you might need to check this on a per pool basis, depending on your Borrow/Return policies. In this example, Pool 3 has the parameter “borrow,return” enabled.
Pending Updates window
Use this window (Figure 10-34 on page 703) to view the pending updates for the IBM Virtualization Engine TS7700 Grid. The existence of pending updates indicates that updates occurred while a cluster was offline, in service prep mode, or in service mode. Before any existing pending updates can take effect, all clusters must be online.
Figure 10-34 Grid Pending Updates window
This window provides a summary of the number of outstanding updates for each cluster in an IBM Virtualization Engine TS7700 Grid. You can also use this window to monitor the progress of pending immediate-deferred copies, which, like pending updates, normally result from changes made while a cluster is Offline, in service prep mode, or in service mode.
Remember: Pending immediate-deferred copies need to be avoided. They might be a result of overload or grid network problems.
Virtual Mounts window
Use this window (Figure 10-35) to view virtual mount statistics for the TS7700 Virtualization Engine. The virtual mount statistics for each cluster are displayed in two bar graphs and tables: One for the number of mounts and one for average mount time. The example in Figure 10-35 is from a TS7700 Virtualization Engine Cluster that is part of a six-cluster grid configuration.
Figure 10-35 TS7700 Virtualization Engine MI Virtual Mounts window
The “Number of logical mounts during last 15 minutes” table has the following information:
Cluster The cluster name
Fast Ready Number of logical mounts completed using the scratch (Fast Ready) method
Cache Hits Number of logical mounts completed from cache
Cache Misses Number of mount requests that were not fulfilled from cache
Total Total number of logical mounts
The “Average mount times (ms) during last 15 minutes” table has the following information:
Cluster The cluster name
Fast Ready Average mount time for scratch (Fast Ready) logical mounts
Cache Hits Average mount time for logical mounts completed from cache
Cache Misses Average mount time for requests that are not fulfilled from cache
This view only gives you an overview if you run out of virtual drives in a cluster. Depending on your environment, it does not show you, if in a specific LPAR or sysplex, there might be a shortage of virtual drives. Especially if you define virtual drives in a static way to an LPAR (without an allocation manager), a certain LPAR might not have enough drives. To ensure that a specific LPAR has enough virtual drives, analyze your environment with Tapetools MOUNTMON.
Physical Mounts window
Use this window (Figure 10-36) to view physical mount statistics for the TS7740 Virtualization Engine. The physical mount statistics for each cluster are displayed in two bar graphs and tables: One for the number of mounts by category and one for average mount time per cluster. The example in Figure 10-36 is from a TS7740 Virtualization Engine cluster that is part of a multicluster grid configuration (four-cluster grid).
Figure 10-36 TS7740 Virtualization Engine MI Physical Mounts window
The table cells show the following items:
Cluster The cluster name
Pre-migrate Number of premigrate mounts
Reclaim Number of reclaim mounts
Recall Number of recall mounts
Secure Data Erase Number of Secure Data Erase mounts
Total Total physical mounts
Mount Time Average mount time for physical mounts
Review the used numbers of physical drives to help you with the following tasks:
Identify upcoming bottlenecks.
Determine whether it is appropriate to add or reduce additional physical pools. Using a larger number of pools requires more physical drives to handle the premigration, recall, and reclaim activity.
Determine possible time lines for Copy Export operations.
Host Throughput window
You can use this window (Figure 10-37) to view host throughput statistics for the TS7700 Virtualization Engine. The information is provided in 15-second intervals, unlike the 15-minute intervals of other performance data.
Use this window to view statistics for each cluster, vNode, host adapter, and host adapter port in the grid. At the top of the window is a collapsible tree where you view statistics for a specific level of the grid and cluster. Click the grid to view information for each cluster. Click the cluster link to view information for each vNode. Click the vNode link to view information for each host adapter. Click a host adapter link to view information for each port.
The example in Figure 10-37 is from a TS7700 Virtualization Engine cluster that is part of a multicluster grid configuration (four-cluster grid).
Figure 10-37 TS7700 Virtualization Engine MI Host Throughput window
The host throughput data is displayed in two bar graphs and one table. The bar graphs are for raw data coming from the host to the HBA, and for compressed data going from the HBA to the virtual drive on the vNode.
The letter next to the table heading corresponds with the letter in the diagram above the table. Data is available for a cluster, vNode, host adapter, and host adapter port. The table cells include the following items:
Cluster The cluster or cluster component for which data is being displayed (vNode, host adapter, or host adapter port)
Compressed Read (A) Amount of data read between the virtual drive and HBA
Raw Read (B) Amount of data read between the HBA and host
Read Compression Ratio Ratio of compressed read data to raw read data
Compressed Write (D) Amount of data written from the HBA to the virtual drive
Raw Write (C) Amount of data written from the host to the HBA
Write Compression Ratio Ratio of compressed written data to raw written data
While this is a snapshot, the performance evaluation tools on Techdocs provide you with a 24-hour, 7-day, or 90-day overview about these numbers.
Review these numbers to help you to identify the following conditions:
Identify the compression ratio in your environment for cache and stacked volume planning.
Identify any bottlenecks in the host throughput (FC enablement).
Cache Throttling window
You can use this window (Figure 10-38 on page 707) to view cache throttling statistics for the TS7700 Virtualization Engine. The example in Figure 10-38 on page 707 is from a TS7700 Virtualization Engine cluster that is part of a multicluster grid configuration (four-cluster grid).
Figure 10-38 TS7700 Virtualization Engine MI Cache Throttling window
Cache throttling is a time interval applied to TS7700 Virtualization Engine internal functions to improve throughput performance to the host. The cache throttling statistics for each cluster that relate to copy and write are displayed both in a bar graph form and in a table. The table shows the following items:
Cluster The cluster name
Copy The amount of time inserted between internal copy operations
Write The amount of time inserted between host write operations
Cache Utilization window
You can use this window (Figure 10-39 on page 708) to view cache utilization statistics for the TS7700 Virtualization Engine. The example in Figure 10-39 on page 708 is from a TS7740 Virtualization Engine cluster that is part of a multicluster grid configuration (four-cluster grid).
Figure 10-39 TS7740 Virtualization Engine MI Cache Utilization window
The cache utilization statistics can be selected for each cluster. Various aspects of cache performance are displayed for each cluster. Select them from the “Select cache utilizations statistics” drop-down menu. The data is displayed in both bar graph and table form, and can be displayed also by preference groups 0 and 1.
The following cache utilization statistics are available:
Cache Preference Group possible values:
 – 0: Volumes in this group have a preference for removal from cache over other volumes.
 – 1: Volumes in this group have a preference to be retained in cache over other volumes.
Number of logical volumes currently in cache: The number of logical volumes present in the cache preference group.
Total amount of data currently in cache: Total amount of data present in volumes assigned to the cache preference group.
Median duration that volumes have remained in cache: Rolling average of the cache age of volumes migrated out of this cache preference group for the specified amount of time (last four hours, 48 hours, and 35 days).
Number of logical volumes migrated: Rolling average of the number of volumes migrated to this cache preference group (four hours, 48 hours, and 35 days). Bar graph is not used.
Clarification: Median Duration in Cache and Number of Logical Volumes Migrated statistics have a table column for each of the time periods mentioned in parentheses.
Review this data with the performance evaluation tool from Techdocs to identify the following conditions:
Cache shortages, especially in your TS7720
Improvement capabilities for your cache usage through the adjustment of copy policies
Grid Network Throughput window
Use this window (Figure 10-40) to view network path utilization (Grid Network Throughput) statistics for the TS7700 Virtualization Engine Cluster.
Restriction: The Grid Network Throughput option is not available in a stand-alone cluster.
This window presents information about cross-cluster data transfer rates. This selection will be present only in a multicluster grid configuration. If the TS7700 Virtualization Engine grid only has one cluster, there is no cross-cluster data transfer through the Ethernet adapters.
The example in Figure 10-40 is from a TS7700 Virtualization Engine Cluster that is part of a multicluster grid configuration (four-cluster grid).
Figure 10-40 TS7700 Virtualization Engine MI grid network throughput in a six-cluster grid
The table displays data for cross-cluster data transfer performance (MBps) during the last 15 minutes. The table cells show the following items:
Cluster The cluster name
Outbound Access Data transfer rate for host operations that move data from the specified cluster into one or more remote clusters
Inbound Access Data transfer rate for host operations that move data into the specified cluster from one or more remote clusters
Copy Outbound Data transfer rate for copy operations that pull data out of the specified cluster into one or more remote clusters
Copy Inbound Data transfer rate for copy operations that pull data into the specified cluster from one or more remote clusters
Review this data with the performance evaluation tools on Techdocs to identify the following conditions:
Identify upcoming performance problems due to grid link usage.
Identify the amount of transferred data to review your settings, such as DAA, SAA, override policies, and Copy Consistency Points.
10.6.3 Feature license code
Figure 10-41shows the feature licenses.
Figure 10-41 Feature code licenses
Figure 10-41 shows a peak throughput of 400 MB. The throughput is limited because of number of the installed 100 MB/s increments.
10.7 Tuning the TS7700
To change the “knobs” settings to alter the behavior of the TS7700 Virtualization Engine, you must be able to collect the statistics data from your clusters, use the available tools to format and plot the binary data, and understand the resulting graphs.
Support is available at the Techdocs website:
Figure 10-42 is an example. This is the cache throughput plotted from VEHSTATS.
Figure 10-42 TS7740 activity graph
10.7.1 Performance evaluation tool - Plotting cache throughput from VEHSTATS
When evaluating performance, a graph that reveals a significant amount of information succinctly is the cache throughput for a cluster graph.
There are performance tools available on Techdocs that will take 24 hours of 15-minute VEHSTATS data, seven days of 1-hour VEHSTATS data, or 90 days of daily summary data and create a set of charts for you. See the following Techdoc sites for the performance tools and class replay for detailed information about how to use the performance tools:
Class replay:
The 24-hour, 15-minute data spreadsheets include the cache throughput chart. The cache throughput chart has two major components: the uncompressed host I/O line and a stacked bar chart that shows the cache throughput.
The cache throughput chart includes the following components. All values are in MiB/s.
Compressed host write: This is the MiB/s of the data written to cache. This bar is hunter green.
Compressed host read: This is the MiB/s of the data read from cache. This bar is lime green.
Data copied out from this cluster to other clusters: This is the rate at which copies of data to other clusters are made. This cluster is the source of the data and includes copies to all other clusters in the grid. The DCT value applied by this cluster applies to this data. For a two-cluster grid, this is a single value. For a three-cluster grid, there are two values, one for copies to each of the other clusters. For a four-cluster grid, there are three values, one for copies to each of the other clusters. The following descriptions are for plotting Cluster 0’s cache throughput. Use the appropriate columns when plotting other clusters. These bars are cyan.
Data copied to this cluster from other clusters: This is the rate at which other clusters are copying data into this cluster. This cluster is the target of the data and includes copies from all other clusters in the grid. The same rules for DCT apply for “data copied out”. These bars are light blue.
Compressed data premigrated from cache to tape: This is the rate at which data is being read from cache and written to physical tape. This bar is yellow.
Compressed data recalled from tape to cache: This is the rate at which data is being read from tape into cache for a mount requiring a recall. This bar is dark blue.
Compressed remote reads from this cluster: This is the rate that other clusters use this TVC as I/O cache for read. This bar is orange.
Compressed remote writes to this cluster: This is the rate of synchronous copies. This bar is burnt orange.
This tool contains spreadsheets, data collection requirements, and a 90-day trending evaluation guide to assist you in the evaluation of the TS7700 performance. Spreadsheets for a 90-day, one-week, and 24-hour evaluations are provided.
One 90-day evaluation spreadsheet can be used for one-cluster, two-cluster, three-cluster, or four-cluster grids and the other evaluation spreadsheet can be used for five-cluster and six-cluster grids. There is an accompanying data collection guide for each. The first worksheet in each spreadsheet has instructions for populating the data into the spreadsheet. A guide to help with the interpretation of the 90-day trends is also included.
There are separate one-week spreadsheets for two-cluster, three-cluster, four-cluster, five-cluster, and six-cluster grids. The spreadsheets use the one-hour interval data to produce charts for the one-week period. There is also a data collection guide.
There are separate 24-hour spreadsheets for two-cluster, three-cluster, four-cluster, five-cluster, and six-cluster grids. The spreadsheets use the 15-minute interval data to produce charts for the 24-hour period. There is also a data collection guide.
These spreadsheets are intended for experienced TS7700 users. A detailed knowledge of the TS7700 is expected, as well as familiarity with using spreadsheets.
10.7.2 Interpreting cache throughput
The TS7700 cache has a finite bandwidth, depending on your actual environment. See the performance white paper. This TVC bandwidth is shared between the host I/O (compressed), copy activity, premigration activity, and remote write and read. The TS7700 balances these tasks using various thresholds and controls in an effort to prefer host I/O.
The two major maintenance or “housekeeping” tasks at work are the premigration of data from cache to tape, and deferred copies to and from other clusters. The TS7740 will delay these housekeeping tasks in order to preference host I/O.
Fast Host Write premigration algorithm
The Fast Host Write algorithm limits the number of premigration tasks to two, one, or zero. This limit occurs when the compressed host write rate is greater than 30 MiB/s and the CPU is more than 99% busy. The circle on the graph (Figure 10-43) illustrates this algorithm in effect. During the 16:15 to 16:45 intervals, the amount of premigrate activity is limited. During the next six intervals, the premigration activity is zero. After this period of intense host activity and CPU usage, the premigrate tasks are allowed to start again.
Figure 10-43 Fast Host Write premigration algorithm
10.7.3 Adjusting the TS7700
The various adjustments that are available to tune the TS7700 are described.
Important: Library commands change the behavior of the whole cluster. If a cluster is attached to multiple LPARs from the same client, or to a multi-tenant environment, the change executed from one LPAR will influence all attached LPARs.
If you have a shared TS7700, consider restricting the usage of the Library command.
Deferred Copy throttle value and threshold
The DCT is used to regulate outgoing deferred copies to other clusters to prefer host throughput. For some, host throughput is more important than the deferred copies, but for others, deferred copies are just as important. Adjusting the DCT value and threshold can allow you to tune the performance of the deferred copies.
Deferred Copy throttle value
When the DCT threshold is reached, the TS7700 adds a delay to each block of deferred copy data sent across the grid links from a cluster. The larger the delay, the slower the overall copy rate becomes.
The performance of the grid links is also affected by the latency time of the connection. The latency has a significant influence on the maximum grid throughput. For example, with a one-way latency of 20 - 25 ms on a 2x1Gb grid link with 20 copy tasks on the receiving cluster, the maximum grid bandwidth will be approximately 140 MB/s. Increasing the number of copy tasks on the receiving cluster increases the grid bandwidth closer to 200 MB/s.
The default DCT is 125 ms. The effect on host throughput as the DCT is lowered is not linear. Field experience shows the knee of the curve is at approximately 30 ms. As the DCT value is lowered toward 30 ms, the host throughput is affected somewhat and deferred copy performance improves somewhat. At and below 30 ms, the host throughput is affected more significantly as well as deferred copy performance. If the DCT needs to be adjusted from the default value, the initial recommended DCT value is between 30 ms and 40 ms. Favor the value to 30 ms if the client is more concerned with deferred copy performance, or toward 40 ms if the client is concerned about sacrificing host throughput.
After you adjust the DCT, monitor the host throughput and Deferred Copy Queue to see whether the desired balance of host throughput and deferred copy performance is achieved. Lowering the DCT will improve deferred copy performance at the expense of host throughput.
The DCT value can be set by using the Host Console Request command. The setting of this throttle is discussed in detail in the IBM Virtualization Engine TS7700 Series z/OS Host Command Line Request User’s Guide, which is available from the Techdocs website using the keywords SETTING, THROTTLE, DCOPYT:
Deferred Copy throttle value threshold
This value is used to determine the average host I/O rate at which to keep deferred copy throttling on. The average host I/O rate is a rolling average of the I/O rate over a 20-minute period. When this average rate exceeds the DCT threshold, the deferred copies are delayed as specified by the DCOPYT value.
The DCTAVGTD – DCT 20 Minute Average Threshold looks at the 20-minute average of the compressed host read and write rate. The threshold defaults to 100 MB/s.
The Cache Write Rate – Compressed writes to disk cache includes host write, recall write, grid copy-in write, and cross-cluster write to this cluster. The threshold is fixed at 150 MB/s.
Cluster Utilization looks at both the CPU usage and the disk cache usage. The threshold is when either one is 85% busy or more.
DCT is applied when both of the following conditions are true:
Cluster utilization is greater than 85% or the cache write rate is more than 150 MB/s.
The 20-minute average compressed host I/O rate is more than DCTAVGTD.
The preceding algorithm was added in R2.0. The reason to introduce the cache write rate at R2.0 was due to the increased CPU power on the POWER7 processor. The CPU usage is often below 85% during peak host I/O periods. Prior to R2.0, the cache write rate was not considered.
Use the following parameters with the LIBRARY command to modify the DCT value and the DCTAVGTD:
Application of Deferred Copy Throttle
The next two charts illustrate the use of DCT. In Figure 10-44, the amount of data being copied out is small because the DCT being applied since the compressed host I/O is above the DCT threshold, which is set to the default of 100 MB/s. Figure 10-45 on page 716 shows the compressed host I/O dropping below the 100 MB/s threshold and, as a result, the rate of deferred copies to other clusters is increased substantially.
Figure 10-44 shows the behavior when the DCT is used. The deferred copies are limited (light blue bars), while Host I/O (green bar) and Premigration (yellow bar) are preferred.
Figure 10-44 DCT being applied
In Figure 10-45 on page 716, you see the effect when DCT is “turned off”, because the host throughput drops under 100 MB/s (green bar). The number of deferred copy writes in MB/s increases (light blue bar).
Figure 10-45 DCT turned off
Preferred premigration and premigration throttling thresholds
These two thresholds are triggered by the amount of non-premigrated data in the cache. The preferred premigration threshold defaults to 1600 GB. The premigration throttling threshold defaults to 2000 GB. You can modify the thresholds by using the Host Console Request. The amount of non-premigrated data is available with the Host Console Request CACHE request.
For instance, in Example 10-2, 750 GB of data has yet to be premigrated.
Example 10-2 Data waiting to be premigrated
LI REQ,lib_name,CACHE
        0  2000 1880     0 1880   750     0   14    14
The TS7700 historical statistics, which are available through BVIR and VEHSTATS, show the amount of non-premigrated data at the end of each reporting interval. This value is also available on the TS7700 MI as a point-in-time statistic. Two host warning messages, low and high, can be configured for the TS7700 by using the Host Console Request function. See the IBM Virtualization Engine TS7700 Series z/OS Host Command Line Request User’s Guide, which is available on the Techdocs website. Use the following keywords:
The Techdocs website is at the following URL:
Preferred premigration threshold
When this threshold is crossed and the number of premigration processes increases, the host throughput will tend to decrease from the peak I/O rate. Lowering this value will decrease the peak throughput period. This will also delay the amount of time before premigration throttling can occur. The purpose is to hopefully cause data to be premigrated faster and avoid the premigration throttling threshold. See “Premigration tasks” on page 667 for details about how the premigration tasks are added.
You might want to adjust this threshold lower to provide a larger gap between this threshold and the premigration throttling threshold. Do this if you want the gap to be larger but you do not want to raise the premigration throttling threshold. This threshold can be raised, along with the premigration throttling threshold, to defer premigration until after a peak period. This can improve the host I/O rate because the premigration tasks are not ramped up as soon with a lower threshold. This trades off an increased amount of non-premigrated data for a higher host I/O rate during heavy production periods. The preferred premigration threshold is set using the Host Console Request function. The setting of this threshold is discussed in detail in the IBM Virtualization Engine TS7700 Series z/OS Host Command Line Request User’s Guide, which is available on the Techdocs website. Use the keywords SETTING, CACHE, PMPRIOR.
Premigration throttling threshold
When this threshold is crossed, the host write throttle and copy throttle are both invoked. The purpose is to slow incoming data to allow the amount of non-premigrated data to be reduced and not rise above this threshold. See 10.3.2, “Host Write Throttle” on page 663 and 10.3.3, “Copy throttle” on page 663 for more details about these throttles. You might want to adjust this threshold if there are periods where the amount of data entering the subsystem increases for a period of time, and the existing threshold is being crossed for a short period of time. Raising the threshold will avoid the application of the throttles, and keep host and copy throughput higher. However, the exposure is more non-premigrated data in cache. The extra non-premigrated data will take a longer period to be pre-migrated. You need to determine the balance. The premigration throttling threshold is set using the Host Console Request function. Details about this setting are described in the IBM Virtualization Engine TS7700 Series z/OS Host Command Line Request User’s Guide, which is available on the Techdocs website. Use the keywords SETTING, CACHE, PMTHLVL.
Disabling host write throttle because of immediate copy
Host write throttle can be turned on because of immediate copies taking too long to copy to other clusters in the grid. Host write throttling is applied for various reasons, including when the oldest copy in the queue is 10 or more minutes old. The TS7700 changes an immediate copy to immediate-deferred if the immediate copy has not started after 40 minutes in the immediate copy queue. The reason for this approach is to avoid triggering the 45-minute missing interrupt handler (MIH) on the host. When a copy is changed to immediate-deferred, the Rewind Unload task is completed, and the immediate copy becomes a high priority deferred copy. See “Immediate-copy set to immediate-deferred state” on page 669 for more information.
You might decide to turn off host write throttling because of immediate copies taking too long (if having the immediate copies take longer is acceptable). However, avoid the 40-minute limit where the immediate copies are changed to immediate-deferred.
In grids where a large portion of the copies are immediate, better overall performance has been seen when the host write throttle because of immediate copies is turned off. You are trading off host I/O for length of time required to complete an immediate copy. The enabling and disabling of the host write throttle because of immediate copies is discussed in detail in the IBM Virtualization Engine TS7700 Series z/OS Host Command Line Request User’s Guide, which is available on Techdocs. Use the keywords SETTING, THROTTLE, ICOPYT.
Making your cache deeper
A deeper cache will improve the likelihood of a volume being in cache for a recall. A cache-hit for a recall improves performance when compared to a cache-miss that requires a recall from physical tape. The TS7700 statistics provide a cache hit ratio for read mounts that can be monitored to ensure that the cache-hit rate is not too low. Generally, you want to keep the cache-hit ratio above 80%. Your cache can be made deeper in several ways:
Add more cache.
For TS7740, use the Storage Class construct to use Preference Level 0 (PG0). PG0 volumes are removed from cache soon after they are premigrated to physical tape. PG0 volumes are actively removed from cache and do not wait for the cache to fill before being removed. This approach leaves more room for the PG1 volumes, which remain in cache as long as possible, to be available for recalls. Many clients have effectively made their cache deeper by examining their jobs and identifying which of them are most likely not to be recalled. Use Storage Class to assign these jobs to PG0.
For TS7720, set the Storage Class constructs to use prefer remove for volumes you do not expect to be mounted. Use pinned for those you know you will be mounting and prefer keep for the others. Prefer keep is the default Storage Class action.
With the Storage Class construct PG1, the volume on the selected TVC for I/O operations is preferred to reside in the cache of that cluster. The copy made on the other clusters is preferred to be removed from cache. If the TS7700 is used for the copy, ensure that this default setting is not overridden by the Host Console command. The behavior can be set by using SETTING, CACHE, COPYFSC:
 – When disabled, logical volumes copied into cache from a Peer TS7700 Virtualization Engine are managed as PG0 (prefer to be removed from cache).
 – When the ENABLE keyword is specified, the logical volumes copied into the cache from a peer TS7700 are managed using the actions defined for the Storage Class construct associated with the volume as defined at the TS7700.
This setting works on a distributed library level. It needs to be specified on each cluster. For a deep cache, DISABLE is the preferred keyword.
By default, logical volumes that are recalled into cache are managed as though they were newly created or modified. You can modify cache behavior by using the SETTING Host Console command: SETTING, CACHE, RECLPG0:
 – When enabled, logical volumes that are recalled into cache are managed as PG0 (prefer to be removed from cache). This overrides the actions defined for the Storage Class associated with the recalled volume.
 – When the DISABLE keyword is specified, logical volumes that are recalled into cache are managed using the actions defined for the Storage Class construct associated with the volume as defined at the TS7700.
This setting works on a distributed library level. It needs to be specified on each cluster. The preferred keyword is dependent on your requirements. ENABLE is the best setting if it is likely that the recalled logical volumes are used only once. With the setting DISABLE, the logical volume stays in cache for further retrieval if the Storage Class is defined as PG1 in the cluster used for the I/O TVC.
Back-end drives
It is important to ensure that enough back-end drives are available. General guidelines are provided for the number of back-end drives versus the number of performance increments installed in the TS7740. If there are insufficient back-end drives, the performance of the TS7740 will suffer.
As a guideline, use the ranges listed in Table 10-1 of back-end drives based on the host throughput configured for the TS7740. The lower number of drives in the ranges is for scenarios that have few recalls. The upper number is for scenarios that have numerous recalls. Remember, these are guidelines, not rules.
Table 10-1 Performance increments versus back-end drives
Back-end drives
100 MiB/s
4 - 6
200 MiB/s
5 - 8
300 MiB/s
7 - 10
400 MiB/s
9 - 12
500 MiB/s
10 - 14
600 - 1000 MiB/s
12 - 16
Installing the correct number of back-end drives is important, along with the drives being available for use. “Available” means that they are operational and might be idle or in use. The Host Console Request function can be used to set up warning messages for when the number of available drives drops. Setting the Available Physical Drive Low and High warning levels is discussed in detail in the IBM Virtualization Engine TS7700 Series z/OS Host Command Line Request User’s Guide, which is available on Techdocs. Use these keywords:
Use this website:
Grid links
The grid link performance and the setting of the performance warning threshold are described.
Providing sufficient bandwidth
The network between the TS7700s must have sufficient bandwidth to account for the total replication traffic. If you are sharing network switches among multiple TS7700 paths or with other network traffic, the sum total of bandwidth on that network needs to be sufficient to account for all of the network traffic.
The TS7700 uses the TCP/IP protocol for moving data between each cluster. In addition to the bandwidth, there are other key factors that affect the throughput that the TS7700 can achieve. The following factors directly affect performance:
Latency between the TS7700s
Network efficiency (packet loss, packet sequencing, and bit error rates)
Network switch capabilities
Flow control to pace the data from the TS7700s
Inter-switch link (ISL) capabilities, such as flow control, buffering, and performance
The TS7700s attempt to drive the network links at the full 1-Gb rate for the two or four 1-Gbps links, or at the highest possible load at the two 10-Gbps links, which might be much higher than the network infrastructure is able to handle. The TS7700 supports the IP flow control frames to have the network pace the rate at which the TS7700 attempts to drive the network. The best performance is achieved when the TS7700 is able to match the capabilities of the underlying network, resulting in fewer dropped packets.
Important: When the system attempts to give the network more data than it can handle, it begins to throw away the packets that it cannot handle. This process causes TCP to stop, resynchronize, and resend amounts of data, resulting in a less efficient use of the network.
To maximize network throughput, you must ensure the following items regarding the underlying network:
The underlying network must have sufficient bandwidth to account for all network traffic expected to be driven through the system - eliminate network contention.
The underlying network must be able to support flow control between the TS7700s and the switches, allowing the switch to pace the TS7700 to the wide-area LAN’s (WAN’s) capability.
Flow control between the switches is also a potential factor to ensure that the switches are able to pace with each other’s rate.
Be sure that the performance of the switch is capable of handling the data rates expected from all of the network traffic.
Latency between the sites is the primary factor. However, packet loss, because of bit error rates or because the network is not capable of the maximum capacity of the links, causes TCP to resend data, which multiplies the effect of the latency.
Grid link performance monitoring
The TS7700 generates a host message when it detects the grid performance is degraded. If the degraded condition persists, a call-home link is generated. The performance of the grid links is monitored periodically, and if one link is performing worse than the other link by an IBM service support representative (SSR)-alterable value, a warning message is generated and sent to the host. The purpose of this warning is to alert you that an abnormal grid performance difference exists. The value must be adjusted so that warning messages are not generated because of normal variations of the grid performance. For example, a setting of 60% means that if one link is running at 100%, the remaining links are marked as degraded if it is running at less than 60% of the 100% link. The grid link performance is available with the Host Console Request function and is available on the TS7700 MI. The monitoring of the grid link performance using the Host Console Request function is described in detail in the IBM Virtualization Engine TS7700 Series z/OS Host Command Line Request User’s Guide, which is available on Techdocs. Use the STATUS and GRIDLINK keywords:
The grid link degraded threshold also includes two other values that can be set by the SSR:
Number of degraded iterations: The number of consecutive five-minute intervals that link degradation was detected before reporting an attention message. The default value is 9.
Generate Call Home iterations: The number of consecutive five-minute intervals that link degradation was detected before generating a Call Home. The default value is 12.
The default values are set to 60% for the threshold, nine iterations before an attention message is generated, and 12 iterations before a Call Home is generated.
Use the default values unless you are receiving intermittent warnings and support indicates that the values need to be changed. If you receive intermittent warnings, let the SSR change the threshold and iteration to the suggested values from support.
For example, clusters in a two-cluster grid are 2,000 miles apart with a round-trip latency of approximately 45 ms. The normal variation seen is 20 - 40%. In this example, the threshold value is at 25% and the iterations are set to 12 and 15.
Copy count control
There can be several reasons for tuning the counts of the number of concurrent copy jobs over the grid links. There are two major cases where you might need the number of concurrent copy threads increased to improve copy performance:
When four links are present either in old or new hardware or if the 10 Gbps grid links are installed
When either a distant high-latency link or high-packet-loss link is present, based on the research investigation result
Also, if you have limited network bandwidth (for example, less than the 100 MiB/s), decreasing the number of concurrent copy tasks can reduce the overhead allocated and consumed simultaneously. Eventually, this can prevent a three-hour copy that timed out without repeating the deferred copy task forever.
Values can be set for the number of concurrent RUN copy threads and the number of Deferred copy threads. The allowed values for the copy thread count are 5 - 128. The default value is 20 for clusters with two 1-Gbps Ethernet links, and 40 for clusters with four 1-Gbps or two 10-Gbps Ethernet links. Use the following parameters with the LIBRARY command:
Reclaim operations
Reclaim operations consume two back-end drives per reclaim task. Reclaim operations also consume CPU MIPs. If needed, the TS7740 can allocate pairs of idle drives for reclaim operations, making sure to leave one drive available for recall. Reclaim operations affect host performance, especially during peak workload periods. Tune your reclaim tasks using both the reclaim threshold and Inhibit Reclaim schedule.
Reclaim threshold
The reclaim threshold directly affects how much data is moved during each reclaim operation. The default setting is 10% for each pool. Clients tend to raise this threshold too high because they want to store more data on their stacked volumes. The result is that reclaim operations must move larger amounts of data and consume back-end drive resources that are needed for recalls and premigration. After a reclaim task is started, it does not free up its back-end drives until the volume being reclaimed is empty. Table 10-2 on page 722 shows the reclaim threshold and the amount of data that must be moved, depending on the stacked tape capacity and the reclaim percentage. When the threshold is reduced from 40% to 20%, only half of the data needs to be reclaimed, therefore cutting the time and resources needed for reclaim in half.
Table 10-2 Reclaim threshold
Reclaim threshold
300 GB
30 GB
60 GB
90 GB
120 GB
500 GB
50 GB
100 GB
150 GB
200 GB
640 GB
64 GB
128 GB
192 GB
256 GB
700 GB
70 GB
140 GB
210 GB
280 GB
1000 GB
100 GB
200 GB
300 GB
400 GB
4000 GB
400 GB
800 GB
1200 GB
1600 GB
Inhibit Reclaim schedule
Use the Inhibit Reclaim schedule to inhibit reclaims during your busy periods, leaving back-end drives available for recalls and premigrates tasks. We suggest that you start the inhibit 60 minutes before the heavy workload period. This setting allows any started reclaim tasks to complete before the heavy workload period.
Adjusting the maximum number of reclaim tasks
Reclaim operations consume two back-end drives per task, and CPU cycles as well. For this reason, use the Inhibit Reclaim schedule to turn off reclaim operations during heavy production periods. When reclaim operations are not inhibited, you might want to limit the number of reclaim tasks. Perhaps, there is moderate host I/O during the uninhibited period and reclaim is consuming too many back-end drives, CPU cycles, or both.
With the Host Library Request command, you can limit the number of reclaim tasks in the TS7740. The second keyword RECLAIM can be used along with the third keyword of RCLMMAX. This expansion only applies to the TS7740. Also, the Inhibit Reclaim schedule is still honored. The limit is turned off by setting the value to -1 (minus 1).
The maximum number of reclaim tasks is limited by the TS7740, based on the number of available back-end drives, as listed in Table 10-3.
Table 10-3 Reclaim tasks
Number of available drives
Maximum number of reclaim tasks
Limiting the number of premigration drives (maximum drives)
Each storage pool allows you to define the maximum number of back-end drives to be used for premigration tasks. There are several triggers that cause the TS7740 to ramp up the number of premigration tasks. If a ramp-up of premigration tasks occurs, followed by the need for more than one recall, the recall must wait until a premigration task is complete for a back-end drive to free up. A single premigration task can move up to 30 GB at one time. Having to wait for a back-end drive delays a logical mount that requires a recall.
If this ramping up is causing too many back-end drives to be used for premigration tasks, you can limit the number of premigration drives in the Pool Properties window. For a V06, the maximum number of premigration drives per pool must not exceed 4. Additional drives will not increase the copy rate to the drives. For a V07, premigration can benefit from having eight to 10 drives available for premigration.
The limit setting is in the TS7740 MI. For Copy Export pools, it is advisable that the maximum number of premigration drives be set appropriately. If you are exporting a small amount of data each day (one or two cartridges’ worth of data), limit the premigration drives to two. If more data is being exported, set the maximum to four. This setting limits the number of partially filled export volumes.
Look at MB/GB written to a pool, compute MiB/s, compute maximum and average, and compute the number of premigration drives per pool. Base the number of drives by using 50-70 MBps per drive for models up to the TS1140 and approximately 100 MB/s for a TS1140.
Avoid Copy Export during heavy production periods
Because a Copy Export operation requires each physical volume to be exported to be mounted, the best approach is to perform the operation during a slower workload time.
10.8 TS7700 Virtualization Engine statistical data
The Virtualization Engine TS7700 statistics content and processes differ significantly from the VTS predecessor. In addition to completely new statistical record formats, major changes have been made to the frequency of collection, data storage, host data retrieval, and reporting tools.
10.8.1 Types of statistical records
The TS7700 Virtualization Engine provides two types of statistics:
Point-in-time statistics
These statistics are performance-related. The point-in-time information is intended to supply information about what the system is doing the instant that the request is made to the system. This information is not persistent on the system: The TS7700 updates these statistics every 15 seconds, but it does not retain them. This information focuses on the individual components of the system and their current activity. These statistics report operations over the last full 15-second interval. You can retrieve the point-in-time statistics from the TS7700 at any time by using the Bulk Volume Information Retrieval (BVIR) facility. A subset of point-in-time statistics is also available on the TS7700 MI.
Historical (HIS) statistics
These statistics encompass a wide selection of performance and capacity planning information. They are intended to help with capacity planning and tracking system use over an extended period of time. The information focuses more on the system as a whole, and the movement of data through the system. These statistics are collected by the TS7700 every 15 minutes, and they are stored for 90 days in a TS7700 database. The user can also retrieve these records by using BVIR. A subset of the historical statistics is also available on the TS7700 MI. More information is available in 10.7, “Tuning the TS7700” on page 711.
Both point-in-time statistics and historical statistics are recorded. The point-in-time records present data from the most recent interval, providing speedometer-like statistics. The historical statistics provide data that can be used to observe historical trends.
These statistical records are available to a host through the BVIR facility. See 10.10, “IBM Tape Tools” on page 744 for more information about how to format and analyze these records.
Each cluster in a grid has its own set of point-in-time and historical statistics for both the vNode and hNode.
For a complete description of the records, see IBM Virtualization Engine TS7700 Series Statistical Data Format White Paper Version 2.0, which is available on the Techdocs website at the following URL and the VEHSTATS Decoder:
Point-in-time statistics
The data provided by this type of statistics is a snapshot of the TS7700 Virtualization Engine activity over the last 15-second interval. Each new 15-second interval data overlays the prior interval’s data. But not all data is updated every 15 seconds (primarily hNode data). Those statistics contain single and multicluster grid information.
You can obtain the point-in-time statistics through the appropriate BVIR request. The response returns the last point-in-time snapshot from the TS7700 Virtualization Engine. The data returned is not in a human-readable format. It is primarily binary data, so use the DCB=(RECFM=U,BLKSIZE=24000) parameter.
Basically, the point-in-time statistics provide the following record types:
vNode Virtual Device Point in Time Record
vNode Adapter Point in Time Record
hNode HSM Point in Time Record
hNode Grid Point in Time Record
A variable number of data records are returned depending on the number of vNodes and hNodes (if in a multicluster grid configuration).
Each record has a common header that contains the following information:
Record length
Record version
Record type
Node and distributed library ID
Time stamp
Machine type, model, and hardware serial number
Code level
Point-in-time statistics: vNode virtual device
These kinds of point-in-time statistics provide the state of the reporting vNode:
Overall state (online, offline, going online, and so on)
Maximum configured throughput, and throughput delay
Number of installed virtual devices
In addition, the state and usage of each virtual drive are presented:
Volume mounted or last mounted
Distributed library access point
Mount state (unloaded/mount in progress, failed, mounted, and so on)
Device state (ready, write mode, BOT, and so on)
Buffer wait condition count (whether the subsystem is pacing the channel or vice versa)
Data transferred
Point-in-time statistics: vNode host adapter
These kinds of point-in-time statistics provide the status for each of the four host adapter positions:
Adapter type installed
State (online, offline, reloading, and so on)
Location (drawer and slot)
In addition, the state and usage of each adapter port are provided:
Port interface ID
Data transferred before and after compression
Point-in-time statistics: hNode HSM
These kinds of point-in-time statistics provide the status for the management tasks running in the hNode:
Premigrate and recall queue counts
Throttling values (write, copy, and deferred copy)
Library sequence number
State and usage of the physical tape drives:
 – Device type
 – Physical volume mounted and pool ID
 – Device state (offline or online)
 – Device role (idle, premig, recall, reclaim, and so on)
 – Logical volume being processed
 – Data transferred
 – Device Serial number
 – Media format
hNode HSM does not establish a relationship to the Hierarchical Storage Manager (DFSMShsm), the z/OS storage management software.
For TS7720 Virtualization Engine statistics, because physical tape drives are not used, there is no information provided for functions related to back-end drive activity, such as migration, recall, and reclamation.
Point-in-time statistics: hNode Grid
These point-in-time statistics provide the grid status from this hNode’s perspective:
Status and usage for the distributed library:
 – Run and deferred copy queue counts for this distributed library
 – Active copies for this distributed library
 – Link status
Data transferred to and from each cluster
Historical statistics
The data provided by this type of statistics is captured over a 15-minute interval in the TS7700 Virtualization Engine. Each new 15-minute interval’s data does not overlay the prior interval’s data. However, not all data is updated every 15 minutes. Those statistics contain single and multicluster grid information. Up to 96 intervals per day can be acquired, and each day, a file is generated that contains the historical statistics for that day. The historical statistics are kept in the TS7700 Virtualization Engine for 90 rolling days.
You can obtain the complete set or a subset of these historical statistics through the appropriate BVIR request (for more details, see “Historical statistics” on page 726). The request will specify the day or the days of needed historical statistics. For the current day, records up to the last 15-minute interval are returned. The data returned is not in a human-readable format. It is primarily binary data. Therefore, use the following parameter:
The record length depends on the record type. For more information about the format of the statistics records, see the IBM Virtualization Engine TS7700 Series Statistical Data Format White Paper Version 2.0 and the VEHSTATS Decoder on the Techdocs website at the following addresses:
VEHSTATS is a tool that is available for formatting these records into reports. For more information about using VEHSTATS, see 10.10, “IBM Tape Tools” on page 744.
The historical statistics provide the following record types:
vNode: Virtual Device
vNode: Host Adapter
hNode: HSM
hNode: Library
hNode: Grid
hNode: Import/Export
The number of records returned varies depending on the number of vNodes and hNodes (if in a multicluster grid configuration).
Each record has a common header that contains the following information:
Record length
Record type
Node and distributed library ID
Time stamp
Machine type, model, and hardware serial number
Code level
Historical statistics: vNode virtual device
These types of historical statistics will provide the following information about the usage for a vNode’s virtual devices:
Number of installed virtual devices
Virtual device type
Blocksizes being written
Configured throughput
Minimum, maximum, and average virtual devices mounted
Maximum and average delay
Historical statistics: vNode host adapter
You can use these historical statistics to determine the status for each of the two or four host adapter positions. The provided statistics contain the following information:
Adapter type installed
State (online, offline, and so on)
Location (drawer or slot)
Both the state and usage of each adapter port are provided:
Port interface ID
Interface data transfer rate setting (capable and actual)
Data transferred before and after compression
Selective and system reset counts
Historical statistics: hNode HSM
This data portion within the historical statistics will give you hNode HSM information. You will get the VOLSER of the physical volume with the latest snapshot of the database on it.
You also see the state and status of the TVC. The provided statistics contain the following information:
Usable size in GB
Throttling values
State of each cache partition:
 – Partition size
 – Number of scratch (Fast Ready), cache hit, or cache miss mounts
 – Average scratch (Fast Ready), cache hit, or cache miss mount times
 – Number of volumes in cache by preference group
 – Space occupied by the volumes in cache by preference group
 – Volume aging by preference group
 – Migration and replication
 – Average CPU usage percentage
Historical statistics: hNode library
This part of the historical statistics give you library information:
The attached physical library
The physical tape devices in the library
The common scratch pool media in the library
Each physical volume pool in the library
The TS7720 Virtualization Engine does not use a physical tape library. Therefore, hNode physical library information is not available from a TS7720 Virtualization Engine.
With a TS7740 Virtualization Engine, the attached physical tape library might have multiple subsystems sharing the TS3500 Tape Library through Advanced Library Management System (ALMS) partitioning. Only data related to the partition attached to this TS7740 Virtualization Engine is provided in the hNode library report.
Historical statistics: hNode grid
This last set of historical statistics gives you information in a multicluster grid environment for the following items:
Status and usage for the distributed library:
 – Number of logical volumes and data to copy for this distributed library
 – Average age of the deferred and immediate copy queues for this distributed library
 – Number of distributed libraries in the grid configuration
Data transferred to and from each cluster
Historical statistics: Import/Export
Historical statistics are provided about volumes and data that are imported and exported:
Physical volumes imported and exported
Logical volumes imported and exported
Amount of data imported and export
10.8.2 Collecting and analyzing TS7700 Virtualization Engine statistical records
Consider the following aspects when you work with the statistics and reporting mechanisms that are provided by TS7700 Virtualization Engine:
Getting the statistics and reports: The main interface to access statistic and reports from the TS7700 Virtualization Engine is the BVIR facility. Depending on the request, you will receive readable output or, for the TS7700 Virtualization Engine point-in-time and historical statistics, binary data. For binary data, more description and tools are needed.
Formatting and displaying the information: Some of the response data of the BVIR functions is already in a readable format. For the remaining binary format data provided by the point-in-time and historical statistics, you need a formatting tool. IBM provides a tool called VEHSTATS. More information about where to download this tool and how to use it is in 10.10, “IBM Tape Tools” on page 744.
10.9 Bulk Volume Information Retrieval
With the potential to support hundreds of thousands of logical volumes in a TS7700 Virtualization Engine subsystem, providing a set of information for all of those volumes through normal channel control type commands is not practical. Luckily, the functions of a TS7700 Virtualization Engine subsystem that allow it to virtualize a tape volume also allow for a simple and effective method to transfer the information to a requesting application. The TS7700 Virtualization Engine converts the format and storage conventions of a tape volume into a standard file managed by a file system within the subsystem.
With BVIR, you are able to obtain information about all of the logical volumes managed by a TS7700 Virtualization Engine. The following data is available from a TS7700 Virtualization Engine:
Volume Status Information
Cache Contents Information
Physical Volume to Logical Volume Mapping Information
Point-in-Time Statistics
Historical Statistics
Physical Media Pools
Physical Volume Status
Copy Audit
For more information, see the IBM Virtualization Engine TS7700 Series Bulk Volume Information Retrieval Function User’s Guide at the following URL:
10.9.1 Overview of the BVIR function
The TS7700 Virtualization Engine converts the format and storage conventions of a tape volume into a standard file managed by a file system within the subsystem. It uses an IBM standard labeled tape volume to both initiate a request for information and return the results. By using a standard tape volume, no special interfaces or access methods are needed for an application to use this facility. In practice, no specific applications are required because standard IBM utilities, such as IEBGENER, provide the function needed to request and obtain the information.
The following steps obtain information by using this function:
1. A single data set with the information request is written to a logical volume. The logical volume can be any logical volume in the subsystem from which the information is to be obtained. Either a scratch or specific volume request can be used. The data set contains a minimum of two records and a maximum of three records that specify the type of data being requested. The records are in human-readable form, that is, lines of character data. The data set can be cataloged or uncataloged (although cataloging the data set can make it easier for subsequent access to the data). On closing the volume, the TS7700 Virtualization Engine server recognizes it as a request volume and “primes” the subsystem for the next step.
Remember: Some information obtained through this function is specific to the cluster on which the logical volume is written, for example, cache contents or a logical-physical volume map. In a TS7700 Virtualization Engine grid configuration with multiple clusters, use a Management Class for the volume to obtain statistics for a specific cluster.
Historical statistics for a multicluster grid can be obtained from any of the clusters.
2. The request volume is again mounted, this time as a specific mount. Seeing that the volume was primed for a data request, the TS7700 Virtualization Engine appends the requested information to the data set. The process of obtaining the information and creating the records to append can take up to several minutes, depending on the request and, from a host’s viewpoint, is part of the mount processing time. After the TS7700 Virtualization Engine has completed appending to the data set, the host is notified that the mount has completed. The requested data can then be accessed like any other tape data set.
In a JES2 environment, the JCL to perform the two steps can be combined into a single job. However, in a JES3 environment, they must be run in separate jobs because the volume will not be demounted and remounted between job steps in a JES3 environment.
After the response data set has been written to the request logical volume, that logical volume functions identically to any other logical volume in the subsystem. Subsequent mount requests and read accesses to the logical volume do not affect its contents. Write accesses to the logical volume will overwrite its contents. The logical volume can be returned to SCRATCH status and reused by any application.
Figure 10-46 shows the process flow of BVIR.
Figure 10-46 BVIR process flow
The building of the response information requires a small amount of resources from the TS7700 Virtualization Engine. Do not use the BVIR function to “poll” for a specific set of information and only issue one request at a time. Certain requests, for example, the volume map, might take several minutes to complete. To prevent “locking” out another request during that time, the TS7700 Virtualization Engine is designed to handle two concurrent requests. If more than two concurrent requests are issued, they will be processed as previous requests are completed.
Although the requested data is always in a human-readable format, depending on the request, the data returned from the TS7700 Virtualization Engine can be in human-readable or binary form. See the response sections for the specifics of the returned data.
The general format for the request/response data set is shown in Example 10-3.
Example 10-3 BVIR output format
11/20/2008 12:27:00 VERSION 02
S/N: 0F16F LIB ID: DA01A
P00024   GK0000    P 000001 1 OF 1    23.45 M
P00024   GK0020    P 000002 1 OF 1    76.50 M
P00024   GK0010    P 000003 1 OF 1   134.24 M
Clarification: When records are listed in this chapter, there will be an initial record showing “1234567890123...” This record does not exist, but it is provided to improve readability.
Record 0 is identical for all requests and it is not part of the output; it is for support for records 1 through 5 only. Records 6 and higher contain the requested output, which differs depending on the request:
Records 1 and 2 contain the data request commands.
Record 3 contains the date and time when the report was created and the version of BVIR.
Record 4 contains both the hardware serial number and the distributed library ID of the TS7700 Virtualization Engine.
Record 5 contains all blanks.
Records 6-N and higher contain the requested data. The information is described in general terms. Detailed information about these records is in the IBM Virtualization Engine TS7700 Series Bulk Volume Information Retrieval Function User’s Guide at the following URL:
10.9.2 Prerequisites
Any logical volume defined to a TS7700 Virtualization Engine can be used as the request/response volume. Logical volumes in a TS7700 Virtualization Engine are formatted as IBM standard-labeled volumes. Although a user can reformat a logical volume with an ANSI standard label or as an unlabeled tape volume, those formats are not supported for use as a request/response volume.
There are no restrictions regarding the prior use of a volume used as a request/response volume and no restrictions regarding its subsequent use for any other application. Use normal scratch allocation methods for each request (that is, use the DISP=(NEW,CATLG) parameter). In this way, any of the available scratch logical volumes in the TS7700 Virtualization Engine can be used. Likewise, when the response volume’s data is no longer needed, the logical volume must be returned to SCRATCH status through the normal methods (typically by deletion of the data set on the volume and a return-to-scratch policy based on data set deletion).
10.9.3 Request data format
Several types of data can be requested. The type of data requested is indicated in the request data set. The request data set must be the only data set on the volume, and must be written with a record format of F and a logical record size of 80 bytes in uncompressed data format (TRTCH=NOCOMP). Request information is in EBCDIC character form, beginning in the first character position of the record and padded with blank characters on the right to fill out the record.
The request fields must be as shown. Not beginning with the first character position of the record or using extra blanks between words will result in a failed request.
The file must be written in uncompressed format to have it correctly interpreted by the TS7700 Virtualization Engine.
Although the request data format uses fixed records, not all response records are fixed. For the point-in-time and historical statistics responses, the data records are of variable length and the record format used to read them is the Undefined (U) format. See Appendix F, “Sample JCL” on page 925 for more information.
In a multi-site TS7700 Virtualization Engine grid configuration, the request volume must be created on the cluster for which the data is being requested. The Management Class assigned to the volume needs to specify the particular cluster that is to have the copy of the request volume.
The format for the request data set records is listed in the following sections.
Record 1
Record 1 must contain the command exactly as shown in Example 10-4.
Example 10-4 BVIR request record 1
The format for the request’s data set records is shown in Table 10-4.
Table 10-4 BVIR request record 1
Record 1: Request identifier
1 - 28
Request identifier
29 - 80
Blank padding
Record 2
With Record 2, you can specify which data you want to obtain. The following options are available:
The format for the request’s data set records is shown in Table 10-5.
Table 10-5 BVIR request record 2
Record 2: Request identifier
1 - 80
‘VOLUME STATUS zzzzzz’ or
left-aligned, padded with blanks on the right
For the Volume Status and Physical Volume Status Volume requests, ‘zzzzzz’’ specifies the volume serial number mask to be used. By using the mask, one to thousands of volume records can be retrieved for the request. The mask must be six characters in length, with the underscore character ( _ ) representing a positional wildcard mask.
For example, assuming that volumes in the range of ABC000 through ABC999 have been defined to the cluster, a request of VOLUME STATUS ABC1_0 returns database records that exist for ABC100, ABC110, ABC120, ABC130, ABC140, ABC150, ABC160, ABC170, ABC180, and ABC190.
For the Historical Statistics request, ‘xxx’ specifies the Julian day being requested. Optionally, ‘-yyy’ can also be specified and indicates that historical statistics from xxx through yyy are being requested. Valid days are 001 through 366 (to account for leap year). For leap years, February 29 is Julian day 060 and December 31 is Julian day 366. For other years, Julian day 060 is March 1, and December 31 is Julian day 365. If historical statistics do not exist for the day or days requested, that will be indicated in the response record. (This can occur if a request is issued for a day before the day the system was installed, day or days the system was powered off, or after the current day before a rolling year has been accumulated.) If a request spans the end of the year, for example, a request that specified as HISTORICAL STATISTICS FOR 364-002, responses are provided for days 364, 365, 366, 001, and 002, regardless of whether the year was a leap year.
For Copy Audit, INCLUDE or EXCLUDE is specified to indicate which TS7700’s clusters in a grid configuration are to be included or excluded from the audit. COPYMODE is an option for taking a volume’s copy mode for a cluster into consideration. If COPYMODE is specified, a single space must separate it from INCLUDE or EXCLUDE. The libid parameter specifies the library sequence numbers of the distributed libraries associated with each of the TS7700 clusters either to include or exclude in the audit. The parameters are separated by a comma. At least one libid parameter must be specified.
For the Physical Volume Status Pool request, ‘xx’ specifies the pool for which the data is to be returned. If there are no physical volumes currently assigned to the specified pool, that will be indicated in the response record. Data might be requested for pools 0 through 32.
For point-in-time and historical statistics requests, any additional characters provided in the request record past the request itself are retained in the response data, but otherwise ignored. In a TS7700 grid configuration, the request volume must be valid only on the specific cluster from which the data is to be obtained. Use a specific Management Class that has a copy policy defined to indicate that only the desired cluster is to have a copy of the data. By ensuring that there is a sole copy of the request volume, any virtual device address on any of the clusters in the same grid configuration can be used to request and access the data. You do not have to have host connectivity to the specific cluster. If a Management Class is used that indicates that more than one cluster is to have a valid copy of the request volume, unpredictable response data results can occur.
10.9.4 Response data format
When the request data set has been written to the volume and subsequently closed and demounted, when mounted again, the TS7700 Virtualization Engine validates the contents of the request volume and appends the requested data records to the data set.
Human-readable appended records can vary in length, depending on the reports requested and can vary between 80 bytes and 640 bytes in length. Binary data appended records can be variable in length of up to 24000 bytes. The data set is now a response data set. The appropriate block counts in the end of file (EOF) records will be updated to reflect the total number of records written to the volume.
These records contain the specific response records based on the request. If the request cannot be understood or was invalid, that will be indicated. The record length of each response data is listed in Table 10-6.
Table 10-6 Record length of response data
BVIR request
Record length
in bytes
After appending the records and updating the EOF records, the host that requested the mount is signaled that the mount is complete and can read the contents of the volume. If the contents of the request volume are not valid, either one or more error description records will be appended to the data set or the data set will be unmodified before signaling the host that the mount completed, depending on the problem encountered.
All human-readable response records begin in the first character position of the record and are padded with blank characters on the right to fill out the record. All binary records are variable in length and are not padded.
In the response records, the dates and times presented are all based on the internal clock of the TS7700 Virtualization Engine handling the request. The internal clock of a TS7700 Virtualization Engine is not synchronized to the host, but it is synchronized with all other TS7700 Virtualization Engines.
The host and the TS7740 Virtualization Engine can be synchronized to a Network Time Protocol (NTP) server, but they use a different NTP server with a different timing protocol. Slight time differences are still possible when NTP is used.
The response data set contains both request records that are described in 10.9.3, “Request data format” on page 732, and the response data set contains three explanatory records (Records 3 - 5) and, starting with Record 6, the actual response to the data request.
The detailed description of the record formats of the response record is in the following white papers:
IBM Virtualization Engine TS7700 Series Bulk Volume Information Retrieval Function User’s Guide:
IBM Virtualization Engine TS7700 Series Statistical Data Format White Paper:
The response data set has this general format:
Records 1- 2
Contains the contents of request records 1- 2.
Record 3
This record contains the date and time that the response data set was created and a format version number for the results.
Record 4
This record contains both the five-character hardware serial number of the TS7700 Virtualization Engine, and the five-character distributed library sequence number of the cluster that generated the response.
Record 5
This record contains all blank characters.
Record 6 - N and Record 7
These records contain the specific response records based on the request. If the request cannot be understood or was invalid, that will be indicated.
10.9.5 Interpreting the BVIR response data
This section explains how to interpret each BVIR Response Data Set for the specific request information, such as the following information:
Volume Status Information
Cache Contents Information
Physical Volume to Logical Volume Mapping Information
Point in Time Statistics
Historical Statistics
Physical Media pools
Physical Volume Status
Copy Audit Request
Clarification: When records are listed in this chapter, an initial record shows “1234567890123...”. This record does not exist, but is provided to improve readability.
Volume status information
A database is maintained on each TS7740 cluster that contains information related to the management of the logical volumes on the cluster and copy and resynchronization processes when the TS7700 Virtualization Engines are in a grid configuration. Several returned database fields can be useful in handling operational exceptions at one or more clusters in a grid configuration.
The volume status information returned represents the status of the volume on the cluster the requested volume was written to. In a TS7700 Virtualization Engine grid configuration, separate requests must be issued to each cluster to obtain the volume status information for the individual clusters. Using the volume serial number mask specified in the request, a response record is written for each matching logical volume that exists in the cluster. A response record consists of the database fields defined as described in the white paper. Fields are presented in the order defined in the table and are comma-separated. The overall length of each record is 640 bytes with blank padding after the last field, as needed. The first few fields of the record returned for VOLSER ABC123 are shown in Example 10-5.
Example 10-5 BVIR volume status information
Important information is derived from the records:
Data Inconsistent
This field indicates whether the cluster has a valid version of the data. If it indicates that the data on the logical volume is not valid, this means that the same volume on another TS7700 Virtualization Engine in the grid has been modified and it has not yet been copied. If you use the deferred Copy Consistency Point (which is typically when there is significant distance between the TS7700 Virtualization Engines in the grid configuration), there will be some number of volumes that are not consistent between the TS7700 Virtualization Engines at any point in time.
If a situation occurs that renders the site inoperable where the source data resides, by issuing the Volume Status request to an operable TS7700 Virtualization Engine, this field can be used to identify the volumes that were not copied before the situation so that appropriate recovery steps can be performed for them.
MES Volume
This field indicates that the logical volume was created in the TS7700 Virtualization Engine Cluster or even created within a VTS, before being merged into a grid configuration. Volumes that existed in a TS7700 Virtualization Engine cluster before being included in a grid configuration are not automatically copied to the other TS7700 Virtualization Engine clusters in the configuration until they have been accessed and closed. This field can be used to determine which volumes in each TS7700 Virtualization Engine cluster have not been copied, to build a set of jobs to access them, and force the copy. The PRESTAGE program from the TAPETOOL FTP site can support you in doing that job in an efficient way. The VEHSYNC job can be used to identify volumes needing copies.
Additional information about various tools available for monitoring your TS7700 Virtualization Engine is provided in 10.11.1, “VEHSTATS tool overview” on page 750. You can also access the TAPETOOL FTP site at the following URL:
Copy Required for Cluster n
This field indicates that a copy to another TS7700 Virtualization Engine Cluster in a grid configuration is required. In cases where Deferred mode copy is used, this field can be used to determine whether a critical set of volumes have completed their copy operations to specific clusters.
Volume Ownership and Volume Ownership Taken
At any point in time, a logical volume is owned by a specific cluster. If required, ownership is transferred as part of mount processing. If a logical volume is mounted on a virtual drive anywhere in the composite library, ownership will not be transferred until the volume is unloaded. Ownership can transfer in one of two ways:
 – Through communication with the current owning cluster
 – Through a recovery process called ownership takeover
Normally, if the cluster receiving a mount command does not have ownership of the volume, it requests the transfer of volume ownership from the current owning cluster. If the volume is not in use, ownership is transferred.
However, if the cluster receiving the mount request cannot communicate with the owning cluster, that method does not work. In this case, the requesting clusters cannot determine whether the owning cluster has failed or just the grid network links to it have failed. Operator intervention is required to indicate that the owning cluster has failed and that ownership takeover by the other clusters is allowed. Two types of ownership takeover are available:
 – Write ownership takeover (WOT): The cluster taking over ownership of the volume has complete freedom to modify the contents of the volume or modify any of the properties associated with the volume. This includes scratch mounts.
 – Read ownership takeover (ROT): The cluster taking over ownership of the volume is restricted to reading the volume’s data only. Therefore, a cluster in ROT mode fails a scratch mount request for which it is unable to acquire volume ownership.
Current and Pending Category
One of the key properties associated with a volume is the category that it is assigned. The primary usage for category is to group scratch volumes together. A volume’s category assignment changes as the volume is used. The current category field indicates the category the volume is assigned to within the TS7700 Virtualization Engine Integrated Library Manager function. The pending category field indicates that a new category assignment is in progress for the volume. These fields can be used to determine whether the category assignments are in sync between the clusters and the host databases.
Data Deleted
As part of normal processing in a TS7700 Virtualization Engine Cluster, you can specify that after a certain period of time after being returned to scratch, the contents of a volume can be deleted. This field indicates whether or not the data associated with the volume has been deleted on the cluster.
Removal State
As part of normal processing in a TS7700 Virtualization Engine Grid configuration where a mixture of both TS7740 Virtualization Engine and TS7720 Virtualization Engine clusters exists, a data removal or migration process occurs where data is removed from TS7720 Virtualization Engine clusters to prevent TS7720 Virtualization Engine clusters from overrunning their TVC. This field, and the removal time stamp, can be used to determine whether the data associated with the volume has been removed.
This field represents the cluster’s view of which clusters have down-level token or volume metadata information as a result of a cluster outage. When clusters are unavailable because of expected or unexpected outages, the remaining clusters mark the unavailable cluster for pending reconciliation by updating this hot mask. The field represents both Insert or Eject pending updates, or regular pending updates. Insert/Eject updates are related to volumes being inserted or ejected during the outage. Regular pending updates are for updates that occur to the volume during an outage as a result of normal operations, such as host I/O. Each bit within the mask represents which clusters are viewed as needing reconciliation.
Cache content information
Volumes that are accessed by a host are maintained in the TVC, which is managed by each cluster. The cache can be partitioned into up to eight partitions. The TS7700 Virtualization Engine controls the movement of logical volumes out of a cache partition because space is needed for newly created or recalled volumes for that partition. The primary goal of the cache management algorithms in the TS7700 Virtualization Engine is to maximize the utilization of its cache for volumes that have some likelihood to be accessed again.
The cache management function of the TS7700 Virtualization Engine arranges the volumes in a cache partition in the anticipated order they are to be removed when space is needed. To remove a volume from cache, it must first have been premigrated (which means copied to a physical tape). For this reason, it is possible that volumes with a higher order number are removed from cache first. As part of the Advanced Policy Management functions of the TS7700 Virtualization Engine, the Storage Class construct provides control of the partition for a volume’s data and cache preferencing policies for the management of the volume in cache.
Two preferencing policies are supported:
Preference Level 0 (PG0)
When space is needed in the cache, premigrated volumes assigned to PG0 are removed from cache before volumes assigned to preference group 1. Within PG0, the volumes are ordered for removal from cache by largest volumes first.
Clarification: Volumes that are assigned to PG0 can also be removed from the cache, independently of the need for cache space, as a background task within the TS7700 Virtualization Engine.
Preference Level 1 (PG1)
When space is needed in the cache and there are no premigrated PG0 volumes to remove, premigrated volumes that are assigned to PG1 are removed. Within PG1, the volumes are ordered for removal from cache based on the time since the last access or “least recently used” (LRU).
Tip: The order of removal of a volume from cache might also be influenced by other storage construct settings for a volume, so the order that is presented in the response data must not be relied on to be exact.
The contents of the cache that are associated with the specific cluster that the request volume is written to are returned in the response records. In a TS7700 grid configuration, separate requests must be issued to each cluster to obtain the cache contents.
The response records are written in 80-byte fixed block (FB) format.
The generation of the response might take several minutes to complete depending on the number of volumes in the cache and how busy the TS7700 cluster is at the time of the request.
The contents of the cache typically are all private volumes. However, some might have been returned to SCRATCH status soon after being written. The TS7700 does not filter the cache contents based on the private or SCRATCH status of a volume.
Physical volume to logical volume mapping information
The TS7700 Virtualization Engine maintains the mapping between logical and physical volumes in a database on each cluster. It is possible that there are inconsistencies in the mapping information provided with this function. This results when a logical volume is being moved from one physical volume to another. For a period of time, the volume is shown on more than one physical volume. This can result in a small number of logical volumes reported as being on physical volumes that they were located on in the past, but are not presently located on.
Even with inconsistencies, the mapping data is useful if you want to design jobs that recall data efficiently off of physical volumes. If the logical volumes reported on a physical volume are recalled together, the efficiency of the recalls will be increased. If a logical volume with an inconsistent mapping relationship is recalled, it will recall correctly, but an additional mount of a separate physical volume might be required.
The physical volume to logical volume mapping that is associated with the physical volumes managed by the specific cluster to which the request volume is written is returned in the response records. In a TS7700 grid configuration, separate requests must be issued to each cluster to obtain the mapping for all physical volumes.
The response records are written in 80-byte FB format.
Tip: The generation of the response can take several minutes to complete depending on the number of active logical volumes in the library and how busy the TS7700 cluster is at the time of the request.
Point in time statistics
A TS7700 Virtualization Engine is continually logging information about the activities within it. The logged information is referred to as statistical information and is recorded in two forms: point in time and historical. Point-in-time statistics indicate the state and operational aspects of the TS7700 Virtualization Engine over a short interval of time. The time interval is currently approximately 15 seconds. A request for point-in-time statistics will respond with the data accumulated in the interval completed just before the request being processed. Because of this, the state information reported might lag behind the actual state of the TS7700 Virtualization Engine by an interval.
Other than an information header, point-in-time statistics are provided in a mixture of character and binary format fields. The record sizes and format of the statistical records are defined in the IBM Virtualization Engine TS7700 Series Statistical Data Format White Paper Version 2.0, which is available at the following URL:
The point-in-time statistics for all clusters are returned in the response records. In a TS7700 grid configuration, this means that the request volume can be written to any cluster to obtain the information for the entire configuration.
The response records are written in binary undefined (U) format of maximum 24000 bytes.
If a cluster or node is not available at the time that the point in time statistics are recorded, except for the headers, all the data fields for that cluster or node will be zeros.
The request records are written in FB format. To read the response records, use the Undefined (U) format with a maximum blocksize of 24000. The response records are variable in length.
Historical statistics
A TS7700 Virtualization Engine is continually logging information regarding the activities within it. The logged information is referred to as statistical information and is recorded in two forms: point-in-time and historical. Historical statistics indicate the operational aspects of the TS7700 Virtualization Engine accumulated over a 15-minute interval of time. The data from each 15-minute interval is maintained and logged within the TS7700 Virtualization Engine. A request for historical statistics results in a response file that contains all of the data logged up to that point for the requested julian day.
Other than an information header, historical statistics are provided in character and binary format fields. The sizes and format of the statistical records are defined in the IIBM Virtualization Engine TS7700 Series Statistical Data Format White Paper Version 2.0:
The historical statistics for all clusters are returned in the response records. In a TS7700 grid configuration, this way means that the request volume can be written to any cluster to obtain the information for the entire configuration.
The response records are written in a binary undefined (U) format of a maximum of 24000 bytes.
If a cluster or node is not available at the time that the historical statistics are recorded, except for the headers, all the data fields for that cluster or node will be zeros.
The TS7700 Virtualization Engine retains 90 days worth of historical statistics. If you want to keep statistics for a longer period of time, be sure that you retain the logical volumes that are used to obtain the statistics.
The request records are written in FB format. To read the response records, use the undefined (U) format with a maximum blocksize of 24000. The response records are variable in length.
Physical media pools
The TS7700 Virtualization Engine supports separating the physical volumes that it manages into pools. The supported pools include a pool that contains scratch (empty) volumes that are common, and up to 32 pools that can contain scratch (empty) and data (filling/full) volumes. Pools can borrow and return volumes from the common scratch pool. Each pool can contain several types of media.
For pool 0 (common scratch pool), because it only contains empty volumes, only the empty count is returned. Volumes that have been borrowed from the common pool are not included.
For pools 1 - 32, a count of the physical volumes that are empty, are empty and waiting for erasure, are in the process of being filled, and have been marked as full is returned. The count for empty includes physical volumes that have been specifically assigned to the pool and volumes that were borrowed from the common scratch pool but have not yet been returned. The count of volumes that are marked as Read Only or Unavailable (including destroyed volumes) is returned. Also, the full data volumes contain a mixture of valid and invalid data. Response records are provided for the distribution of active data on the data volumes marked as full for a pool.
Information is returned for the common pool and all other pools that are defined and have physical volumes associated with them.
The physical media pool information managed by the specific cluster to which the request volume is written is returned in the response records. In a TS7700 grid configuration, separate requests must be issued to each cluster to obtain the physical media pool information for all clusters.
The response records are written in 80-byte FB format. Counts are provided for each media type associated with the pool (up to a maximum of eight).
Physical volume status
A database is maintained on each TS7740 cluster that contains information related to the management of the physical volumes on the cluster. The physical volume status information that is returned represents the status of the volume or volumes on the cluster to which the request volume is written. In a TS7700 grid configuration, separate requests must be issued to each cluster to obtain the physical volume status information for the individual clusters. A response record is written for each physical volume, selected based on the volume serial number mask or pool number specified in the request, that exists in the cluster. A response record consists of the database fields defined in the following table. Fields are presented in the order defined in the table and are comma-separated (,).
The overall length of each record is 400 bytes with blank padding after the last field, as needed. For example, the first few fields of the record returned for VOLSER A03599 are shown:
Tip: The generation of the response might take several minutes to complete depending on the number of volumes requested and how busy the TS7700 cluster is at the time of the request.
Copy audit request
A database is maintained on each TS7740 Virtualization Engine cluster that contains status information about the logical volumes defined to the grid. Two key pieces of information are whether the cluster contains a valid copy of a logical volume and whether the copy policy for the volume indicates that it must have a valid copy.
This request performs an audit of the databases on a set of specified TS7700 Virtualization Engine distributed libraries to determine whether there are any volumes that do not have a valid copy on at least one of them. If the COPYMODE option is specified, whether the volume is supposed to have a copy on the distributed library is taken into account when determining whether that distributed library has a valid copy. If COPYMODE is specified and the copy policy for a volume on a specific cluster is “R” or “D”, that cluster is considered during the audit. If COPYMODE is specified and the copy policy for a volume on a specific cluster is “N”, the volume’s validity state is ignored because that cluster does not need to have a valid copy. The request then returns a list of any volumes that do not have a valid copy, subject to the copy mode if the COPYMODE option is specified, on the TS7700 Virtualization Engine clusters specified as part of the request.
The specified clusters might not have a copy for several reasons:
The copy policy associated with the volume did not specify that any of the clusters specified in the request were to have a copy and the COPYMODE option was not specified. This might be because of a mistake in defining the copy policy or because it was intended. For example, volumes used in a disaster recovery test only need to reside on the disaster recovery TS7700 Virtualization Engine and not on the production TS7700 Virtualization Engines. If the request specified only the production TS7700 Virtualization Engines, all of the volumes used in the test are returned in the list.
The copies have not yet been made from a source TS7700 Virtualization Engine to one or more of the specified clusters. This can be because the source TS7700 Virtualization Engine or the links to it are unavailable, or because a copy policy of Deferred was specified and a copy has not been completed when the audit was performed. In addition, one or more of the specified clusters might have completed their copy and then had their copy automatically removed as part of the TS7700 Virtualization Engine hybrid automated removal policy function. Automatic removal can only take place on TS7720 Virtualization Engine clusters in a hybrid configuration.
Each of the specified clusters contained a valid copy at one time but has since removed them as part of the TS7700 Virtualization Engine hybrid automated removal policy function. Automatic removal can only take place on TS7720 Virtualization Engine clusters in a hybrid configuration.
The Copy Audit request is intended to be used for the following situations:
A TS7700 Virtualization Engine is to be removed from a grid configuration. Before its removal, you want to ensure that the TS7700 Virtualization Engines that are to remain in the grid configuration have a copy of all the important volumes that were created on the TS7700 Virtualization Engine that is to be removed.
A condition has occurred (because of a site disaster or as part of a test procedure) where one of the TS7700 Virtualization Engines in a grid configuration is no longer available and you want to determine which, if any, volumes on the remaining TS7700 Virtualization Engines do not have a valid copy.
In the Copy Audit request, you need to specify which TS7700 Virtualization Engine clusters are to be audited. The clusters are specified by using their associated distributed library ID (this is the unique five-character library sequence number defined when the TS7700 Virtualization Engine Cluster was installed). If more than one distributed library ID is specified, they are separated by a comma. The following rules determine which TS7700 Virtualization Engine clusters are to be included in the audit:
When the INCLUDE parameter is specified, all specified distributed library IDs will be included in the audit. All clusters associated with these IDs must be available or the audit will fail.
When the EXCLUDE parameter is specified, all specified distributed library IDs will be excluded from the audit. All other clusters in the grid configuration must be available or the audit will fail.
Distributed library IDs specified are checked for being valid in the grid configuration. If one or more of the specified distributed library IDs are invalid, the Copy Audit fails and the response will indicate the IDs that are considered invalid.
Distributed library IDs must be specified or the Copy Audit fails.
Here are examples of valid requests (assume a three-cluster grid configuration with distributed library IDs of DA01A, DA01B, and DA01C):
 – COPY AUDIT INCLUDE DA01A: Audits the copy status of all volumes on only the cluster associated with distributed library ID DA01A.
 – COPY AUDIT COPYMODE INCLUDE DA01A: Audits the copy status of volumes that also have a valid copy policy on only the cluster associated with distributed library ID DA01A.
 – COPY AUDIT INCLUDE DA01B,DA01C: Audits the copy status of volumes on the clusters associated with distributed library IDs DA01B and DA01C.
 – COPY AUDIT EXCLUDE DA01C: Audits the copy status of volumes on the clusters in the grid configuration associated with distributed library IDs DA01A and DA01B.
On completion of the audit, a response record is written for each logical volume that did not have a valid copy on any of the specified clusters. Volumes that have never been used, have had their associated data deleted, or have been returned to scratch are not included in the response records. The record includes the volume serial number and the copy policy definition for the volume. The VOLSER and the copy policy definitions are comma separated, as shown in Example 10-6.
The response records are written in 80-byte FB format.
Example 10-6 BVIR message when Copy Audit is requested
The output for Copy Audit includes Copy Consistency Points for up to eight TS7700 Virtualization Engine clusters. This is to provide for future expansion of the number of clusters supported in a TS7700 Virtualization Engine Grid to the architected maximum.
Copy Audit might take more than an hour to complete depending on the number of logical volumes that have been defined, how many clusters are configured in the grid configuration, and how busy the TS7700 Virtualization Engines are at the time of the request.
Unknown or invalid request
If the request file does not contain the correct number of records or the first record is incorrect, the request file on the volume is unchanged and no error is indicated.
If the request file contains the correct number of records and the first record is correct but the second is not, the response file will indicate in Record 6 that the request is unknown, as shown in Example 10-7.
Example 10-7 BVIR message when an unknown or invalid request is submitted
If the request file contains the correct number of records, the first record is correct, and the second is recognized but includes a variable that is not within the range supported for the request, the response file will indicate in record 6 that the request is invalid, as shown in Example 10-8.
Example 10-8 BVIR message when an invalid variable is specified
10.10 IBM Tape Tools
A set of tape tools is available on an as-is basis to help you monitor your tape environment. Several of these tools are specific to the TS7700 and are based on BVIR data, such as VEHSTATS. Before describing VEHSTATS and the reports that you can obtain by running the VEHSTATS jobs, a general overview of the IBM Tape Tools is provided to guide you through the installation of these tools in a z/OS environment.
10.10.1 Introduction to IBM Tape Tools
All the TS7700 Virtualization Engine monitoring and evaluating tools that are described are at the following FTP site:
Example 10-9 on page 745 lists the content of the Readme.txt file that provides basic information about the tape tools.
Example 10-9 Readme.txt from the IBM Tape Tools website
Program enhancements will be made to handle data format changes when they occur. If you try to run new data with old program versions, results will be unpredictable. To avoid this situation, you need to be informed of these enhancements so you can stay current. To be informed of major changes to any of the tools distributed via this ftp site, send an email message to:
In the subject, specify NOTIFY. Nothing else is required in the body of the note. This will add you to our change distribution list.
The UPDATES.TXT file will contain a chronological history of all changes made to the tools. You should review that file on a regular basis, at least monthly, perhaps weekly, so you can see if any changes apply to you.
Look in file, OVERVIEW.PDF, for an overview of all currently available tools.
The JCL, CNTL, and LOAD libraries for all the tools are zipped into IBMTOOLS.EXE.
IBMTOOLS.TXT explains the complete installation procedure.
Most tools have their own xxxxxx.txt file with more detail. There are no formal documentation manuals. The intent is to have enough JCL comment to allow the user to run the jobs without difficulty and to have adequate column headings and footnotes to make report output obvious without needing additional documentation.
If you feel that the JCL or report output needs more explanation, please send an mail to the address above indicating the area needing attention.
Most of these tools are z/OS-based and included in the ibmtools.exe file. A complete list of all tools that are included in the ibmtools.exe file is available in the overview.pdf file. Tools that might be interesting for you are presented in Table 10-7.
Table 10-7 Tape tools selection
Major use
Identify small VTS blocksizes
Improve VTS performance, make jobs run faster
VOLSER, Jobname, and Dsname for VTS volumes with small blocksizes.
Get historical stats from TS7700
Creates U, VB, SMF format
Statistics file.
Identify available scratch by pool
Reports all pools at once
BVIR file
Physical media by pool.
Reclaim Copy Export volumes
Based on active GB,
not %
BVIR file
Detailed report of data on volumes.
Identify VTS virtual volumes by owner
Determine which applications or users have virtual volumes
Logical volumes by jobname or dsname, logical to physical reports .
Point in Time stats as write to operator (WTO)
Immediately available
Point in Time stats as WTO.
Copy lvols from old VTS
Recall lvols based on selected applications
IEBGENER to recall lvols and copy to new VTS.
Identify multi-file volumes with different expiration dates
Prevent single file from not allowing volume to return to scratch
List of files not matching
file 1 expiration date .
Replace *.HMIGTAPE.DATASET in SMF 14 with actual recalled dsname
Allows TapeWise and other tools using SMF 14/15 data to report actual recalled data set
FSR records plus SMF 14, 15, 21, 30, 40, and so on
Updated SMF 14s plus all other SMF records as they were.
Get VOLSERs from list of dsns
Automate input to PRESTAGE
VOLSERs for requested dsns .
Report job elapsed times
Show runtime improvements
SMF 30 records
Job step detailed reporting
Monitor mount pending and volume allocations
Determine accurate mount times and concurrent drive allocations
Samples tape UCBs
Detail, summary, distribution, hourly, TGROUP, and system reporting.
Identify orphan data sets in Tape Management Catalog
Cleanup tool
Listing file showing all multiple occurrence generation data group (GDGs) that have not been created in the last nn days.
Recall lvols to VTS
Ordered and efficient
Jobs submitted to recall lvols.
IFASMFDP exit or E15 exit
Filters SMF records to keep just tape activity. Generates “tape” records to simulate optical activity
SMF data
Records for tape activity plus optional TMM or optical activity.
Show current tape compression ratios
See how well data will compress in VTS
Logrec MDR or EREP history file
Shift and hourly reports showing current read and write compression ratios.
Identify tape usage improvement opportunities
Shows UNIT=AFF, early close, UNIT=(TAPE,2), multi-mount, DISP=MOD, recalls
SMF 14, 15, 21, 30, and 40
Detail, summary, distributions, hourly, TGROUP, and system reporting.
Identify tape configuration database (TCDB) versus Tape Catalog mismatches
List VOLSER mismatches
ERRRPT with mismatched volumes.
Identify data sets with create date equal to last ref date
Get candidate list for VTS PG0
Filter list of potential PG0 candidates.
Graphing package
Graphs TS7700 activity
VEHSTATS flat files
Many graphs of TS7700 activity.
Dump fields in historical statistics file
Individual field dump
BVIR stats file
DTLRPT for selected interval.
TS7700 historical performance reporting
Show activity on and performance of TS7700
Reports showing mounts, data transfer, and box usage.
TS7700 point-in-time statistics
Snapshot of last 15 seconds of activity plus current volume status
BVIRPIT data file
Reports showing current activity and status.
Synchronize TS7700 after new cluster added
Identify lvols that need copies
List of all VOLSERs to recall by application.
Show all active VOLSERs from tape management catalog. Also get volume counts by group, size, and media.
Used to get a picture of user data set naming conventions. See how many volumes are allocated to different application s.
Dsname, VOLSER, create date, and volseq. Group name, counts by media type.
10.10.2 Tools download and installation
Public access is provided to the IBM Tape Tools library, which contains various tools that can help you analyze your tape environment. This set of utilities also includes the VEHSTATS and VEPSTATS tools, which use the Bulk Volume Information Retrieval (BVIR) reports for comprehensive performance analysis.
Figure 10-47 shows several tools that are available from the FTP site.
Figure 10-47 Tape tools catalog
The index is at the following web address:
For most tools, a text file is available. In addition, each job to run a tool contains a detailed description of the function of the tool and parameters that need to be specified.
Important: For the IBM Tape Tools, there are no warranties, expressed or implied, including the warranties of merchantability and fitness for a particular purpose.
To obtain the tape tools, download the ibmtools.exe file to your computer or use FTP from Time Sharing Option (TSO) on your z/OS system to directly upload the files contained in the ibmtools.exe file.
The ibmtools.exe file is a self-extracting .zip file that is expanded into four separate files:
IBMJCL.XMI Contains the execution JCL for current tape analysis tools.
IBMCNTL.XMI Contains parameters needed for job execution, but that do not need to be modified by the user.
IBMLOAD.XMI Contains the load library for executable load modules.
IBMPAT.XMI Contains the data pattern library, which is only needed if you will run the QSAMDRVR utility.
The ibmtools.txt file contains detailed information about how to download and install the tools libraries.
After you have created the three or four libraries on the z/OS host, be sure that you execute the following steps:
1. Copy, edit, and submit userid.IBMTOOLS.JCL($$CPYLIB) to create a new JCL library that has a unique second node (&SITE symbolic). This step creates a private JCL library for you from which you can submit jobs while leaving the original as is. CNTL and LOAD can then be shared by multiple users running jobs from the same system.
2. Edit and submit userid.SITENAME.IBMTOOLS.JCL($$TAILOR) to tailor the JCL according to your system requirements.
The updates.txt file contains all fixes and enhancements made to the tools. Review this file regularly to determine whether any of the programs that you use have been modified.
To ensure that you are not working with outdated tools, the tools are controlled through an EXPIRE member. Every three months, a new EXPIRE value will be issued that is good for the next 12 months. When you download the latest tools package any time during the year, you have at least nine months remaining on the EXPIRE value. New values are issued in the middle of January, April, July, and October.
If your IBM tools jobs stop running because the expiration date has passed, download the ibmtools.exe file again to get the latest IBMTOOLS.JCL(EXPIRE) member.
10.10.3 IBM Tape Tools for TS7700 monitoring
Several tape tools that can be used to help you better understand your tape processing with regard to TS7700 operation and migration are described.
IOSTATS tool is part of the ibmtools.exe file, which is available at the following URL:
You can use IOSTATS tool to measure job execution times. For example, you might want to compare the TS7700 Virtualization Engine performance before and after configuration changes.
IOSTATS can be run for a subset of job names for a certain period of time before the hardware installation. SMF type 30 records are required as input. The reports list the number of disk and tape I/O operations that were done for each job step, and the elapsed job execution time.
With the TS7700 Virtualization Engine running in a multicluster grid configuration, IOSTATS can be used for the following purposes:
To evaluate the effect of the multicluster grid environment and to compare job execution times before implementation of the multicluster grid to those after migration, especially if you are operating in immediate copy (RUN, RUN data consistency point) mode.
To evaluate the effect of hardware upgrades and to compare job execution times before and after upgrading components of the TS7700 Virtualization Engine. For example, you might want to verify the performance impact of a larger TVC capacity or the number of TS1130/TS1120/3592 tape drives.
To evaluate the effect of changing the copy mode of operation on elapsed job execution time.
As with IOSTATS, TAPEWISE tool is available from the IBM Tape Tools FTP site. TAPEWISE can, based on input parameters, generate several reports:
Tape activity analysis
Mounts and MBs processed by hour
Input and output mounts by hour
Mounts by SYSID during an hour
Concurrent open drives used
Long VTS mounts (recalls)
As with IOSTATS, MOUNTMON is available from the IBM Tape Tools FTP site. MOUNTMON runs as a started task or batch job and monitors all tape activity on the system. The program must be authorized program facility (APF)-authorized and, if it runs continuously, it writes statistics for each tape volume allocation to SMF or to a flat file.
Based on data that is gathered from MOUNTMON, the MOUNTRPT program can report on the following information:
How many tape mounts are necessary
How many are scratch
How many are private
How many by host system
How many by device type
How much time is needed to mount a tape
How long are tapes allocated
How many drives are being used at any given time
What is the most accurate report of concurrent drive usage
Which jobs are allocating too many drives
10.11 Using VEHSTATS and VEHGRXCL for monitoring and reporting
This section shows how to work with the binary reports for point-in-time and historical statistics after you use the BVIR functions that are described in Appendix F, “Sample JCL” on page 925. Some of the response data of the BVIR functions is already in a readable format. For the remaining binary format data provided by the point-in-time statistics and historical statistics, you need a formatting tool. IBM provides a tool called VEHSTATS. Further information about where to download this tool and how to use it is in 10.10, “IBM Tape Tools” on page 744.
To convert the binary response record from BVIR data to address your requirements, you can use the IBM tool VEHSTATS when working with historical statistics. When working with point-in-time statistics, you can use the IBM tool VEPSTATS. See 10.10.2, “Tools download and installation” on page 747 for specifics about where to obtain these tools. Details about using BVIR are in the IBM Virtualization Engine TS7700 Series Bulk Volume Information Retrieval Function User’s Guide. The most recently published white papers are available at the Techdocs website by searching for TS7700 Virtualization Engine at the following address:
With the record layout of the binary BVIR response data, you can decode the binary file or you can use the record layout to program your own tool for creating statistical reports.
10.11.1 VEHSTATS tool overview
The TS7700 Virtualization Engine’s activity is recorded in the subsystem. There are two types of statistics:
Point-in-time statistics: A snapshot of activity in the last 15 seconds
Historical statistics: Up to 90 days in 15-minute increments
Both sets of statistics can be obtained through the BVIR functions (see Appendix F, “Sample JCL” on page 925).
Because both types of statistical data are delivered in binary format from the BVIR functions, you must translate the content into a readable format. You can do this task manually by using the information provided in the following documents:
IBM Virtualization Engine TS7700 Series Statistical Data Format White Paper Version 2.0:
IBM Virtualization Engine TS7700 Series VEHSTATS Decoder:
Or, you can use an existing automation tool. IBM provides a historical statistics tool called VEHSTATS. Like the other IBM Tape Tools, the program is provided as-is, without official support, for the single purpose of showing how the data might be reported. There is no guarantee of its accuracy, and there is no additional documentation available for this tool. Guidance for interpretation of the reports is available in 10.11.3, “VEHSTATS reports” on page 752.
You can use VEHSTATS to monitor TS7700 Virtualization Engine virtual and physical back-end tape drives, and TVC activity to do trend analysis reports, based on BVIR binary response data. The tool summarizes TS7700 Virtualization Engine activity on a specified time basis, up to 90 days in time sample intervals of 15 minutes or one hour, depending upon the data reported.
Figure 10-47 on page 747 highlights three files that might be helpful in reading and interpreting VEHSTATS reports:
The TS7700.VEHSTATS.Decoder.V10.pdf file contains a description of the fields listed in the various VEHSTATS reports.
The VEHGRXCL.txt file contains the description for the graphical package contained in VEHGRXCL.EXE.
The VEHGRXCL.EXE file contains VEHSTATS_Model.ppt and VEHSTATS_Model.xls. You can use these files to create graphs of cluster activity based on the flat files created with VEHSTATS. Follow the instructions in the VEHSTATS_Model.xls file to create these graphs.
10.11.2 Running the VEHSTATS jobs
You have several output options for VEHSTATS, and you must submit separate jobs depending on your requirements. The IBMTOOLS.JCL member VEHSTATS (Example 10-10) provides guidance about which job to choose.
Example 10-10 Member VEHSTATS
In addition to the VEHSTATS tool, sample BVIR jobs are included in the IBMTOOLS libraries. These jobs help you obtain the input data from the TS7700 Virtualization Engine. With these jobs, you can control where the historical statistics are accumulated for long-term retention. The TS7700 Virtualization Engine still maintains historical statistics for the previous 90 days, but you can have the pulled statistics recorded directly to the SMF log file or continue to use the disk flat file method. The flat files can be recorded as either RECFM=U or RECFB=VB.
Three specific jobs in IBMTOOLS.JCL are designed to fit your particular needs:
BVIRHSTS To write statistics to the SMF log file
BVIRHSTU To write statistics to a RECFM=U disk file
BVIRHSTV To write statistics to a RECFM=VB disk file
The VEHSTATS reporting program accepts any or all of the various formats of BVIR input. Define which input is to be used through a data definition (DD) statement in the VEHSTATS job. The three input DD statements are optional, but at least one of the statements shown in Example 10-11 must be specified.
Example 10-11 VEHSTATS input DD statements
//* DSN=&USERHLQ..#&VTSID..BVIRHIST.D070205.D070205
//* DSN=&USERHLQ..#&VTSID..BVIRHIST.D070206.D070206
The SMF input file can contain all SMF record types kept by the user. The SMFNUM parameter defines which record number is processed when you specify the STATSMF statement.
The fields shown in the various reports depend on which ORDER member in IBMTOOLS.JCL is being used. Use the following steps to ensure that the reports and the flat file contain the complete information that you want in the reports:
1. Review which member is defined in the ORDER= parameter in the VEHSTATS job member.
2. Verify that none of the fields that you want to see have been deactivated by an asterisk in the first column. Example 10-12 on page 752 shows sample active and inactive definitions in the ORDERV12 member of IBMTOOL.JCL. The sample statements define whether you want the amount of data in cache to be displayed in MB or in GB.
Example 10-12 Sample statements in the ORDERV12 member
If you are planning to create graphics from the flat file using the graphics package from the IBM Tape Tools FTP site, specify the ORDERV12 member because it contains all the fields that are used when creating the graphics, and verify that all statements are activated for all clusters in your environment.
10.11.3 VEHSTATS reports
VEHSTATS can be used to monitor TS7700 Virtualization Engine drive and TVC activity, and to perform trend analysis to see where the performance bottlenecks are. Also, comparative analysis can be used to determine whether an upgrade, such as adding additional physical tape drives, might improve the overall performance of the TS7740 Virtualization Engine. VEHSTATS is not a projection tool, but it provides the basis for an overall health check of the TS7700 Virtualization Engine.
VEHSTATS gives you a huge amount of information. The following list shows the most important reports available for the TS7700 Virtualization Engine, and the results and analysis that can help you understand the reports better:
H20VIRT: Virtual Device Historical Records
H21ADP00: vNode Adapter Historical Activity
H21ADPXX: vNode Adapter Historical Activity combined (by adapter)
H21ADPSU: vNode Adapter Historical Activity combined (total)
H30TVC1: hNode HSM Historical Cache Partition
H31IMEX: hNode Export/Import Historical Activity
H32TDU12: hNode Library Historical Drive Activity
H32CSP: hNode Library Hist Scratch Pool Activity
H32GUPXX: General Use Pools 01/02 through General Use Pools 31/32
H33GRID: hNode Historical Peer-to-Peer Activity
AVGRDST: Hrs Interval Average Recall Mount Pending Distribution
DAYMRY: Daily Summary
MONMRY: Monthly Summary
COMPARE: Interval Cluster Comparison
HOURFLAT: 15 minutes interval or one hour interval
DAYHSMRY: Daily flat file
Tip: Be sure that you have a copy of TS7700.VEHSTATS.Decoder.V10.pdf available when you familiarize yourself with the VEHSTATS reports.
Virtual Device Activity
Example 10-13 on page 753 shows the report for Virtual Device Activity. This report gives you an overview, per 15-minute interval, of the relevant time frame and shows the following information:
The minimum, average, or maximum (MIN, AVG, or MAX) mounted virtual drives
The amount of channel blocks written based on blocksize
Clarification: The report is provided per cluster in the grid. The report title includes the cluster number in the DIST_LIB_ID field.
Example 10-13 VEHSTATS report for Virtual Drive Activity
     TIME INST MIN AVG MAX THRPUT <=2048 <=4096 <=8192 <=16384 <=32768 <=65536 >65536
4:15:00 256 114 124 127 MAX 630 0 0 0 2485298 0 0
4:30:00 256 117 125 127 MAX 631 0 0 0 2026062 0 0
4:45:00 256 113 124 127 MAX 530 0 0 0 2620099 0 0
5:00:00 256 117 125 127 MAX 474 0 0 0 3118714 0 0
The most important fields in this report are CHANNEL BLOCKS WRITTEN FOR BLOCKSIZES. In general, the largest amount of blocks are written at 32768 or higher blocksize, but this is not a fixed rule. For example, DFSMShsm writes a 16384 blocksize and DB2 writes a 4096 blocksize. From an I/O point of view, analysis of blocksize on performance is outside the scope of this book.
vNode Host Adapter Activity
The next example report provides details about the vNode Host Adapter Activity. Although there is a large amount of information available (one report per distributed library per FICON adapter), the vNode Adaptor Historical Activity Combined report is usually sufficient to provide an overall view of the FICON channel performance. As always, one report exists for each distributed library. This report is on an hourly basis with the following information:
Total throughput per distributed library every hour
Read and write channel activity
Read and write device activity with compression rate achieved
Example 10-14 shows a sample report for Adapter 3 of Cluster 0.
Example 10-14 Adapter 3 sample report
19JUL10MO PORT 0 MiB is 1024 based, MB is 1000 based PORT 1
   RECORD GBS MB/ ----CHANNEL-------------- ----------DEVICE--------- GBS MB/ ---------CHANNEL-------   ----------DEVICE-----
01:00:00 4 20 25827 7 49676 13 7741 3.33 19634 2.53 0 0   0 0 0 0 0 0
02:00:00 4 7 9204 2 18030 5 2100 4.38 6480 2.78 0 0   0 0 0 0 0 0
03:00:00 4 1 2248 0 4550 1 699 3.21 1154 3.94 0 0   0 0 0 0 0 0
04:00:00 4 0 0 0 69 0 0 24 2.87 0 0   0 0 0 0 0 0
05:00:00 4 0 1696 0 1655 0 550 3.08 540 3.06 0 0   0 0 0 0 0 0
06:00:00 4 9 8645 2 24001 6 3653 2.36 13589 1.76 0 0   0 0 0 0 0 0
07:00:00 4 4 6371 1 10227 2 2283 2.79 3503 2.91 0 0   0 0 0 0 0 0
08:00:00 4 2 5128 1 4950 1 2048 2.50 1985 2.49 0 0   0 0 0 0 0 0
09:00:00 4 3 6270 1 7272 2 2530 2.47 3406 2.13 0 0   0 0 0 0 0 0
The following fields are the most important fields in this report:
TOTAL_MB/S: This is the total throughput of the cluster. In a multicluster configuration, combine every distributed library’s hourly figures to calculate the total throughput, giving a better perspective of performance. For example, at the sixteenth time frame, you can see 253 MBps in both distributed libraries. This gives a total throughput of 506 MBps during this hour.
DEVICE_COMP (to the right of the WR-GB column): This is the real rate of compression achieved each hour. The compression ratio is highly dependent on the nature of the data.
The host adapter activity is summarized per adapter and as a total of all adapters. This result is also shown in the vNode Adaptor Throughput Distribution report shown in Example 10-15.
Example 10-15 Adapter Throughput Distribution report
1 - 50 477 64.4 64.4
51 - 100 191 25.8 90.2
101 - 150 52 7.0 97.2
151 - 200 17 2.2 99.5
201 - 250 1 0.1 99.7
251 - 300 2 0.2 100.0
This report summarizes the overall host throughput and shows how many one-hour intervals have shown which throughput. For example, look at the second line of the report data:
The throughput was 51 - 100 MBps in 191 intervals.
191 intervals are 25.8% of the entire measurement period.
In 90.2% of the measurement intervals, the throughput was below 100 MBps.
Cache Partition Activity
This report provides details of Cache Partition Activity in the TS7700 Virtualization Engine. You can identify the following information for each 15-minute interval:
The percentage of read, write, and deferred copy throttling
The number of scratch (Fast Ready) mounts, cache hits, and cache misses
The capacity and number of logical volumes by preference group (0 or 1) in cache
The report also shows information about the Preference Groups.
The following fields are the most important fields in this report:
The ratio between FAST_RDY_MOUNTS, CACHE_HIT_MOUNTS, and CACHE_MISS_MOUNTS. In general, a high number of CACHE_MISSES might mean that additional cache capacity is needed or cache management policies need to be adjusted.
FAST_RDY_AVG_SECS and CACHE_HIT_ AVG_ SECS need to show only a few seconds. CACHE_MIS_AVG_SECS can list values higher than a few seconds, but higher values (more than two or three minutes) might indicate a lack of back-end physical tape drives. See “Physical Drive Activity” on page 754 for more information.
Physical Drive Activity
Another important report is the report for Physical Drive Activity, grouped by device type. See the information for TS1130 drives from a sample report in Example 10-16 on page 755. From there, you can identify the following items for each 15-minute interval taken for the report:
How many physical tape drives were installed
How many physical tape drives were available
How many drives (min/avg/max) were mounted
How much time (min/avg/max in seconds) the mount took
The number of physical mounts sorted by purpose:
 – STG: Recalls of logical volumes back into cache
 – MIG: Premigration of logical volumes from cache to physical tape
 – RCM: Reclamation
 – SDE: Secure Data Erase
Example 10-16 VEHSTATS for Physical Drives Activity
08JUL10TH -----------PHYSICAL_DRIVES_3592-E06-------------------
01:00:00 16 16 2 9 16 20 32 53 3 15 0 0 18
02:00:00 16 16 3 8 16 20 25 39 6 4 0 0 10
03:00:00 16 16 1 4 9 20 20 21 4 2 0 0 6
04:00:00 16 16 1 2 3 19 21 23 0 2 0 0   2
The following fields are the most important fields in this report:
PHYSICAL_DRIVE_MOUNTED_AVG: If this value is equal or close to the maximum drives available during several hours, this might mean that more physical tape drives are required.
MOUNT_FOR (RCL MIG RCM SDE): This field presents the reason for each physical mount. If the percentage value in the Recall (RCL) column is high compared to the total number of mounts, this might indicate a need to evaluate the cache size or cache management policies. However, this is not a fixed rule and further analysis is required. For example, if HSM migration is into a TS7740 Virtualization Engine, you might see high recall activity during the morning, which can be driven by temporary development or user activity. This is normal and not a problem in itself.
Common Scratch Pool
The report for Common Scratch Pool presented in Example 10-17 shows you the amount of scratch tapes per physical media type. Review this report often to avoid a shortage in scratch stacked volumes.
Example 10-17 VEHSTATS report for Common Scratch Pool
RECORD 3590J 3590K 3592JA 3592JJ NONE NONE 3592JB NONE
4:15:00 0 ...... 0............. 42 ........... 0........... 0.............. 0............. 0 ........... 0
4:30:00 0 ..... 0 ............ 42 .......... 0........... 0.............. 0 ............ 0............ 0
4:45:00 0 ....0 ........42 ...... 0 .....0 ........0 .......0 ......0
5:00:00 0 ....0 ........41 ......0 .....0 ........0 .......0 ............0
General Pool Use
The General Pool Use report is shown in Example 10-18. A single report always shows two pools. In this example, the report shows Pool 01 and Pool 02. You can see the following details per pool for each recorded time frame:
The number of active logical volumes
The amount of active data in GB
The amount of data written in MB
The amount of data read in MB
The current reclamation threshold and target pool
Example 10-18 VEHSTATS report for General Pool Use
4:15:00 65079 18052 5412 0 2 56 25 01 00 00 0 0 0 0 0 0 25 02 00 00
4:30:00 65079 18052 37888 0 2 56 25 01 00 00 0 0 0 0 0 0 25 02 00 00
  4:45:00 65079 18052 83895 0 2 56 25 01 00 00 0 0 0 0 0 0 25 02 00 00
5:00:00 65630 18206 94721 0 2 57 25 01 00 00 0 0 0 0 0 0 25 02 00 00
5:15:00 65630 18206 98630 0 2 57 25 01 00 00 0 0 0 0 0 0 25 02 00 00
5:30:00 65630 18206 124490 0 2 57 25 01 00 00 0 0 0 0 0 0 25 02 00 00
5:45:00 65630 18206 119979 0 2 57 25 01 00 00 0 0 0 0 0 0 25 02 00 00
6:00:00 67069 18610 108854 0 2 57 25 01 00 00 0 0 0 0 0 0 25 02 00 00
6:15:00 67069 18610 108854 0 2 57 25 01 00 00 0 0 0 0 0 0 25 02 00 00
6:30:00 67069 18610 97126 0 2 57 25 01 00 00 0 0 0 0 0 0 25 02 00 00
Peer-to-Peer Activity
The Peer-to-Peer Activity report shown in Example 10-19 provides various performance metrics of grid activity. This report can be useful for installations working in Deferred copy mode. This report allows, for example, the analysis of subsystem performance during peak grid network activity, such as determining the maximum delay during the batch window.
For the duration of the report, you can identify, in 15-minute increments, the following items:
The number of logical volumes to be copied (valid only for a multicluster grid configuration)
The amount of data to be copied (in MB)
The average age of copy jobs on the deferred and immediate copy queue
The amount of data (in MB) to and from the TVC driven by copy activity
The amount of data (in MB) copied from other clusters (inbound data) to the cluster on which the report was executed
Tip: Analyzing the report shown in Example 10-19, you see three active clusters with write operations from a host. This might not be a common configuration, but it is an example of a scenario to show the possibility of having three copies of a logical volume in a multicluster grid.
Example 10-19 VEHSTATS report for Peer-to-Peer Activity
01:00:00 1 13 1 0 139077 38.6 43 1 346 61355 17.0 746 0.2 156 0.0
02:00:00 6 1518 7 0 150440 41.7 84 462 11410 64536 17.9 4448 1.2 1175 0.3
03:00:00 2 3239 3 0 88799 24.6 38 8 44 57164 15.8 1114 0.3 166 0.0
04:00:00 2 574 4 0 241205 67.0 4 82 29 109850 30.5 1409 0.3 401 0.1
05:00:00 3 1055 2 0 70637 19.6 9 390 136 51464 14.2 2488 0.6 0
06:00:00 16 9432 2 0 187776 52.1 33 1519 491 100580 27.9 2526 0.7 463 0.1
07:00:00 0 0 0 0 86624 24.0 19 63 12649 50139 13.9 6036 1.6 1988 0.5
08:00:00 1 484 0 0 46314 12.8 26 30 12292 23216 6.4 9563 2.6 1971 0.5
The following fields are the most important fields in this report:
MB_TO_COPY: The amount of data pending a copy function to other clusters (outbound).
MB_FR: The amount of data (MB) copied from the cluster (inbound data) identified in the column heading. The column heading 1-->2 indicates Cluster 1 is the copy source and Cluster 2 is the target.
CALC_MB/SEC: This number shows the true throughput achieved when replicating data between the clusters identified in the column heading.
Summary reports
In addition to daily and monthly summary reports per cluster, VEHSTATS also provides a side-by-side comparison of all clusters for the entire measurement interval. Examine this report for an overall view of the grid, and for significant or unexpected differences between the clusters.
10.11.4 VEHGRXCL tool overview
VEHGRXCL is a tool that can be downloaded from the IBM Tape Tools and used as the graphical interface for the records provided by VEHSTATS. The VEHGRXCL.EXE file contains VEHSTATS_Model.ppt and VEHSTATS_Model.xls. You can use these files to create graphs on cluster activity based on the flat files created with VEHSTATS. Detailed instructions about how to include your data in the tool are described in the first worksheet in the VEHSTATS_Model.xls file that is created as part of the installation procedure.
The following steps describe the sequence of actions in general to produce the graphs of your grid environment:
1. Run the BVIRHSTV program to collect the TS7700 BVIR History data for a selected period (suggested 31 days). Run the VEHSTATS program for the period to be analyzed (a maximum of 31 days is used).
2. Select one day during the analysis period to analyze in detail, and run the VEHSTATS hourly report for that day. You can import the hourly data for all days and then select the day later in the process. You also decide which cluster will be reported by importing the hourly data of that cluster.
3. File transfer the two space-separated files from VEHSTATS (one daily and one hourly) to your workstation.
4. Start Microsoft Excel and open this workbook, which must be in the directory C:VEHSTATS.
5. Import the VEHSTATS daily file into the “Daily data” sheet, using a special parsing option.
6. Import the VEHSTATS hourly file into the “Hourly data” sheet, using a special parsing option. Copy 24 hours of data for your selected day and cluster and paste it into the top section of the “Hourly data” sheet.
7. Open the accompanying VEHSTATS_MODEL.PPT Microsoft PowerPoint presentation and ensure that automatic links are updated.
8. Save the presentation with a new name so as not to modify the original VEHSTATS_MODEL.PPT.
9. Verify that the PowerPoint presentation is correct, or make any corrections necessary.
10. Break the links between the workbook and the presentation.
11. Edit or modify the saved presentation to remove blank or unneeded charts. Save the presentation with the links broken.
The following examples of PowerPoint slides give an impression of the type of information that is provided with the tool. You can easily update these slides and include them in your own capacity management reports.
Figure 10-48 gives an overview of all the sections included in the PowerPoint presentation.
Figure 10-48 Sample VEHGRXCL: Agenda
Figure 10-49 gives an overview of the reported period.
Figure 10-49 Sample VEHGRXCL: Overview
Figure 10-50 is an example throughput, expressed in MBps.
Figure 10-50 Sample VEHGRXCL: Maximum and average throughput
Figure 10-51 is an example of physical mounts.
Figure 10-51 Sample VEHGRXCL: All physical mounts
10.12 z/OS commands for monitoring
In addition to the previously introduced methods and options for monitoring the TS7700 Virtualization Engine, the following additional points offer further subsystem monitoring.
10.12.1 Display SMS command
Several DISPLAY SMS commands exist to display the OAM status, the composite and distribution library, and volume details.
Several of these commands (shown in bold) and their responses are listed in Example 10-20, separated by a dashed line.
Example 10-20 DISPLAY SMS command responses
CBR1100I OAM status: 770
       2 2   1 ...0 ... 1... 0...138...138 136 27826
There are also 1 VTS distributed libraries defined.
CBR1110I OAM library status: 866
TLIB ....AL 3584-L22   10 10 ...8... 6019... 5319....316 ..Y Y
TVTS ...VCL 3957-V06  128 128 128 0  .....0..27510...Y Y
TVTSD  .VDL 3957-V06    0 0....0 ...6000 ...5700 0...N Y
IGD002I 00:20:45 DISPLAY SMS 944
***************************** LEGEND *****************************
CBR1110I OAM library status: 074
TVTS VCL 3957-V06 128 128 128 0 0 27510 Y Y
MEDIA1 100 0 ......   0001
MEDIA2 27410 500 .......  0002
Library supports import/export.
Library supports outboard policy management.
Library supports logical WORM.
CBR1110I OAM library status: 233
TLIB AL 3584-L22.. 10 10 .. 9... 6019 5319 316 Y Y
MEDIA5 312 20 .........0005
MEDIA9 4 0 . 0009
Convenience I/O station installed.
Convenience I/O station in Input mode.
Convenience I/O station Empty.
Bulk input/output not configured.
CBR1180I OAM tape volume status: 403
CREATION DATE: 2007-02-08 EXPIRATION DATE: 2011-09-26
09.18.16 d sms,vol(hyd210)
09.18.16 STC00098 CBR1180I OAM tape volume status: 870
Logical volume.
Volume is logical WORM.
For more information, see Chapter 9, “Operation” on page 413 and z/OS DFSMS Object Access Method Planning, Installation, and Storage Administration Guide for Tape Libraries, SC35-0427.
10.12.2 Library command
The LIBRARY command and the LIBRARY REQUEST command, also known as the Host Console Request function, can be used to check for missing virtual drives or for the status of the grid links. Example 10-21 shows the output of the LIBRARY DD command that you can use to verify whether all virtual drives are available.
Example 10-21 Sample response for LI DD,libname command
CBR1220I Tape drive status: 338
5F00 3490 ATVIGA ......N N Y N A NONE N
5F01 3490 ATVIGA ......N N Y N A NONE N
5F02 3490 ATVIGA ...... N N Y N A NONE N
5F03 3490 ATVIGA ,,,,,,N N Y N A NONE N
5F04 3490 ATVIGA ,,,,,,N N Y N A NONE N
5F05 3490 ATVIGA ......N N Y N A NONE N
5F06 3490 ATVIGA ......N N Y N A NONE N
5F07 3490 ATVIGA ......N N Y N A NONE N
5F08 3490 ATVIGA ......N N Y N A NONE N
5F09 3490 ATVIGA ......N N Y N A NONE N
5F0A 3490 ATVIGA ..... N N Y N A NONE N
5F0B 3490 ATVIGA ..... N N Y N A NONE N
5F0C 3490 ATVIGA ..... N N Y N A NONE N
5F0D 3490 ATVIGA ..... N N Y N A NONE N
5F0E 3490 ATVIGA ..... N N Y N A NONE N
5F0F 3490 ATVIGA ..... N N Y N A NONE N
5F10 3490 ATVIGA ..... N N Y N A NONE N
5F11 3490 ATVIGA ..... N N Y N A NONE N
5F12 3490 ATVIGA ..... N N Y N A NONE N
5F13 3490 ATVIGA ..... N N Y N A NONE N
 5FFA 3490 ATVIGA ..... N N Y N A NONE N
 5FFB 3490 ATVIGA .....  N N Y N A NONE N
 5FFC 3490 ATVIGA ..... N N Y N A NONE N
 5FFD 3490 ATVIGA ..... N N Y N A NONE N
 5FFE 3490 ATVIGA ..... N N Y N A NONE N
 5FFF 3490 ATVIGA ,,,,, N N Y N A NONE N
For more information about the LIBRARY command, see Chapter 9, “Operation” on page 413 and z/OS DFSMS Object Access Method Planning, Installation, and Storage Administration Guide for Tape Libraries, SC35-0427.
10.13 What to look for and where
This chapter describes tools, provides considerations, and gives you important information to help you monitor and understand the performance indicators of your TS7700 grid. Table 10-8 summarizes where you can find information and what observations you can make.
Most checks that you need to make in each shift ensure that the TS7700 environment is operating as expected. The checks that are made daily or weekly are intended for tuning and longer-term trend analysis.
The information in this table is intended as a basis for monitoring. You can tailor this information to best fit your needs.
Table 10-8 Monitoring summary
Reporting interval
All virtual drives online
LI DD,libname
Display each composite library and each system
Each shift
Report or act on any missing drive
Virtualization Engine health check
TS7700 MI
Display each composite library
Each shift
Report any offline or degraded status
Library online and operational
Display each composite library and each system
Each shift
Verify availability to systems
Exits enabled
Display each system
Each shift
Report any disabled exits
Virtual scratch volumes
Display each composite library
Each shift
Report each shift
Physical scratch tapes
Display each composite library
Each shift
Report each shift
Display each composite library
Each shift
Report or act on any interventions
Grid link status
Display each composite library
Each shift
Report any errors or elevated Retransmit%
Number of volumes on the deferred copy queue
TS7700 MI  Logical Volumes  Incoming Copy Queue
Display for each cluster in the grid
Each shift
Report and watch for gradual or sudden increases
Copy queue depths
TS7700 MI
Display for each system
Each shift
Report if queue depth is higher than usual
Virtual mounts per day
Rolling weekly trend
Increase over time. Indicates increased workload
MB transferred per day
Rolling weekly trend
Increase over time. Indicates increased workload
Virtual volumes managed
Rolling weekly trend
Capacity planning: maximum one million per grid
MB stored
Rolling weekly trend
Capacity planning and general awareness
Back-end drive utilization
Rolling weekly trend
Check for periods of 100%
Daily throttle indicators
Rolling weekly trend
Key performance indicator
Average virtual mount time
Rolling weekly trend
Key performance indicator
Cache hit percentage
Rolling weekly trend
Key performance indicator
scratch count
Rolling weekly trend
Capacity planning and general awareness
Available slot count
Rolling weekly trend
Capacity planning and general awareness
Available virtual scratch volumes
Rolling weekly trend
Drive insert
Data distribution
Watch for healthy distribution
Use for reclaim tuning
Times in cache
Watch for healthy distribution
Preference group tuning indicator
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.