Chapter 3. Connectivity

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Connectivity

This chapter describes how to set up the connectivity to the remote site. It also gives an overview of the required components and how to set up the different options and functions.

This chapter includes the following sections:

•Inter-site connectivity

•Fibre Channel to IP conversion

•Bandwidth estimation

•Long-distance link considerations

•DS8000 configuration considerations

3.1 Inter-site connectivity

When Global Mirror is used, there are typically two data center sites, characterized by either a reasonable distance between them, or connected with a limited bandwidth. The data connection between both sites is usually implemented with high-speed WAN connections provided by a third-party network provider. The providers typically offer IP-based links with dedicated bandwidth. Today technologies are often Synchronous Optical Network (SONET) and Synchronous Digital Hierarchy (SDH) based connections. Depending on the global region, other standards might be available.

The available different bandwidths are defined by the Open Carrier Transmission Rate standard. They are named by the acronym OC-n where n is an integer number that represents the bit stream transmission rate. Table 3-1 gives an overview of the available transmission rates.

Table 3-1 SONET / SDH transmission rates

OC - Level	Data rate	SONET / SDH
OC-1	51.84 kbps	STS-1 / STM-0
OC-3	155.52 kbps	STS-3 / STM-1
OC-12	622.08 kbps	STS-12 / ST-4
OC-24	1,244.16 kbps	STS-24
OC-48	2,488.32 kbps	STS-48 / STM-16
OC-192	9,953.28 kbps	STS-192 / STM-64
OC-768	39,813.12 kbps	STS-786 / STM-256

These connections are typically IP-based and are provided to a data center as a so-called access point. The provider installs a communication device in the data center to provide IP-ports with one of these data rates.

3.2 Fibre Channel to IP conversion

The DS8880 only provides Fibre Channel based I/O ports for PPRC connections. Therefore, a gateway that converts Fibre Channel to IP must be implemented in each data center that hosts a DS8000. Figure 3-1 on page 35 shows a possible setup for a Fibre Channel to IP conversion.

In this setup, the Fibre Channel connections are connected directly to the FC/IP Gateway. However it is possible to use Fibre Channel switches in between. For more information about such a configuration, see “Fibre Channel flow control” on page 39.

To provide sufficient redundancy for the inter-site communication, at least two Fibre Channel connections should be provided to connect the FC/IP gateway. In addition, additional Fibre Channel connections can be considered to provide sufficient bandwidth. To meet the redundancy requirement, select an even number of connections. The most common environments use two, four, six, or eight Fibre Channel connections at each site.

On the IP-side of the gateway, the number of connections must be sufficient to at least match the total bandwidth on the Fibre Channel side of the gateway. For example, with four 8 Gbps FC connections that amount to a total bandwidth of 32 Gbps, four 10 Gbps IP links must be connected to the access point of the service provider. To realize a 32 Gbps link between the two sites, four OC-192 links are required.

For additional information about bandwidth requirements, see 3.3, “Bandwidth estimation”.

Figure 3-1 Converting Fibre Channel to IP traffic

3.3 Bandwidth estimation

Estimating the bandwidth requirement for asynchronous replication requires a little more attention than for synchronous replication. For synchronous replication, it is easy to determine the bandwidth with these steps:

1. Get a representative measurement of the write performance of the primary storage system.

2. Take the peak load.

3. Size the bandwidth of the replication link according to this peak.

For asynchronous replication, another important parameter comes into play as introduced in 1.1, “Global Mirror overview” on page 2: The recovery point objective (RPO). The reason is simple: With asynchronous replication, the data is transmitted to the remote site block by block, without any regard for the write order from the host. In other words, the data is inconsistent for some time, and a portion of data would be lost in a site disaster. This portion is measured by the RPO given in time frames like minutes or hours.

It is important that you understand this circumstance in asynchronous replication and that you specify how much data you are willing to lose in a disaster case. This decision might take some discussions because most companies do not want to lose any data. The correct approach to minimizing the data loss is to understand that either you need to provide more bandwidth, which costs money, or to reduce the distance to the secondary site to reduce signal latency.

This discussion produces an number of minutes or hours for the RPO. This value is the main input parameter for the bandwidth estimation. The required bandwidth calculation can now be based on the following parameters:

•The write load of all volumes that will participate in the data replication.

•The latency of the physical link between the data centers. This value can be requested from the link provider.

•The number of participating DS8000 volumes. As you can see in Figure 1-1 on page 3, one of the essential components of Global Mirror is the FlashCopy at the secondary site. This FlashCopy is a major component for Global Mirror to provide consistency at the remote site. FlashCopy operations are time consuming and thus must be taken into consideration.

•Expected compression rate. The FC/IP gateways offer hardware and software data compression. See 3.4.3, “Compression” on page 41.

One way to calculate the required bandwidth is to use DiskMagic. However, DiskMagic does not take a specific RPO into account.

The better approach is to contact your IBM representative. IBM can do bandwidth studies by using its own Global Mirror RPO and Bandwidth Calculator. The main input for this tool is a set of write performance data that can be provided by using IBM Spectrum Control data or RMF data. The result consists of two corresponding graphs that show the expected RPO for a particular bandwidth. The bandwidth can be changed interactively so that the required RPO can be adjusted.

For example, as illustrated in Figure 3-2, a calculation of the RPO was done based on a rough assumption of 20 Mbps. The first graph shows the following data:

•The black line represents the data transfer pipe.

•The green line shows the data to be sent to the secondary line.

•The blue line shows the emerging write rate at the primary storage system.

•The red line shows the accumulated writes that could not be sent to the secondary site because the bandwidth was not sufficient.

The second graph shows the RPO profile. In every interval where data was falling behind because there was more written than the link could transmit, the RPO ramps up until all accumulated writes could be transmitted to the secondary site.

In conclusion, with this write profile and a bandwidth of only 20 Mbps, the RPO can raise up to 9602 seconds and the maximum backlog is 13.01 GB.

Figure 3-2 Bandwidth and RPO estimation with 20 Mbps

In Figure 3-3, the bandwidth was adjusted in such a way that the RPO will not be higher than the data sampling rate of the measurement. The RPO will be close to 10 minutes, or 600 seconds. The data backlog will be at roughly 4 GB. All of these goals can be achieved with a bandwidth of 47 Mbps.

Figure 3-3 Adjusted Bandwidth estimation with optimized RPO achievement

As you can see, the Bandwidth Calculator is a great tool to facilitate the discussion of RPO and bandwidth.

3.4 Long-distance link considerations

When data is transmitted over longer distances, the data might encounter on its way to the target site changes like sections with different bandwidth or different transmission protocols. These circumstances influence attributes like throughput and response time, which are recognized by the participating DS8000 disk systems. It can even influence the host I/O when the data flow coordination or the sizing of the different link components is not done properly.

This section provides an overview of the most common attributes that require attention.

3.4.1 Fibre Channel flow control

A simple inter-site connection is shown in Figure 3-1 on page 35. However, additional Fibre Channel switches might be needed between the DS8000 and the FCIP gateway. If so, the resulting inter-switch links (ISLs) need special attention.

First, the total bandwidth of the ISLs must be sufficient. For example, if there are four links from the DS8000 in use for the replication, the ISLs should consist also of four links of the same speed. If the ISL ports have different speeds, as many ISLs as the sum of the DS8000 link speed must be supplied.

Especially when multiple Fibre Channel hops are in place, it becomes important to take a closer look at the Fibre Channel flow control. As you might know, the basic unit of data transmission is a frame. The Fibre Channel protocol consists of a control mechanism named Buffer Credits that allows a maximum utilization of ISL, and therefore a contiguous data flow from source to target.

Buffer credits are portions of memory in the target communication device in which the received frames are held until they are processed. If frames are arriving while the receiver is busy processing earlier frames and all buffer credits are used up, the transmitter will not be able to send data anymore. The data flow is then disrupted.

To calculate the correct buffer credits, consider the following items:

•The distance to the remote site and the speed of the link

The longer the distance, the more frames can be sent to fill up the link. For example, on a 10 km link with a bandwidth of 1 Gbps, roughly one single full Fibre Channel frame fits on the link. With 2 Gbps, it is two full frames, with 4 Gbps four frames, and so on.

•The average size of the Fibre Channel frames

A Fibre Channel frame size is a maximum of 2148 bytes long, whereby the payload can vary up to 2112 bytes. With data replication, the maximum payload is typically used. However, verify what the actual average frame size is.

•The round-trip time

Although the distance between the two replication sites is fixed and the expected latency can be calculated, the total round-trip time should be the foundation for the buffer-buffer credit calculation. The values of total round-trip time that should be used can be measured by using the FCIP gateway.

3.4.2 Configuring the FCIP gateway

The following are considerations when configuring the FCIP gateway:

•Tunneling

•Max/min bandwidth

•Keep alive timeout value

•SCSI Fast Write or SCSI Write Acceleration

Tunneling

The FCIP gateway allows you to transmit Fibre Channel frames transparently over IP networks. This result is achieved by building one or more tunnels between the FCIP gateways. At each end of a tunnel, a Fibre Channel port is presented to the participating Fibre Channel devices.

Depending on the gateway vendor, with IP links or IP circuits it is possible to allow trunking of multiple IP links. This configuration enables optimized bandwidth utilization and management of link redundancy. For more information, see the documentation of the implemented FCIP gateway vendor.

Max/min bandwidth

In IP networks, the traffic load is controlled by an algorithm to avoid an over commitment of the network, which would lead to packet losses and performance problems. The way that the algorithm works is that when a bunch of IP data must be delivered, a segment window of data is defined and sent to the remote site. The system then waits for an acknowledge. If the acknowledge is received quickly, the segment window is doubled until a defined segment window threshold has been reached. If the traffic can still be handled, the segment window is increased from now on in a linear manner.

When the ceiling of the available bandwidth is reached, depending on the implemented algorithm, the segment window is either set back to one or is reduced to 50% and the algorithm starts over again.

This process sometimes leads to the saturation of the links, displaying a sort of sawtooth shape instead a flat line close to the ceiling of the bandwidth. In this case, the overall throughput of the link will not reach the capacity of the link as ordered by the provider. With FCIP routers, this effect can be reduced by supplying a threshold value for the expected maximum bandwidth and a minimum bandwidth. The FCIP routers then use their own congestion algorithm that helps to flatten the bandwidth deviation to a minimum.

For more information, see the user documentation of your FCIP routers. Monitor the bandwidth behavior and, if necessary, adopt both values.

Keep alive timeout value

As mentioned, the FCIP gateways are using a tunnel with typically two or four IP links or IP circuits. The tunnel itself is stateless and thus cannot be monitored except whether the connection is still available or not. But the underlaying IP links can do the monitoring by using keepalive messages that are returned from the remote gateway. If one link is not responding, the sender waits for a timeout value before link is brought down. The embedded routing protocol of the gateway then uses the next IP links. If this link has the same problem, the sender again waits for the response until the keep alive timeout value is due.

If the keep alive timeout values are larger than the timeout value of the PPRC links, the DS8000 sets the link that was most recent used to send data as degraded. This process occurs even if other IP links are still available. To avoid this situation, the keepalive timely value should be in total less than the PPRC timeout value for a link, which is typically 6 seconds.

SCSI Fast Write or SCSI Write Acceleration

In the standard SCSI model, each SCSI write command is done in two phases, where a command request is first sent from the initiator to that SCSI target. The target is sent back to the initiator that tells whether the target is ready or not. When ready, in a second phase the write command including the data is sent to the target.

With SCSI Fast Write or SCSI Write Acceleration, the sender gateway sends the command request and the data to the receiver in one go. The receiver then sends back only one acknowledge to the sender. The aim of this function is to reduce protocol interaction and saving transmission time.

Newer DS8000 code levels tolerate the SCSI write acceleration, but the DS8000 does not get any advantage from it. Therefore, in this case disable the SCSI write acceleration.

3.4.3 Compression

Compression is a method to reduce the amount of data before transmission. With compression, the data stream is analyzed for redundant pattern in a specific window of the data stream. When patterns are found, they are replaced by a shorter representation of this pattern.

This function requires some processing resources on both the sender and receiver gateways. Different implementations are available. In general, a hardware-based implementation allows a compression rate between 1:1.5 and 1:3, and software-based implementations can achieve higher compression rates. The hardware-based solutions are faster, and the software-based solutions consume some operation time, which is in addition to the link latency.

Remember that the compression ratio is not a fixed value. This value can vary depending on the data that is transmitted. In the diagrams for the bandwidth estimation in Figure 3-3 on page 38, you can see that a compression ratio of 2 has been assumed. In implementations for customers, this value has commonly been achieved.

Tip: When sizing the links between both sites, do not assume that a compression ratio is too optimistic. The effect of compression might offer some buffer of bandwidth capacity.

3.5 DS8000 configuration considerations

The following list summarizes some considerations about the host adapter setting and configuration:

•Always isolate host connections from Remote Copy connections (MM, GM, z/GM, GC, and MGM) on a host adapter basis. Isolate CKD host connections from FB host connections on a host adapter basis.

•Always have symmetric paths by connection type (that is, use the same number of paths on all host adapters that are used by each connection type). For z/OS, all path groups should be symmetric (that is, have a uniform number of ports per HA), and spread path groups as widely as possible across all CKD HAs.

•When possible, isolate asynchronous from synchronous copy connections on a host adapter basis.

•When possible, use the same number of host adapter connections (especially for z Systems) as the number of connections that come from the hosts.

•Size the number of host adapters needed based on expected aggregate maximum bandwidth and maximum IOPS (use Disk Magic or other common sizing methods that are based on actual or expected workload).

•For optimal performance with 2 Gb and 4 Gb HAs, avoid using adjacent ports by using the information in Table 3-2. For 8 Gb HAs, the port order is less important, except that for 8-port cards, it is preferable to use ports 1 - 4 before ports 5 - 8.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Chapter 3. Connectivity

Create new playlist

Sign In

Sign Up

Table of Contents for
Chapter 3. Connectivity