Business continuity scenarios
In this chapter we describe some business continuity scenarios.
5.1 Business continuity scenarios for IBM i full system with HyperSwap
In this chapter, we describe business continuity solutions with Storwize HyperSwap and full system IBM i.
5.1.1 Outage of Storwize IO group on Site 1
To trigger the outage of the I/O group on Site 1, we enter a Service state for both Storwize Node 1 and Node 2.
At the outage, the I/O rate automatically transfers to Node 3 and Node 4 at Site 2. The IBM i workload keeps running, and there are no relevant messages in IBM i message queues.
IBM i LUNs now have the path status as follows:
8 failed paths: The paths from Node 1 and Node 2.
4 active paths and 4 passive paths: The paths from Node 3 and Node 4. Active paths are the paths through the preferred node of a LUN, and the passive paths are through the non-preferred node of the LUN.
To end the outage of the I/O group at Site 1, we exit the service state of both nodes at Site 1. The IBM i I/O rate automatically transfers to Node 1 and Node 2, the IBM i workload keeps running, and there are no relevant messages in IBM i message queues. After failback, the status of IBM i paths is the same as initially: 4 active and 16 passive paths.
I/O rate on nodes during the outage is shown on the IBM Spectrum Control™ graph in Figure 5-1.
Figure 5-1 Test: I/O group on Site 1 fails
The IBM i paths during the outage are shown in Figure 5-2. Each DMPxxx resource represents one path to the LUN. Note that now different DMPxxx resources are active than before the outage. This means that different paths are used for the I/O rate.
Figure 5-2 IBM i paths during the outage of IO group on Site 1
5.1.2 Outage of Power hardware on Site 1
In Power HMC, we power down the IBM i partition on site 1 to simulate the failure of Power hardware on site1.
Fail-over to site 2
Perform the following steps to fail-over IBM i workload to site 2:
1. In the HyperSwap cluster, unmap HyperSwap LUNs from the Host of IBM i on Site 1, and map the LUNs to the host of IBM i on Site 2, as can be seen in Figure 5-3 and Figure 5-4 on page 37.
Figure 5-3 Power HW on site 1 falls: unmap the LUNs
Figure 5-4 Power HW on site 1 falls: remap the LUNs
2. In POWER HMC, activate the IBM i partition at Site 2 and perform an IPL.
The LUNs in IBM i on site 2 are shown in Figure 5-5. By LUN serial number you can see that they are the same LUNs as in IBM i on site 1.
Figure 5-5 IBM i LUNs on site 2
The LUNs in IBM i on site 2 have 4 active paths and 12 passive paths, as can be seen in Figure 5-6.
Figure 5-6 Paths to IBM i LUNs on site 2
Failback to Site 1 after outage
Perform the following steps to failback to site 1 after the outage is ended:
1. End the jobs in IBM i on site 2 and power down IBM i using the CLI.
2. After IBM i is powered-down, unmap the IBM i LUNs from the host of site 2 in Storwize V7000 and map them to the Host of site 1. This can be seen in Figure 5-7 and Figure 5-8 on page 39.
Figure 5-7 Fail-back: unmap LUNs in Storwize
Figure 5-8 Failback: remap LUNs in Storwize
3. IPL IBM i on site 1.
The IO rate on the nodes is captured by IBM Spectrum Control, and it is shown on the IBM Spectrum Control graph in Figure 5-9.
The IO rate is initially running on both nodes on site 1: node 1 and node 2. After starting the workload in IBM i on site 2 HyperSwap transfers the workload on node 3 and node 4 on site 2.
After we failback to site 1, HyperSwap transfers the IO rate to nodes on site 1.
Figure 5-9 IBM Spectrum Control graph of I/O rate
5.1.3 Disaster at site 1
To simulate a disaster at Site 1, we trigger the outage of both Power hardware and Storwize node 1 and node 2. For this, we use Power HMC to power down the IBM i LPAR. At the same time, we use the Storwize GUI to Enter Service State for nodes 1 and 2.
Failover to site 2
After the simulated disaster, we perform the following steps to fail over to site 2. In Storwize, we remap the IBM i LUNs to the host of IBM i on site 2:
1. In HMC on site 2, we IPL IBM i on site 2, and restart workload on site 2.
2. After failover, there are 8 paths in IBM_i_site2: 4 active and 4 passive. Failed paths are not indicated because we started the partition after failure of paths to node 1 and node 2. The paths in IBM i on site2 are shown in Figure 5-10.
Figure 5-10 Paths after failover
3. The I/O rate is transferred to Storwize node 3 and node 4 on Site 2.
4. For the “end of the disaster” we exit Service state on node 1 and node 2 for the Storwize system using the GUI, and then power on IBM i using the PowerHMC. After enabling the nodes at Site 1, IBM i Site 2 can now see 16 paths: 8 active and 8 passive. This is shown in Figure 5-11 on page 41.
Figure 5-11 Paths after enabling nodes at Site 1
Failback to site 1
To fail back, complete the following steps:
1. Power down IBM_i_site2 from the operating system.
2. Change the LUN mapping to the host of IBM i on site 1, and start IBM i on site 1.
3. Restart workload on site 1.
The IBM Spectrum Control graph shows the I/O rate during failover to Site 2 and failback to Site 1, and is shown in Figure 5-12 on page 41.
Figure 5-12 IBM Spectrum Control graph of I/O rate
5.2 Business continuity scenarios with Live Partition Mobility and Storwize HyperSwap
Our setup scenario fulfils the following requirements for Live Partition Mobility (LPM):
IBM POWER7® firmware level.
SSH Communication between the two consoles enabled.
IBM i partition has only virtual resources: both disk capacity and Ethernet connected are virtualized.
The second WWPN of virtual FC adapters in IBM i is zoned in the SAN switches the same way as the first WWPN.
There is a Host in Storwize for the second WWPN, the HyperSwap LUNs are connected to both hosts: one with the first WWPN and one with the second WWPN.
For the complete checklist of prerequisites for LPM, see IBM PowerVM Virtualization Managing and Monitoring, SG24-7590.
 
Note: The Storwize Host with the second WWPN is assigned to site 2, as can be seen in Figure 5-20 on page 46.
5.2.1 Planned outage with Live Partition Mobility and Storwize HyperSwap
To validate that the IBM I partition is feasible for LPM, complete the following steps:
1. Select the partition in HMC. From the pull-down, select Operations → Mobile → Validate, as shown in Figure 5-13.
Figure 5-13 Validate LPM
2. Insert the IP address and userid of the server at Site 2 to migrate to, as shown in Figure 5-14.
Figure 5-14 Insert credentials of site 2
3. After successful validation, click Migrate as shown in Figure 5-15.
Figure 5-15 Migrate to site 2
During migration, the progress status is shown in an HMC window, as can be seen in Figure 5-16. When migration is finished, the window shows success (Figure 5-17).
Figure 5-16 Migration progress
Figure 5-17 Migration success
During migration, the IBM i partition with the workload is running, and after migration the
IBM i LPAR runs in the Power server at site 2, as can be seen in Figure 5-18.
Figure 5-18 Migrated LPAR
IBM i workload is running all the time, and there aren’t any messages in QSYSOPR. During migration, the I/O rate transfers from the Storwize V7000 nodes at site 1 to the nodes at
site 2. The paths to IBM i LUNs in the migrated partition are shown in Figure 5-19.
Figure 5-19 Paths to IBM i LUNs in migrated LPAR
The Storwize V7000 hosts after partition is migrated are shown in Figure 5-20.
Figure 5-20 Storwize hosts after partition is migrated
5.2.2 Migration back to site 1
To migrate back to site 1, complete the following steps:
1. When the planned outage is finished, validate the migration back to site 1.
2. After successful validation, start migration by clicking the Migration button, as shown in Figure 5-21.
Figure 5-21 Migrate back to site 1
3. After showing the progress (Figure 5-22), the migration back successfully finishes.
Figure 5-22 Progress of migration back to site 1
I/O rate on the Storwize nodes, as captured by IBM Spectrum Control, is shown on Figure 5-23. At LPM, migrating to site 2 the I/O rate transfers from the nodes on site 1 to the nodes on site 2. When migrating back, the I/O rate transfers to the nodes on site 1.
Figure 5-23 IBM Spectrum Control graph of I/O rate on nodes
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset