What’s new with IBM Cluster Aware AIX and Reliable Scalable Clustering Technology
This chapter provides details on what is new with IBM Cluster Aware AIX (CAA) and with IBM Reliable Scalable Clustering Technology (RSCT).
This chapter describes the following topics:
CAA
4.1 CAA
This section describes in more detail some of the new CAA features.
4.1.1 CAA tunables
This section and other places in this book mention CAA tunables and how they behave. Example 4-1 shows the list of the CAA tunables with IBM AIX V7.2.0.0 and IBM PowerHA V7.2.0. Newer versions can have more tunables, different defaults, or both.
 
Attention: Do not change any of these tunables without the explicit permission of IBM.
In general, you should never modify these values, because these values are modified and managed by PowerHA.
Example 4-1 List of CAA tunables
# clctrl -tune -a
ha72cluster(71a0d83c-e467-11e5-8022-4217e0ce7b02).communication_mode = u
ha72cluster(71a0d83c-e467-11e5-8022-4217e0ce7b02).config_timeout = 240
ha72cluster(71a0d83c-e467-11e5-8022-4217e0ce7b02).deadman_mode = a
ha72cluster(71a0d83c-e467-11e5-8022-4217e0ce7b02).link_timeout = 30000
ha72cluster(71a0d83c-e467-11e5-8022-4217e0ce7b02).local_merge_policy = m
ha72cluster(71a0d83c-e467-11e5-8022-4217e0ce7b02).network_fdt = 20000
ha72cluster(71a0d83c-e467-11e5-8022-4217e0ce7b02).no_if_traffic_monitor = 0
ha72cluster(71a0d83c-e467-11e5-8022-4217e0ce7b02).node_down_delay = 10000
ha72cluster(71a0d83c-e467-11e5-8022-4217e0ce7b02).node_timeout = 30000
ha72cluster(71a0d83c-e467-11e5-8022-4217e0ce7b02).packet_ttl = 32
ha72cluster(71a0d83c-e467-11e5-8022-4217e0ce7b02).remote_hb_factor = 1
ha72cluster(71a0d83c-e467-11e5-8022-4217e0ce7b02).repos_mode = e
ha72cluster(71a0d83c-e467-11e5-8022-4217e0ce7b02).site_merge_policy = p
#
4.1.2 What is new in CAA overview
The following new features are included in CAA:
Automatic Repository Update (ARU)
Also known as Automatic Repository Replacement (ARR)
See 4.2, “Automatic repository update for the repository disk” on page 77 for more details.
Monitor /var usage
New -g option for the lscluster command
Interface Failure Detection:
 – Tuning for Interface Failure Detection
 – Send multicast packet to generate incoming traffic
 – Implementation of network monitor (NETMON) within CAA
Functional Enhancements
 – Reduce dependency of CAA node name on hostname
 – Roll back on mkcluster failure or partial success
Reliability, Availability, and Serviceability (RAS) Enhancements
 – Message improvements
 – Several syslog.caa serviceability improvements
 – Enhanced Dead Man Switch (DMS) error logging
4.1.3 Monitoring /var usage
Starting with PowerHA 7.2 the /var file system is monitored by default. This monitoring is done by the clconfd subsystem. The following default values are used:
Threshold 75% (range 70 - 95)
Interval 15 min (range 5 - 30)
To change the default values, use the chssys command. The -t option is used to specify the threshold in % and the -i option is used to specify the interval:
chssys -s clconfd -a "-t 80 -i 10"
To check what values are currently used, you have two options: You can use the ps -ef | grep clconfd or the odmget -q "subsysname='clconfd'" SRCsubsys command. Example 4-2 shows the output of the two commands mentioned before with default values. When using the odmget command, the cmdargs line has no arguments listed. The same happens if ps -ef is used, because there are no arguments displayed after clconfd.
Example 4-2 Check clconfd (when default values are used)
# ps -ef | grep clconfd
    root 3713096 3604778 0 17:50:30 - 0:00 /usr/sbin/clconfd
#
# odmget -q "subsysname='clconfd'" SRCsubsys
SRCsubsys:
 
subsysname = "clconfd"
synonym = ""
cmdargs = ""
path = "/usr/sbin/clconfd"
uid = 0
auditid = 0
standin = "/dev/null"
standout = "/dev/null"
standerr = "/dev/null"
action = 1
multi = 0
contact = 2
svrkey = 0
svrmtype = 0
priority = 20
signorm = 2
sigforce = 9
display = 1
waittime = 20
grpname = "caa"
Example 4-3 shows what happens when you change the default values, and what the output of the odmget and ps -ef looks like after that change.
 
Important: You need to stop and start the subsystem to get your changes active.
Example 4-3 Change monitoring for /var
# chssys -s clconfd -a "-t 80 -i 10"
0513-077 Subsystem has been changed
#
# stopsrc -s clconfd
0513-044 The clconfd Subsystem was requested to stop.
#
# startsrc -s clconfd
0513-059 The clconfd Subsystem has been started. Subsystem PID is 13173096.
# ps -ef | grep clconfd
    root 13173096 3604778 0 17:50:30 - 0:00 /usr/sbin/clconfd -t 80 -i 10
#
# odmget -q "subsysname='clconfd'" SRCsubsys
 
SRCsubsys:
subsysname = "clconfd"
synonym = ""
cmdargs = "-t 80 -i 10"
path = "/usr/sbin/clconfd"
uid = 0
auditid = 0
standin = "/dev/null"
standout = "/dev/null"
standerr = "/dev/null"
action = 1
multi = 0
contact = 2
svrkey = 0
svrmtype = 0
priority = 20
signorm = 2
sigforce = 9
display = 1
waittime = 20
grpname = "caa"
If the threshold is exceeded, then you get an entry in the error log. Example 4-4 shows what such an error entry can look like.
Example 4-4 Error message of /var monitoring
LABEL: CL_VAR_FULL
IDENTIFIER: E5899EEB
 
Date/Time: Fri Nov 13 17:47:15 2015
Sequence Number: 1551
Machine Id: 00F747C94C00
Node Id: esp-c2n1
Class: S
Type: PERM
WPAR: Global
Resource Name: CAA (for RSCT)
 
Description
/var filesystem is running low on space
 
Probable Causes
Unknown
 
Failure Causes
Unknown
 
Recommended Actions
RSCT could malfunction if /var gets full
Increase the filesystem size or delete unwanted files
 
Detail Data
Percent full
          81
Percent threshold
          80
4.1.4 New lscluster option -g
Starting with AIX V7.1 TL4 and AIX V7.2, there is an additional option for the CAA lscluster command.
The new option -g lists the used communication paths of CAA.
 
Note: At the time this publication was written, this option was not available in AIX versions earlier than AIX V7.1.4.
The lscluster -i command lists all of the seen communication paths by CAA but it does not show if all of them can potentially be used for heartbeating. This is particularly the case if you use a network that is set to private, or if you have removed a network from the PowerHA configuration.
Using all interfaces
When using the standard way to configure a cluster, all configured networks in AIX are added to the PowerHA and CAA configuration. In our test cluster, we configured two IP interfaces in AIX. Example 4-5 shows the two networks in our PowerHA configuration, all set to public.
Example 4-5 The cllsif command with all interfaces on public
> cllsif
Adapter Type Network Net Type Attribute Node IP Address Hardware Address Interface Name Global Name Netmask Alias for HB Prefix Length
 
n1adm boot adm_net ether public powerha-c2n1 10.17.1.100 en1 255.255.255.0 24
powerha-c2n1 boot service_net ether public powerha-c2n1 172.16.150.121 en0 255.255.0.0 16
c2svc service service_net ether public powerha-c2n1 172.16.150.125 255.255.0.0 16
n2adm boot adm_net ether public powerha-c2n2 10.17.1.110 en1 255.255.255.0 24
powerha-c2n2 boot service_net ether public powerha-c2n2 172.16.150.122 en0 255.255.0.0 16
c2svc service service_net ether public powerha-c2n2 172.16.150.125 255.255.0.0 16
#
In this case, the lscluster -i output looks like that shown in Example 4-6.
Example 4-6 The lscluster -i command (all interfaces on public)
> lscluster -i
Network/Storage Interface Query
 
Cluster Name: ha72cluster
Cluster UUID: 63d12f4e-e61b-11e5-8016-4217e0ce7b02
Number of nodes reporting = 2
Number of nodes stale = 0
Number of nodes expected = 2
 
Node powerha-c2n1.munich.de.ibm.com
Node UUID = 63b68a36-e61b-11e5-8016-4217e0ce7b02
Number of interfaces discovered = 3
Interface number 1, en0
IFNET type = 6 (IFT_ETHER)
NDD type = 7 (NDD_ISO88023)
MAC address length = 6
MAC address = 42:17:E0:CE:7B:02
Smoothed RTT across interface = 0
Mean deviation in network RTT across interface = 0
Probe interval for interface = 990 ms
IFNET flags for interface = 0x1E084863
NDD flags for interface = 0x0021081B
Interface state = UP
Number of regular addresses configured on interface = 1
IPv4 ADDRESS: 172.16.150.121 broadcast 172.16.255.255 netmask 255.255.0.0
Number of cluster multicast addresses configured on interface = 1
IPv4 MULTICAST ADDRESS: 228.16.150.121
Interface number 2, en1
IFNET type = 6 (IFT_ETHER)
NDD type = 7 (NDD_ISO88023)
MAC address length = 6
MAC address = 42:17:E0:CE:7B:05
Smoothed RTT across interface = 0
Mean deviation in network RTT across interface = 0
Probe interval for interface = 990 ms
IFNET flags for interface = 0x1E084863
NDD flags for interface = 0x0021081B
Interface state = UP
Number of regular addresses configured on interface = 1
IPv4 ADDRESS: 10.17.1.100 broadcast 10.17.1.255 netmask 255.255.255.0
Number of cluster multicast addresses configured on interface = 1
IPv4 MULTICAST ADDRESS: 228.16.150.121
Interface number 3, dpcom
IFNET type = 0 (none)
NDD type = 305 (NDD_PINGCOMM)
Smoothed RTT across interface = 750
Mean deviation in network RTT across interface = 1500
Probe interval for interface = 22500 ms
IFNET flags for interface = 0x00000000
NDD flags for interface = 0x00000009
Interface state = UP RESTRICTED AIX_CONTROLLED
 
Node powerha-c2n2.munich.de.ibm.com
Node UUID = 63b68a86-e61b-11e5-8016-4217e0ce7b02
Number of interfaces discovered = 3
Interface number 1, en0
IFNET type = 6 (IFT_ETHER)
NDD type = 7 (NDD_ISO88023)
MAC address length = 6
MAC address = 42:17:E4:E6:1B:02
Smoothed RTT across interface = 0
Mean deviation in network RTT across interface = 0
Probe interval for interface = 990 ms
IFNET flags for interface = 0x1E084863
NDD flags for interface = 0x0021081B
Interface state = UP
Number of regular addresses configured on interface = 1
IPv4 ADDRESS: 172.16.150.122 broadcast 172.16.255.255 netmask 255.255.0.0
Number of cluster multicast addresses configured on interface = 1
IPv4 MULTICAST ADDRESS: 228.16.150.121
Interface number 2, en1
IFNET type = 6 (IFT_ETHER)
NDD type = 7 (NDD_ISO88023)
MAC address length = 6
MAC address = 42:17:E4:E6:1B:05
Smoothed RTT across interface = 0
Mean deviation in network RTT across interface = 0
Probe interval for interface = 990 ms
IFNET flags for interface = 0x1E084863
NDD flags for interface = 0x0021081B
Interface state = UP
Number of regular addresses configured on interface = 1
IPv4 ADDRESS: 10.17.1.110 broadcast 10.17.1.255 netmask 255.255.255.0
Number of cluster multicast addresses configured on interface = 1
IPv4 MULTICAST ADDRESS: 228.16.150.121
Interface number 3, dpcom
IFNET type = 0 (none)
NDD type = 305 (NDD_PINGCOMM)
Smoothed RTT across interface = 750
Mean deviation in network RTT across interface = 1500
Probe interval for interface = 22500 ms
IFNET flags for interface = 0x00000000
NDD flags for interface = 0x00000009
Interface state = UP RESTRICTED AIX_CONTROLLED
root@powerha-c2n1:/>
Example 4-7 shows the output of the lscluster -g command. When you compare the output of the lscluster -g command with the lscluster -i command, you should not find any differences. There are no differences because all of the networks are allowed to potentially be used for heartbeat in this example.
Example 4-7 The lscluster -g command output in relation to cllsif output
# > lscluster -g
Network/Storage Interface Query
 
Cluster Name: ha72cluster
Cluster UUID: 63d12f4e-e61b-11e5-8016-4217e0ce7b02
Number of nodes reporting = 2
Number of nodes stale = 0
Number of nodes expected = 2
 
Node powerha-c2n1.munich.de.ibm.com
Node UUID = 63b68a36-e61b-11e5-8016-4217e0ce7b02
Number of interfaces discovered = 3
Interface number 1, en0
IFNET type = 6 (IFT_ETHER)
NDD type = 7 (NDD_ISO88023)
MAC address length = 6
MAC address = 42:17:E0:CE:7B:02
Smoothed RTT across interface = 0
Mean deviation in network RTT across interface = 0
Probe interval for interface = 990 ms
IFNET flags for interface = 0x1E084863
NDD flags for interface = 0x0021081B
Interface state = UP
Number of regular addresses configured on interface = 1
IPv4 ADDRESS: 172.16.150.121 broadcast 172.16.255.255 netmask 255.255.0.0
Number of cluster multicast addresses configured on interface = 1
IPv4 MULTICAST ADDRESS: 228.16.150.121
Interface number 2, en1
IFNET type = 6 (IFT_ETHER)
NDD type = 7 (NDD_ISO88023)
MAC address length = 6
MAC address = 42:17:E0:CE:7B:05
Smoothed RTT across interface = 0
Mean deviation in network RTT across interface = 0
Probe interval for interface = 990 ms
IFNET flags for interface = 0x1E084863
NDD flags for interface = 0x0021081B
Interface state = UP
Number of regular addresses configured on interface = 1
IPv4 ADDRESS: 10.17.1.100 broadcast 10.17.1.255 netmask 255.255.255.0
Number of cluster multicast addresses configured on interface = 1
IPv4 MULTICAST ADDRESS: 228.16.150.121
Interface number 3, dpcom
IFNET type = 0 (none)
NDD type = 305 (NDD_PINGCOMM)
Smoothed RTT across interface = 750
Mean deviation in network RTT across interface = 1500
Probe interval for interface = 22500 ms
IFNET flags for interface = 0x00000000
NDD flags for interface = 0x00000009
Interface state = UP RESTRICTED AIX_CONTROLLED
 
Node powerha-c2n2.munich.de.ibm.com
Node UUID = 63b68a86-e61b-11e5-8016-4217e0ce7b02
Number of interfaces discovered = 3
Interface number 1, en0
IFNET type = 6 (IFT_ETHER)
NDD type = 7 (NDD_ISO88023)
MAC address length = 6
MAC address = 42:17:E4:E6:1B:02
Smoothed RTT across interface = 0
Mean deviation in network RTT across interface = 0
Probe interval for interface = 990 ms
IFNET flags for interface = 0x1E084863
NDD flags for interface = 0x0021081B
Interface state = UP
Number of regular addresses configured on interface = 1
IPv4 ADDRESS: 172.16.150.122 broadcast 172.16.255.255 netmask 255.255.0.0
Number of cluster multicast addresses configured on interface = 1
IPv4 MULTICAST ADDRESS: 228.16.150.121
Interface number 2, en1
IFNET type = 6 (IFT_ETHER)
NDD type = 7 (NDD_ISO88023)
MAC address length = 6
MAC address = 42:17:E4:E6:1B:05
Smoothed RTT across interface = 0
Mean deviation in network RTT across interface = 0
Probe interval for interface = 990 ms
IFNET flags for interface = 0x1E084863
NDD flags for interface = 0x0021081B
Interface state = UP
Number of regular addresses configured on interface = 1
IPv4 ADDRESS: 10.17.1.110 broadcast 10.17.1.255 netmask 255.255.255.0
Number of cluster multicast addresses configured on interface = 1
IPv4 MULTICAST ADDRESS: 228.16.150.121
Interface number 3, dpcom
IFNET type = 0 (none)
NDD type = 305 (NDD_PINGCOMM)
Smoothed RTT across interface = 750
Mean deviation in network RTT across interface = 1500
Probe interval for interface = 22500 ms
IFNET flags for interface = 0x00000000
NDD flags for interface = 0x00000009
Interface state = UP RESTRICTED AIX_CONTROLLED
root@powerha-c2n1:/>
One network set to private
The following examples in this section describe the lscluster command output when you decided to change one or more networks to private. Example 4-8 shows the starting point for this example. In our testing environment, we changed one network to private.
 
Note: Private networks cannot be used for any services. When you want to use a service IP address, the network must be public.
Example 4-8 The clslif command (private)
# cllsif
Adapter Type Network Net Type Attribute Node IP Address Hardware Address Interface Name Global Name Netmask Alias for HB Prefix Length
 
n1adm service adm_net ether private powerha-c2n1 10.17.1.100 en1 255.255.255.0 24
powerha-c2n1 boot service_net ether public powerha-c2n1 172.16.150.121 en0 255.255.0.0 16
c2svc service service_net ether public powerha-c2n1 172.16.150.125 255.255.0.0 16
n2adm service adm_net ether private powerha-c2n2 10.17.1.110 en1 255.255.255.0 24
powerha-c2n2 boot service_net ether public powerha-c2n2 172.16.150.122 en0 255.255.0.0 16
c2svc service service_net ether public powerha-c2n2
172.16.150.125 255.255.0.0
16
#
Because we did not change the architecture of our cluster, the output of the lscluster -i command is still the same, as shown in Example 4-6 on page 68.
 
Remember: You must synchronize your cluster before the change to private is visible
in CAA.
Example 4-9 shows the lscluster -g command output after the synchronization. If you now compare the output of the lscluster -g command with the lscluster -i command or with the lscluster -g output from the previous example, you see that the entries about en1 (in our example) do not appear any longer. In other words, the list of networks potentially allowed to be used for heartbeat is shorter.
Example 4-9 The lscluster -g command (one private network)
# lscluster -g
Network/Storage Interface Query
 
Cluster Name: ha72cluster
Cluster UUID: 55430510-e6a7-11e5-8035-4217e0ce7b02
Number of nodes reporting = 2
Number of nodes stale = 0
Number of nodes expected = 2
 
Node powerha-c2n1.munich.de.ibm.com
Node UUID = 55284db0-e6a7-11e5-8035-4217e0ce7b02
Number of interfaces discovered = 2
Interface number 1, en0
IFNET type = 6 (IFT_ETHER)
NDD type = 7 (NDD_ISO88023)
MAC address length = 6
MAC address = 42:17:E0:CE:7B:02
Smoothed RTT across interface = 0
Mean deviation in network RTT across interface = 0
Probe interval for interface = 990 ms
IFNET flags for interface = 0x1E084863
NDD flags for interface = 0x0021081B
Interface state = UP
Number of regular addresses configured on interface = 1
IPv4 ADDRESS: 172.16.150.121 broadcast 172.16.255.255 netmask 255.255.0.0
Number of cluster multicast addresses configured on interface = 1
IPv4 MULTICAST ADDRESS: 228.16.150.121
Interface number 2, dpcom
IFNET type = 0 (none)
NDD type = 305 (NDD_PINGCOMM)
Smoothed RTT across interface = 750
Mean deviation in network RTT across interface = 1500
Probe interval for interface = 22500 ms
IFNET flags for interface = 0x00000000
NDD flags for interface = 0x00000009
Interface state = UP RESTRICTED AIX_CONTROLLED
 
Node powerha-c2n2.munich.de.ibm.com
Node UUID = 55284df6-e6a7-11e5-8035-4217e0ce7b02
Number of interfaces discovered = 2
Interface number 1, en0
IFNET type = 6 (IFT_ETHER)
NDD type = 7 (NDD_ISO88023)
MAC address length = 6
MAC address = 42:17:E4:E6:1B:02
Smoothed RTT across interface = 0
Mean deviation in network RTT across interface = 0
Probe interval for interface = 990 ms
IFNET flags for interface = 0x1E084863
NDD flags for interface = 0x0021081B
Interface state = UP
Number of regular addresses configured on interface = 1
IPv4 ADDRESS: 172.16.150.122 broadcast 172.16.255.255 netmask 255.255.0.0
Number of cluster multicast addresses configured on interface = 1
IPv4 MULTICAST ADDRESS: 228.16.150.121
Interface number 2, dpcom
IFNET type = 0 (none)
NDD type = 305 (NDD_PINGCOMM)
Smoothed RTT across interface = 750
Mean deviation in network RTT across interface = 1500
Probe interval for interface = 22500 ms
IFNET flags for interface = 0x00000000
NDD flags for interface = 0x00000009
Interface state = UP RESTRICTED AIX_CONTROLLED
#
Remove networks from PowerHA
The examples in this section describe the lscluster command output when you remove one or more networks from the list of known networks in PowerHA. Example 4-10 shows the starting point for this example. In our test environment, we removed the adm_net network.
Example 4-10 The cllsif command (removed network)
# cllsif
Adapter Type Network Net Type Attribute Node IP Address Hardware Address Interface Name Global Name Netmask Alias for HB Prefix Length
 
powerha-c2n1 boot service_net ether public powerha-c2n1 172.16.150.121 en0 255.255.0.0 16
c2svc service service_net ether public powerha-c2n1 172.16.150.125 255.255.0.0 16
powerha-c2n2 boot service_net ether public powerha-c2n2 172.16.150.122 en0 255.255.0.0 16
c2svc service service_net ether public powerha-c2n2 172.16.150.125 255.255.0.0 16
#
Because we did not change the architecture of our cluster, the output of the lscluster -i command is still the same as listed in Example 4-6 on page 68.
Remember that you must synchronize your cluster before the change to private is visible
in CAA.
Example 4-11 shows the lscluster -g output after the synchronization. If you now compare the output of the lscluster -g command with the previous lscluster -i command, or with the lscluster -g output in “Using all interfaces” on page 68, you see that the entries about en1 (in our example) do not appear.
When you compare the content of Example 4-11 with the content of Example 4-9 on page 73 in “One network set to private” on page 72, you see that the output of the lscluster -g commands is identical.
Example 4-11 The lscluster -g command output (removed network)
# lscluster -g
Network/Storage Interface Query
 
Cluster Name: ha72cluster
Cluster UUID: 63d12f4e-e61b-11e5-8016-4217e0ce7b02
Number of nodes reporting = 2
Number of nodes stale = 0
Number of nodes expected = 2
 
Node powerha-c2n1.munich.de.ibm.com
Node UUID = 63b68a36-e61b-11e5-8016-4217e0ce7b02
Number of interfaces discovered = 2
Interface number 1, en0
IFNET type = 6 (IFT_ETHER)
NDD type = 7 (NDD_ISO88023)
MAC address length = 6
MAC address = 42:17:E0:CE:7B:02
Smoothed RTT across interface = 0
Mean deviation in network RTT across interface = 0
Probe interval for interface = 990 ms
IFNET flags for interface = 0x1E084863
NDD flags for interface = 0x0021081B
Interface state = UP
Number of regular addresses configured on interface = 1
IPv4 ADDRESS: 172.16.150.121 broadcast 172.16.255.255 netmask 255.255.0.0
Number of cluster multicast addresses configured on interface = 1
IPv4 MULTICAST ADDRESS: 228.16.150.121
Interface number 2, dpcom
IFNET type = 0 (none)
NDD type = 305 (NDD_PINGCOMM)
Smoothed RTT across interface = 750
Mean deviation in network RTT across interface = 1500
Probe interval for interface = 22500 ms
IFNET flags for interface = 0x00000000
NDD flags for interface = 0x00000009
Interface state = UP RESTRICTED AIX_CONTROLLED
 
Node powerha-c2n2.munich.de.ibm.com
Node UUID = 63b68a86-e61b-11e5-8016-4217e0ce7b02
Number of interfaces discovered = 2
Interface number 1, en0
IFNET type = 6 (IFT_ETHER)
NDD type = 7 (NDD_ISO88023)
MAC address length = 6
MAC address = 42:17:E4:E6:1B:02
Smoothed RTT across interface = 0
Mean deviation in network RTT across interface = 0
Probe interval for interface = 990 ms
IFNET flags for interface = 0x1E084863
NDD flags for interface = 0x0021081B
Interface state = UP
Number of regular addresses configured on interface = 1
IPv4 ADDRESS: 172.16.150.122 broadcast 172.16.255.255 netmask 255.255.0.0
Number of cluster multicast addresses configured on interface = 1
IPv4 MULTICAST ADDRESS: 228.16.150.121
Interface number 2, dpcom
IFNET type = 0 (none)
NDD type = 305 (NDD_PINGCOMM)
Smoothed RTT across interface = 750
Mean deviation in network RTT across interface = 1500
Probe interval for interface = 22500 ms
IFNET flags for interface = 0x00000000
NDD flags for interface = 0x00000009
Interface state = UP RESTRICTED AIX_CONTROLLED
root@powerha-c2n1:/>
#
4.1.5 Interface failure detection
PowerHA V7.1 had a fixed latency for network failure detection that was about 5 seconds. In PowerHA V7.2, the default is now 20 seconds. The tunable is named network_fdt.
 
Note: The network_fdt tunable is also available in PowerHA V7.1.3. To get it for your PowerHA V7.1.3 version, you must open a PMR and request the Tunable FDT IFix bundle.
The self-adjusting network heartbeating behavior (CAA) which was introduced with PowerHA V7.1.0 is still there and still gets used. It has no impact on the network failure detection time.
The network_fdt tunable can be set to zero to maintain the default behavior. The tunable can be set in a range of 5 - 10 seconds less than the node_timeout.
The default recognition time for a network problem is not affected by this tunable. It is 0 for hard failures and 5 seconds for soft failures (since PowerHA V7.1.0). CAA continues to check the network, but it waits until the end of the defined timeout to create a network down event.
For PowerHA nodes, when the effective level of CAA is 4, also known as the 2015 release, CAA automatically sets the network_fdt to 20 seconds and the node_timeout to 30 seconds.
 
Note: At the time that this publication was written, the only way to find out if CAA level 4 is installed is to use the lscluster -c command. In the output of lscluster -c, check if the AUTO_REPOS_REPLACE is listed for the effective cluster-wide capabilities.
For instance, use the following command:
# lscluster -c | grep "Effective cluster-wide capabilities"
Effective cluster-wide capabilities: CAA_NETMON, AUTO_REPOS_REPLACE, HNAME_CHG, UNICAST, IPV6, SITE
#
Example 4-12 shows how to both check and change the CAA network tunable attribute using the CAA native clctrl command.
Example 4-12 Using clctrl to change CAA network tunable
# clctrl -tune -o network_fdt
HA72a_cluster(641d80c2-bd87-11e5-8005-96d75a7c7f02).network_fdt = 20000
 
# clctrl -tune -o network_fdt=10000
1 tunable updated on cluster PHA72a_cluster
 
# clctrl -tune -o network_fdt
PHA72a_cluster(641d80c2-bd87-11e5-8005-96d75a7c7f02).network_fdt = 10000
4.2 Automatic repository update for the repository disk
This section discusses the new PowerHA Automatic Repository Update (ARU) feature for the PowerHA repository disk.
4.2.1 Introduction to the automatic repository update
Starting with PowerHA V7.0.0, PowerHA uses a shared disk, called the PowerHA repository disk, for various purposes. The availability of this repository disk is critical to the operation of PowerHA clustering and its nodes. The initial implementation of the repository disk, at PowerHA V7.0.0, did not allow for the operation of PowerHA cluster services if the repository disk failed, making that a single point of failure.
With later versions of PowerHA, features have been added to make the cluster more resilient if there is a PowerHA repository disk failure. The ability to survive a repository disk failure, in addition to the ability to manually replace a repository disk without an outage has increased the resiliency of PowerHA. With PowerHA V7.2.0, a new feature to increase the resiliency further was introduced, and this is called Automatic Repository Update.
If there is an active repository disk failure, the purpose of ARU is to automate the replacement of a PowerHA repository disk, without intervention from a system administrator, and without affecting the active cluster services. All that is needed is to point PowerHA to the backup repository disks to use if there is an active repository disk failure.
If a repository disk fails, PowerHA detects the failure of the active repository disk. At that point, it verifies that the active repository disk is not usable. If the active repository disk is unusable, it attempts to switch to the backup repository disk. If it is successful, then the backup repository disk becomes the active repository disk.
4.2.2 Requirements for Automatic Repository Update
The Automatic Repository Update has the following requirements:
AIX V7.1.4 or AIX V7.2.0.
PowerHA V7.2.0.
The storage used for the backup repository disk has the same requirements as the primary repository disk.
See the following website for the PowerHA repository disk requirements:
4.2.3 Configuring Automatic Repository Update
The configuration of the ARU is automatic when you configure a backup repository disk for PowerHA. Essentially, all you have to do is configure a backup repository disk.
This section shows an example in a 2-site, 2-node cluster. The cluster configuration is similar to the diagram shown in Figure 4-1.
Figure 4-1 Storage example for PowerHA ARU showing linked and backup repository disks
For the purposes of this example, we configure a backup repository disk for each site of this 2-site cluster.
Configuring a backup repository disk
The following process details how to configure a backup repository disk. For our example, we perform this process for each site in our cluster:
1. Using AIX’s SMIT, run smitty sysmirror and select Cluster Nodes and Networks  Cluster Nodes and Networks  Add a Repository Disk. You are prompted for a site, due to the fact that our example is a 2-site cluster, and then given a selection of possible repository disks. The screen captures in the following sections provide more details.
When you select Add a Repository Disk, you are prompted to select a site, as shown in Example 4-13.
Example 4-13 Selecting “Add a Repository Disk” in multi-site cluster
Manage Repository Disks
 
Move cursor to desired item and press Enter.
 
Add a Repository Disk
Remove a Repository Disk
Show Repository Disks
 
Verify and Synchronize Cluster Configuration
 
 
+--------------------------------------------------------------------------+
| Select a Site |
| |
| Move cursor to desired item and press Enter. |
| |
| primary_site1 |
| standby_site2 |
| |
| F1=Help F2=Refresh F3=Cancel |
| F8=Image F10=Exit Enter=Do |
F1| /=Find n=Find Next |
F9+--------------------------------------------------------------------------+
2. After selecting primary_site1, we are shown the repository disk menu (Example 4-14).
Example 4-14 Add a repository disk screen
Add a Repository Disk
 
Type or select values in entry fields.
Press Enter AFTER making all desired changes.
 
[Entry Fields]
Site Name primary_site1
* Repository Disk [] +
 
 
 
F1=Help F2=Refresh F3=Cancel F4=List
F5=Reset F6=Command F7=Edit F8=Image
F9=Shell F10=Exit Enter=Do
3. Next, press F4 on the Repository Disk field, and you are shown the repository disk selection list, as shown in Example 4-15.
Example 4-15 Backup repository disk selection
Add a Repository Disk
 
Type or select values in entry fields.
Press Enter AFTER making all desired changes.
[Entry Fields]
Site Name primary_site1
* Repository Disk [] +
 
+--------------------------------------------------------------------------+
| Repository Disk |
| |
| Move cursor to desired item and press F7. |
| ONE OR MORE items can be selected. |
| Press Enter AFTER making all selections. |
| |
| hdisk3 (00f61ab295112078) on all nodes at site primary_site1 |
| hdisk4 (00f61ab2a61d5bc6) on all nodes at site primary_site1 |
| hdisk5 (00f61ab2a61d5c7e) on all nodes at site primary_site1 |
| |
| F1=Help F2=Refresh F3=Cancel |
F1| F7=Select F8=Image F10=Exit |
F5| Enter=Do /=Find n=Find Next |
F9+--------------------------------------------------------------------------+
4. After selecting the appropriate disk, the choice is shown in Example 4-16.
Example 4-16 Add a repository disk preview screen
Add a Repository Disk
 
Type or select values in entry fields.
Press Enter AFTER making all desired changes.
 
[Entry Fields]
Site Name primary_site1
* Repository Disk [(00f61ab295112078)] +
 
 
 
 
F1=Help F2=Refresh F3=Cancel F4=List
F5=Reset F6=Command F7=Edit F8=Image
F9=Shell F10=Exit Enter=Do
5. Next, after pressing the Enter key to make the changes, the confirmation screen appears, as shown in Example 4-17.
Example 4-17 Backup repository disk addition confirmation screen
COMMAND STATUS
 
Command: OK stdout: yes stderr: no
 
Before command completion, additional instructions may appear below.
 
Successfully added one or more backup repository disks.
To view the complete configuration of repository disks use:
"clmgr query repository" or "clmgr view report repository"
 
 
 
 
 
 
F1=Help F2=Refresh F3=Cancel F6=Command
F8=Image F9=Shell F10=Exit /=Find
4.2.4 Automatic Repository Update operations
PowerHA ARU operations are automatic when a backup repository disk is configured.
Successful ARU operation
As previously mentioned, ARU operations are automatic when a backup repository disk is defined. Our scenario has a 2-site cluster and a backup repository disk per site.
In order to induce a failure of the primary repository disk, we logged in to the VIOS servers that present storage to the cluster LPARs and deallocate the disk LUN that corresponds to the primary repository disk on one site of our cluster. This disables the primary repository disk, and PowerHA ARU detects the failure and automatically activates the backup repository disk as the active repository disk.
This section presents the following examples used during this process:
1. Before disabling the primary repository disk, we look at the lspv command output and note that the active repository disk is hdisk1, as shown in Example 4-18.
Example 4-18 Output of the lspv command in an example cluster
hdisk0 00f6f5d09570f647 rootvg active
hdisk1 00f6f5d0ba49cdcc caavg_private active
hdisk2 00f6f5d0a621e9ff None
hdisk3 00f61ab2a61d5c7e None
hdisk4 00f61ab2a61d5d81 testvg01 concurrent
hdisk5 00f61ab2a61d5e5b testvg01 concurrent
hdisk6 00f61ab2a61d5f32 testvg01 concurrent
2. We then proceed to log in to the VIOS servers that present the repository disk to this logical partition (LPAR) and de-allocate that logical unit (LUN) so that the cluster LPAR no longer has access to that disk. This causes the primary repository disk to fail.
3. At this point, PowerHA ARU detects the failure and activates the backup repository disk as the active repository disk. You can verify this behavior in the syslog.caa log file. This log file logs the ARU activities and shows the detection of the primary repository disk failure, and the activation of the backup repository disk. See Example 4-19.
Example 4-19 The /var/adm/ras/syslog.caa file showing repository disk failure and recovery
Nov 12 09:13:29 primo_s2_n1 caa:info cluster[14025022]: caa_config.c run_list 1377 1 = = END REPLACE_REPOS Op = = POST Stage = =
Nov 12 09:13:30 primo_s2_n1 caa:err|error cluster[14025022]: cluster_utils.c cluster_repository_read 5792 1 Could not open cluster repository device /dev/rhdisk1: 5
Nov 12 09:13:30 primo_s2_n1 caa:info cluster[14025022]: cluster_utils.c cl_kern_repos_check 11769 1 Could not read the respository.
Nov 12 09:13:30 primo_s2_n1 caa:info cluster[14025022]: cluster_utils.c cl_run_log_method 11862 1 START '/usr/sbin/importvg -y caavg_private_t -O hdisk1'
Nov 12 09:13:32 primo_s2_n1 caa:info cluster[14025022]: cluster_utils.c cl_run_log_method 11893 1 FINISH return = 1
Nov 12 09:13:32 primo_s2_n1 caa:info cluster[14025022]: cluster_utils.c cl_run_log_method 11862 1 START '/usr/sbin/reducevg -df caavg_private_t hdisk1'
Nov 12 09:13:32 primo_s2_n1 caa:info cluster[14025022]: cluster_utils.c cl_run_log_method 11893 1 FINISH return = 1
Nov 12 09:13:33 primo_s2_n1 caa:err|error cluster[14025022]: cluster_utils.c cluster_repository_read 5792 1 Could not open cluster repository device /dev/rhdisk1: 5
Nov 12 09:13:33 primo_s2_n1 caa:info cluster[14025022]: cl_chrepos.c destroy_old_repository 344 1 Failed to read repository data.
Nov 12 09:13:34 primo_s2_n1 caa:err|error cluster[14025022]: cluster_utils.c cluster_repository_write 5024 1 return = -1, Could not open cluster repository device /dev/rhdisk1: I/O error
Nov 12 09:13:34 primo_s2_n1 caa:info cluster[14025022]: cl_chrepos.c destroy_old_repository 350 1 Failed to write repository data.
Nov 12 09:13:34 primo_s2_n1 caa:warn|warning cluster[14025022]: cl_chrepos.c destroy_old_repository 358 1 Unable to destroy repository disk hdisk1. Manual interventio
n is required to clear the disk of cluster identifiers.
Nov 12 09:13:34 primo_s2_n1 caa:info cluster[14025022]: cl_chrepos.c automatic_repository_update 2242 1 Replaced hdisk1 with hdisk2
Nov 12 09:13:34 primo_s2_n1 caa:info cluster[14025022]: cl_chrepos.c automatic_repository_update 2255 1 FINISH rc = 0
Nov 12 09:13:34 primo_s2_n1 caa:info cluster[14025022]: caa_protocols.c recv_protocol_slave 1542 1 Returning from Automatic Repository replacement rc = 0
4. As an extra verification, note that the AIX error log has an entry showing that a successful repository disk replacement has occurred, as shown in Example 4-20.
Example 4-20 AIX error log showing successful repository disk replacement message
LABEL: CL_ARU_PASSED
IDENTIFIER: 92EE81A5
 
Date/Time: Thu Nov 12 09:13:34 2015
Sequence Number: 1344
Machine Id: 00F6F5D04C00
Node Id: primo_s2_n1
Class: H
Type: INFO
WPAR: Global
Resource Name: CAA ARU
Resource Class: NONE
Resource Type: NONE
Location:
 
Description
Automatic Repository Update succeeded.
 
Probable Causes
Primary repository disk was replaced.
 
Failure Causes
A hardware problem prevented local node from accessing primary repository disk.
 
Recommended Actions
Primary repository disk was replaced using backup repository disk.
 
Detail Data
Primary Disk Info
hdisk1 6c1b76e1-3e0a-ff3c-3c43-cb6c3881c3bf
Replacement Disk Info
hdisk2 5890b139-e987-1451-211e-24ba89e7d1df
At this point, it is safe to remove the failed repository disk and replace it. The replacement disk can become the new backup repository disk by following the steps in “Configuring a backup repository disk” on page 79.
Possible ARU failure situations
Note that some activities can affect the operation of ARU. Specifically, any administrative activity that uses the backup repository disk can affect ARU. If a volume group was previously created on a backup repository disk and this disk was not cleaned up, then ARU cannot operate properly.
In our sample scenario, we completed the following steps:
1. Configure a backup repository disk that previously had an AIX volume group (VG).
2. Export the AIX VG so that the disk did not display a volume group using the AIX command, lspv. However, we did not delete that volume group from the disk, so the disk itself still had that information.
3. For our example, we ran the AIX command, lspv. Our backup repository disk is hdisk2. The disk shows a PVID but no volume group, as shown in Example 4-21.
Example 4-21 Output of lspv command in an example cluster showing hdisk2
hdisk0 00f6f5d09570f647 rootvg active
hdisk1 00f6f5d0ba49cdcc caavg_private active
hdisk2 00f6f5d0a621e9ff None
hdisk3 00f61ab2a61d5c7e None
hdisk4 00f61ab2a61d5d81 testvg01 concurrent
hdisk5 00f61ab2a61d5e5b testvg01 concurrent
hdisk6 00f61ab2a61d5f32 testvg01 concurrent
4. At this point, we disconnected the primary repository disk from the LPAR by going to the VIOS and de-allocating the disk LUN from the cluster LPAR. This made the primary repository disk fail immediately.
At this point, ARU attempted to perform the following actions:
a. Check the primary repository disk that is not accessible.
b. Switch to the backup repository disk (but this action failed).
5. We noted that ARU did leave an error message in the AIX error report, as shown in Example 4-22.
Example 4-22 Output of AIX errpt command showing failed repository disk replacement
LABEL: CL_ARU_FAILED
IDENTIFIER: F63D60A2
 
Date/Time: Wed Nov 11 17:15:17 2015
Sequence Number: 1263
Machine Id: 00F6F5D04C00
Node Id: primo_s2_n1
Class: H
Type: INFO
WPAR: Global
Resource Name: CAA ARU
Resource Class: NONE
Resource Type: NONE
Location:
 
Description
Automatic Repository Update failed.
 
Probable Causes
Unknown.
 
Failure Causes
Unknown.
 
Recommended Actions
Try manual replacement of cluster repository disk.
 
Detail Data
Primary Disk Info
hdisk1 6c1b76e1-3e0a-ff3c-3c43-cb6c3881c3bf
6. In addition, we noted that ARU verified the primary repository disk and failed. This is shown in the CAA log /var/adm/ras/syslog.caa, as shown in Example 4-23.
Example 4-23 Selected messages from /var/adm/ras/syslog.caa log file
Nov 12 09:13:20 primo_s2_n1 caa:info unix: *base_kernext_services.c aha_thread_queue 614 The AHAFS event is EVENT_TYPE=REP_DOWN DISK_NAME=hdisk1 NODE_NUMBER=2 NODE_ID=0xD9DDB48A889411E580106E8DDB7B3702 SITE_NUMBER=2 SITE_ID=0xD9DE2028889411E580106E8DDB7B3702 CLUSTER_ID=0xD34E8658889411E580026E8DDB
Nov 12 09:13:20 primo_s2_n1 caa:info unix: caa_sock.c caa_kclient_tcp 231 entering caa_kclient_tcp ....
Nov 12 09:13:20 primo_s2_n1 caa:info unix: *base_kernext_services.c aha_thread_queue 614 The AHAFS event is EVENT_TYPE=VG_DOWN DISK_NAME=hdisk1 VG_NAME=caavg_private NODE_NUMBER=2 NODE_ID=0xD9DDB48A889411E580106E8DDB7B3702 SITE_NUMBER=2 SITE_ID=0xD9DE2028889411E580106E8DDB7B3702 CLUSTER_ID=0xD34E8
Nov 12 09:13:20 primo_s2_n1 caa:info cluster[14025022]: cluster_utils.c cl_run_log_method 11862 1 START '/usr/lib/cluster/caa_syslog '
Nov 12 09:13:20 primo_s2_n1 caa:info unix: kcluster_event.c find_event_disk 742 Find disk called for hdisk4
Nov 12 09:13:20 primo_s2_n1 caa:info unix: kcluster_event.c ahafs_Disk_State_register 1504 diskState set opqId = 0xF1000A0150301A00
Nov 12 09:13:20 primo_s2_n1 caa:info cluster[14025022]: cluster_utils.c cl_run_log_method 11893 1 FINISH return = 0
Nov 12 09:13:20 primo_s2_n1 caa:info cluster[14025022]: caa_message.c inherit_socket_inetd 930 1 IPv6=::ffff:127.0.0.1
Nov 12 09:13:20 primo_s2_n1 caa:info cluster[14025022]: cluster_utils.c cl_kern_repos_check 11769 1 Could not read the respository.
Nov 12 09:13:20 primo_s2_n1 caa:info cluster[14025022]: caa_message.c cl_recv_req 172 1 recv successful, sock = 0, recv rc = 32, msgbytes = 32
Nov 12 09:13:20 primo_s2_n1 caa:info cluster[14025022]: caa_protocols.c recv_protocol_slave 1518 1 Automatic Repository Replacement request being processed.
7. Then we noted that ARU attempted to activate the backup repository disk, but it failed due to the fact that an AIX VG previously existed in this disk, as shown Example 4-24.
Example 4-24 Messages from the /var/adm/ras/syslog.caa log file showing ARU failure
Nov 12 09:11:26 primo_s2_n1 caa:info unix: kcluster_lock.c xcluster_lock 659 xcluster_lock: nodes which responded: 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Nov 12 09:11:26 primo_s2_n1 caa:info cluster[8716742]: cluster_utils.c cl_run_log_method 11862 1 START '/usr/sbin/mkvg -y caavg_private_t hdisk2'
Nov 12 09:11:26 primo_s2_n1 caa:info cluster[8716742]: cluster_utils.c cl_run_log_method 11893 1 FINISH return = 1
Nov 12 09:11:26 primo_s2_n1 caa:err|error cluster[8716742]: cl_chrepos.c check_disk_add 2127 1 hdisk2 contains an existing vg.
Nov 12 09:11:26 primo_s2_n1 caa:info cluster[8716742]: cl_chrepos.c automatic_repository_update 2235 1 Failure to move to hdisk2
Nov 12 09:11:26 primo_s2_n1 caa:info cluster[8716742]: cl_chrepos.c automatic_repository_update 2255 1 FINISH rc = -1
Nov 12 09:11:26 primo_s2_n1 caa:info cluster[8716742]: caa_protocols.c recv_protocol_slave 1542 1 Returning from Automatic Repository replacement rc = -1
Recovering from a failed ARU event
In the previous section “Possible ARU failure situations” on page 83, an example was given on what can prevent a successful repository disk replacement using ARU. In order to recover from that failed event, we manually switched the repository disks using the PowerHA
SMIT panels.
Complete the following steps:
1. Using AIX’s SMIT, run smitty sysmirror, and select Problem Determination Tools → Replace the Primary Repository Disk. In our sample cluster, we have multiple sites so that a menu is shown to select a site, as shown in Example 4-25.
Example 4-25 Site selection prompt after selecting “Replace the Primary Repository Disk”
Problem Determination Tools
 
Move cursor to desired item and press Enter.
 
[MORE...1]
View Current State
PowerHA SystemMirror Log Viewing and Management
Recover From PowerHA SystemMirror Script Failure
Recover Resource Group From SCSI Persistent Reserve Error
Restore PowerHA SystemMirror Configuration Database from Active Configuration
Release Locks Set By Dynamic Reconfiguration
Cluster Test Tool
+--------------------------------------------------------------------------+
| Select a Site |
| |
| Move cursor to desired item and press Enter. |
| |
| primary_site1 |
| standby_site2 |
| |
[M| F1=Help F2=Refresh F3=Cancel |
| F8=Image F10=Exit Enter=Do |
F1| /=Find n=Find Next |
F9+--------------------------------------------------------------------------+
2. In our example, we select standby_site2 and a screen is shown with an option to select the replacement repository disk, as shown in Example 4-26.
Example 4-26 Prompt to select a new repository disk
Select a new repository disk
 
Type or select values in entry fields.
Press Enter AFTER making all desired changes.
[Entry Fields]
Site Name standby_site2
* Repository Disk [] +
 
 
F1=Help F2=Refresh F3=Cancel F4=List
F5=Reset F6=Command F7=Edit F8=Image
F9=Shell F10=Exit Enter=Do
3. Pressing the F4 key displays the available backup repository disks, as shown in Example 4-27.
Example 4-27 SMIT menu prompting for replacement repository disk
Select a new repository disk
 
Type or select values in entry fields.
Press Enter AFTER making all desired changes.
 
[Entry Fields]
Site Name standby_site2
* Repository Disk [] +
 
 
 
+--------------------------------------------------------------------------+
| Repository Disk |
| |
| Move cursor to desired item and press Enter. |
| |
| 00f6f5d0ba49cdcc |
| |
| F1=Help F2=Refresh F3=Cancel |
F1| F8=Image F10=Exit Enter=Do |
F5| /=Find n=Find Next |
F9+--------------------------------------------------------------------------+
4. Selecting the backup repository disk leads to the SMIT panel showing the selected disk, as shown in Example 4-28.
Example 4-28 SMIT panel showing the selected repository disk
Select a new repository disk
 
Type or select values in entry fields.
Press Enter AFTER making all desired changes.
 
[Entry Fields]
Site Name standby_site2
* Repository Disk [00f6f5d0ba49cdcc] +
 
 
 
 
 
F1=Help F2=Refresh F3=Cancel F4=List
F5=Reset F6=Command F7=Edit F8=Image
F9=Shell F10=Exit Enter=Do
5. Last, pressing the Enter key runs the repository disk replacement. After the repository disk has been replaced, the following screen displays, as shown in Example 4-29.
Example 4-29 SMIT panel showing success repository disk replacement
COMMAND STATUS
 
Command: OK stdout: yes stderr: no
 
Before command completion, additional instructions may appear below.
 
chrepos: Successfully modified repository disk or disks.
 
New repository "hdisk1" (00f6f5d0ba49cdcc) is now active.
The configuration must be synchronized to make this change known across the clus
ter.
 
 
 
 
 
F1=Help F2=Refresh F3=Cancel F6=Command
F8=Image F9=Shell F10=Exit /=Find
n=Find Next
At this point, it is safe to remove the failed repository disk and replace it. The replacement disk can become the new backup repository disk by following the steps described in “Configuring a backup repository disk” on page 79.
4.3 Reliable Scalable Cluster Technology overview
This section provides an overview of Reliable Scalable Cluster Technology (RSCT), its components, and the communication path between these components. This section also discusses what of it is used by PowerHA. The items described here are not new but are needed for a basic understanding of the PowerHA underlying infrastructure.
4.3.1 What Reliable Scalable Cluster Technology is
Reliable Scalable Cluster Technology (RSCT) is a set of software components that together provide a comprehensive clustering environment for AIX, Linux, Solaris, and Microsoft Windows operating systems. RSCT is the infrastructure used by various IBM products to provide clusters with improved system availability, scalability, and ease of use.
4.3.2 Reliable Scalable Cluster Technology components
This section describes the RSCT components and how they communicate with each other.
Reliable Scalable Cluster Technology components overview
For a more detailed description of the RSCT components, see the IBM RSCT for AIX: Guide and Reference, SA22-7889 on the following website:
The main RSCT components are explained in this section:
Resource Monitoring and Control (RMC) subsystem
This is the scalable, and reliable backbone of RSCT. RMC runs on a single machine or on each node (operating system image) of a cluster, and provides a common abstraction for the resources of the individual system or the cluster of nodes. You can use RMC for a single system monitoring, or for monitoring nodes in a cluster. However, in a cluster, RMC provides global access to subsystems and resources throughout the cluster, thus providing a single monitoring and management infrastructure for clusters.
RSCT core resource managers
A resource manager is a software layer between a resource (a hardware or software entity that provides services to some other component) and RMC. A resource manager maps programmatic abstractions in RMC into the actual calls and commands of a resource.
RSCT cluster security services
This RSCT component provides the security infrastructure that enables RSCT components to authenticate the identity of other parties.
Group Services subsystem
This RSCT component provides cross-node/process coordination on some cluster configurations.
Topology Services subsystem
This RSCT component provides node and network failure detection on some cluster configurations.
Communication between RSCT components
The RMC subsystem and RSCT core resource managers (RM) are today the only ones that use the RSCT cluster security services. Since the availability of PowerHA V7, RSCT Group Services are able to use Topology Services or CAA. Figure 4-2 on page 90 shows the RSCT components and their relationships.
The RMC application programming interface (API) is the only interface that can be used by applications to exchange data with the RSCT components. RMC manages the RMs and receives data from them. Group Services is a client of RMC. Depending on if PowerHA V7 is installed, it connects to CAA. Otherwise, it connects to the RSCT Topology Services.
Figure 4-2 shows RSCT component relationships.
Figure 4-2 RSCT components
RSCT domains
A RSCT management domain is a set of nodes with resources that can be managed and monitored from one of the nodes, which is designated as the management control point (MCP). All other nodes are considered to be managed nodes. Topology Services and Group Services are not used in a management domain. Figure 4-3 shows the high-level architecture of an RSCT management domain.
Figure 4-3 RSCT managed domain (architecture)
A RSCT peer domain is a set of nodes that have a consistent knowledge of the existence of each other, and of the resources shared among them. On each node within the peer domain, RMC depends on a core set of cluster services, which include Topology Services, Group Services, and cluster security services. Figure 4-4 shows the high-level architecture of an RSCT peer domain.
Figure 4-4 RSCT peer domain (architecture)
Group Services are used in peer domains. If PowerHA V7 is installed, Topology Services are not used, and CAA is used instead. Otherwise, Topology Services are used too.
Combination of management and peer domains
You can have a combination of both types of domains (management domain and peer domains).
Figure 4-5 on page 92 shows the high-level architecture for how an RSCT managed domain and RSCT peer domains can be combined. In this example, Node Y is an RSCT management server. You have three nodes as managed nodes (Node A, Node B, and Node C). Node B and Node C are part of an RSCT peer domain.
You can have multiple peer domains within a managed domain. A node can be part of a managed domain and a peer domain. A given node can only belong to a single peer domain, as shown in Figure 4-5.
Figure 4-5 Management and peer domain (architecture)
 
Important: A node can only belong to one RSCT peer domain.
Example of a management and a peer domain
The example here is extremely simplified. It just shows one Hardware Management Console (HMC) that is managing three LPARS, where two of them are used for a 2-node PowerHA cluster.
In a Power Systems environment, the HMC is always the management server in the RSCT management domain. The LPARs are clients to this server from an RSCT point of view. For instance, this management domain is used to do dynamic LPAR (DLPAR) operations on the different LPARs.
Figure 4-6 shows this simplified setup.
Figure 4-6 Example management and peer domain
RSCT peer domain on Cluster Aware AIX (CAA)
When RSCT operates on nodes in a CAA cluster, a peer domain is created that is equivalent to the CAA cluster. This RSCT peer domain presents largely the same set of function to users and software as other peer domains not based on CAA. Consider a peer domain, which is operating without CAA, and autonomously manages and monitors the configuration and liveness of the nodes and interfaces that it comprises.
The peer domain that represents a CAA cluster acquires configuration information and liveness results from CAA. It introduces some differences in the mechanics of peer domain operations, but very few in the view of the peer domain that is available to the users.
Only one CAA cluster can be defined on a set of nodes. Therefore, if a CAA cluster is defined, the peer domain that represents it is the only peer domain that can exist, and it exists and be online for the life of the CAA cluster.
Figure 4-7 illustrates the relationship discussed in this section.
Figure 4-7 RSCT peer domain and CAA
When your cluster is configured and synchronized, you can check the RSCT peer domain using the lsrpdomain command. To list the nodes in this peer domain, you can use the command lsrpnode. Example 4-30 shows a sample output of these commands.
The RSCTActiveVersion number of the lsrpdomain output can show a back-level version number. This is the lowest RSCT version that is required by a new joining node. In a PowerHA environment, there is no need to modify this.
The value of yes for MixedVersions just means that you have at least one node with a higher version than the displayed RSCT version. The lsrpdnode command lists the actually used RSCT version by node.
Example 4-30 List RSCT peer domain information
# lsrpdomain
Name OpState RSCTActiveVersion MixedVersions TSPort GSPort
c2n1_cluster Online 3.1.5.0 Yes 12347 12348
# lsrpnode
lsrpnode
Name OpState RSCTVersion
c2n2.munich.de.ibm.com Online 3.2.1.0
c2n1.munich.de.ibm.com Online 3.2.1.0
#
Update the RSCT peer domain version
If you like, you can upgrade the RSCT version of the RSCT peer domain which is reported by the lsrpdomain command. To do this used the command listed in Example 4-31.
To be clear, doing such an update does not give you any advantages in a PowerHA environment. In fact, if you delete the cluster and then re-create it manually, or by using an existing snapshot of the RSCT peer domain version, you are back to the original version, which was 3.1.5.0 in our example.
Example 4-31 Update RSCT peer domain
# export CT_MANAGEMENT_SCOPE=2; runact -c IBM.PeerDomain
CompleteMigration Options=0
#
Check for CAA
To do a quick check on the CAA cluster, you can for instance use the lscluster -c command or use the lscluster -m command. Example 4-32 shows an example output of these two commands. For most situations, when you get an output of the lscluster command, CAA is up and running. To be on the safe side, you should use the lscluster -m command.
Example 4-32 shows that in our case CAA is up and running on the local node where we used the lscluster command. But on the remote node CAA was stopped.
To stop CAA, we used the clmgr off node powerha-c2n2 STOP_CAA=yes command.
Example 4-32 The lscluster -c and lscluster -m commands
# lscluster -c
Cluster Name: c2n1_cluster
Cluster UUID: d19995ae-8246-11e5-806f-fa37c4c10c20
Number of nodes in cluster = 2
Cluster ID for node c2n1.munich.de.ibm.com: 1
Primary IP address for node c2n1.munich.de.ibm.com: 172.16.150.121
Cluster ID for node c2n2.munich.de.ibm.com: 2
Primary IP address for node c2n2.munich.de.ibm.com: 172.16.150.122
Number of disks in cluster = 1
Disk = caa_r0 UUID = 12d1d9a1-916a-ceb2-235d-8c2277f53d06 cluster_major = 0 cluster_minor = 1
Multicast for site LOCAL: IPv4 228.16.150.121 IPv6 ff05::e410:9679
Communication Mode: unicast
Local node maximum capabilities: AUTO_REPOS_REPLACE, HNAME_CHG, UNICAST, IPV6, SITE
Effective cluster-wide capabilities: AUTO_REPOS_REPLACE, HNAME_CHG, UNICAST, IPV6, SITE
#
# lscluster -m | egrep "Node name|State of node"
Node name: powerha-c2n1.munich.de.ibm.com
State of node: DOWN
Node name: powerha-c2n2.munich.de.ibm.com
State of node: UP  NODE_LOCAL
#
Peer domain on CAA linked clusters
Starting with PowerHA V7.1.2, linked clusters can be used. An RSCT peer domain that operates on linked clusters encompasses all nodes at each site. The nodes that comprise each site cluster are all members of the same peer domain.
Figure 4-8 shows how this looks from an architecture point of view.
Figure 4-8 RSCT peer domain and CAA linked cluster
Example 4-33 shows what the RSCT looks like in our 2-node cluster.
Example 4-33 Output of the lsrpdomain command
# lsrpdomain
Name OpState RSCTActiveVersion MixedVersions TSPort GSPort
primo_s1_n1_cluster Online 3.1.5.0 Yes 12347 12348
# lsrpnode
Name OpState RSCTVersion
primo_s2_n1 Online 3.2.1.0
primo_s1_n1 Online 3.2.1.0
#
Because we have defined each of our nodes to a different site, the lscluster -c command only lists one node. Example 4-34 shows an example output from node 1.
Example 4-34 Output of the lscluster command (node 1)
# lscluster -c
Cluster Name: primo_s1_n1_cluster
Cluster UUID: d34e8658-8894-11e5-8002-6e8ddb7b3702
Number of nodes in cluster = 2
Cluster ID for node primo_s1_n1: 1
Primary IP address for node primo_s1_n1: 192.168.100.20
Cluster ID for node primo_s2_n1: 2
Primary IP address for node primo_s2_n1: 192.168.100.21
Number of disks in cluster = 4
Disk = hdisk2 UUID = 2f1b2492-46ca-eb3b-faf9-87fa7d8274f7 cluster_major = 0 cluster_minor = 1
Disk = UUID = 6c1b76e1-3e0a-ff3c-3c43-cb6c3881c3bf cluster_major = 0 cluster_minor = 2
Disk = hdisk3 UUID = 20d93b0c-97e8-85ee-8b71-b880ccf848b7 cluster_major = 0 cluster_minor = 3
Disk = UUID = 5890b139-e987-1451-211e-24ba89e7d1df cluster_major = 0 cluster_minor = 4
Multicast for site primary_site1: IPv4 228.168.100.20 IPv6 ff05::e4a8:6414
Multicast for site standby_site2: IPv4 228.168.100.21 IPv6 ff05::e4a8:6415
Communication Mode: unicast
Local node maximum capabilities: CAA_NETMON, AUTO_REPOS_REPLACE, HNAME_CHG, UNICAST, IPV6, SITE
Effective cluster-wide capabilities: CAA_NETMON, AUTO_REPOS_REPLACE, HNAME_CHG, UNICAST, IPV6, SITE
#
Example 4-35 shows the output from node 2.
Example 4-35 Output of the lscluster command (node 2)
# lscluster -c
Cluster Name: primo_s1_n1_cluster
Cluster UUID: d34e8658-8894-11e5-8002-6e8ddb7b3702
Number of nodes in cluster = 2
Cluster ID for node primo_s1_n1: 1
Primary IP address for node primo_s1_n1: 192.168.100.20
Cluster ID for node primo_s2_n1: 2
Primary IP address for node primo_s2_n1: 192.168.100.21
Number of disks in cluster = 4
Disk = UUID = 2f1b2492-46ca-eb3b-faf9-87fa7d8274f7 cluster_major = 0 cluster_minor = 1
Disk = UUID = 20d93b0c-97e8-85ee-8b71-b880ccf848b7 cluster_major = 0 cluster_minor = 3
Disk = hdisk2 UUID = 5890b139-e987-1451-211e-24ba89e7d1df cluster_major = 0 cluster_minor = 4
Disk = hdisk1 UUID = 6c1b76e1-3e0a-ff3c-3c43-cb6c3881c3bf cluster_major = 0 cluster_minor = 2
Multicast for site standby_site2: IPv4 228.168.100.21 IPv6 ff05::e4a8:6415
Multicast for site primary_site1: IPv4 228.168.100.20 IPv6 ff05::e4a8:6414
Communication Mode: unicast
Local node maximum capabilities: CAA_NETMON, AUTO_REPOS_REPLACE, HNAME_CHG, UNICAST, IPV6, SITE
Effective cluster-wide capabilities: CAA_NETMON, AUTO_REPOS_REPLACE, HNAME_CHG, UNICAST, IPV6, SITE
#
4.4 IBM PowerHA, RSCT, and CAA
Starting with PowerHA V7.1, instead of the RSCT Topology Service, the CAA component is used in a PowerHA V7 setup. Figure 4-9 shows the connections between PowerHA V7, RSCT, and CAA (mainly the connection from PowerHA to RSCT Group services, and from there to CAA and back, are used). The potential communication to RMC is rarely used.
Figure 4-9 PowerHA, RSCT, CAA overview
4.4.1 Configuring PowerHA, RSCT, and CAA
There is no need to make any configuration RSCT or CAA. You just need to configure or migrate PowerHA, as shown in Figure 4-10 on page 99. To set it up, just use the smitty sysmirror screens or the clmgr command. The different migration processes operate in a similar way.
Figure 4-10 Set up PowerHA, RSCT, and CAA
4.4.2 Relationship between PowerHA, RSCT, CAA
This section describes, from a high-level point of view, the relationship between PowerHA, RSCT, and CAA. The intention of this section is to give you a general understanding of what is running in the background. The examples use in this section are based on a 2-node cluster.
In traditional situations, there is no need to use CAA or RSCT commands, because these are all managed by PowerHA.
All PowerHA components are up
In a cluster where the state of PowerHA is up on all nodes, you also have all of the RSCT and CAA services up and running, as shown in Figure 4-11.
Figure 4-11 All cluster services are up
To check if the services are up, you can use different commands. In the following examples, we use the clmgr, clRGinfo, lsrpdomain, and lscluster commands. Example 4-36 shows the output of the clmgr and clRGinfo PowerHA commands.
Example 4-36 Check PowerHA when all is up
# clmgr -a state query cluster
STATE="STABLE“
# clRGinfo
-----------------------------------------------------------------------------
Group Name Group State Node
-----------------------------------------------------------------------------
Test_RG ONLINE CL1_N1
OFFLINE CL1_N2
#
To check if RSCT is up and running, use the lsrpdomain command. Example 4-37 shows the output of the command.
Example 4-37 Check for RSCT when all components are running
# lsrpdomain
Name OpState RSCTActiveVersion MixedVersions TSPort GSPort
CL1_N1_cluster Online 3.1.5.0 Yes 12347 12348
#
To check if CAA is properly running, we use the lscluster command. You must specify an option when using the lscluster command. We used the option -m in our Example 4-38. In most cases, any other valid option can be used as well. However, to be absolutely sure, you should use the option -m.
In most cases the general behavior is that, when you get a valid output, CAA is running. Otherwise, you get an error message telling you that the Cluster services are not active.
Example 4-38 Check for CAA when all is up
# lscluster -m | egrep "Node name|State of node"
Node name: powerha-c2n1
State of node: UP
Node name: powerha-c2n2
State of node: UP  NODE_LOCAL
#
One node stopped with Unmanage
In a cluster where the state of PowerHA is up on all Nodes, you also have all of the RSCT and CAA services running, as shown in Figure 3-10 on page 24.
In a cluster where one node is stopped with an Unmanage state, all of the underlying components (RSCT and CAA) need to stay running. Figure 4-12 illustrates what happens when LPAR A is stopped with an Unmanage state.
Figure 4-12 One node where all RGs are unmanaged
The following examples use the same commands as in “All PowerHA components are up” on page 99 to check the status of the different components. Example 4-39 shows the output of the clmgr and clRGinfo PowerHA commands.
Example 4-39 Check PowerHA, one node in state unmanaged
# clmgr -a state query cluster
STATE="WARNING“
# clRGinfo
-----------------------------------------------------------------------------
Group Name Group State Node
-----------------------------------------------------------------------------
Test_RG UNMANAGED CL1_N1
UNMANAGED CL1_N2
#
As expected, the output of the lsrpdomain RSCT command shows that RSCT is still online (see Example 4-40).
Example 4-40 Check RSCT, one node in state unmanaged
# lsrpdomain
Name OpState RSCTActiveVersion MixedVersions TSPort GSPort
CL1_N1_cluster Online 3.1.5.0 Yes 12347 12348
#
Also as expected, checking for CAA shows that it is up and running, as shown in Example 4-41.
Example 4-41 Check CAA, one node in state unmanaged
# lscluster -m | egrep "Node name|State of node"
Node name: powerha-c2n1
State of node: UP
Node name: powerha-c2n2
State of node: UP  NODE_LOCAL
#
PowerHA stopped on all nodes
When you stop PowerHA on all cluster nodes, then you get a situation as illustrated in Figure 4-13. In this case, PowerHA is stopped on all cluster nodes but RSCT and CAA are still up and running. You have the same situation after a system reboot of all your cluster nodes (assuming that you do not use the automatic startup of PowerHA).
Figure 4-13 PowerHA stopped on all cluster nodes
Again, we use the same commands as in “All PowerHA components are up” on page 99 to check the status of the different components. Example 4-42 shows the output of the PowerHA commands clmgr and clRGinfo.
As expected, the clmgr command shows that PowerHA is offline, and clRGinfo returns an error message.
Example 4-42 Check PowerHA, PowerHA stopped on all cluster nodes
# clmgr -a state query cluster
STATE="OFFLINE“
# clRGinfo
Cluster IPC error: The cluster manager on node CL1_N1 is in ST_INIT or NOT_CONFIGURED state and cannot process the IPC request.
#
As mentioned previously, the output of the RSCT lsrpdomain command shows that RSCT is still online (Example 4-43).
Example 4-43 Check RSCT, PowerHA stopped on all cluster nodes
# lsrpdomain
Name OpState RSCTActiveVersion MixedVersions TSPort GSPort
CL1_N1_cluster Online 3.1.5.0 Yes 12347 12348
#
And as expected, the check for CAA shows that it is running, as shown in Example 4-44.
When RSCT is running, CAA needs to be up as well. Keep in mind that this statement is only true for a PowerHA cluster.
Example 4-44 Check CAA, PowerHA stopped on all cluster nodes
# lscluster -m | egrep "Node name|State of node"
Node name: powerha-c2n1
State of node: UP
Node name: powerha-c2n2
State of node: UP  NODE_LOCAL
#
All cluster components are stopped
Remember, by default CAA and RSCT are automatically started as part of an operating system restart (if it is configured by PowerHA).
There are situations when you need to stop all three cluster components, for instance when you need to change the RSCT or CAA code, as shown in Figure 4-14.
For example, to stop all cluster components, use clmgr off cluster STOP_CAA=yes. For more details about starting and stopping CAA, see 4.4.3, “How to start and stop CAA and RSCT” on page 104.
Figure 4-14 All cluster services stopped
Example 4-45 shows the status of the cluster with all services stopped. As in the previous examples, we used the clmgr and clRGinfo commands.
Example 4-45 Check PowerHA, all cluster services stopped
# clmgr -a state query cluster
STATE="OFFLINE“
root@CL1_N1:/home/root# clRGinfo
Cluster IPC error: The cluster manager on node CL1_N1 is in ST_INIT or NOT_CONFIGURED state and cannot process the IPC request.
#
The lsrpdomain command shows that the RSCT cluster is offline, as shown in Example 4-46.
Example 4-46 Check RSCT, all cluster services stopped
# lsrpdomain
Name OpState RSCTActiveVersion MixedVersions TSPort GSPort
CL1_N1_cluster Offline 3.1.5.0 Yes 12347 12348
#
As mentioned in the previous examples, the output of the lscluster command creates an error message in this case, as shown in Example 4-47.
Example 4-47 Check CAA, all cluster services stopped
# lscluster -m
lscluster: Cluster services are not active on this node because it has been stopped.
#
4.4.3 How to start and stop CAA and RSCT
CAA and RSCT are stopped and started together. As mentioned in the previous section, CAA and RSCT are automatically started as part of an operating system boot (if it is configured by PowerHA).
If you want to stop CAA and RSCT, you must use the clmgr command (at the time this publication was written, SMIT does not support this operation). To stop it, you must use the STOP_CAA=yes argument. This argument can be used for both CAA and RSCT, and the complete cluster or a set of nodes.
Remember that the information when you stopped CAA manually is preserved across an operating system reboot. So if you want to start PowerHA on a node where CAA and RSCT has been stopped deliberately, you must use the START_CAA argument.
To start CAA and RSCT, you can use the clmgr command with the argument START_CAA=yes. Remember that this command also starts PowerHA.
Example 4-48 shows how to stop or start CAA and RSCT. Remember that all of these examples stop all three components or start all three components.
Example 4-48 Using clmgr to start and stop CAA, RSCT
To Stop CAA and RSCT:
- clmgr off cluster STOP_CAA=yes
- clmgr off node system-a STOP_CAA=yes
 
To Start CAA and RSCT:
- clmgr on cluster START_CAA=yes
- clmgr on node system-a START_CAA=yes
Starting with AIX V7.1 TL4 or AIX V7.2, you can use the clctrl command to stop or start CAA and RSCT. To stop it, use the -stop option for the clctrl command. Remember that this also stops PowerHA. To start CAA and RSCT, you can use the -start option. If -start is used, only CAA and RSCT are started. To start PowerHA, you must use the clmgr command, or use SMIT afterward.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset