Chapter 4. Whatâ€™s new with IBM Cluster Aware AIX and Reliable Scalable Clustering Technology

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

What’s new with IBM Cluster Aware AIX and Reliable Scalable Clustering Technology

This chapter provides details on what is new with IBM Cluster Aware AIX (CAA) and with IBM Reliable Scalable Clustering Technology (RSCT).

This chapter describes the following topics:

•CAA

•Automatic repository update for the repository disk

•Reliable Scalable Cluster Technology overview

•IBM PowerHA, RSCT, and CAA

4.1 CAA

This section describes in more detail some of the new CAA features.

4.1.1 CAA tunables

This section and other places in this book mention CAA tunables and how they behave. Example 4-1 shows the list of the CAA tunables with IBM AIX V7.2.0.0 and IBM PowerHA V7.2.0. Newer versions can have more tunables, different defaults, or both.

Attention: Do not change any of these tunables without the explicit permission of IBM.

In general, you should never modify these values, because these values are modified and managed by PowerHA.

Example 4-1 List of CAA tunables

# clctrl -tune -a

ha72cluster(71a0d83c-e467-11e5-8022-4217e0ce7b02).communication_mode = u

ha72cluster(71a0d83c-e467-11e5-8022-4217e0ce7b02).config_timeout = 240

ha72cluster(71a0d83c-e467-11e5-8022-4217e0ce7b02).deadman_mode = a

ha72cluster(71a0d83c-e467-11e5-8022-4217e0ce7b02).link_timeout = 30000

ha72cluster(71a0d83c-e467-11e5-8022-4217e0ce7b02).local_merge_policy = m

ha72cluster(71a0d83c-e467-11e5-8022-4217e0ce7b02).network_fdt = 20000

ha72cluster(71a0d83c-e467-11e5-8022-4217e0ce7b02).no_if_traffic_monitor = 0

ha72cluster(71a0d83c-e467-11e5-8022-4217e0ce7b02).node_down_delay = 10000

ha72cluster(71a0d83c-e467-11e5-8022-4217e0ce7b02).node_timeout = 30000

ha72cluster(71a0d83c-e467-11e5-8022-4217e0ce7b02).packet_ttl = 32

ha72cluster(71a0d83c-e467-11e5-8022-4217e0ce7b02).remote_hb_factor = 1

ha72cluster(71a0d83c-e467-11e5-8022-4217e0ce7b02).repos_mode = e

ha72cluster(71a0d83c-e467-11e5-8022-4217e0ce7b02).site_merge_policy = p

4.1.2 What is new in CAA overview

The following new features are included in CAA:

•Automatic Repository Update (ARU)

Also known as Automatic Repository Replacement (ARR)
See 4.2, “Automatic repository update for the repository disk” on page 77 for more details.

•Monitor /var usage

See 4.1.3, “Monitoring /var usage” on page 65 for more details.

•New -g option for the lscluster command

See 4.1.4, “New lscluster option -g” on page 67 for more details.

•Interface Failure Detection:

– Tuning for Interface Failure Detection

– Send multicast packet to generate incoming traffic

– Implementation of network monitor (NETMON) within CAA

See 4.1.5, “Interface failure detection” on page 76 for more Details.

•Functional Enhancements

– Reduce dependency of CAA node name on hostname

– Roll back on mkcluster failure or partial success

•Reliability, Availability, and Serviceability (RAS) Enhancements

– Message improvements

– Several syslog.caa serviceability improvements

– Enhanced Dead Man Switch (DMS) error logging

4.1.3 Monitoring /var usage

Starting with PowerHA 7.2 the /var file system is monitored by default. This monitoring is done by the clconfd subsystem. The following default values are used:

Threshold 75% (range 70 - 95)

Interval 15 min (range 5 - 30)

To change the default values, use the chssys command. The -t option is used to specify the threshold in % and the -i option is used to specify the interval:

chssys -s clconfd -a "-t 80 -i 10"

To check what values are currently used, you have two options: You can use the ps -ef | grep clconfd or the odmget -q "subsysname='clconfd'" SRCsubsys command. Example 4-2 shows the output of the two commands mentioned before with default values. When using the odmget command, the cmdargs line has no arguments listed. The same happens if ps -ef is used, because there are no arguments displayed after clconfd.

Example 4-2 Check clconfd (when default values are used)

# ps -ef | grep clconfd

root 3713096 3604778 0 17:50:30 - 0:00 /usr/sbin/clconfd

# odmget -q "subsysname='clconfd'" SRCsubsys

SRCsubsys:

subsysname = "clconfd"

synonym = ""

cmdargs = ""

path = "/usr/sbin/clconfd"

uid = 0

auditid = 0

standin = "/dev/null"

standout = "/dev/null"

standerr = "/dev/null"

action = 1

multi = 0

contact = 2

svrkey = 0

svrmtype = 0

priority = 20

signorm = 2

sigforce = 9

display = 1

waittime = 20

grpname = "caa"

Example 4-3 shows what happens when you change the default values, and what the output of the odmget and ps -ef looks like after that change.

Important: You need to stop and start the subsystem to get your changes active.

Example 4-3 Change monitoring for /var

# chssys -s clconfd -a "-t 80 -i 10"

0513-077 Subsystem has been changed

# stopsrc -s clconfd

0513-044 The clconfd Subsystem was requested to stop.

# startsrc -s clconfd

0513-059 The clconfd Subsystem has been started. Subsystem PID is 13173096.

# ps -ef | grep clconfd

root 13173096 3604778 0 17:50:30 - 0:00 /usr/sbin/clconfd -t 80 -i 10

# odmget -q "subsysname='clconfd'" SRCsubsys

SRCsubsys:

subsysname = "clconfd"

synonym = ""

cmdargs = "-t 80 -i 10"

path = "/usr/sbin/clconfd"

uid = 0

auditid = 0

standin = "/dev/null"

standout = "/dev/null"

standerr = "/dev/null"

action = 1

multi = 0

contact = 2

svrkey = 0

svrmtype = 0

priority = 20

signorm = 2

sigforce = 9

display = 1

waittime = 20

grpname = "caa"

If the threshold is exceeded, then you get an entry in the error log. Example 4-4 shows what such an error entry can look like.

Example 4-4 Error message of /var monitoring

LABEL: CL_VAR_FULL

IDENTIFIER: E5899EEB

Date/Time: Fri Nov 13 17:47:15 2015

Sequence Number: 1551

Machine Id: 00F747C94C00

Node Id: esp-c2n1

Class: S

Type: PERM

WPAR: Global

Resource Name: CAA (for RSCT)

Description

/var filesystem is running low on space

Probable Causes

Unknown

Failure Causes

Unknown

Recommended Actions

RSCT could malfunction if /var gets full

Increase the filesystem size or delete unwanted files

Detail Data

Percent full

Percent threshold

4.1.4 New lscluster option -g

Starting with AIX V7.1 TL4 and AIX V7.2, there is an additional option for the CAA lscluster command.

The new option -g lists the used communication paths of CAA.

Note: At the time this publication was written, this option was not available in AIX versions earlier than AIX V7.1.4.

The lscluster -i command lists all of the seen communication paths by CAA but it does not show if all of them can potentially be used for heartbeating. This is particularly the case if you use a network that is set to private, or if you have removed a network from the PowerHA configuration.

Using all interfaces

When using the standard way to configure a cluster, all configured networks in AIX are added to the PowerHA and CAA configuration. In our test cluster, we configured two IP interfaces in AIX. Example 4-5 shows the two networks in our PowerHA configuration, all set to public.

Example 4-5 The cllsif command with all interfaces on public

> cllsif

Adapter Type Network Net Type Attribute Node IP Address Hardware Address Interface Name Global Name Netmask Alias for HB Prefix Length

n1adm boot adm_net ether public powerha-c2n1 10.17.1.100 en1 255.255.255.0 24

powerha-c2n1 boot service_net ether public powerha-c2n1 172.16.150.121 en0 255.255.0.0 16

c2svc service service_net ether public powerha-c2n1 172.16.150.125 255.255.0.0 16

n2adm boot adm_net ether public powerha-c2n2 10.17.1.110 en1 255.255.255.0 24

powerha-c2n2 boot service_net ether public powerha-c2n2 172.16.150.122 en0 255.255.0.0 16

c2svc service service_net ether public powerha-c2n2 172.16.150.125 255.255.0.0 16

In this case, the lscluster -i output looks like that shown in Example 4-6.

Example 4-6 The lscluster -i command (all interfaces on public)

> lscluster -i

Network/Storage Interface Query

Cluster Name: ha72cluster

Cluster UUID: 63d12f4e-e61b-11e5-8016-4217e0ce7b02

Number of nodes reporting = 2

Number of nodes stale = 0

Number of nodes expected = 2

Node powerha-c2n1.munich.de.ibm.com

Node UUID = 63b68a36-e61b-11e5-8016-4217e0ce7b02

Number of interfaces discovered = 3

Interface number 1, en0

IFNET type = 6 (IFT_ETHER)

NDD type = 7 (NDD_ISO88023)

MAC address length = 6

MAC address = 42:17:E0:CE:7B:02

Smoothed RTT across interface = 0

Mean deviation in network RTT across interface = 0

Probe interval for interface = 990 ms

IFNET flags for interface = 0x1E084863

NDD flags for interface = 0x0021081B

Interface state = UP

Number of regular addresses configured on interface = 1

IPv4 ADDRESS: 172.16.150.121 broadcast 172.16.255.255 netmask 255.255.0.0

Number of cluster multicast addresses configured on interface = 1

IPv4 MULTICAST ADDRESS: 228.16.150.121

Interface number 2, en1

IFNET type = 6 (IFT_ETHER)

NDD type = 7 (NDD_ISO88023)

MAC address length = 6

MAC address = 42:17:E0:CE:7B:05

Smoothed RTT across interface = 0

Mean deviation in network RTT across interface = 0

Probe interval for interface = 990 ms

IFNET flags for interface = 0x1E084863

NDD flags for interface = 0x0021081B

Interface state = UP

Number of regular addresses configured on interface = 1

IPv4 ADDRESS: 10.17.1.100 broadcast 10.17.1.255 netmask 255.255.255.0

Number of cluster multicast addresses configured on interface = 1

IPv4 MULTICAST ADDRESS: 228.16.150.121

Interface number 3, dpcom

IFNET type = 0 (none)

NDD type = 305 (NDD_PINGCOMM)

Smoothed RTT across interface = 750

Mean deviation in network RTT across interface = 1500

Probe interval for interface = 22500 ms

IFNET flags for interface = 0x00000000

NDD flags for interface = 0x00000009

Interface state = UP RESTRICTED AIX_CONTROLLED

Node powerha-c2n2.munich.de.ibm.com

Node UUID = 63b68a86-e61b-11e5-8016-4217e0ce7b02

Number of interfaces discovered = 3

Interface number 1, en0

IFNET type = 6 (IFT_ETHER)

NDD type = 7 (NDD_ISO88023)

MAC address length = 6

MAC address = 42:17:E4:E6:1B:02

Smoothed RTT across interface = 0

Mean deviation in network RTT across interface = 0

Probe interval for interface = 990 ms

IFNET flags for interface = 0x1E084863

NDD flags for interface = 0x0021081B

Interface state = UP

Number of regular addresses configured on interface = 1

IPv4 ADDRESS: 172.16.150.122 broadcast 172.16.255.255 netmask 255.255.0.0

Number of cluster multicast addresses configured on interface = 1

IPv4 MULTICAST ADDRESS: 228.16.150.121

Interface number 2, en1

IFNET type = 6 (IFT_ETHER)

NDD type = 7 (NDD_ISO88023)

MAC address length = 6

MAC address = 42:17:E4:E6:1B:05

Smoothed RTT across interface = 0

Mean deviation in network RTT across interface = 0

Probe interval for interface = 990 ms

IFNET flags for interface = 0x1E084863

NDD flags for interface = 0x0021081B

Interface state = UP

Number of regular addresses configured on interface = 1

IPv4 ADDRESS: 10.17.1.110 broadcast 10.17.1.255 netmask 255.255.255.0

Number of cluster multicast addresses configured on interface = 1

IPv4 MULTICAST ADDRESS: 228.16.150.121

Interface number 3, dpcom

IFNET type = 0 (none)

NDD type = 305 (NDD_PINGCOMM)

Smoothed RTT across interface = 750

Mean deviation in network RTT across interface = 1500

Probe interval for interface = 22500 ms

IFNET flags for interface = 0x00000000

NDD flags for interface = 0x00000009

Interface state = UP RESTRICTED AIX_CONTROLLED

root@powerha-c2n1:/>

Example 4-7 shows the output of the lscluster -g command. When you compare the output of the lscluster -g command with the lscluster -i command, you should not find any differences. There are no differences because all of the networks are allowed to potentially be used for heartbeat in this example.

Example 4-7 The lscluster -g command output in relation to cllsif output

# > lscluster -g

Network/Storage Interface Query

Cluster Name: ha72cluster

Cluster UUID: 63d12f4e-e61b-11e5-8016-4217e0ce7b02

Number of nodes reporting = 2

Number of nodes stale = 0

Number of nodes expected = 2

Node powerha-c2n1.munich.de.ibm.com

Node UUID = 63b68a36-e61b-11e5-8016-4217e0ce7b02

Number of interfaces discovered = 3

Interface number 1, en0

IFNET type = 6 (IFT_ETHER)

NDD type = 7 (NDD_ISO88023)

MAC address length = 6

MAC address = 42:17:E0:CE:7B:02

Smoothed RTT across interface = 0

Mean deviation in network RTT across interface = 0

Probe interval for interface = 990 ms

IFNET flags for interface = 0x1E084863

NDD flags for interface = 0x0021081B

Interface state = UP

Number of regular addresses configured on interface = 1

IPv4 ADDRESS: 172.16.150.121 broadcast 172.16.255.255 netmask 255.255.0.0

Number of cluster multicast addresses configured on interface = 1

IPv4 MULTICAST ADDRESS: 228.16.150.121

Interface number 2, en1

IFNET type = 6 (IFT_ETHER)

NDD type = 7 (NDD_ISO88023)

MAC address length = 6

MAC address = 42:17:E0:CE:7B:05

Smoothed RTT across interface = 0

Mean deviation in network RTT across interface = 0

Probe interval for interface = 990 ms

IFNET flags for interface = 0x1E084863

NDD flags for interface = 0x0021081B

Interface state = UP

Number of regular addresses configured on interface = 1

IPv4 ADDRESS: 10.17.1.100 broadcast 10.17.1.255 netmask 255.255.255.0

Number of cluster multicast addresses configured on interface = 1

IPv4 MULTICAST ADDRESS: 228.16.150.121

Interface number 3, dpcom

IFNET type = 0 (none)

NDD type = 305 (NDD_PINGCOMM)

Smoothed RTT across interface = 750

Mean deviation in network RTT across interface = 1500

Probe interval for interface = 22500 ms

IFNET flags for interface = 0x00000000

NDD flags for interface = 0x00000009

Interface state = UP RESTRICTED AIX_CONTROLLED

Node powerha-c2n2.munich.de.ibm.com

Node UUID = 63b68a86-e61b-11e5-8016-4217e0ce7b02

Number of interfaces discovered = 3

Interface number 1, en0

IFNET type = 6 (IFT_ETHER)

NDD type = 7 (NDD_ISO88023)

MAC address length = 6

MAC address = 42:17:E4:E6:1B:02

Smoothed RTT across interface = 0

Mean deviation in network RTT across interface = 0

Probe interval for interface = 990 ms

IFNET flags for interface = 0x1E084863

NDD flags for interface = 0x0021081B

Interface state = UP

Number of regular addresses configured on interface = 1

IPv4 ADDRESS: 172.16.150.122 broadcast 172.16.255.255 netmask 255.255.0.0

Number of cluster multicast addresses configured on interface = 1

IPv4 MULTICAST ADDRESS: 228.16.150.121

Interface number 2, en1

IFNET type = 6 (IFT_ETHER)

NDD type = 7 (NDD_ISO88023)

MAC address length = 6

MAC address = 42:17:E4:E6:1B:05

Smoothed RTT across interface = 0

Mean deviation in network RTT across interface = 0

Probe interval for interface = 990 ms

IFNET flags for interface = 0x1E084863

NDD flags for interface = 0x0021081B

Interface state = UP

Number of regular addresses configured on interface = 1

IPv4 ADDRESS: 10.17.1.110 broadcast 10.17.1.255 netmask 255.255.255.0

Number of cluster multicast addresses configured on interface = 1

IPv4 MULTICAST ADDRESS: 228.16.150.121

Interface number 3, dpcom

IFNET type = 0 (none)

NDD type = 305 (NDD_PINGCOMM)

Smoothed RTT across interface = 750

Mean deviation in network RTT across interface = 1500

Probe interval for interface = 22500 ms

IFNET flags for interface = 0x00000000

NDD flags for interface = 0x00000009

Interface state = UP RESTRICTED AIX_CONTROLLED

root@powerha-c2n1:/>

One network set to private

The following examples in this section describe the lscluster command output when you decided to change one or more networks to private. Example 4-8 shows the starting point for this example. In our testing environment, we changed one network to private.

Note: Private networks cannot be used for any services. When you want to use a service IP address, the network must be public.

Example 4-8 The clslif command (private)

# cllsif

Adapter Type Network Net Type Attribute Node IP Address Hardware Address Interface Name Global Name Netmask Alias for HB Prefix Length

n1adm service adm_net ether private powerha-c2n1 10.17.1.100 en1 255.255.255.0 24

powerha-c2n1 boot service_net ether public powerha-c2n1 172.16.150.121 en0 255.255.0.0 16

c2svc service service_net ether public powerha-c2n1 172.16.150.125 255.255.0.0 16

n2adm service adm_net ether private powerha-c2n2 10.17.1.110 en1 255.255.255.0 24

powerha-c2n2 boot service_net ether public powerha-c2n2 172.16.150.122 en0 255.255.0.0 16

c2svc service service_net ether public powerha-c2n2

172.16.150.125 255.255.0.0

Because we did not change the architecture of our cluster, the output of the lscluster -i command is still the same, as shown in Example 4-6 on page 68.

Remember: You must synchronize your cluster before the change to private is visible
in CAA.

Example 4-9 shows the lscluster -g command output after the synchronization. If you now compare the output of the lscluster -g command with the lscluster -i command or with the lscluster -g output from the previous example, you see that the entries about en1 (in our example) do not appear any longer. In other words, the list of networks potentially allowed to be used for heartbeat is shorter.

Example 4-9 The lscluster -g command (one private network)

# lscluster -g

Network/Storage Interface Query

Cluster Name: ha72cluster

Cluster UUID: 55430510-e6a7-11e5-8035-4217e0ce7b02

Number of nodes reporting = 2

Number of nodes stale = 0

Number of nodes expected = 2

Node powerha-c2n1.munich.de.ibm.com

Node UUID = 55284db0-e6a7-11e5-8035-4217e0ce7b02

Number of interfaces discovered = 2

Interface number 1, en0

IFNET type = 6 (IFT_ETHER)

NDD type = 7 (NDD_ISO88023)

MAC address length = 6

MAC address = 42:17:E0:CE:7B:02

Smoothed RTT across interface = 0

Mean deviation in network RTT across interface = 0

Probe interval for interface = 990 ms

IFNET flags for interface = 0x1E084863

NDD flags for interface = 0x0021081B

Interface state = UP

Number of regular addresses configured on interface = 1

IPv4 ADDRESS: 172.16.150.121 broadcast 172.16.255.255 netmask 255.255.0.0

Number of cluster multicast addresses configured on interface = 1

IPv4 MULTICAST ADDRESS: 228.16.150.121

Interface number 2, dpcom

IFNET type = 0 (none)

NDD type = 305 (NDD_PINGCOMM)

Smoothed RTT across interface = 750

Mean deviation in network RTT across interface = 1500

Probe interval for interface = 22500 ms

IFNET flags for interface = 0x00000000

NDD flags for interface = 0x00000009

Interface state = UP RESTRICTED AIX_CONTROLLED

Node powerha-c2n2.munich.de.ibm.com

Node UUID = 55284df6-e6a7-11e5-8035-4217e0ce7b02

Number of interfaces discovered = 2

Interface number 1, en0

IFNET type = 6 (IFT_ETHER)

NDD type = 7 (NDD_ISO88023)

MAC address length = 6

MAC address = 42:17:E4:E6:1B:02

Smoothed RTT across interface = 0

Mean deviation in network RTT across interface = 0

Probe interval for interface = 990 ms

IFNET flags for interface = 0x1E084863

NDD flags for interface = 0x0021081B

Interface state = UP

Number of regular addresses configured on interface = 1

IPv4 ADDRESS: 172.16.150.122 broadcast 172.16.255.255 netmask 255.255.0.0

Number of cluster multicast addresses configured on interface = 1

IPv4 MULTICAST ADDRESS: 228.16.150.121

Interface number 2, dpcom

IFNET type = 0 (none)

NDD type = 305 (NDD_PINGCOMM)

Smoothed RTT across interface = 750

Mean deviation in network RTT across interface = 1500

Probe interval for interface = 22500 ms

IFNET flags for interface = 0x00000000

NDD flags for interface = 0x00000009

Interface state = UP RESTRICTED AIX_CONTROLLED

Remove networks from PowerHA

The examples in this section describe the lscluster command output when you remove one or more networks from the list of known networks in PowerHA. Example 4-10 shows the starting point for this example. In our test environment, we removed the adm_net network.

Example 4-10 The cllsif command (removed network)

# cllsif

Adapter Type Network Net Type Attribute Node IP Address Hardware Address Interface Name Global Name Netmask Alias for HB Prefix Length

powerha-c2n1 boot service_net ether public powerha-c2n1 172.16.150.121 en0 255.255.0.0 16

c2svc service service_net ether public powerha-c2n1 172.16.150.125 255.255.0.0 16

powerha-c2n2 boot service_net ether public powerha-c2n2 172.16.150.122 en0 255.255.0.0 16

c2svc service service_net ether public powerha-c2n2 172.16.150.125 255.255.0.0 16

Because we did not change the architecture of our cluster, the output of the lscluster -i command is still the same as listed in Example 4-6 on page 68.

Remember that you must synchronize your cluster before the change to private is visible
in CAA.

Example 4-11 shows the lscluster -g output after the synchronization. If you now compare the output of the lscluster -g command with the previous lscluster -i command, or with the lscluster -g output in “Using all interfaces” on page 68, you see that the entries about en1 (in our example) do not appear.

When you compare the content of Example 4-11 with the content of Example 4-9 on page 73 in “One network set to private” on page 72, you see that the output of the lscluster -g commands is identical.

Example 4-11 The lscluster -g command output (removed network)

# lscluster -g

Network/Storage Interface Query

Cluster Name: ha72cluster

Cluster UUID: 63d12f4e-e61b-11e5-8016-4217e0ce7b02

Number of nodes reporting = 2

Number of nodes stale = 0

Number of nodes expected = 2

Node powerha-c2n1.munich.de.ibm.com

Node UUID = 63b68a36-e61b-11e5-8016-4217e0ce7b02

Number of interfaces discovered = 2

Interface number 1, en0

IFNET type = 6 (IFT_ETHER)

NDD type = 7 (NDD_ISO88023)

MAC address length = 6

MAC address = 42:17:E0:CE:7B:02

Smoothed RTT across interface = 0

Mean deviation in network RTT across interface = 0

Probe interval for interface = 990 ms

IFNET flags for interface = 0x1E084863

NDD flags for interface = 0x0021081B

Interface state = UP

Number of regular addresses configured on interface = 1

IPv4 ADDRESS: 172.16.150.121 broadcast 172.16.255.255 netmask 255.255.0.0

Number of cluster multicast addresses configured on interface = 1

IPv4 MULTICAST ADDRESS: 228.16.150.121

Interface number 2, dpcom

IFNET type = 0 (none)

NDD type = 305 (NDD_PINGCOMM)

Smoothed RTT across interface = 750

Mean deviation in network RTT across interface = 1500

Probe interval for interface = 22500 ms

IFNET flags for interface = 0x00000000

NDD flags for interface = 0x00000009

Interface state = UP RESTRICTED AIX_CONTROLLED

Node powerha-c2n2.munich.de.ibm.com

Node UUID = 63b68a86-e61b-11e5-8016-4217e0ce7b02

Number of interfaces discovered = 2

Interface number 1, en0

IFNET type = 6 (IFT_ETHER)

NDD type = 7 (NDD_ISO88023)

MAC address length = 6

MAC address = 42:17:E4:E6:1B:02

Smoothed RTT across interface = 0

Mean deviation in network RTT across interface = 0

Probe interval for interface = 990 ms

IFNET flags for interface = 0x1E084863

NDD flags for interface = 0x0021081B

Interface state = UP

Number of regular addresses configured on interface = 1

IPv4 ADDRESS: 172.16.150.122 broadcast 172.16.255.255 netmask 255.255.0.0

Number of cluster multicast addresses configured on interface = 1

IPv4 MULTICAST ADDRESS: 228.16.150.121

Interface number 2, dpcom

IFNET type = 0 (none)

NDD type = 305 (NDD_PINGCOMM)

Smoothed RTT across interface = 750

Mean deviation in network RTT across interface = 1500

Probe interval for interface = 22500 ms

IFNET flags for interface = 0x00000000

NDD flags for interface = 0x00000009

Interface state = UP RESTRICTED AIX_CONTROLLED

root@powerha-c2n1:/>

4.1.5 Interface failure detection

PowerHA V7.1 had a fixed latency for network failure detection that was about 5 seconds. In PowerHA V7.2, the default is now 20 seconds. The tunable is named network_fdt.

Note: The network_fdt tunable is also available in PowerHA V7.1.3. To get it for your PowerHA V7.1.3 version, you must open a PMR and request the Tunable FDT IFix bundle.

The self-adjusting network heartbeating behavior (CAA) which was introduced with PowerHA V7.1.0 is still there and still gets used. It has no impact on the network failure detection time.

The network_fdt tunable can be set to zero to maintain the default behavior. The tunable can be set in a range of 5 - 10 seconds less than the node_timeout.

The default recognition time for a network problem is not affected by this tunable. It is 0 for hard failures and 5 seconds for soft failures (since PowerHA V7.1.0). CAA continues to check the network, but it waits until the end of the defined timeout to create a network down event.

For PowerHA nodes, when the effective level of CAA is 4, also known as the 2015 release, CAA automatically sets the network_fdt to 20 seconds and the node_timeout to 30 seconds.

Note: At the time that this publication was written, the only way to find out if CAA level 4 is installed is to use the lscluster -c command. In the output of lscluster -c, check if the AUTO_REPOS_REPLACE is listed for the effective cluster-wide capabilities.

For instance, use the following command:

# lscluster -c | grep "Effective cluster-wide capabilities"
Effective cluster-wide capabilities: CAA_NETMON, AUTO_REPOS_REPLACE, HNAME_CHG, UNICAST, IPV6, SITE
#

Example 4-12 shows how to both check and change the CAA network tunable attribute using the CAA native clctrl command.

Example 4-12 Using clctrl to change CAA network tunable

# clctrl -tune -o network_fdt

HA72a_cluster(641d80c2-bd87-11e5-8005-96d75a7c7f02).network_fdt = 20000

# clctrl -tune -o network_fdt=10000

1 tunable updated on cluster PHA72a_cluster

# clctrl -tune -o network_fdt

PHA72a_cluster(641d80c2-bd87-11e5-8005-96d75a7c7f02).network_fdt = 10000

4.2 Automatic repository update for the repository disk

This section discusses the new PowerHA Automatic Repository Update (ARU) feature for the PowerHA repository disk.

4.2.1 Introduction to the automatic repository update

Starting with PowerHA V7.0.0, PowerHA uses a shared disk, called the PowerHA repository disk, for various purposes. The availability of this repository disk is critical to the operation of PowerHA clustering and its nodes. The initial implementation of the repository disk, at PowerHA V7.0.0, did not allow for the operation of PowerHA cluster services if the repository disk failed, making that a single point of failure.

With later versions of PowerHA, features have been added to make the cluster more resilient if there is a PowerHA repository disk failure. The ability to survive a repository disk failure, in addition to the ability to manually replace a repository disk without an outage has increased the resiliency of PowerHA. With PowerHA V7.2.0, a new feature to increase the resiliency further was introduced, and this is called Automatic Repository Update.

If there is an active repository disk failure, the purpose of ARU is to automate the replacement of a PowerHA repository disk, without intervention from a system administrator, and without affecting the active cluster services. All that is needed is to point PowerHA to the backup repository disks to use if there is an active repository disk failure.

If a repository disk fails, PowerHA detects the failure of the active repository disk. At that point, it verifies that the active repository disk is not usable. If the active repository disk is unusable, it attempts to switch to the backup repository disk. If it is successful, then the backup repository disk becomes the active repository disk.

4.2.2 Requirements for Automatic Repository Update

The Automatic Repository Update has the following requirements:

•AIX V7.1.4 or AIX V7.2.0.

•PowerHA V7.2.0.

•The storage used for the backup repository disk has the same requirements as the primary repository disk.

See the following website for the PowerHA repository disk requirements:

https://www.ibm.com/support/knowledgecenter/#!/SSPHQG_7.1.0/com.ibm.powerha.plangd/ha_plan_repos_disk.htm

4.2.3 Configuring Automatic Repository Update

The configuration of the ARU is automatic when you configure a backup repository disk for PowerHA. Essentially, all you have to do is configure a backup repository disk.

This section shows an example in a 2-site, 2-node cluster. The cluster configuration is similar to the diagram shown in Figure 4-1.

Figure 4-1 Storage example for PowerHA ARU showing linked and backup repository disks

For the purposes of this example, we configure a backup repository disk for each site of this 2-site cluster.

Configuring a backup repository disk

The following process details how to configure a backup repository disk. For our example, we perform this process for each site in our cluster:

1. Using AIX’s SMIT, run smitty sysmirror and select Cluster Nodes and Networks → Cluster Nodes and Networks → Add a Repository Disk. You are prompted for a site, due to the fact that our example is a 2-site cluster, and then given a selection of possible repository disks. The screen captures in the following sections provide more details.

When you select Add a Repository Disk, you are prompted to select a site, as shown in Example 4-13.

Example 4-13 Selecting “Add a Repository Disk” in multi-site cluster

Manage Repository Disks

Move cursor to desired item and press Enter.

Add a Repository Disk

Remove a Repository Disk

Show Repository Disks

Verify and Synchronize Cluster Configuration

+--------------------------------------------------------------------------+

| Select a Site |

| |

| Move cursor to desired item and press Enter. |

| |

| primary_site1 |

| standby_site2 |

| |

| F1=Help F2=Refresh F3=Cancel |

| F8=Image F10=Exit Enter=Do |

F1| /=Find n=Find Next |

F9+--------------------------------------------------------------------------+

2. After selecting primary_site1, we are shown the repository disk menu (Example 4-14).

Example 4-14 Add a repository disk screen

Add a Repository Disk

Type or select values in entry fields.

Press Enter AFTER making all desired changes.

[Entry Fields]

Site Name primary_site1

* Repository Disk [] +

F1=Help F2=Refresh F3=Cancel F4=List

F5=Reset F6=Command F7=Edit F8=Image

F9=Shell F10=Exit Enter=Do

3. Next, press F4 on the Repository Disk field, and you are shown the repository disk selection list, as shown in Example 4-15.

Example 4-15 Backup repository disk selection

Add a Repository Disk

Type or select values in entry fields.

Press Enter AFTER making all desired changes.

[Entry Fields]

Site Name primary_site1

* Repository Disk [] +

+--------------------------------------------------------------------------+

| Repository Disk |

| |

| Move cursor to desired item and press F7. |

| ONE OR MORE items can be selected. |

| Press Enter AFTER making all selections. |

| |

| hdisk3 (00f61ab295112078) on all nodes at site primary_site1 |

| hdisk4 (00f61ab2a61d5bc6) on all nodes at site primary_site1 |

| hdisk5 (00f61ab2a61d5c7e) on all nodes at site primary_site1 |

| |

| F1=Help F2=Refresh F3=Cancel |

F1| F7=Select F8=Image F10=Exit |

F5| Enter=Do /=Find n=Find Next |

F9+--------------------------------------------------------------------------+

4. After selecting the appropriate disk, the choice is shown in Example 4-16.

Example 4-16 Add a repository disk preview screen

Add a Repository Disk

Type or select values in entry fields.

Press Enter AFTER making all desired changes.

[Entry Fields]

Site Name primary_site1

* Repository Disk [(00f61ab295112078)] +

F1=Help F2=Refresh F3=Cancel F4=List

F5=Reset F6=Command F7=Edit F8=Image

F9=Shell F10=Exit Enter=Do

5. Next, after pressing the Enter key to make the changes, the confirmation screen appears, as shown in Example 4-17.

Example 4-17 Backup repository disk addition confirmation screen

COMMAND STATUS

Command: OK stdout: yes stderr: no

Before command completion, additional instructions may appear below.

Successfully added one or more backup repository disks.

To view the complete configuration of repository disks use:

"clmgr query repository" or "clmgr view report repository"

F1=Help F2=Refresh F3=Cancel F6=Command

F8=Image F9=Shell F10=Exit /=Find

4.2.4 Automatic Repository Update operations

PowerHA ARU operations are automatic when a backup repository disk is configured.

Successful ARU operation

As previously mentioned, ARU operations are automatic when a backup repository disk is defined. Our scenario has a 2-site cluster and a backup repository disk per site.

In order to induce a failure of the primary repository disk, we logged in to the VIOS servers that present storage to the cluster LPARs and deallocate the disk LUN that corresponds to the primary repository disk on one site of our cluster. This disables the primary repository disk, and PowerHA ARU detects the failure and automatically activates the backup repository disk as the active repository disk.

This section presents the following examples used during this process:

1. Before disabling the primary repository disk, we look at the lspv command output and note that the active repository disk is hdisk1, as shown in Example 4-18.

Example 4-18 Output of the lspv command in an example cluster

hdisk0 00f6f5d09570f647 rootvg active

hdisk1 00f6f5d0ba49cdcc caavg_private active

hdisk2 00f6f5d0a621e9ff None

hdisk3 00f61ab2a61d5c7e None

hdisk4 00f61ab2a61d5d81 testvg01 concurrent

hdisk5 00f61ab2a61d5e5b testvg01 concurrent

hdisk6 00f61ab2a61d5f32 testvg01 concurrent

2. We then proceed to log in to the VIOS servers that present the repository disk to this logical partition (LPAR) and de-allocate that logical unit (LUN) so that the cluster LPAR no longer has access to that disk. This causes the primary repository disk to fail.

3. At this point, PowerHA ARU detects the failure and activates the backup repository disk as the active repository disk. You can verify this behavior in the syslog.caa log file. This log file logs the ARU activities and shows the detection of the primary repository disk failure, and the activation of the backup repository disk. See Example 4-19.

Example 4-19 The /var/adm/ras/syslog.caa file showing repository disk failure and recovery

Nov 12 09:13:29 primo_s2_n1 caa:info cluster[14025022]: caa_config.c run_list 1377 1 = = END REPLACE_REPOS Op = = POST Stage = =

Nov 12 09:13:30 primo_s2_n1 caa:err|error cluster[14025022]: cluster_utils.c cluster_repository_read 5792 1 Could not open cluster repository device /dev/rhdisk1: 5

Nov 12 09:13:30 primo_s2_n1 caa:info cluster[14025022]: cluster_utils.c cl_kern_repos_check 11769 1 Could not read the respository.

Nov 12 09:13:30 primo_s2_n1 caa:info cluster[14025022]: cluster_utils.c cl_run_log_method 11862 1 START '/usr/sbin/importvg -y caavg_private_t -O hdisk1'

Nov 12 09:13:32 primo_s2_n1 caa:info cluster[14025022]: cluster_utils.c cl_run_log_method 11893 1 FINISH return = 1

Nov 12 09:13:32 primo_s2_n1 caa:info cluster[14025022]: cluster_utils.c cl_run_log_method 11862 1 START '/usr/sbin/reducevg -df caavg_private_t hdisk1'

Nov 12 09:13:32 primo_s2_n1 caa:info cluster[14025022]: cluster_utils.c cl_run_log_method 11893 1 FINISH return = 1

Nov 12 09:13:33 primo_s2_n1 caa:err|error cluster[14025022]: cluster_utils.c cluster_repository_read 5792 1 Could not open cluster repository device /dev/rhdisk1: 5

Nov 12 09:13:33 primo_s2_n1 caa:info cluster[14025022]: cl_chrepos.c destroy_old_repository 344 1 Failed to read repository data.

Nov 12 09:13:34 primo_s2_n1 caa:err|error cluster[14025022]: cluster_utils.c cluster_repository_write 5024 1 return = -1, Could not open cluster repository device /dev/rhdisk1: I/O error

Nov 12 09:13:34 primo_s2_n1 caa:info cluster[14025022]: cl_chrepos.c destroy_old_repository 350 1 Failed to write repository data.

Nov 12 09:13:34 primo_s2_n1 caa:warn|warning cluster[14025022]: cl_chrepos.c destroy_old_repository 358 1 Unable to destroy repository disk hdisk1. Manual interventio

n is required to clear the disk of cluster identifiers.

Nov 12 09:13:34 primo_s2_n1 caa:info cluster[14025022]: cl_chrepos.c automatic_repository_update 2242 1 Replaced hdisk1 with hdisk2

Nov 12 09:13:34 primo_s2_n1 caa:info cluster[14025022]: cl_chrepos.c automatic_repository_update 2255 1 FINISH rc = 0

Nov 12 09:13:34 primo_s2_n1 caa:info cluster[14025022]: caa_protocols.c recv_protocol_slave 1542 1 Returning from Automatic Repository replacement rc = 0

4. As an extra verification, note that the AIX error log has an entry showing that a successful repository disk replacement has occurred, as shown in Example 4-20.

Example 4-20 AIX error log showing successful repository disk replacement message

LABEL: CL_ARU_PASSED

IDENTIFIER: 92EE81A5

Date/Time: Thu Nov 12 09:13:34 2015

Sequence Number: 1344

Machine Id: 00F6F5D04C00

Node Id: primo_s2_n1

Class: H

Type: INFO

WPAR: Global

Resource Name: CAA ARU

Resource Class: NONE

Resource Type: NONE

Location:

Description

Automatic Repository Update succeeded.

Probable Causes

Primary repository disk was replaced.

Failure Causes

A hardware problem prevented local node from accessing primary repository disk.

Recommended Actions

Primary repository disk was replaced using backup repository disk.

Detail Data

Primary Disk Info

hdisk1 6c1b76e1-3e0a-ff3c-3c43-cb6c3881c3bf

Replacement Disk Info

hdisk2 5890b139-e987-1451-211e-24ba89e7d1df

At this point, it is safe to remove the failed repository disk and replace it. The replacement disk can become the new backup repository disk by following the steps in “Configuring a backup repository disk” on page 79.

Possible ARU failure situations

Note that some activities can affect the operation of ARU. Specifically, any administrative activity that uses the backup repository disk can affect ARU. If a volume group was previously created on a backup repository disk and this disk was not cleaned up, then ARU cannot operate properly.

In our sample scenario, we completed the following steps:

1. Configure a backup repository disk that previously had an AIX volume group (VG).

2. Export the AIX VG so that the disk did not display a volume group using the AIX command, lspv. However, we did not delete that volume group from the disk, so the disk itself still had that information.

3. For our example, we ran the AIX command, lspv. Our backup repository disk is hdisk2. The disk shows a PVID but no volume group, as shown in Example 4-21.

Example 4-21 Output of lspv command in an example cluster showing hdisk2

hdisk0 00f6f5d09570f647 rootvg active

hdisk1 00f6f5d0ba49cdcc caavg_private active

hdisk2 00f6f5d0a621e9ff None

hdisk3 00f61ab2a61d5c7e None

hdisk4 00f61ab2a61d5d81 testvg01 concurrent

hdisk5 00f61ab2a61d5e5b testvg01 concurrent

hdisk6 00f61ab2a61d5f32 testvg01 concurrent

4. At this point, we disconnected the primary repository disk from the LPAR by going to the VIOS and de-allocating the disk LUN from the cluster LPAR. This made the primary repository disk fail immediately.

At this point, ARU attempted to perform the following actions:

a. Check the primary repository disk that is not accessible.

b. Switch to the backup repository disk (but this action failed).

5. We noted that ARU did leave an error message in the AIX error report, as shown in Example 4-22.

Example 4-22 Output of AIX errpt command showing failed repository disk replacement

LABEL: CL_ARU_FAILED

IDENTIFIER: F63D60A2

Date/Time: Wed Nov 11 17:15:17 2015

Sequence Number: 1263

Machine Id: 00F6F5D04C00

Node Id: primo_s2_n1

Class: H

Type: INFO

WPAR: Global

Resource Name: CAA ARU

Resource Class: NONE

Resource Type: NONE

Location:

Description

Automatic Repository Update failed.

Probable Causes

Unknown.

Failure Causes

Unknown.

Recommended Actions

Try manual replacement of cluster repository disk.

Detail Data

Primary Disk Info

hdisk1 6c1b76e1-3e0a-ff3c-3c43-cb6c3881c3bf

6. In addition, we noted that ARU verified the primary repository disk and failed. This is shown in the CAA log /var/adm/ras/syslog.caa, as shown in Example 4-23.

Example 4-23 Selected messages from /var/adm/ras/syslog.caa log file

Nov 12 09:13:20 primo_s2_n1 caa:info unix: *base_kernext_services.c aha_thread_queue 614 The AHAFS event is EVENT_TYPE=REP_DOWN DISK_NAME=hdisk1 NODE_NUMBER=2 NODE_ID=0xD9DDB48A889411E580106E8DDB7B3702 SITE_NUMBER=2 SITE_ID=0xD9DE2028889411E580106E8DDB7B3702 CLUSTER_ID=0xD34E8658889411E580026E8DDB

Nov 12 09:13:20 primo_s2_n1 caa:info unix: caa_sock.c caa_kclient_tcp 231 entering caa_kclient_tcp ....

Nov 12 09:13:20 primo_s2_n1 caa:info unix: *base_kernext_services.c aha_thread_queue 614 The AHAFS event is EVENT_TYPE=VG_DOWN DISK_NAME=hdisk1 VG_NAME=caavg_private NODE_NUMBER=2 NODE_ID=0xD9DDB48A889411E580106E8DDB7B3702 SITE_NUMBER=2 SITE_ID=0xD9DE2028889411E580106E8DDB7B3702 CLUSTER_ID=0xD34E8

Nov 12 09:13:20 primo_s2_n1 caa:info cluster[14025022]: cluster_utils.c cl_run_log_method 11862 1 START '/usr/lib/cluster/caa_syslog '

Nov 12 09:13:20 primo_s2_n1 caa:info unix: kcluster_event.c find_event_disk 742 Find disk called for hdisk4

Nov 12 09:13:20 primo_s2_n1 caa:info unix: kcluster_event.c ahafs_Disk_State_register 1504 diskState set opqId = 0xF1000A0150301A00

Nov 12 09:13:20 primo_s2_n1 caa:info cluster[14025022]: cluster_utils.c cl_run_log_method 11893 1 FINISH return = 0

Nov 12 09:13:20 primo_s2_n1 caa:info cluster[14025022]: caa_message.c inherit_socket_inetd 930 1 IPv6=::ffff:127.0.0.1

Nov 12 09:13:20 primo_s2_n1 caa:info cluster[14025022]: cluster_utils.c cl_kern_repos_check 11769 1 Could not read the respository.

Nov 12 09:13:20 primo_s2_n1 caa:info cluster[14025022]: caa_message.c cl_recv_req 172 1 recv successful, sock = 0, recv rc = 32, msgbytes = 32

Nov 12 09:13:20 primo_s2_n1 caa:info cluster[14025022]: caa_protocols.c recv_protocol_slave 1518 1 Automatic Repository Replacement request being processed.

7. Then we noted that ARU attempted to activate the backup repository disk, but it failed due to the fact that an AIX VG previously existed in this disk, as shown Example 4-24.

Example 4-24 Messages from the /var/adm/ras/syslog.caa log file showing ARU failure

Nov 12 09:11:26 primo_s2_n1 caa:info unix: kcluster_lock.c xcluster_lock 659 xcluster_lock: nodes which responded: 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

Nov 12 09:11:26 primo_s2_n1 caa:info cluster[8716742]: cluster_utils.c cl_run_log_method 11862 1 START '/usr/sbin/mkvg -y caavg_private_t hdisk2'

Nov 12 09:11:26 primo_s2_n1 caa:info cluster[8716742]: cluster_utils.c cl_run_log_method 11893 1 FINISH return = 1

Nov 12 09:11:26 primo_s2_n1 caa:err|error cluster[8716742]: cl_chrepos.c check_disk_add 2127 1 hdisk2 contains an existing vg.

Nov 12 09:11:26 primo_s2_n1 caa:info cluster[8716742]: cl_chrepos.c automatic_repository_update 2235 1 Failure to move to hdisk2

Nov 12 09:11:26 primo_s2_n1 caa:info cluster[8716742]: cl_chrepos.c automatic_repository_update 2255 1 FINISH rc = -1

Nov 12 09:11:26 primo_s2_n1 caa:info cluster[8716742]: caa_protocols.c recv_protocol_slave 1542 1 Returning from Automatic Repository replacement rc = -1

Recovering from a failed ARU event

In the previous section “Possible ARU failure situations” on page 83, an example was given on what can prevent a successful repository disk replacement using ARU. In order to recover from that failed event, we manually switched the repository disks using the PowerHA
SMIT panels.

Complete the following steps:

1. Using AIX’s SMIT, run smitty sysmirror, and select Problem Determination Tools → Replace the Primary Repository Disk. In our sample cluster, we have multiple sites so that a menu is shown to select a site, as shown in Example 4-25.

Example 4-25 Site selection prompt after selecting “Replace the Primary Repository Disk”

Problem Determination Tools

Move cursor to desired item and press Enter.

[MORE...1]

View Current State

PowerHA SystemMirror Log Viewing and Management

Recover From PowerHA SystemMirror Script Failure

Recover Resource Group From SCSI Persistent Reserve Error

Restore PowerHA SystemMirror Configuration Database from Active Configuration

Release Locks Set By Dynamic Reconfiguration

Cluster Test Tool

+--------------------------------------------------------------------------+

| Select a Site |

| |

| Move cursor to desired item and press Enter. |

| |

| primary_site1 |

| standby_site2 |

| |

[M| F1=Help F2=Refresh F3=Cancel |

| F8=Image F10=Exit Enter=Do |

F1| /=Find n=Find Next |

F9+--------------------------------------------------------------------------+

2. In our example, we select standby_site2 and a screen is shown with an option to select the replacement repository disk, as shown in Example 4-26.

Example 4-26 Prompt to select a new repository disk

Select a new repository disk

Type or select values in entry fields.

Press Enter AFTER making all desired changes.

[Entry Fields]

Site Name standby_site2

* Repository Disk [] +

F1=Help F2=Refresh F3=Cancel F4=List

F5=Reset F6=Command F7=Edit F8=Image

F9=Shell F10=Exit Enter=Do

3. Pressing the F4 key displays the available backup repository disks, as shown in Example 4-27.

Example 4-27 SMIT menu prompting for replacement repository disk

Select a new repository disk

Type or select values in entry fields.

Press Enter AFTER making all desired changes.

[Entry Fields]

Site Name standby_site2

* Repository Disk [] +

+--------------------------------------------------------------------------+

| Repository Disk |

| |

| Move cursor to desired item and press Enter. |

| |

| 00f6f5d0ba49cdcc |

| |

| F1=Help F2=Refresh F3=Cancel |

F1| F8=Image F10=Exit Enter=Do |

F5| /=Find n=Find Next |

F9+--------------------------------------------------------------------------+

4. Selecting the backup repository disk leads to the SMIT panel showing the selected disk, as shown in Example 4-28.

Example 4-28 SMIT panel showing the selected repository disk

Select a new repository disk

Type or select values in entry fields.

Press Enter AFTER making all desired changes.

[Entry Fields]

Site Name standby_site2

* Repository Disk [00f6f5d0ba49cdcc] +

F1=Help F2=Refresh F3=Cancel F4=List

F5=Reset F6=Command F7=Edit F8=Image

F9=Shell F10=Exit Enter=Do

5. Last, pressing the Enter key runs the repository disk replacement. After the repository disk has been replaced, the following screen displays, as shown in Example 4-29.

Example 4-29 SMIT panel showing success repository disk replacement

COMMAND STATUS

Command: OK stdout: yes stderr: no

Before command completion, additional instructions may appear below.

chrepos: Successfully modified repository disk or disks.

New repository "hdisk1" (00f6f5d0ba49cdcc) is now active.

The configuration must be synchronized to make this change known across the clus

ter.

F1=Help F2=Refresh F3=Cancel F6=Command

F8=Image F9=Shell F10=Exit /=Find

n=Find Next

At this point, it is safe to remove the failed repository disk and replace it. The replacement disk can become the new backup repository disk by following the steps described in “Configuring a backup repository disk” on page 79.

4.3 Reliable Scalable Cluster Technology overview

This section provides an overview of Reliable Scalable Cluster Technology (RSCT), its components, and the communication path between these components. This section also discusses what of it is used by PowerHA. The items described here are not new but are needed for a basic understanding of the PowerHA underlying infrastructure.

4.3.1 What Reliable Scalable Cluster Technology is

Reliable Scalable Cluster Technology (RSCT) is a set of software components that together provide a comprehensive clustering environment for AIX, Linux, Solaris, and Microsoft Windows operating systems. RSCT is the infrastructure used by various IBM products to provide clusters with improved system availability, scalability, and ease of use.

4.3.2 Reliable Scalable Cluster Technology components

This section describes the RSCT components and how they communicate with each other.

Reliable Scalable Cluster Technology components overview

For a more detailed description of the RSCT components, see the IBM RSCT for AIX: Guide and Reference, SA22-7889 on the following website:

http://www.ibm.com/support/knowledgecenter/SGVKBA

The main RSCT components are explained in this section:

•Resource Monitoring and Control (RMC) subsystem

This is the scalable, and reliable backbone of RSCT. RMC runs on a single machine or on each node (operating system image) of a cluster, and provides a common abstraction for the resources of the individual system or the cluster of nodes. You can use RMC for a single system monitoring, or for monitoring nodes in a cluster. However, in a cluster, RMC provides global access to subsystems and resources throughout the cluster, thus providing a single monitoring and management infrastructure for clusters.

•RSCT core resource managers

A resource manager is a software layer between a resource (a hardware or software entity that provides services to some other component) and RMC. A resource manager maps programmatic abstractions in RMC into the actual calls and commands of a resource.

•RSCT cluster security services

This RSCT component provides the security infrastructure that enables RSCT components to authenticate the identity of other parties.

•Group Services subsystem

This RSCT component provides cross-node/process coordination on some cluster configurations.

•Topology Services subsystem

This RSCT component provides node and network failure detection on some cluster configurations.

Communication between RSCT components

The RMC subsystem and RSCT core resource managers (RM) are today the only ones that use the RSCT cluster security services. Since the availability of PowerHA V7, RSCT Group Services are able to use Topology Services or CAA. Figure 4-2 on page 90 shows the RSCT components and their relationships.

The RMC application programming interface (API) is the only interface that can be used by applications to exchange data with the RSCT components. RMC manages the RMs and receives data from them. Group Services is a client of RMC. Depending on if PowerHA V7 is installed, it connects to CAA. Otherwise, it connects to the RSCT Topology Services.

Figure 4-2 shows RSCT component relationships.

Figure 4-2 RSCT components

RSCT domains

A RSCT management domain is a set of nodes with resources that can be managed and monitored from one of the nodes, which is designated as the management control point (MCP). All other nodes are considered to be managed nodes. Topology Services and Group Services are not used in a management domain. Figure 4-3 shows the high-level architecture of an RSCT management domain.

Figure 4-3 RSCT managed domain (architecture)

A RSCT peer domain is a set of nodes that have a consistent knowledge of the existence of each other, and of the resources shared among them. On each node within the peer domain, RMC depends on a core set of cluster services, which include Topology Services, Group Services, and cluster security services. Figure 4-4 shows the high-level architecture of an RSCT peer domain.

Figure 4-4 RSCT peer domain (architecture)

Group Services are used in peer domains. If PowerHA V7 is installed, Topology Services are not used, and CAA is used instead. Otherwise, Topology Services are used too.

Combination of management and peer domains

You can have a combination of both types of domains (management domain and peer domains).

Figure 4-5 on page 92 shows the high-level architecture for how an RSCT managed domain and RSCT peer domains can be combined. In this example, Node Y is an RSCT management server. You have three nodes as managed nodes (Node A, Node B, and Node C). Node B and Node C are part of an RSCT peer domain.

You can have multiple peer domains within a managed domain. A node can be part of a managed domain and a peer domain. A given node can only belong to a single peer domain, as shown in Figure 4-5.

Figure 4-5 Management and peer domain (architecture)

Important: A node can only belong to one RSCT peer domain.

Example of a management and a peer domain

The example here is extremely simplified. It just shows one Hardware Management Console (HMC) that is managing three LPARS, where two of them are used for a 2-node PowerHA cluster.

In a Power Systems environment, the HMC is always the management server in the RSCT management domain. The LPARs are clients to this server from an RSCT point of view. For instance, this management domain is used to do dynamic LPAR (DLPAR) operations on the different LPARs.

Figure 4-6 shows this simplified setup.

Figure 4-6 Example management and peer domain

RSCT peer domain on Cluster Aware AIX (CAA)

When RSCT operates on nodes in a CAA cluster, a peer domain is created that is equivalent to the CAA cluster. This RSCT peer domain presents largely the same set of function to users and software as other peer domains not based on CAA. Consider a peer domain, which is operating without CAA, and autonomously manages and monitors the configuration and liveness of the nodes and interfaces that it comprises.

The peer domain that represents a CAA cluster acquires configuration information and liveness results from CAA. It introduces some differences in the mechanics of peer domain operations, but very few in the view of the peer domain that is available to the users.

Only one CAA cluster can be defined on a set of nodes. Therefore, if a CAA cluster is defined, the peer domain that represents it is the only peer domain that can exist, and it exists and be online for the life of the CAA cluster.

Figure 4-7 illustrates the relationship discussed in this section.

Figure 4-7 RSCT peer domain and CAA

When your cluster is configured and synchronized, you can check the RSCT peer domain using the lsrpdomain command. To list the nodes in this peer domain, you can use the command lsrpnode. Example 4-30 shows a sample output of these commands.

The RSCTActiveVersion number of the lsrpdomain output can show a back-level version number. This is the lowest RSCT version that is required by a new joining node. In a PowerHA environment, there is no need to modify this.

The value of yes for MixedVersions just means that you have at least one node with a higher version than the displayed RSCT version. The lsrpdnode command lists the actually used RSCT version by node.

Example 4-30 List RSCT peer domain information

# lsrpdomain

Name OpState RSCTActiveVersion MixedVersions TSPort GSPort

c2n1_cluster Online 3.1.5.0 Yes 12347 12348

# lsrpnode

lsrpnode

Name OpState RSCTVersion

c2n2.munich.de.ibm.com Online 3.2.1.0

c2n1.munich.de.ibm.com Online 3.2.1.0

Update the RSCT peer domain version

If you like, you can upgrade the RSCT version of the RSCT peer domain which is reported by the lsrpdomain command. To do this used the command listed in Example 4-31.

To be clear, doing such an update does not give you any advantages in a PowerHA environment. In fact, if you delete the cluster and then re-create it manually, or by using an existing snapshot of the RSCT peer domain version, you are back to the original version, which was 3.1.5.0 in our example.

Example 4-31 Update RSCT peer domain

# export CT_MANAGEMENT_SCOPE=2; runact -c IBM.PeerDomain
CompleteMigration Options=0

Check for CAA

To do a quick check on the CAA cluster, you can for instance use the lscluster -c command or use the lscluster -m command. Example 4-32 shows an example output of these two commands. For most situations, when you get an output of the lscluster command, CAA is up and running. To be on the safe side, you should use the lscluster -m command.

Example 4-32 shows that in our case CAA is up and running on the local node where we used the lscluster command. But on the remote node CAA was stopped.

To stop CAA, we used the clmgr off node powerha-c2n2 STOP_CAA=yes command.

Example 4-32 The lscluster -c and lscluster -m commands

# lscluster -c

Cluster Name: c2n1_cluster

Cluster UUID: d19995ae-8246-11e5-806f-fa37c4c10c20

Number of nodes in cluster = 2

Cluster ID for node c2n1.munich.de.ibm.com: 1

Primary IP address for node c2n1.munich.de.ibm.com: 172.16.150.121

Cluster ID for node c2n2.munich.de.ibm.com: 2

Primary IP address for node c2n2.munich.de.ibm.com: 172.16.150.122

Number of disks in cluster = 1

Disk = caa_r0 UUID = 12d1d9a1-916a-ceb2-235d-8c2277f53d06 cluster_major = 0 cluster_minor = 1

Multicast for site LOCAL: IPv4 228.16.150.121 IPv6 ff05::e410:9679

Communication Mode: unicast

Local node maximum capabilities: AUTO_REPOS_REPLACE, HNAME_CHG, UNICAST, IPV6, SITE

Effective cluster-wide capabilities: AUTO_REPOS_REPLACE, HNAME_CHG, UNICAST, IPV6, SITE

# lscluster -m | egrep "Node name|State of node"

Node name: powerha-c2n1.munich.de.ibm.com

State of node: DOWN

Node name: powerha-c2n2.munich.de.ibm.com

State of node: UP NODE_LOCAL

Peer domain on CAA linked clusters

Starting with PowerHA V7.1.2, linked clusters can be used. An RSCT peer domain that operates on linked clusters encompasses all nodes at each site. The nodes that comprise each site cluster are all members of the same peer domain.

Figure 4-8 shows how this looks from an architecture point of view.

Figure 4-8 RSCT peer domain and CAA linked cluster

Example 4-33 shows what the RSCT looks like in our 2-node cluster.

Example 4-33 Output of the lsrpdomain command

# lsrpdomain

Name OpState RSCTActiveVersion MixedVersions TSPort GSPort

primo_s1_n1_cluster Online 3.1.5.0 Yes 12347 12348

# lsrpnode

Name OpState RSCTVersion

primo_s2_n1 Online 3.2.1.0

primo_s1_n1 Online 3.2.1.0

Because we have defined each of our nodes to a different site, the lscluster -c command only lists one node. Example 4-34 shows an example output from node 1.

Example 4-34 Output of the lscluster command (node 1)

# lscluster -c

Cluster Name: primo_s1_n1_cluster

Cluster UUID: d34e8658-8894-11e5-8002-6e8ddb7b3702

Number of nodes in cluster = 2

Cluster ID for node primo_s1_n1: 1

Primary IP address for node primo_s1_n1: 192.168.100.20

Cluster ID for node primo_s2_n1: 2

Primary IP address for node primo_s2_n1: 192.168.100.21

Number of disks in cluster = 4

Disk = hdisk2 UUID = 2f1b2492-46ca-eb3b-faf9-87fa7d8274f7 cluster_major = 0 cluster_minor = 1

Disk = UUID = 6c1b76e1-3e0a-ff3c-3c43-cb6c3881c3bf cluster_major = 0 cluster_minor = 2

Disk = hdisk3 UUID = 20d93b0c-97e8-85ee-8b71-b880ccf848b7 cluster_major = 0 cluster_minor = 3

Disk = UUID = 5890b139-e987-1451-211e-24ba89e7d1df cluster_major = 0 cluster_minor = 4

Multicast for site primary_site1: IPv4 228.168.100.20 IPv6 ff05::e4a8:6414

Multicast for site standby_site2: IPv4 228.168.100.21 IPv6 ff05::e4a8:6415

Communication Mode: unicast

Local node maximum capabilities: CAA_NETMON, AUTO_REPOS_REPLACE, HNAME_CHG, UNICAST, IPV6, SITE

Effective cluster-wide capabilities: CAA_NETMON, AUTO_REPOS_REPLACE, HNAME_CHG, UNICAST, IPV6, SITE

Example 4-35 shows the output from node 2.

Example 4-35 Output of the lscluster command (node 2)

# lscluster -c

Cluster Name: primo_s1_n1_cluster

Cluster UUID: d34e8658-8894-11e5-8002-6e8ddb7b3702

Number of nodes in cluster = 2

Cluster ID for node primo_s1_n1: 1

Primary IP address for node primo_s1_n1: 192.168.100.20

Cluster ID for node primo_s2_n1: 2

Primary IP address for node primo_s2_n1: 192.168.100.21

Number of disks in cluster = 4

Disk = UUID = 2f1b2492-46ca-eb3b-faf9-87fa7d8274f7 cluster_major = 0 cluster_minor = 1

Disk = UUID = 20d93b0c-97e8-85ee-8b71-b880ccf848b7 cluster_major = 0 cluster_minor = 3

Disk = hdisk2 UUID = 5890b139-e987-1451-211e-24ba89e7d1df cluster_major = 0 cluster_minor = 4

Disk = hdisk1 UUID = 6c1b76e1-3e0a-ff3c-3c43-cb6c3881c3bf cluster_major = 0 cluster_minor = 2

Multicast for site standby_site2: IPv4 228.168.100.21 IPv6 ff05::e4a8:6415

Multicast for site primary_site1: IPv4 228.168.100.20 IPv6 ff05::e4a8:6414

Communication Mode: unicast

Local node maximum capabilities: CAA_NETMON, AUTO_REPOS_REPLACE, HNAME_CHG, UNICAST, IPV6, SITE

Effective cluster-wide capabilities: CAA_NETMON, AUTO_REPOS_REPLACE, HNAME_CHG, UNICAST, IPV6, SITE

4.4 IBM PowerHA, RSCT, and CAA

Starting with PowerHA V7.1, instead of the RSCT Topology Service, the CAA component is used in a PowerHA V7 setup. Figure 4-9 shows the connections between PowerHA V7, RSCT, and CAA (mainly the connection from PowerHA to RSCT Group services, and from there to CAA and back, are used). The potential communication to RMC is rarely used.

Figure 4-9 PowerHA, RSCT, CAA overview

4.4.1 Configuring PowerHA, RSCT, and CAA

There is no need to make any configuration RSCT or CAA. You just need to configure or migrate PowerHA, as shown in Figure 4-10 on page 99. To set it up, just use the smitty sysmirror screens or the clmgr command. The different migration processes operate in a similar way.

Figure 4-10 Set up PowerHA, RSCT, and CAA

4.4.2 Relationship between PowerHA, RSCT, CAA

This section describes, from a high-level point of view, the relationship between PowerHA, RSCT, and CAA. The intention of this section is to give you a general understanding of what is running in the background. The examples use in this section are based on a 2-node cluster.

In traditional situations, there is no need to use CAA or RSCT commands, because these are all managed by PowerHA.

All PowerHA components are up

In a cluster where the state of PowerHA is up on all nodes, you also have all of the RSCT and CAA services up and running, as shown in Figure 4-11.

Figure 4-11 All cluster services are up

To check if the services are up, you can use different commands. In the following examples, we use the clmgr, clRGinfo, lsrpdomain, and lscluster commands. Example 4-36 shows the output of the clmgr and clRGinfo PowerHA commands.

Example 4-36 Check PowerHA when all is up

# clmgr -a state query cluster

STATE="STABLE“

# clRGinfo

-----------------------------------------------------------------------------

Group Name Group State Node

-----------------------------------------------------------------------------

Test_RG ONLINE CL1_N1

OFFLINE CL1_N2

To check if RSCT is up and running, use the lsrpdomain command. Example 4-37 shows the output of the command.

Example 4-37 Check for RSCT when all components are running

# lsrpdomain

Name OpState RSCTActiveVersion MixedVersions TSPort GSPort

CL1_N1_cluster Online 3.1.5.0 Yes 12347 12348

To check if CAA is properly running, we use the lscluster command. You must specify an option when using the lscluster command. We used the option -m in our Example 4-38. In most cases, any other valid option can be used as well. However, to be absolutely sure, you should use the option -m.

In most cases the general behavior is that, when you get a valid output, CAA is running. Otherwise, you get an error message telling you that the Cluster services are not active.

Example 4-38 Check for CAA when all is up

# lscluster -m | egrep "Node name|State of node"

Node name: powerha-c2n1

State of node: UP

Node name: powerha-c2n2

State of node: UP NODE_LOCAL

One node stopped with Unmanage

In a cluster where the state of PowerHA is up on all Nodes, you also have all of the RSCT and CAA services running, as shown in Figure 3-10 on page 24.

In a cluster where one node is stopped with an Unmanage state, all of the underlying components (RSCT and CAA) need to stay running. Figure 4-12 illustrates what happens when LPAR A is stopped with an Unmanage state.

Figure 4-12 One node where all RGs are unmanaged

The following examples use the same commands as in “All PowerHA components are up” on page 99 to check the status of the different components. Example 4-39 shows the output of the clmgr and clRGinfo PowerHA commands.

Example 4-39 Check PowerHA, one node in state unmanaged

# clmgr -a state query cluster

STATE="WARNING“

# clRGinfo

-----------------------------------------------------------------------------

Group Name Group State Node

-----------------------------------------------------------------------------

Test_RG UNMANAGED CL1_N1

UNMANAGED CL1_N2

As expected, the output of the lsrpdomain RSCT command shows that RSCT is still online (see Example 4-40).

Example 4-40 Check RSCT, one node in state unmanaged

# lsrpdomain

Name OpState RSCTActiveVersion MixedVersions TSPort GSPort

CL1_N1_cluster Online 3.1.5.0 Yes 12347 12348

Also as expected, checking for CAA shows that it is up and running, as shown in Example 4-41.

Example 4-41 Check CAA, one node in state unmanaged

# lscluster -m | egrep "Node name|State of node"

Node name: powerha-c2n1

State of node: UP

Node name: powerha-c2n2

State of node: UP NODE_LOCAL

PowerHA stopped on all nodes

When you stop PowerHA on all cluster nodes, then you get a situation as illustrated in Figure 4-13. In this case, PowerHA is stopped on all cluster nodes but RSCT and CAA are still up and running. You have the same situation after a system reboot of all your cluster nodes (assuming that you do not use the automatic startup of PowerHA).

Figure 4-13 PowerHA stopped on all cluster nodes

Again, we use the same commands as in “All PowerHA components are up” on page 99 to check the status of the different components. Example 4-42 shows the output of the PowerHA commands clmgr and clRGinfo.

As expected, the clmgr command shows that PowerHA is offline, and clRGinfo returns an error message.

Example 4-42 Check PowerHA, PowerHA stopped on all cluster nodes

# clmgr -a state query cluster

STATE="OFFLINE“

# clRGinfo

Cluster IPC error: The cluster manager on node CL1_N1 is in ST_INIT or NOT_CONFIGURED state and cannot process the IPC request.

As mentioned previously, the output of the RSCT lsrpdomain command shows that RSCT is still online (Example 4-43).

Example 4-43 Check RSCT, PowerHA stopped on all cluster nodes

# lsrpdomain

Name OpState RSCTActiveVersion MixedVersions TSPort GSPort

CL1_N1_cluster Online 3.1.5.0 Yes 12347 12348

And as expected, the check for CAA shows that it is running, as shown in Example 4-44.

When RSCT is running, CAA needs to be up as well. Keep in mind that this statement is only true for a PowerHA cluster.

Example 4-44 Check CAA, PowerHA stopped on all cluster nodes

# lscluster -m | egrep "Node name|State of node"

Node name: powerha-c2n1

State of node: UP

Node name: powerha-c2n2

State of node: UP NODE_LOCAL

All cluster components are stopped

Remember, by default CAA and RSCT are automatically started as part of an operating system restart (if it is configured by PowerHA).

There are situations when you need to stop all three cluster components, for instance when you need to change the RSCT or CAA code, as shown in Figure 4-14.

For example, to stop all cluster components, use clmgr off cluster STOP_CAA=yes. For more details about starting and stopping CAA, see 4.4.3, “How to start and stop CAA and RSCT” on page 104.

Figure 4-14 All cluster services stopped

Example 4-45 shows the status of the cluster with all services stopped. As in the previous examples, we used the clmgr and clRGinfo commands.

Example 4-45 Check PowerHA, all cluster services stopped

# clmgr -a state query cluster

STATE="OFFLINE“

root@CL1_N1:/home/root# clRGinfo

Cluster IPC error: The cluster manager on node CL1_N1 is in ST_INIT or NOT_CONFIGURED state and cannot process the IPC request.

The lsrpdomain command shows that the RSCT cluster is offline, as shown in Example 4-46.

Example 4-46 Check RSCT, all cluster services stopped

# lsrpdomain

Name OpState RSCTActiveVersion MixedVersions TSPort GSPort

CL1_N1_cluster Offline 3.1.5.0 Yes 12347 12348

As mentioned in the previous examples, the output of the lscluster command creates an error message in this case, as shown in Example 4-47.

Example 4-47 Check CAA, all cluster services stopped

# lscluster -m

lscluster: Cluster services are not active on this node because it has been stopped.

4.4.3 How to start and stop CAA and RSCT

CAA and RSCT are stopped and started together. As mentioned in the previous section, CAA and RSCT are automatically started as part of an operating system boot (if it is configured by PowerHA).

If you want to stop CAA and RSCT, you must use the clmgr command (at the time this publication was written, SMIT does not support this operation). To stop it, you must use the STOP_CAA=yes argument. This argument can be used for both CAA and RSCT, and the complete cluster or a set of nodes.

Remember that the information when you stopped CAA manually is preserved across an operating system reboot. So if you want to start PowerHA on a node where CAA and RSCT has been stopped deliberately, you must use the START_CAA argument.

To start CAA and RSCT, you can use the clmgr command with the argument START_CAA=yes. Remember that this command also starts PowerHA.

Example 4-48 shows how to stop or start CAA and RSCT. Remember that all of these examples stop all three components or start all three components.

Example 4-48 Using clmgr to start and stop CAA, RSCT

To Stop CAA and RSCT:

- clmgr off cluster STOP_CAA=yes

- clmgr off node system-a STOP_CAA=yes

To Start CAA and RSCT:

- clmgr on cluster START_CAA=yes

- clmgr on node system-a START_CAA=yes

Starting with AIX V7.1 TL4 or AIX V7.2, you can use the clctrl command to stop or start CAA and RSCT. To stop it, use the -stop option for the clctrl command. Remember that this also stops PowerHA. To start CAA and RSCT, you can use the -start option. If -start is used, only CAA and RSCT are started. To start PowerHA, you must use the clmgr command, or use SMIT afterward.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Chapter 4. Whatâ€™s new with IBM Cluster Aware AIX and Reliable Scalable Clustering Technology

Create new playlist

Sign In

Sign Up

Table of Contents for
Chapter 4. Whatâ€™s new with IBM Cluster Aware AIX and Reliable Scalable Clustering Technology