Chapter 8. Good practices

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Good practices

This chapter provides useful information, suggestions, and good practices for managing the Hardware Management Console (HMC). It includes suggestions on planning for an HMC, initial configurations, security, problem determination, and code installation and maintenance.

This chapter describes the following topics:

•Planning

•Initial configuration

•Security

•Problem determination

•Maintaining Licensed Internal Code

•Maintaining system firmware

8.1 Planning

Planning should be done for any major change in any computing environment, including adding new servers, performing upgrades and implementing software changes. Careful planning involves creating a time line and dividing the project in phases, each with a specific, stated outcome. In effective planning, you draw up a list of assignments and responsibilities, and also document the current environment and the desired result.

To plan for the HMC, begin with some fairly simple questions:

•Is an HMC needed?

•What models are available?

•Where are they set up?

•How do they connect to servers they manage?

•How are they maintained and by whom?

•Where can I find documentation?

8.1.1 Do I need an HMC

Midrange and enterprise Power Servers need an HMC to create and manage logical partitions, dynamically reallocate resources, invoke Capacity on Demand (CoD), utilize Service Focal Point and facilitate hardware control. Two HMCs are suggested for enhanced availability (see 2.8.1, “Dual HMC and redundancy” on page 149). Mission-critical solutions, even those hosted on entry or midrange Power Servers, might benefit for having dual HMCs.

HMCs might not be cost-effective for distributed, entry-level systems that nevertheless require the capabilities of Advanced Power Virtualization. Entry-level servers without an HMC can be configured with a hosting partition called the IBM Integrated Virtualization Manager (IVM). It provides a subset of HMC functions and a single point of control for small system virtualization. IVM does not offer the full range of management capabilities found on an HMC, but it might be sufficient for a small server with one to eight processors.

Another solution might be the vHMC, with which you can run an HMC in a virtualized environment somewhere else in your environment, if you have spare resources available. For further information, see 3.2, “Virtual appliances installations” on page 185.

8.1.2 HMC models

The available hardware models for the HMC are described in 1.3, “Hardware Models” on page 40. The number of servers each HMC can manage varies by server size and complexity. The HMC performance can vary depending on the unique combination of servers and the number of partitions and I/O drawers implemented.

8.1.3 Physical location of an HMC

Locate an HMC close to the servers it manages, nominally 50 feet (15.24 meters). For remote administration, this is normally not necessary, but for service personnel using the HMC and its service applications to maintain systems and record service actions, it is necessary, however, in order to enable them to go back and forth between an HMC and a managed server during a service call.

8.1.4 Planning for network connectivity

Two types of networks are possible for an HMC:

•An open network

An open network is the easiest to describe. It means any standard network connection, such as might be used to connect an HMC and a logical partition, or an HMC and a remote workstation.

•A private network

The private network is a non-routable subnet. It is sometimes referred to as a service network. In the context of the HMC, a single HMC will nearly always be the DHCP server for a private network.

A server with dual HMCs would be connected to two private networks with each HMC acting as a DHCP server on a unique, non-routable subnet (see 2.8.1, “Dual HMC and redundancy” on page 149).

To attach multiple managed systems to one or a pair of HMCs, network switches might be required. If you are planning to implement private networks over a switch that supports virtual local area network (VLAN) technology, be sure that a broadcast from the service processor will reach the HMC DHCP server quickly before the service processor port goes to its default IP address. For example, if the switch port must have spanning tree enabled, it should also have PortFast or the equivalent enabled.

As an additional step, determine whether the switch requires that the network interface on the HMC be set to a specific speed or whether auto-detect may be used. Hubs generally require the HMC to be set to a specific speed and duplex setting.

8.1.5 Private versus open

The network connection between the HMC and the service processor can be either private or open. Private is preferred, and therefore a good practice.

On a private network, the HMC acts as a DHCP server for the managed system service processors. The IP address is assigned from a range of non-routable addresses selected by you when you configure DHCP on the HMC. The non-routable subnets isolate the HMC and the service processors from other HMC network interfaces.

An HMC can also manage service processors over an open network on low-end and mid-range systems. This scenario requires that the service processors be network reachable from the HMC. All HMC-to-service-processor communication is SSL-encrypted, whether over a private or open network.

In an open configuration, the service processor IP addresses must be set manually on each managed server. They cannot be DHCP clients of any server other than a managing HMC.

Addresses can be set by using the Advanced System Management Interface (ASMI) on the service processor. This involves directly connecting a notebook to one of the ports on the service processor using HTTPs to log into one of the two predefined IP addresses (see 6.10.2, “Connecting to ASMI” on page 472). If no notebook is available, an ASCII terminal can be used on the native serial port to access the FSP (service processor) menus in character mode.

Open networks are used for communications between a logical partition and the HMC. This connection is largely to facilitate traffic over the Resource Monitoring and Control (RMC) subsystem, which is the backbone of Service Focal Point (SFP) and required for dynamic resource allocation. The open network also is the means by which remote workstations might access the HMC, and it might be the path by which an HMC communicates with IBM Service through an Internet connection.

Regardless of which type of network is involved, you must provide your own networking infrastructure, such as cables, switches, or hubs. Switches that support virtual networks (VLAN) may be used to create one or more private or open networks as conditions require.

8.1.6 Customer setup

The HMC is a customer setup machine. Contracts for support can be purchased for one-year and three-year periods. Hardware service contracts for on-site hardware support are available beyond the initial warranty period.

Customers are responsible for installing and updating the Licensed Machine Code on all HMC and managed servers. An update strategy is discussed later. Systems administrators should become familiar with the information and tools available on the Fix Central website:

http://www.ibm.com/support/fixcentral/

As a system administrator, consider signing up for subscription service at the Fix Central website. With a subscription you will receive email notification of new releases of software and firmware.

8.1.7 Documentation

Documentation of the IBM Power Servers and the HMC is available through the IBM Knowledge Center:

http://www.ibm.com/support/knowledgecenter/POWER8/p8hdx/POWER8welcome.htm

The Knowledge Center enables you to print documents or portions of them, and also bookmark pages for easy reference in the future.

Carefully consider the interdependencies between HMC code levels and system firmware on Power servers. These relationships can be found on the firmware support web pages. Be sure you understand that new system firmware might require an upgrade of the HMC code. Because upgrading the HMC does not disrupt partition operations, and because dual or redundant HMCs are supported, keeping HMC code on the most current level is a preferred practice. The general rule is that the HMC must support the highest level of system firmware on any server that it manages.

8.2 Initial configuration

The HMC includes preinstalled licensed machine code, but you might need to reinstall if the code is superseded or your have a disk failure. Customized setup and installation instructions are included with all new machines or upgrades. Before you receive a new system, review the IBM Knowledge Center to determine what specific preparations might be needed:

http://www.ibm.com/support/knowledgecenter/POWER8/p8hdx/POWER8welcome.htm

8.2.1 Install and configure the HMC first

If you will be using a private network, install and configure your HMC before connecting it to a network or powering on the servers the HMC will manage. This approach enables you to configure the networks properly, start DHCP in an orderly manner, and ensure that the managed servers will connect properly.

The HMC includes a setup wizard (see 3.3, “HMC Install Wizard” on page 223) that can be used by a systems administrator to customize the system. The wizard starts the first time you log in after a new installation. You are not required to use it. Administrators who are comfortable with the HMC Configuration menus may use them instead. The setup wizard can be run at any time from the main menu on the HMC console (see 5.2.1, “Console Settings” on page 310).

8.2.2 Changing passwords

Whether you use the setup wizard or the HMC Configurations menu, the first task you should do is change the hscroot and root passwords from their defaults (see “Predefined user IDs and passwords” on page 3). Logging in as root is disabled. It can be enabled with assistance from IBM support when performing problem determination. Be sure to save all passwords in a secure location where they can be retrieved in an emergency.

8.2.3 Creating user IDs

Create additional user IDs on the HMC so that not every user is accessing the system with the same user ID and password, and not necessarily with the same level of authority.

Administrators with hmcsuperadmin authority, which is what hscroot has, should have their own user IDs and passwords. This will facilitate auditing administrative actions on the HMC. Other users may have other predefined roles with more restricted authority.

A special, optional user ID, hscpe, can be created initially or when needed. This is the user ID needed to gain root access to the HMC. The hscpe user can enter a password obtained from IBM that allows this user to run the pesh command to override the restricted shell and switch-user to root. This also requires that the user knows the root password. The password used to override the restricted shell is good for one day and must be obtained by contacting IBM Support and providing the HMC’s serial number.

8.2.4 Configuring the call-home capability

For many years, IBM has offered a call-home capability. This feature is the ability to automatically notify IBM Service in the event of a hardware problem, and also the ability to transmit other service related or vital product data as required.

On the HMC, the setup wizard prompts for the necessary information to configure call-home (see 6.4, “Connectivity” on page 434). You can also use the HMC menus for configuring customer information and customizing outbound and inbound communications for call-home. If customer information is not configured correctly, the HMC will not be able to “call home” for support.

Note: Inbound communication is optional. When allowed, you have real-time control over the inbound session, and it can be terminated any time.

The inbound and outbound communications are secured by SSL and you can use an SSL-Proxy that allows you to use Network Address Translation (NAT) firewalls between an HMC and IBM Support. In this way, the HMC’s true IP address can be hidden behind a corporate firewall and encrypted information can be sent to IBM.

8.2.5 Configure customer notification

During initial configuration, you can decide how events should be called to your attention.

Customer notifications can be configured to send email to customer accounts when service events are generated. The emails can be send to distribution lists. Most customers will want to use the filters to ensure that only serviceable events are sent using email, and not every message generated by the HMC. Optionally, SNMP traps may also be configured to send notifications to specific IP addresses when an event occurs, This can be used in conjunction with a network or system monitoring program.

8.3 Security

Physical security of the HMC is a customer responsibility. The HMC should be located in a secure room, if possible. Usually, because of its proximity to the servers it manages, the HMC will be located in a secured data center. However, when that is not possible, there are ways of providing additional protection against unauthorized physical access. These protections are mainly provided by changes in the BIOS settings on the chip that powers the HMC:

•Change the startup device settings in BIOS to prevent the use of a recovery DVD to boot into single-user mode.

•Assign a power-on password in BIOS to prevent unauthorized changes to BIOS settings.

•Unattended start mode can be set in BIOS to allow the HMC to reboot without the power-on password following restoration of power after an unplanned outage. However, the keyboard and mouse at the local console will remain locked until the power-on password is entered.

8.3.1 Network security

The HMC must be properly networked to perform its server management functions. The private or service network is used to communicate with service processors, and the open network is used to collect serviceable events from managed systems and to dynamically reallocate resources. A network is also the means by which remote administrators access and manage the HMC itself.

The HMC enables a firewall to block all incoming network traffic, with the exception of a well-known set of ports, which are listed in Table 8-1 on page 541. Within these well-known ports, further access restrictions can be customized based on IP address or host name.

Table 8-1 HMC port information

Port	Protocol	Application	Enabled by default	Notes
22	TCP	ssh	No
443, 12443, 9960	TCP	https	Yes
5989	TCP	Open Pegasus	No	Open source CIM
657	TCP/UDP	RMC	Yes
9920	TCP	FCS	Yes	Call home
9900	UDP	FCS	Yes	Call home
2300, 2301	TCP	5250 console	Yes
n/a	ICMP	ping	Yes
123	UDP	NTP	No
427	UDP	SLP	Yes	Used in cluster
12347, 12348	UDP	RSCT Peer Domain	Yes
162	TCP/UDP	SNMP traps	No
161	TCP/UDP	SNMP Agent	No
2049	TCP	NFS	No
69	TCP	TFTP	No
500, 4500	UDP	IPSec	No	VPN

The firewall interface allows you to customize remote access to the HMC by IP address and network mask. For further information, see “LAN Adapters Details - Firewall Settings” on page 265.

8.3.2 Network access between HMC and service processor

The HMC communicates with the service processor to perform its management functions. To do this, it establishes a Secure Sockets Layer (SSL) connection with port 30000 and 30001 of the service processor’s Ethernet port. Be sure that the network used for this communication channel is private, although an open network is supported.

8.3.3 Restricted shell on the HMC

The HMC provides a rich set of commands that encompasses most of the tasks found in the graphical user interface. These can be accessed by SSH. However, by itself, SSH can provide to an authenticated user full access to the shell. To protect the HMC from users trying to gain higher privileges by some means of exploiting the system, there is a restricted shell enforced when remotely connecting the HMC using SSH or when opening a local terminal on the HMC console. In the restricted shell environment, users will only have access to a small subset of operating system commands, along with the HMC commands. Users will not be able to use the cd command, and cannot use redirection.

For more information about the command line, see 5.1.2, “Command-line interface (CLI)” on page 307.

8.3.4 Auditing capabilities of the HMC

A secure system also requires strong auditing capabilities. This section describes some of the logging and auditing functions on the HMC.

Most tasks performed on the HMC (either locally or remotely) are logged by the HMC in the iqqylog.log file. These entries can be viewed by using the Console Events Log task, under Serviceability → Console Events Log or by using the lssvcevents command from the restricted shell. A log entry contains the time stamp, the user name, and the task being performed. When a user logs in to the HMC locally or from a remote client, entries are also recorded. For remote login, the client host name or IP address is also captured, as in the following example:

lssvcevents -t console

time=11/11/2015 09:52:55,"text=User hscroot has logged on from location 172.16.254.10 to session id 32. The user's maximum role is ""hmcsuperadmin""."

Standard log entries from syslogd can be also seen on the HMC by viewing the /var/hsc/log/secure file. This file can be read by users with the hmcsuperadmin role. It is under logrotate control. A valid user can simply use the cat or tail command to view the file. A user with the hmcsuperadmin role can also use the scp command to securely copy the file to another system.

If you want to copy syslogd entries to a remote system, you may use the chhmc command to change the /etc/syslog.conf file on the HMC to specify a system to which to copy. For example, the following command line causes the syslog entries to be sent to the myremotesys.company.com host name:

chhmc -c syslog -s add -h myremotesys.company.com

The systems administrator must be sure that the syslogd daemon running on the target system is set up to receive messages from the network. On most Linux systems, this can be done by adding the -r option to the SYSLOGD_OPTIONS in /etc/sysconfig/syslog file.

In AIX, edit the /etc/syslog.conf file by uncommenting the appropriate lines at the bottom of the file, such as these:

*.debug /tmp/syslog.out rotate size 100k files 4

*.crit /dev/console

Then, as a systems administrator, you enter the following lines:

# touch /tmp/syslog.out

# refresh -s syslogd

8.3.5 Managing and understanding security vulnerabilities on the HMC

As stated in 8.5.5, “Updates” on page 548, HMC users can subscribe to email notification of corrective service at the Fix Central website:

http://www.ibm.com/support/fixcentral/

Whenever a vulnerability is discovered on the HMC, a bulletin describing how to obtain the fix will be sent to users. In most cases, because of the closed nature of the HMC and the presence of the restricted shell, some vulnerabilities found on non-HMC systems will not apply. Each time a new release of the HMC code is made available on the support website, a list of security fixes included in the release is also published.

8.3.6 Resource Monitoring and Control (RMC)

The RMC is based on IBM Reliable Scalable Cluster Technology (RSCT). It is installed and used on the HMC for establishing a trusted communication channel between the HMC and the partitions on the managed server. The following examples describe tasks performed through this channel:

•Dynamic allocation of hardware resources on the partitions

•Graceful shutdown of the AIX operating systems running on the partitions

•Sending hardware error log entries from the AIX partitions to the HMC to provide a single focal point for error collection

RMC uses port 657 for HMC-to-partition communications. RMC employs access control lists to authenticate communication between the partitions and the HMC. The authentication is established during configuration steps on the HMC, thus, when transmitting messages over port 657, the HMC and the partition can be sure with whom they are communicating.

8.4 Problem determination

This section covers issues and problems that might be encountered during operation of the HMC itself. Although some mention is made of managed systems in the context of server and frame management, service applications, such as Electronic Service Agent or Service Focal Point, are not discussed.

8.4.1 Problem analysis

The HMC is composed of many subsystems and layers to its code stack. Every subsystem maintains a dynamic trace. The vital traces are stored in persistent files in the HMC file system. At periodic intervals, typically every hour, cron jobs are run, depending on the process, for those subsystems or applications that tend to generate heavier quantities of trace data. This cron job checks the respective trace files size to see if it has exceeded a fixed, static threshold. If the file does exceed this threshold, the trace file is backed up and a new trace file’s generation begins. Also, when the HMC is rebooted, the last running trace file for some subsystems will be backed up; for others it will be overwritten.

Typically up to eight backup copies are maintained at any given time. Although this might seem sufficient, the size of the trace files and number of backups maintained depend upon the HMC load. For example, in an environment where an HMC is managing two frames and 40 logical partitions, the amount of trace data generated can be voluminous over a short period. If IBM Support will be needed to diagnose an HMC problem, the trace files must be extracted from the HMC as soon as possible after the problem has been observed.

The HMC provides a pedbg command to assist in this process. This command can be run only as user hscpe with the hmcpe task role. Consult the man page for the full list of options this command provides. For further instruction about using it, see Appendix C, “IBM product engineering debug data collection” on page 575.

Although pedbg should suffice for enabling trace collection, situations can arise where having support gain root-level access on the HMC will be helpful. The HMC provides the pesh command to escape temporarily from the restricted shell. This command may be run only by user hscpe with task role hmcpe. This command accepts one parameter, the HMC serial number, which can be obtained by issuing the lshmc -v command. You are prompted for a password, which you must obtain from IBM support.

This password is valid until the end of the calendar day on which it was issued. Hence, a password obtained at 11:00 PM will expire one hour later, at midnight. When accepted, the password will give full shell access.

Note: Although the password will expire at the end of the calendar day, after you are logged in as user root, you may remain logged in indefinitely. However, staying logged in indefinitely is not recommended because it can be a security exposure. Be sure that this user ID is deleted after use and re-created, temporarily, as needed.

For instructions of how to gather analysis data of Live Partition Mobility (LPM), see Appendix D, “Live Partition Mobility support log collection” on page 581.

8.4.2 Problem logging and tracking

More detailed information about either information or error messages received during command execution can be obtained by using the showLog command. This command can be run in a Terminal in the HMC console by a user who has become root (to become root, follow the previous steps for obtaining a password to run in conjunction with the hscpe task role and the pesh command).

Any user, except users with the hmcviewer task role, can view system event information included in the console logs on the GUI by selecting Serviceability → Console Events Log. This task will display system events logged during HMC operation, enumeration HMC activity in response to user-initiated tasks, whether the command succeeded or failed. This will not display all entries in the HMC Console logs, but a subset of them. This task can also be executed on the command line by using the lssvcevents command with the -t console flag.

8.4.3 Problem correction

As mentioned previously, all HMC tasks (user-initiated and otherwise) require interactions between various subsystems on the HMC. Failures in one or more subsystems might occur, and a useful tactic is to isolate the failures if possible. For example, usually when a task fails it is a good idea to try another way to perform the same operation. Assume a task cannot be performed from the GUI; it was initiated but the GUI is not working. Typically, this means the panels is displayed but not available for use, especially after minimizing and maximizing the window. Check whether it can be done through the command line. If both ways do not work, the back end is likely to be the culprit.

Therefore, consider some possible scenarios in HMC system management. A common source of curiosity is HMC performance. Performance can suffer if trace and log files fill the HMC file system. Disk space usage can be checked with the following command:

monhmc -r disk

If any file system partition is in 100% use, issue the chhmcfs command, to free the space in the HMC file systems (see the man page for options).

The managed systems and frames can be in one of many states, as reported by the HMC. Among the states that cause confusion are these:

•No Connection

The HMC cannot build a valid connection to the service processor or BPC. The reason will be displayed as an error code on the GUI. If you believe that this state was reached in error, you can reset the network connection between the HMC and service processor by using one of the following procedures:

– Right-click on the managed system to open the pop-up menu (or select the managed system). Select Actions → Reset or Remove System Connection on the HMC Console.

You must have the hmcsuperadmin, hmcoperator, or hmcpe task role to perform this operation.

– Use the rmsysconn command (on the command line) with -o reset flag.

•Incomplete

The HMC is unable to gather all system information from the managed system or frame. In some cases this might be because of a network error causing a temporary disruption to HMC and service processor interactions, or managed system hardware configuration changes being performed for a redundant HMC. To verify, an attempt can be made to recover from this state by using either of the following procedures:

– Select the Rebuild System GUI task.

– Issue the chsysstate command with -o rebuild - r sys flags on the command line.

Neither procedure can be performed by the hmcviewer task role. If the state does not change, try resetting the HMC-service processor connection (see the previous bullet, No Connection), then try rebooting the HMC if resetting does not help. If the problem still persists, gather trace files and logs for support.

•Recovery

The save area of the service processor, where partition profile and some partition information is kept, might be corrupted, cleared, or out-of-sync with the cached copy the HMC maintains in its file system. First, whether the managed system has been updated recently would be good to know; firmware updates can clear Non Volatile Random Access Memory (NVRAM). If no system update has been performed recently, you can perform either of the following procedures:

– Restore the save area with the cached copy on the HMC; use the GUI or the CLI:

• Manage Partition Data - Restore task from the GUI

• chsysstate - o recover -r sys command from the CLI

– Clear all partition configuration information; use the GUI or the CLI:

• Manage Partition Data - Initialize task from the GUI

• rstprofdata -l 4 command from the CLI

Do not use this procedure unless you are willing to rebuild the partitions from scratch.

If neither approach works, gather trace files and logs.

A problem more severe than No Connection is the situation where no systems or frames appear where they had appeared before. Although many reasons exist for those, one common scenario observed is when a managed system or frame is removed from the HMC. This might have happened through the Remove Connection task or rmsysconn in the CLI.

When this system is then added back into the HMC’s management domain, the HMC (as DHCP server) will not redetect it. If you remove a managed system, and have reason to believe this HMC might again manage in the future, run the mksysconn -o auto command to purge the HMC of its management history and allow it once again to provide IP addresses to the managed server.

Another observed problem has been the GUI is not reflecting the true managed system or frame status or configuration. The CLI can be used to see if it gives up-to-date information. If it does, that means the GUI has either stopped receiving indication data, or no indications are being propagated to it. A Reload (F5) operation can refresh the GUI in this situation. If the command line is also incorrect, the HMC has stopped receiving event notifications from the managed system or frame. To recover from this situation, perform the Rebuild System task.

The service processor provides the HMC with the capability to set locks on the platform. The service processor does not interpret the locks, but rather leaves it up to the HMC functions to do that. These locks are used for synchronization of operations from one or two (dual) HMCs. In a dual HMC environment, situations can arise where both HMCs perform tasks against the same managed system and require the same lock.

When HMC 2 needs to perform a task that requires a lock that HMC 1 is currently holding, HMC 2 will wait and retry to acquire the lock. If after a few attempts it is unsuccessful, the operation will fail and the user will be notified accordingly. Although this blocked state is often mistaken for a “hang,” that is not the case. However, a possibility is for HMC 1 to acquire a lock and fail to release it. If this happens, HMC 2 can be used to disconnect HMC 1. When an HMC is disconnected, all locks owned by the HMC are reset. To do this, any hmcsuperadmin user can run the Disconnect Another HMC GUI task on HMC 2 against HMC 1. This can be done only from the GUI. No corresponding command-line version of this task exists.

8.5 Maintaining Licensed Internal Code

Make sure you keep track of new releases, updates and emergency fixes to HMC code. You can do this in one of two ways:

•Sign up for the technical support subscription service to receive emails when updates become available on the web.

•Monitor the Fix Central website, manually, on a regular basis:

http://www.ibm.com/support/fixcentral/

Read through the website carefully. Select the appropriate Power platform, whichever is appropriate for you. Many additional resources are on the site, such as links or extra technical information, hints and tips, and the latest command-line specifications, where you find new commands that may have been added.

You can order recovery DVDs or download packages that contain the files needed to burn your own recovery DVD, the files used to create DVDs have an .iso file extension. The DVDs created from these packages are bootable. You can download updates to HMC code and also emergency fixes, and you can order DVDs containing the updates and fixes. The DVDs containing updates and fixes are not bootable.

8.5.1 Management Console Data backups

An important task is to maintain a current backup of the Management Console Data (see 5.3.6, “Backup Management Console Data” on page 321) to use in recovering the HMC after the loss of a disk drive. Whenever you go to a new version level of HMC code, or use a recovery DVD to update the HMC, be sure to create a new backup immediately following the installation. If you update HMC code between releases using the Corrective Service files that are downloadable from the web, and then create a new backup of the Management Console Data after the update, you can use this backup and the last-used recovery DVD to rebuild the HMC to the level in use when the disk was lost.

Another example where a Management Console Data backup can be useful is when replacing an service processor or BPC server. Create a fresh backup before starting the replacement in order to preserve the DHCP lease file on the HMC that lists the starting service processor and BPC IP addresses. If for some reason things do not work after replacing the service processor or BPC, this backup can be used to restore the original information so you can return to the original service processor and BPC. If the replacement is successful, a new IP address will be assigned to the new component and the lease file will be updated. At this point, a new backup should be created capturing that freshly updated DHCP lease file.

8.5.2 HMC code installation, upgrade, or update overview

When HMC systems leave manufacturing, they are preinstalled with the most recent level of code. However, there might be a time when reinstallation is needed. The HMC can be installed using DVD media, or installed over the network by using a server that accepts Preboot Execution Environment (PXE) requests.

8.5.3 Install and recovery

Installation is the simplest form of applying code on the HMC. It is used by manufacturing to install code prior to shipping the HMC. When the HMC is at the customer locations, an installation should be needed only for disaster recovery or when the customer wants to reload the HMC from scratch. In disaster recovery, a systems administrator can use an appropriate recovery DVD and the Management Console Data backup to get the HMC back to the state it was in prior to the failure. An old Management Console Data backup should not be used after upgrading to a newer version or release of an HMC. A new Management Console Data backup should be created as mentioned previously.

8.5.4 Upgrade

Be sure you can distinguish between updating and upgrading a system, The terms are not synonymous. To upgrade is to bring the system to a higher version or release of HMC code. When the HMC’s version number is incremented, such as going from Version 7 to Version 8, the upgrade method must be used in order to apply the new version of HMC code. Prior to an upgrade the systems administrator should perform a Save Upgrade Data (see 5.3.8, “Save Upgrade Data” on page 323) to preserve configuration information on the HMC, like network settings, user data and partition profiles. This data is saved in a special location on the HMCs hard disk that will not be erased during the upgrade process. When the upgrade process completes, the data will be restored to the HMCs file system.

Only perform a Save Upgrade Data when you are upgrading an HMC. Do not use it when performing service work on a Power server. If you are planning to service or replace an service processor on a Power server, do a Management Console Data task backup first.

8.5.5 Updates

Between HMC releases, or between upgrades, interim fixes or cumulative service packs might need to be applied. These are types of updates. Interim fixes consist of security fixes that are considered critical to be released immediately to customers. Service packs are generally larger in content. Both can be installed on the HMC by using the Update the Hardware Management Console task (see 5.3.4, “Update the Hardware Management Console” on page 320), or by using the updhmc command on the HMC.

The HMC uses Version, Release, and Maintenance (VRM) nomenclature to describe operating system releases. The HMC version and release information can be viewed with the About function (see “Getting Started with Hardware Management Console (HMC) window” on page 296). From the command line, you can get the current code level by issuing the lshmc -V command.

Every version of HMC code is made available on bootable recovery media. Within a version, releases are available as either recovery DVD or downloadable corrective service files.

Corrective service is a cumulative maintenance release within a single version that customers can use to update the HMC from any previous releases within the same version. For example, a customer who is currently at Version 8 Release 1 or Version 8 Release 3 can update to Version 8 Release 4 using the same corrective service.

Corrective service is not provided whenever the version number is incremented, for example from Version 7 to Version 8. If this happens, only Recover media may be used to perform the upgrade.

Corrective services is relatively easy to apply by using the Update the Hardware Management Console task on the HMC console or by running the updhmc command. As successive corrective service updates are installed, the size of the Critical Console Data increases. The Backup Management Console Data task backs up both data and binary changes on the HMC. Over time, this can mean that the size of the backup will be quite large. To shrink the size of the backup, update with Recovery media after performing a Save Upgrade Data task. The latter step only saves needed user and configuration data. After the update, a new Management Console Data backup can be made, and it will be smaller.

Interim fixes or service packs are also treated as corrective service, and they are installed in the same manner as described previously. The difference is primarily the size and how they are shown to users. Customers will see a Program Temporary Fix (PTF) value associated with the interim fix or service pack when using a command such as lshmc -V, for example:

MH01453: Maintenance Package for V8R8.2 (11-14-2014)

8.5.6 HMC code update on multiple machines

Some clients have a large number of HMCs. Updating code on a large number of machines can be time-consuming, especially if manual intervention or physical access to the local HMC is needed. Fortunately, methods are available to overcome this problem.

Remote command

There is a rich set of commands on the HMC. These commands are available locally and also remotely using SSH, which allows a remote workstation installed with SSH client software to remotely execute commands on the HMC. The updhmc command is such a command that allows interim fixes, service packs, and cumulative maintenance releases to be remotely installed on the HMC. The following example illustrates a scenario where an HMC code update is performed simultaneously on multiple machines from a remote workstation.

From the remote system installed with SSH, generate public key files with the ssh-keygen command using an empty passphrase and deploy the file to all the HMC. In Example 8-1, the HMCs host names are hmc1 through hmc7.

Example 8-1 Update multiple machines simultaneously

for i in 1 2 3 4 5 6 7

scp hmc_update.zip hscroot@hmc$i:/home/hscroot

done

for i in 1 2 3 4 5 6 7

ssh hscroot@hmc$i “updhmc -t disk -f /home/hscroot/hmc_update.zip -c -r”

done

The first for loop in the example copies an interim fix whose file name is hmc_update.zip, to seven HMCs. The second for loop runs the updhmc command for each of the same seven HMCs. When the command finishes, it removes the update file and reboots the HMC.

8.5.7 HMC code remotely install/upgrade

The traditional method many administrators use to upgrade their HMC is to use recovery media; the procedures for installing or upgrading the HMC are well-documented. Upgrades performed remotely are becoming more popular. This section illustrates an example of performing an HMC upgrade remotely.

HMC CLI commands used

During the remote upgrade process, you will use the following commands:

getupgfiles Retrieve network installation images.

chhmc Set up alternate disk boot method.

hmcshutodwn Reboot the HMC.

updhmc Apply corrective service patch.

IBM FTP repository for HMC images

Both the getupgfiles command and updhmc command syntax in following example use an IBM FTP server. If your HMC can get to the IBM FTP server used in this example then you can enter the commands exactly as shown. If not, you can use your own FTP server, SFTP server, or NFS server (see man pages). The IBM FTP repository for HMC and also other product updates is ftp.software.ibm.com and HMC has separate directories for various types of fixes as follows:

Network install images: /software/server/hmc/network

HMC updates: /software/server/hmc/updates

Corrective service fixes: /software/server/hmc/fixes

Example command syntax used in remote upgrade

The following example assumes that your HMC is at V7R7.2 and you want to upgrade to V8R8.4. Here are the steps for the upgrade:

1. Prior to starting an upgrade a good practice is to use the following commands first:

chsvcevents -o closeall

chhmc -o f -d 0

hmcshutdown -t now -r

2. Save upgrade data to HMC hard disk:

saveupgdata -r disk

3. Download the network install images to HMC:

getupgfiles -h ftp.software.ibm.com -u anonymous --passwd ftp

-d /software/server/hmc/network/v8840

Note: The getupgfiles operation will mount the /hmcdump file system, copy the install files into the directory, and then unmount the file system.

4. Set the HMC to boot from an alternate disk partition:

chhmc -c altdiskboot -s enable --mode upgrade

5. Reboot the HMC to begin the upgrade:

hmcshutdown -r -t now

Note: The HMC will boot from the alternate disk partition then start processing the upgrade files, a process that takes some time. Most of the installation is complete between one to two hours.

6. After the upgrade is completed you might need to install the available corrective service fixes, which you can do from the command line as follows:

updhmc -t s -h ftp.software.ibm.com -u anonymous -p ftp

-f /software/server/hmc/updates/HMC_Update_<Version>.iso /r

Where <Version> is the version number of the service pack.

Considerations when doing a remote network upgrade

Although the HMC CLI environment is restricted, some common scripting commands can be used to monitor the status of network image downloads, which can be constructed as hscroot:

while true ; do

date

ls -la /hmcdump

sleep 60

done

Typically the file system /hmcdump remains mounted until the getupgfiles command completely exits. You can use the commands to see the files collected in /hmcdump to ensure the sizes are correct.

Post upgrade verification

You can use the command lshmc -V post upgrade to verify the build level of your HMC.

8.5.8 Network installation, update, backup, and restore

Since HMC Version 5 Release 1.0, you can select the integrated network adapter in your HMC as a start-up or IPL device. This approach allows the HMC to contact a remote system that supports PXE requests to perform installation, upgrade, backup, or restore operations on the HMC. To perform a secure backup/restore operation over the network, the remote system must have an SSH server running. The PXE setup is explained in Appendix E, “Preboot Execution Environment” on page 587.

8.5.9 Tips for maintaining Licensed Internal Code

Be sure to keep track of new releases, updates, and emergency fixes to HMC code. You can do this in one of two ways:

•Sign up for the technical support subscription service to receive emails when updates become available on the web

•Monitor the web manually on a regular basis at the Fix Central website:

http://www.ibm.com/support/fixcentral/

Code and resources on the web

Read the website carefully. Select the correct HMC version, whichever is appropriate. The Fix Central website has many resources, such as these examples:

•Links to additional technical information.

•Hints and tips.

•The latest command-line specification (command-line reference).

•Recovery DVDs to order.

•Download packages that contain ISO files needed to burn you own recovery media.

Note: This media is bootable.

•Download updates to HMC code and also emergency fixes, or order DVDs containing the updates and fixes.

Note: This media is not bootable.

Maintain backups

Maintain a current Management Console Data backup. If you use Recover Media to update your Licensed Internal Code to a new release level, make a new backup after the upgrade process. A Management Console Data backup created at V8R8.1 will not work on a system that was upgraded to V8R8.4 using a recovery DVD. However, if you are trying to recover after losing a disk on a system that was updated using the Corrective Service files, you can use a Management Console Data backup (created at V8R8.4) with your V8R8.4 Recovery DVD.

8.6 Maintaining system firmware

The naming convention for system firmware update files is as follows:

01WW_XXX_YYY_ZZZ

Where:

•WW is an identifier, consisting of two letters.

•XXX is the release level.

•YYY is the service pack level.

•ZZZ is the latest disruptive service pack level.

Upgrades from one release level to another (XXX) are always disruptive, meaning you must restart your managed system (do an IPL again). Updates between service pack levels may be run concurrently, but you need to check.

You might need to upgrade existing HMC code to support a new server that is running the latest system firmware. As soon as you upgrade the HMC for the new server, you should plan to upgrade the existing managed server to the new system firmware level. Upgrade the HMC code before upgrading system firmware on the existing server or attaching new servers.

8.6.1 Concurrent versus disruptive updates

An installation is disruptive if the following statements are true:

•The release levels (XXX) differ.

•The service pack level (YYY) and the last disruptive service pack level (ZZZ) are the same.

•The service pack level (YYY) currently installed on the system is lower than the last disruptive service pack level (ZZZ) of the service pack to be installed.

An installation is concurrent if both of the following statements are true:

•The release level (XXX) is the same.

•The service pack level (YYY) currently installed on the system is the same or greater than the last disruptive service pack level (ZZZ) of the service pack to be installed.

8.6.2 Memory considerations for firmware upgrades

Firmware release level upgrades and service pack updates can consume additional system memory. Server firmware requires memory to support the logical partitions on the server. The amount of memory required by the server firmware varies according to several factors:

•Number of logical partitions

•Partition environments of the logical partitions

•Number of physical and virtual I/O devices used by the logical partitions

•Maximum memory values given to the logical partitions

Generally, you can estimate the amount of memory required by server firmware to be approximately 8% of the system installed memory. The actual amount required will generally be less than 8%. However, some server models require an absolute minimum amount of memory for server firmware, regardless of the previously mentioned considerations.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Chapter 8. Good practices

Create new playlist

Sign In

Sign Up

Table of Contents for
Chapter 8. Good practices