THE FOLLOWING ITIL INTERMEDIATE EXAM OBJECTIVES ARE DISCUSSED IN THIS CHAPTER:
The ITIL service design publication covers the managerial and supervisory aspects of service design processes. It excludes the day-to-day operation of each process and the details of the process activities, methods, and techniques or its information management. More detailed process operation guidance is covered in the service capability courses. Each process is considered from the management perspective. That means that at the end of this chapter, you should understand those aspects that would be required to understand each process and its interfaces, oversee its implementation, and judge its effectiveness and efficiency.
ITIL states that capacity management is responsible for ensuring that the capacity of IT services and the IT infrastructure is able to meet agreed current and future capacity and performance needs in a cost-effective and timely manner. The capacity management process must therefore understand the likely changes in capacity requirements and ensure that the design and ongoing management of a service meet this demand. Sufficient capacity is a key warranty aspect of a service that needs to be delivered if the benefits of the service are to be realized.
Capacity management is considered throughout the lifecycle; as part of strategy, the likely capacity requirements for a new service are considered as part of the service evaluation to ensure that the service is meeting a real need. In design, the service is engineered to cope with that demand and to be flexible enough to be able to adjust to meet changing capacity requirements. Transition ensures that the service, when implemented, is delivering according to its specification. The operational phase of the lifecycle ensures that day-to-day adjustments that are necessary to meet changes in requirements are implemented. Finally, as part of continual service improvement, capacity-related issues are addressed and adjustments are made to ensure that the most cost-effective and reliable delivery of the service is achieved.
The purpose of the capacity management process is to understand the current and future capacity needs of a service and to ensure that the service and its supporting services are able to deliver to this level. The actual capacity requirements will have been agreed upon as part of service level management; capacity management must not only meet these, but also ensure that the future needs of the business, which may change over time, are met.
The objectives of capacity management are met by the development of a detailed plan that states the current business requirement, the expected future requirement, and the actions that will be taken to meet these requirements. This plan should be reviewed and updated at regular intervals (at least annually) to ensure that changes in business requirements are considered. Similarly, any requests to change the current configuration will be considered by capacity management to ensure that they are in line with expectations or, if not, that the capacity plan is amended to suit the changed requirement. Those responsible for capacity management will review any issues that arise and help resolve any incidents or problems that are the result of insufficient capacity. This helps ensure that the service meets its objectives.
An essential objective is to make sure capacity is increased in a timely manner so the business is not impacted.
As part of the ongoing management of capacity and its continual improvement, any proactive measures that may improve performance at a reasonable cost are identified and acted upon. Advice and guidance on capacity and performance-related issues are provided, and assistance is given to service operations with performance- and capacity-related incidents and problems.
The capacity management process has responsibility for ensuring sufficient capacity at all times, including both planning for short-term fluctuations, such as those caused by seasonal variations, and ensuring that the required capacity is there for longer-term business expansion. Changes in demand may sometimes actually be reductions in that demand, and this is also within the scope of the process. Capacity management should ensure that as demand for the service falls, the capacity provided for that service is also reduced to ensure that unnecessary expenditure is avoided.
The process includes all aspects of service provision and therefore may involve the technical, applications, and operations management functions. Other aspects of capacity, such as staff resources, are also considered.
As the example in the case study “Capacity Management” illustrates, an increase in capacity requirements may have repercussions across the infrastructure and on the IT staff resources required to manage it. Although staffing is a line management responsibility, the calculation of resource requirements in this area is also part of the overall capacity management process.
Capacity management also involves monitoring “patterns of business activity” to understand how well the infrastructure is meeting the demands upon it and making adjustments as required to ensure that the demand is met. Proactive improvements to capacity may also be implemented, where justified, and any incidents caused by capacity issues need to be investigated.
Capacity management may also recommend demand management techniques to smooth out excessive peaks in demand.
Capacity management provides value to the business by improving the performance and availability of IT services the business needs; it does so by helping to reduce capacity- and performance-related incidents and problems. The process will also ensure that the required capacity and performance are provided in the most cost-effective manner.
All processes should be contributing in some way to the achievement of customer satisfaction, and capacity management does this by ensuring all capacity- and performance- related service levels are met.
Proactive capacity management activities will support the efficient and effective design and transition of new or changed services. This will include the production of a forward-looking capacity plan based on a sound understanding of business needs and plans.
As with availability management, capacity management will have the opportunity to improve the ability of the business to follow an environmentally responsible strategy by using green technologies and techniques.
Capacity management is essentially a balancing act. It ensures that the capacity and performance of the IT services and systems match the evolving demands of the business in the most cost-effective and timely manner. This requires balancing the costs against the resources needed. Capacity management needs to ensure that the processing capacity that is purchased is cost justifiable in terms of business need. It ensures that the organization makes the most efficient use of those resources.
Capacity management is also about balancing supply against demand. It is important to ensure that the available supply of IT processing power matches the demands made on it by the business, both now and in the future. It may also be necessary to manage or influence the demand for a particular resource.
The policies for capacity management should reflect the need for capacity management to play a significant role across the service lifecycle.
It is important to ensure that capacity management is part of the consideration for all service level and operational level agreements, and of course any supporting contracts with suppliers. These agreements will capture the service requirements of the business, and capacity management should consider these for the current and future business needs.
We are not going to explore the process in detail, but you should make sure you are familiar with all the aspects of the process and the management requirements for each.
In Figure 14.1, you can see the full scope of the subprocesses, techniques, and activities for the capacity management process.
Capacity management is an extremely technical, complex, and demanding process, and in order to achieve results, it requires three supporting subprocesses: business capacity management, service capacity management, and component capacity management.
Business capacity management is focused on the current and future business requirements, while service capacity management is focused on the delivery of the existing services that support the business and component capacity management is focused on the IT infrastructure that underpins service provision.
It is important to ensure that the tools used by capacity management conform to the organization’s management architecture and also integrate with other tools used for the management of IT systems and automating IT processes.
The monitoring and control activities within service operation should provide a basis for the tools to support and analyze information for capacity management. The IT operations management function and the technical management departments (such as network management and server management) may carry out the bulk of the day-to-day operational duties. They will participate in the capacity management process by providing it with performance information.
Like availability management, capacity management has both reactive and proactive activities. In Figure 14.2, you can see the activities relating to both reactive and proactive capacity management and the interaction between the subprocesses.
Capacity management should include the following proactive activities:
Capacity management should include the following reactive activities:
There are a number of ongoing activities that form part of the capacity management process. These activities provide the basic historical information and triggers necessary for all of the other activities and processes within capacity management. These can be seen in Figure 14.3.
Monitoring should be established on all the components and for each of the services. The data from the monitoring systems should be analyzed using expert systems to compare usage levels against thresholds. The results of the analysis should be included in reports and used to make recommendations for management of the systems. Control mechanisms should then be put in place to act on the recommendations.
There are many different approaches to managing capacity, including balancing services, balancing workloads, changing concurrency levels, and adding or removing resources. The information accumulated during these activities should be stored in the capacity management information system (CMIS).
This is a cyclic activity, and any changes should be monitored to make sure they deliver a positive benefit. These iterative activities are primarily performed as part of the service operation stage of the service lifecycle.
Let’s consider the triggers, inputs, and outputs for the capacity management process. Capacity management is a process that has many active connections throughout the organization and its processes. It is important that the triggers, inputs, outputs, and interfaces be clearly defined to avoid duplicated effort or gaps in workflow.
There are many triggers that will initiate capacity management activities:
A number of sources of information are relevant to the capacity management process:
The outputs of capacity management are used within the process itself as well as by many other processes and other parts of the organization. The information is often reproduced in an electronic format as visual real-time displays of performance. The outputs are as follows:
As we have already explained, capacity management has strong connections across the service lifecycle with a number of other processes. The key interfaces are as follows:
The CMIS is used to provide the relevant capacity and performance information to produce reports and support the capacity management process. The reports provide information to a number of IT and service management processes. These should include the reports described in the following sections.
There is likely to be a team of technical staff responsible for each component, and they should be in charge of their control and management. Reports must be produced to illustrate how components are performing and how much of their maximum capacity is being used.
Service-based reports will provide the basis of SLM and customer service reports. Reports and information must be produced to illustrate how the service and its constituent components are performing with respect to their overall service targets and constraints.
Exception reports can be used to show management and technical staff when the capacity and performance of a particular component or service becomes unacceptable. Thresholds can be set for any component, service, or measurement within the CMIS. An example threshold may be that processor utilization for a particular server has breached 70 percent for three consecutive hours or that the concurrent number of logged-in users exceeds the agreed limit.
In particular, exception reports are of interest to the SLM process in determining whether the targets in SLAs have been breached. Also, the incident and problem management processes may be able to use the exception reports in the resolution of incidents and problems. Excess capacity should also be identified. Unused capacity may represent an opportunity for cost savings.
Part of the capacity management process is to predict future workloads and growth. To do this, future component and service capacity and performance must be forecast. This can be done in a variety of ways, depending on the techniques and the technology used. A simple example of a capacity forecast is a correlation between a business driver and component utilization. If the forecasts on future capacity requirements identify a requirement for increased resource, this requirement needs to be input into the capacity plan and included within the IT budget cycle.
Often capacity reports are consolidated and stored on an intranet site so that anyone can access and refer to them.
The following list includes some sample critical success factors for capacity management and some key performance indicators for each.
One of the major challenges facing capacity management is persuading the business to provide information on its strategic business plans. Without this information, the IT service provider will find it difficult to provide effective business capacity management. Where there may be commercial or confidential reasons this data cannot be shared, it becomes even more challenging for the service provider.
Another challenge is the combination of all of the component capacity management data into an integrated set of information that can be analyzed in a consistent manner. This is particularly challenging when the information from the different technologies is provided by different tools in differing formats.
The amount of information produced by business capacity management, and especially service capacity management and component capacity management, is huge, and the analysis of this information is often difficult to achieve.
It is important that the people and the processes focus on the key resources and their usage, without ignoring other areas. For this to be done, appropriate thresholds must be used, and reliance must be placed on tools and technology to automatically manage the technology and provide warnings and alerts when things deviate significantly from the norm.
The following list includes some of the major risks associated with capacity management:
It is a fact that a service delivers value only when it is available for use. In addition to the activities carried out under the availability management process, there is a requirement for the IT service provider to ensure that the service is protected from catastrophic events that could prevent it from being delivered at all. Where these cannot be avoided, there is a requirement to have a plan to recover from any such disruption in a timescale and at a cost that meets the business requirement. Ensuring IT service continuity is an essential element of the warranty of the service.
It is important to understand that IT service continuity management (ITSCM) is responsible for the continuity of the IT services required by the business. The business should have a business continuity plan to ensure that any potential situations that would impact the ability of the business to function are identified and avoided. Where it is not possible to avoid such an event, the business continuity management process should have a plan, which is appropriate and affordable, to both minimize its impact and recover from it. Thus, ITSCM can be seen as one of a number of elements supporting a business continuity management (BCM) process, along with a human resources continuity plan, a financial management continuity plan, a building management continuity plan, and so on.
The purpose of the IT service continuity management process is to support the overall business continuity management (BCM) process. It is not a replacement for business continuity, even though many organizations could not survive without their IT service provider. It is important that this process understands the business continuity requirements. The service provider can then support these requirements by ensuring that, through managing the risks that could seriously affect IT services, the IT service provider can always provide the minimum agreed business continuity-related service levels.
To support and align with the BCM process, ITSCM uses formal risk assessment and management techniques to reduce risks to IT services to agreed acceptable levels. The service provider will plan and prepare for the recovery of IT services to meet these agreed levels.
A key objective of IT service continuity management is to produce and maintain a set of IT service continuity plans that support the overall business continuity plans of the organization. This will require complete and regular business impact analysis exercises to ensure that all continuity plans are maintained in line with changing business impacts and requirements.
A further objective is to conduct regular risk assessment and management exercises to manage IT services within an agreed level of business risk. This should be completed in conjunction with the business and the availability management and information security management processes.
As with all the service design processes, this process has an objective to provide advice and guidance to all other areas of the business and IT on all continuity-related issues.
IT service continuity should also ensure that appropriate continuity mechanisms are put in place to meet or exceed the agreed business continuity targets. This will require the assessment of the impact of all changes on the IT service continuity plans and supporting methods and procedures.
Working with availability management, the process should ensure that cost-justifiable proactive measures to improve the availability of services are implemented.
Service continuity management should also negotiate and agree on contracts with suppliers for the provision of the necessary recovery capability to support all continuity plans in conjunction with the supplier management process.
When we consider the scope of IT service continuity management, it is important to understand that the process focuses on events that the business considers significant enough to be treated as a disaster. Less significant events will be dealt with as part of the incident management process.
Each organization will have its own understanding of what constitutes a disaster. The scope of IT service continuity management within an organization is determined by the organizational structure, culture, and strategic direction (both business and technology) in terms of the services provided and how these develop and change over time.
IT service continuity management first considers the IT assets and configurations that support the business processes. The process is not normally concerned with longer-term risks such as those from changes in business direction or other business-related alterations. Similarly, it does not usually cover minor technical faults (for example, noncritical disk failure) unless there is a possibility that the impact on the business could be major.
The IT service continuity management process includes the agreement of the scope of the ITSCM process and the policies adopted to support the business requirements. The process will also carry out business impact analysis to quantify the impact that the loss of IT service would have on the business.
It is important to establish the likelihood of potential threats taking place by carrying out risk assessment and management. This also includes taking measures to manage the identified threats where the cost can be justified. The approach to managing these threats will form the core of the ITSCM strategy and plans.
Essential to the process is the production of an overall IT service continuity management strategy that must be integrated into the business continuity management strategy. This should be produced by using both risk assessment and management and business impact analysis. The strategy should include cost-justifiable risk reduction measures as well as selection of appropriate and comprehensive recovery options.
As part of the strategy, there should be the requirement to produce an IT service continuity plan, which should integrate with the business continuity plan. These plans should be tested and managed as part of the ongoing operation. This will require regular testing and maintenance to ensure that they are in alignment with business continuity management.
IT service continuity management is a vital part of the assurance and management of IT service provision for an organization because it supports the business continuity process. It can often be used to provide the justification for business continuity processes and plans by raising awareness of the impact of failures to the organization.
The process should be driven by business risk as identified by business continuity and ensure that the recovery arrangements for IT services are aligned to identified business impacts, risks, and needs.
IT service continuity management is a repeating, cyclic process. As the needs of the organization change, so will the requirements for continuity and recovery, so the process must be continually reviewed and the output verified for effectiveness. The process is shown in Figure 14.4.
Initiation The process is structured in four stages. The first is initiation, where the policies and scope of the continuity requirement are established in alignment with the business continuity requirements. It is during this stage that, if the scope requires it, a project management approach will be adopted.
Requirements and Strategy In the next stage, requirements and strategy, the activities of business impact analysis and risk assessment and management are carried out. This will allow the strategy for continuity to be developed.
Implementation Implementation of the strategy requires the development of the IT service continuity plans, including the recovery plans and procedures. This stage is where the risk reduction measures are implemented and the initial testing of the plans is carried out. Once these are found to be successful in supporting the business requirements, the operational stage begins.
Ongoing Operation During the operational stage, it will be important to ensure that there is adequate information delivered to the organization, through education, awareness, and training. The plans should be regularly reviewed and audited to ensure that they meet the ongoing requirements of the business. This will require an association with the change management process, and the plans and procedures for continuity should be subject to change procedures. Regular testing is part of this stage, and the results of testing will be fed back into the process.
Invocation It is important to ensure that there is a clearly understood mechanism and definition of when to invoke the continuity plans. This is not a stage of the process, as such, but it is a vital part of the process, because the establishment of the trigger for implementing the continuity plan is very important.
We will now review the triggers, inputs, and outputs of IT service continuity management.
Many events may trigger IT service continuity management activity, including new or changed business needs, new or changed services, and new or changed targets within agreements, such as service level requirements, service level agreements, operational level agreements, and contracts.
Major incidents that require assessment for potential invocation of either business or IT continuity plans are another trigger for the process, as are periodic activities such as the business impact analysis and risk assessment activities; maintenance of continuity plans; and other reviewing, revising, or reporting activities.
Assessment of changes and attendance at change advisory board meetings should be a part of the process scope, because it is here that there will be opportunity to review and revise business and IT plans and strategies in light of altering business needs, which may trigger changes to the process output. This will include the review and revision of designs and strategies, both for the business and IT service provider.
Other triggers will include the recognition or notification of a change in the risk or impact of a business process or vital business function, an IT service, or a component. The results of testing the plans and lessons learned from previous continuity events will also provide triggers for the process.
There are many sources of input required by the ITSCM process:
The outputs from the ITSCM process are as follows:
IT service continuity should have interfaces to all other processes across the whole service lifecycle.
Important examples are as follows:
ITSCM needs to record all of the information necessary to maintain a comprehensive set of ITSCM plans. This information base should include the following items:
All the preceding information needs to be integrated and aligned with all BCM information and all the other information required by ITSCM. Interfaces to many other processes are required to ensure that this alignment is maintained.
The following list includes some sample critical success factors for ITSCM.
We’ll begin with looking at the key challenges for the process and then look at the risks.
A major challenge facing ITSCM is to provide appropriate plans when there is no BCM process. If there is no BCM process, then IT is likely to adopt the wrong continuity strategies and options and make incorrect assumptions about business criticality of business processes. Also, if BCM is absent, then the business may fail to identify inexpensive non-IT solutions and waste money on ineffective, expensive IT solutions.
In some organizations, the perception is that continuity is an IT responsibility, and the business assumes that IT will be responsible for disaster recovery and that IT services will continue to run under any circumstances.
The challenge, if there is a BCM process established, becomes one of alignment and integration. Following that, the challenge becomes one of keeping the ITSCM process and BCM process aligned by management and by controlling business and IT change. All documents and plans should be maintained under the strict control of change management and service asset and configuration management.
The major risks are among those associated with ITSCM:
This chapter explored the next two processes in the service design stage, capacity management and IT service continuity management. It covered the purpose and objectives for each process in addition to the scope.
We looked at the value of the processes. Then we reviewed the policies for each process and the activities, methods, and techniques.
Last, we reviewed triggers, inputs, outputs, and interfaces for each process and the information management associated with it. We also considered the critical success factors and key performance indicators and the challenges and risks for the processes.
We examined how each of these processes supports the other and the importance of these processes to the business and the IT service provider.
Understand the purpose and objectives of capacity management and IT service continuity management. It is important for you to be able to explain the purpose and objectives of the capacity management and IT service continuity management processes.
Capacity management is concerned with the current and future capacity of services to the business.
IT service continuity management should ensure that the required business continuity plan is delivered to meet the business needs.
Understand the iterative activities of capacity management. Capacity management has both proactive and reactive activities. These include monitoring, tuning, and analysis, which may be carried out as part of a proactive or reactive approach.
Understand the subprocesses of capacity management. Business capacity management is concerned with the business requirements and understanding business needs.
Service capacity management is concerned with the capacity of services to fulfil the needs of the business.
Component capacity management is concerned with the technical aspect of capacity management and the capacity of individual service components.
Explain and differentiate between the different stages of IT service continuity management. Initiation is the start of the process and the trigger received from business continuity management.
Requirements and strategy are where a clear understanding of the business requirements and strategy are developed.
Implementation is where the decisions in the strategy are realized.
Ongoing operation is where the continuity plans are managed as part of the ongoing operation of the services.
Understand the critical success factors and key performance indicators for the processes. Measurement of the processes is an important part of understanding their success. You should be familiar with the CSFs and KPIs for both capacity management and IT service continuity management.
You can find the answers to the review questions in the appendix. Which of the following are responsibilities of capacity management? Capacity management includes three subprocesses. What are they? Which of the following are responsibilities of IT service continuity management? IT service continuity management carries out a BIA in conjunction with the business. What does BIA stand for? Which of the following statements about IT service continuity management (ITSCM) is TRUE? Match each subprocess (a, b, and c) to its definition (1, 2, and 3). True or False? Capacity management has both reactive and proactive activities. Which of these are KPIs relating to IT service continuity management? Which of these statements is/are correct? In which stages of the IT service continuity lifecycle does testing take place?Review Questions