Chapter 14
Service Design Processes: Capacity Management and IT Service Continuity Management

THE FOLLOWING ITIL INTERMEDIATE EXAM OBJECTIVES ARE DISCUSSED IN THIS CHAPTER:

  • ✓  Capacity management and IT service continuity management are discussed in terms of
    • Purpose
    • Objectives
    • Scope
    • Value
    • Policies
    • Principles and basic concepts
    • Process activities, methods, and techniques
    • Triggers, inputs, outputs, and interfaces
    • Information management
    • Critical success factors and key performance indicators
    • Challenges
    • Risks

 The ITIL service design publication covers the managerial and supervisory aspects of service design processes. It excludes the day-to-day operation of each process and the details of the process activities, methods, and techniques or its information management. More detailed process operation guidance is covered in the service capability courses. Each process is considered from the management perspective. That means that at the end of this chapter, you should understand those aspects that would be required to understand each process and its interfaces, oversee its implementation, and judge its effectiveness and efficiency.

Capacity Management

ITIL states that capacity management is responsible for ensuring that the capacity of IT services and the IT infrastructure is able to meet agreed current and future capacity and performance needs in a cost-effective and timely manner. The capacity management process must therefore understand the likely changes in capacity requirements and ensure that the design and ongoing management of a service meet this demand. Sufficient capacity is a key warranty aspect of a service that needs to be delivered if the benefits of the service are to be realized.

Capacity management is considered throughout the lifecycle; as part of strategy, the likely capacity requirements for a new service are considered as part of the service evaluation to ensure that the service is meeting a real need. In design, the service is engineered to cope with that demand and to be flexible enough to be able to adjust to meet changing capacity requirements. Transition ensures that the service, when implemented, is delivering according to its specification. The operational phase of the lifecycle ensures that day-to-day adjustments that are necessary to meet changes in requirements are implemented. Finally, as part of continual service improvement, capacity-related issues are addressed and adjustments are made to ensure that the most cost-effective and reliable delivery of the service is achieved.

Purpose of Capacity Management

The purpose of the capacity management process is to understand the current and future capacity needs of a service and to ensure that the service and its supporting services are able to deliver to this level. The actual capacity requirements will have been agreed upon as part of service level management; capacity management must not only meet these, but also ensure that the future needs of the business, which may change over time, are met.

Objectives of Capacity Management

The objectives of capacity management are met by the development of a detailed plan that states the current business requirement, the expected future requirement, and the actions that will be taken to meet these requirements. This plan should be reviewed and updated at regular intervals (at least annually) to ensure that changes in business requirements are considered. Similarly, any requests to change the current configuration will be considered by capacity management to ensure that they are in line with expectations or, if not, that the capacity plan is amended to suit the changed requirement. Those responsible for capacity management will review any issues that arise and help resolve any incidents or problems that are the result of insufficient capacity. This helps ensure that the service meets its objectives.

An essential objective is to make sure capacity is increased in a timely manner so the business is not impacted.

As part of the ongoing management of capacity and its continual improvement, any proactive measures that may improve performance at a reasonable cost are identified and acted upon. Advice and guidance on capacity and performance-related issues are provided, and assistance is given to service operations with performance- and capacity-related incidents and problems.

Scope of Capacity Management

The capacity management process has responsibility for ensuring sufficient capacity at all times, including both planning for short-term fluctuations, such as those caused by seasonal variations, and ensuring that the required capacity is there for longer-term business expansion. Changes in demand may sometimes actually be reductions in that demand, and this is also within the scope of the process. Capacity management should ensure that as demand for the service falls, the capacity provided for that service is also reduced to ensure that unnecessary expenditure is avoided.

The process includes all aspects of service provision and therefore may involve the technical, applications, and operations management functions. Other aspects of capacity, such as staff resources, are also considered.

As the example in the case study “Capacity Management” illustrates, an increase in capacity requirements may have repercussions across the infrastructure and on the IT staff resources required to manage it. Although staffing is a line management responsibility, the calculation of resource requirements in this area is also part of the overall capacity management process.

Capacity management also involves monitoring “patterns of business activity” to understand how well the infrastructure is meeting the demands upon it and making adjustments as required to ensure that the demand is met. Proactive improvements to capacity may also be implemented, where justified, and any incidents caused by capacity issues need to be investigated.

Capacity management may also recommend demand management techniques to smooth out excessive peaks in demand.

Capacity Management Value to the Business

Capacity management provides value to the business by improving the performance and availability of IT services the business needs; it does so by helping to reduce capacity- and performance-related incidents and problems. The process will also ensure that the required capacity and performance are provided in the most cost-effective manner.

All processes should be contributing in some way to the achievement of customer satisfaction, and capacity management does this by ensuring all capacity- and performance- related service levels are met.

Proactive capacity management activities will support the efficient and effective design and transition of new or changed services. This will include the production of a forward-looking capacity plan based on a sound understanding of business needs and plans.

As with availability management, capacity management will have the opportunity to improve the ability of the business to follow an environmentally responsible strategy by using green technologies and techniques.

Capacity Management Policies

Capacity management is essentially a balancing act. It ensures that the capacity and performance of the IT services and systems match the evolving demands of the business in the most cost-effective and timely manner. This requires balancing the costs against the resources needed. Capacity management needs to ensure that the processing capacity that is purchased is cost justifiable in terms of business need. It ensures that the organization makes the most efficient use of those resources.

Capacity management is also about balancing supply against demand. It is important to ensure that the available supply of IT processing power matches the demands made on it by the business, both now and in the future. It may also be necessary to manage or influence the demand for a particular resource.

The policies for capacity management should reflect the need for capacity management to play a significant role across the service lifecycle.

It is important to ensure that capacity management is part of the consideration for all service level and operational level agreements, and of course any supporting contracts with suppliers. These agreements will capture the service requirements of the business, and capacity management should consider these for the current and future business needs.

Capacity Management Process Activities, Methods, and Techniques

We are not going to explore the process in detail, but you should make sure you are familiar with all the aspects of the process and the management requirements for each.

In Figure 14.1, you can see the full scope of the subprocesses, techniques, and activities for the capacity management process.

Diagram shows capacity management sub processes of business, service, and components lead to production of the capacity plan. CMIS includes capacity plan and storage of capacity management data.

Figure 14.1 Capacity management subprocesses

Copyright © AXELOS Limited 2010. All rights reserved. Material is reproduced under license from AXELOS.

Capacity management is an extremely technical, complex, and demanding process, and in order to achieve results, it requires three supporting subprocesses: business capacity management, service capacity management, and component capacity management.

Business capacity management is focused on the current and future business requirements, while service capacity management is focused on the delivery of the existing services that support the business and component capacity management is focused on the IT infrastructure that underpins service provision.

It is important to ensure that the tools used by capacity management conform to the organization’s management architecture and also integrate with other tools used for the management of IT systems and automating IT processes.

The monitoring and control activities within service operation should provide a basis for the tools to support and analyze information for capacity management. The IT operations management function and the technical management departments (such as network management and server management) may carry out the bulk of the day-to-day operational duties. They will participate in the capacity management process by providing it with performance information.

Like availability management, capacity management has both reactive and proactive activities. In Figure 14.2, you can see the activities relating to both reactive and proactive capacity management and the interaction between the subprocesses.

Diagram shows the connection between business requirements, IT service design, business and service capacity management, capacity management tools, capacity and performance reports, forecasts, and capacity plan.

Figure 14.2 Capacity management overview with subprocesses

Copyright © AXELOS Limited 2010. All rights reserved. Material is reproduced under license from AXELOS.

Capacity management should include the following proactive activities:

  • Preempting performance issues by taking the necessary actions before the issues occur
  • Producing trends of the current component utilization and using them to estimate the future requirements and for planning upgrades and enhancements
  • Modeling and trending the predicted changes in IT services (including service retirements)
  • Ensuring that upgrades are budgeted, planned, and implemented before service level agreements and service targets are breached or performance issues occur
  • Actively seeking to improve service performance wherever the cost is justifiable
  • Producing and maintaining a capacity plan addressing future requirements and plans for meeting them
  • Tuning (also known as optimizing) the performance of services and components

Capacity management should include the following reactive activities:

  • Monitoring, measuring, reporting, and reviewing the current performance of both services and components
  • Responding to all capacity-related “threshold” events and instigating corrective action
  • Reacting to and assisting with specific performance issues

There are a number of ongoing activities that form part of the capacity management process. These activities provide the basic historical information and triggers necessary for all of the other activities and processes within capacity management. These can be seen in Figure 14.3.

Diagram shows cyclic activities such as analysis of reports, tuning, implementation, monitoring of CMIS, resource and service thresholds, followed by framing service and resource utilization exception reports.

Figure 14.3 Ongoing iterative activities of capacity management

Copyright © AXELOS Limited 2010. All rights reserved. Material is reproduced under license from AXELOS.

Monitoring should be established on all the components and for each of the services. The data from the monitoring systems should be analyzed using expert systems to compare usage levels against thresholds. The results of the analysis should be included in reports and used to make recommendations for management of the systems. Control mechanisms should then be put in place to act on the recommendations.

There are many different approaches to managing capacity, including balancing services, balancing workloads, changing concurrency levels, and adding or removing resources. The information accumulated during these activities should be stored in the capacity management information system (CMIS).

This is a cyclic activity, and any changes should be monitored to make sure they deliver a positive benefit. These iterative activities are primarily performed as part of the service operation stage of the service lifecycle.

Capacity Management Triggers, Inputs, and Outputs

Let’s consider the triggers, inputs, and outputs for the capacity management process. Capacity management is a process that has many active connections throughout the organization and its processes. It is important that the triggers, inputs, outputs, and interfaces be clearly defined to avoid duplicated effort or gaps in workflow.

Triggers

There are many triggers that will initiate capacity management activities:

  • New and changed services requiring additional capacity
  • Service breaches, capacity or performance events, and alerts, including threshold events
  • Exception reports
  • Periodic revision of current capacity and performance and the review of forecasts, reports, and plans
  • Periodic trending and modeling
  • Review and revision of business and IT plans and strategies
  • Review and revision of designs and strategies
  • Review and revision of service level agreements, operational level agreements, contracts, or any other agreements
  • Requests from service level management for assistance with capacity and/or performance targets and explanation of achievements

Inputs

A number of sources of information are relevant to the capacity management process:

  • Business information
  • Service and IT information
  • Component performance and capacity information
  • Service performance issue information
  • Service information
  • Financial information
  • Change information
  • Performance information
  • CMS
  • Workload information

Outputs

The outputs of capacity management are used within the process itself as well as by many other processes and other parts of the organization. The information is often reproduced in an electronic format as visual real-time displays of performance. The outputs are as follows:

  • The capacity management information system
  • The capacity plan
  • Service performance information and reports
  • Workload analysis and reports
  • Ad hoc capacity and performance reports
  • Forecasts and predictive reports
  • Thresholds, alerts, and events
  • Improvement actions

Capacity Management Interfaces

As we have already explained, capacity management has strong connections across the service lifecycle with a number of other processes. The key interfaces are as follows:

  • Availability management works with capacity management to determine the resources needed to ensure the required availability of services and components.
  • Service level management provides assistance with determining capacity targets and the investigation and resolution of breaches related to service and component capacity.
  • IT service continuity management is supported by capacity management through the assessment of business impact and risk, determining the capacity needed to support risk reduction measures and recovery options.
  • Capacity management provides assistance with incident and problem management for the resolution and correction of capacity-related incidents and problems.
  • By anticipating the demand for services based on user profiles and patterns of business activity, and by identifying the means to influence that demand, demand management provides strategic decision-making and critical related data on which capacity management can act.

Information Management and Capacity Management

The CMIS is used to provide the relevant capacity and performance information to produce reports and support the capacity management process. The reports provide information to a number of IT and service management processes. These should include the reports described in the following sections.

Component-Based Reports

There is likely to be a team of technical staff responsible for each component, and they should be in charge of their control and management. Reports must be produced to illustrate how components are performing and how much of their maximum capacity is being used.

Service-Based Reports

Service-based reports will provide the basis of SLM and customer service reports. Reports and information must be produced to illustrate how the service and its constituent components are performing with respect to their overall service targets and constraints.

Exception Reports

Exception reports can be used to show management and technical staff when the capacity and performance of a particular component or service becomes unacceptable. Thresholds can be set for any component, service, or measurement within the CMIS. An example threshold may be that processor utilization for a particular server has breached 70 percent for three consecutive hours or that the concurrent number of logged-in users exceeds the agreed limit.

In particular, exception reports are of interest to the SLM process in determining whether the targets in SLAs have been breached. Also, the incident and problem management processes may be able to use the exception reports in the resolution of incidents and problems. Excess capacity should also be identified. Unused capacity may represent an opportunity for cost savings.

Predictive and Forecast Reports

Part of the capacity management process is to predict future workloads and growth. To do this, future component and service capacity and performance must be forecast. This can be done in a variety of ways, depending on the techniques and the technology used. A simple example of a capacity forecast is a correlation between a business driver and component utilization. If the forecasts on future capacity requirements identify a requirement for increased resource, this requirement needs to be input into the capacity plan and included within the IT budget cycle.

Often capacity reports are consolidated and stored on an intranet site so that anyone can access and refer to them.

Measures, Metrics, and Critical Success Factors for Capacity Management

The following list includes some sample critical success factors for capacity management and some key performance indicators for each.

  • Critical success factor: “Accurate business forecasts.”
    • KPI: Production of workload forecasts on time
    • KPI: Accuracy (measured as a percentage) of forecasts of business trends
  • Critical success factor: “Knowledge of current and future technologies.”
    • KPI: Timely justification and implementation of new technology in line with business requirements (time, cost, and functionality)
    • KPI: Reduction in the use of old technology, causing breached SLAs due to problems with support or performance
  • Critical success factor: “Ability to demonstrate cost effectiveness.”
    • KPI: Reduction in last-minute buying to address urgent performance issues
    • KPI: Reduction in the overcapacity of IT
  • Critical success factor: “Ability to plan and implement the appropriate IT capacity to match business needs.”
    • KPI: Reduction (measured as a percentage) in the number of incidents due to poor performance
    • KPI: Reduction (measured as a percentage) in lost business due to inadequate capacity

Challenges for Capacity Management

One of the major challenges facing capacity management is persuading the business to provide information on its strategic business plans. Without this information, the IT service provider will find it difficult to provide effective business capacity management. Where there may be commercial or confidential reasons this data cannot be shared, it becomes even more challenging for the service provider.

Another challenge is the combination of all of the component capacity management data into an integrated set of information that can be analyzed in a consistent manner. This is particularly challenging when the information from the different technologies is provided by different tools in differing formats.

The amount of information produced by business capacity management, and especially service capacity management and component capacity management, is huge, and the analysis of this information is often difficult to achieve.

It is important that the people and the processes focus on the key resources and their usage, without ignoring other areas. For this to be done, appropriate thresholds must be used, and reliance must be placed on tools and technology to automatically manage the technology and provide warnings and alerts when things deviate significantly from the norm.

Risks for Capacity Management

The following list includes some of the major risks associated with capacity management:

  • A lack of commitment from the business to the capacity management process.
  • A lack of appropriate information from the business on future plans and strategies.
  • A lack of senior management commitment to or a lack of resources and/or budget for the capacity management process.
  • Service capacity management and component capacity management performed in isolation because business capacity management is difficult or there is a lack of appropriate and accurate business information.
  • The processes become too bureaucratic or manually intensive.
  • The processes focus too much on the technology (component capacity management) and not enough on the services (service capacity management) and the business (business capacity management).
  • The reports and information provided are too technical and do not give the information required by or appropriate for the customers and the business.

IT Service Continuity Management

It is a fact that a service delivers value only when it is available for use. In addition to the activities carried out under the availability management process, there is a requirement for the IT service provider to ensure that the service is protected from catastrophic events that could prevent it from being delivered at all. Where these cannot be avoided, there is a requirement to have a plan to recover from any such disruption in a timescale and at a cost that meets the business requirement. Ensuring IT service continuity is an essential element of the warranty of the service.

It is important to understand that IT service continuity management (ITSCM) is responsible for the continuity of the IT services required by the business. The business should have a business continuity plan to ensure that any potential situations that would impact the ability of the business to function are identified and avoided. Where it is not possible to avoid such an event, the business continuity management process should have a plan, which is appropriate and affordable, to both minimize its impact and recover from it. Thus, ITSCM can be seen as one of a number of elements supporting a business continuity management (BCM) process, along with a human resources continuity plan, a financial management continuity plan, a building management continuity plan, and so on.

Purpose of IT Service Continuity Management

The purpose of the IT service continuity management process is to support the overall business continuity management (BCM) process. It is not a replacement for business continuity, even though many organizations could not survive without their IT service provider. It is important that this process understands the business continuity requirements. The service provider can then support these requirements by ensuring that, through managing the risks that could seriously affect IT services, the IT service provider can always provide the minimum agreed business continuity-related service levels.

To support and align with the BCM process, ITSCM uses formal risk assessment and management techniques to reduce risks to IT services to agreed acceptable levels. The service provider will plan and prepare for the recovery of IT services to meet these agreed levels.

Objectives of IT Service Continuity Management

A key objective of IT service continuity management is to produce and maintain a set of IT service continuity plans that support the overall business continuity plans of the organization. This will require complete and regular business impact analysis exercises to ensure that all continuity plans are maintained in line with changing business impacts and requirements.

A further objective is to conduct regular risk assessment and management exercises to manage IT services within an agreed level of business risk. This should be completed in conjunction with the business and the availability management and information security management processes.

As with all the service design processes, this process has an objective to provide advice and guidance to all other areas of the business and IT on all continuity-related issues.

IT service continuity should also ensure that appropriate continuity mechanisms are put in place to meet or exceed the agreed business continuity targets. This will require the assessment of the impact of all changes on the IT service continuity plans and supporting methods and procedures.

Working with availability management, the process should ensure that cost-justifiable proactive measures to improve the availability of services are implemented.

Service continuity management should also negotiate and agree on contracts with suppliers for the provision of the necessary recovery capability to support all continuity plans in conjunction with the supplier management process.

Scope of IT Service Continuity Management

When we consider the scope of IT service continuity management, it is important to understand that the process focuses on events that the business considers significant enough to be treated as a disaster. Less significant events will be dealt with as part of the incident management process.

Each organization will have its own understanding of what constitutes a disaster. The scope of IT service continuity management within an organization is determined by the organizational structure, culture, and strategic direction (both business and technology) in terms of the services provided and how these develop and change over time.

IT service continuity management first considers the IT assets and configurations that support the business processes. The process is not normally concerned with longer-term risks such as those from changes in business direction or other business-related alterations. Similarly, it does not usually cover minor technical faults (for example, noncritical disk failure) unless there is a possibility that the impact on the business could be major.

The IT service continuity management process includes the agreement of the scope of the ITSCM process and the policies adopted to support the business requirements. The process will also carry out business impact analysis to quantify the impact that the loss of IT service would have on the business.

It is important to establish the likelihood of potential threats taking place by carrying out risk assessment and management. This also includes taking measures to manage the identified threats where the cost can be justified. The approach to managing these threats will form the core of the ITSCM strategy and plans.

Essential to the process is the production of an overall IT service continuity management strategy that must be integrated into the business continuity management strategy. This should be produced by using both risk assessment and management and business impact analysis. The strategy should include cost-justifiable risk reduction measures as well as selection of appropriate and comprehensive recovery options.

As part of the strategy, there should be the requirement to produce an IT service continuity plan, which should integrate with the business continuity plan. These plans should be tested and managed as part of the ongoing operation. This will require regular testing and maintenance to ensure that they are in alignment with business continuity management.

IT Service Continuity Management Value to the Business

IT service continuity management is a vital part of the assurance and management of IT service provision for an organization because it supports the business continuity process. It can often be used to provide the justification for business continuity processes and plans by raising awareness of the impact of failures to the organization.

The process should be driven by business risk as identified by business continuity and ensure that the recovery arrangements for IT services are aligned to identified business impacts, risks, and needs.

IT Service Continuity Management Process, Methods, and Techniques

IT service continuity management is a repeating, cyclic process. As the needs of the organization change, so will the requirements for continuity and recovery, so the process must be continually reviewed and the output verified for effectiveness. The process is shown in Figure 14.4.

Diagram shows the interconnection between the business continuity management and IT service continuity management lifecycles and key activities during different stages of lifecycles.

Figure 14.4 Lifecycle of IT service continuity management

Copyright © AXELOS Limited 2010. All rights reserved. Material is reproduced under license from AXELOS.

Initiation The process is structured in four stages. The first is initiation, where the policies and scope of the continuity requirement are established in alignment with the business continuity requirements. It is during this stage that, if the scope requires it, a project management approach will be adopted.

Requirements and Strategy In the next stage, requirements and strategy, the activities of business impact analysis and risk assessment and management are carried out. This will allow the strategy for continuity to be developed.

Implementation Implementation of the strategy requires the development of the IT service continuity plans, including the recovery plans and procedures. This stage is where the risk reduction measures are implemented and the initial testing of the plans is carried out. Once these are found to be successful in supporting the business requirements, the operational stage begins.

Ongoing Operation During the operational stage, it will be important to ensure that there is adequate information delivered to the organization, through education, awareness, and training. The plans should be regularly reviewed and audited to ensure that they meet the ongoing requirements of the business. This will require an association with the change management process, and the plans and procedures for continuity should be subject to change procedures. Regular testing is part of this stage, and the results of testing will be fed back into the process.

Invocation It is important to ensure that there is a clearly understood mechanism and definition of when to invoke the continuity plans. This is not a stage of the process, as such, but it is a vital part of the process, because the establishment of the trigger for implementing the continuity plan is very important.

IT Service Continuity Management Triggers, Inputs, and Outputs

We will now review the triggers, inputs, and outputs of IT service continuity management.

Triggers

Many events may trigger IT service continuity management activity, including new or changed business needs, new or changed services, and new or changed targets within agreements, such as service level requirements, service level agreements, operational level agreements, and contracts.

Major incidents that require assessment for potential invocation of either business or IT continuity plans are another trigger for the process, as are periodic activities such as the business impact analysis and risk assessment activities; maintenance of continuity plans; and other reviewing, revising, or reporting activities.

Assessment of changes and attendance at change advisory board meetings should be a part of the process scope, because it is here that there will be opportunity to review and revise business and IT plans and strategies in light of altering business needs, which may trigger changes to the process output. This will include the review and revision of designs and strategies, both for the business and IT service provider.

Other triggers will include the recognition or notification of a change in the risk or impact of a business process or vital business function, an IT service, or a component. The results of testing the plans and lessons learned from previous continuity events will also provide triggers for the process.

Inputs

There are many sources of input required by the ITSCM process:

  • Business information from the organization’s business strategy, plans, and financial plans and information on their current and future requirements
  • IT information from the IT strategy and plans and current budgets
  • A business continuity strategy and a set of business continuity plans from all areas of the business
  • Service information from the SLM process, with details of the services from the service portfolio and the service catalog and service level targets within SLAs and SLRs
  • Financial information from financial management for IT services, the cost of service provision, and the cost of resources and components
  • Change information from the change management process, with a change schedule and an assessment of all changes for their impact on all ITSCM plans
  • A configuration management system (CMS) containing information on the relationships between the business, the services, the supporting services, and the technology
  • Business continuity management and availability management testing schedules
  • Capacity management information identifying the resources required to run the critical services in the event of a continuity event
  • IT service continuity plans and test reports from supplier and partners, where appropriate

Outputs

The outputs from the ITSCM process are as follows:

  • A revised ITSCM policy and strategy
  • A set of ITSCM plans, including all crisis management plans, emergency response plans, and disaster recovery plans, together with a set of supporting plans and contracts with recovery service providers
  • Business impact analysis exercises and reports, in conjunction with business continuity management and the business
  • Risk assessment and management reviews and reports, in conjunction with the business, availability management, and information security management
  • An ITSCM testing schedule
  • ITSCM test scenarios
  • ITSCM test reports and reviews
  • Forecasts and predictive reports used by all areas to analyze, predict, and forecast particular business and IT scenarios and their potential solutions

IT Service Continuity Management Interfaces

IT service continuity should have interfaces to all other processes across the whole service lifecycle.

Important examples are as follows:

  • Change management, because all changes need to be considered for their impact on the continuity plans. The plan itself must be under change management control.
  • Incident and problem management require clear criteria that is agreed on and documented for the invocation of the ITSCM plans.
  • Availability management undertakes risk assessment, and implementing risk responses should be closely coordinated with the availability process to optimize risk mitigation.
  • Recovery requirements will be agreed and documented in the service level agreements. Different service levels that would be acceptable in a disaster situation could be agreed on and documented through the service level management process.
  • Capacity management should ensure that there are sufficient resources to enable recovery to replacement systems following a disaster.
  • Service asset and configuration management provides a valuable tool for the continuity process. The configuration management system documents the components that make up the infrastructure and the relationship between the components.
  • A very close relationship exists between ITSCM and information security management. A major security breach could be considered a disaster, so when the service provider is conducting business impact analysis and risk assessment, security will be a very important consideration.

Information Management

ITSCM needs to record all of the information necessary to maintain a comprehensive set of ITSCM plans. This information base should include the following items:

  • Information from the latest version of the BIA
  • Comprehensive information on risk within a risk register, including risk assessment and risk responses
  • The latest version of the BCM strategy and business continuity plans
  • Details relating to all completed tests and a schedule of all planned tests
  • Details of all ITSCM plans and their contents
  • Details of all other plans associated with ITSCM plans
  • Details of all existing recovery facilities, recovery suppliers and partners, recovery agreements and contracts, and spare and alternative equipment
  • Details of all backup and recovery processes, schedules, systems, and media and their respective locations

All the preceding information needs to be integrated and aligned with all BCM information and all the other information required by ITSCM. Interfaces to many other processes are required to ensure that this alignment is maintained.

IT Service Continuity Management Critical Success Factors and KPIs

The following list includes some sample critical success factors for ITSCM.

  • Critical success factor: “IT services are delivered and can be recovered to meet business objectives.”
    • KPI: Increase in success of regular audits of the ITSCM plans to ensure that, at all times, the agreed recovery requirements of the business can be achieved
    • KPI: Regular and comprehensive testing of ITSCM plans achieved consistently
    • KPI: Regular reviews, at least annual, of the business and IT continuity plans with the business areas
    • KPI: Overall reduction in the risk and impact of possible failure of IT services
  • Critical success factor: “Awareness throughout the organization of the business and IT service continuity plans.”
    • KPI: Increase in validated awareness of business impact, needs, and requirements throughout IT
    • KPI: Increase in successful test results, ensuring that all IT service areas and staff are prepared and able to respond to an invocation of the ITSCM plans

IT Service Continuity Management Challenges and Risks

We’ll begin with looking at the key challenges for the process and then look at the risks.

Challenges

A major challenge facing ITSCM is to provide appropriate plans when there is no BCM process. If there is no BCM process, then IT is likely to adopt the wrong continuity strategies and options and make incorrect assumptions about business criticality of business processes. Also, if BCM is absent, then the business may fail to identify inexpensive non-IT solutions and waste money on ineffective, expensive IT solutions.

In some organizations, the perception is that continuity is an IT responsibility, and the business assumes that IT will be responsible for disaster recovery and that IT services will continue to run under any circumstances.

The challenge, if there is a BCM process established, becomes one of alignment and integration. Following that, the challenge becomes one of keeping the ITSCM process and BCM process aligned by management and by controlling business and IT change. All documents and plans should be maintained under the strict control of change management and service asset and configuration management.

Risks

The major risks are among those associated with ITSCM:

  • Lack of a business continuity management process
  • Lack of commitment from the business to the ITSCM processes and procedures
  • Lack of appropriate information on future business plans and strategies
  • Lack of senior management commitment to or lack of resources and/or budget for the ITSCM process
  • The risk that the processes focus too much on the technology issues and not enough on the IT services and the needs and priorities of the business
  • The risk that the process is unlikely to succeed in its objectives if risk assessment and management are conducted in isolation and not in conjunction with availability management and information security management
  • ITSCM plans and information becoming out of date and losing alignment with the information and plans of the business and BCM

Summary

This chapter explored the next two processes in the service design stage, capacity management and IT service continuity management. It covered the purpose and objectives for each process in addition to the scope.

We looked at the value of the processes. Then we reviewed the policies for each process and the activities, methods, and techniques.

Last, we reviewed triggers, inputs, outputs, and interfaces for each process and the information management associated with it. We also considered the critical success factors and key performance indicators and the challenges and risks for the processes.

We examined how each of these processes supports the other and the importance of these processes to the business and the IT service provider.

Exam Essentials

Understand the purpose and objectives of capacity management and IT service continuity management. It is important for you to be able to explain the purpose and objectives of the capacity management and IT service continuity management processes.

Capacity management is concerned with the current and future capacity of services to the business.

IT service continuity management should ensure that the required business continuity plan is delivered to meet the business needs.

Understand the iterative activities of capacity management. Capacity management has both proactive and reactive activities. These include monitoring, tuning, and analysis, which may be carried out as part of a proactive or reactive approach.

Understand the subprocesses of capacity management. Business capacity management is concerned with the business requirements and understanding business needs.

Service capacity management is concerned with the capacity of services to fulfil the needs of the business.

Component capacity management is concerned with the technical aspect of capacity management and the capacity of individual service components.

Explain and differentiate between the different stages of IT service continuity management. Initiation is the start of the process and the trigger received from business continuity management.

Requirements and strategy are where a clear understanding of the business requirements and strategy are developed.

Implementation is where the decisions in the strategy are realized.

Ongoing operation is where the continuity plans are managed as part of the ongoing operation of the services.

Understand the critical success factors and key performance indicators for the processes. Measurement of the processes is an important part of understanding their success. You should be familiar with the CSFs and KPIs for both capacity management and IT service continuity management.

Review Questions

You can find the answers to the review questions in the appendix.

  1. Which of the following are responsibilities of capacity management?

    1. Negotiating capacity requirements to be included in the SLA
    2. Monitoring capacity
    3. Forecasting capacity requirements
    4. Dealing with capacity issues
      1. 2, 3, and 4
      2. 1 and 2 only
      3. All of the above
      4. 1, 2, and 4
  2. Capacity management includes three subprocesses. What are they?

    1. Service capacity, business capacity, component capacity
    2. System capacity, business capacity, component capacity
    3. Service capacity, business capacity, configuration capacity
    4. System capacity, business capacity, infrastructure capacity
  3. Which of the following are responsibilities of IT service continuity management?

    1. Ensuring that IT services can continue in the event of a disaster
    2. Carrying out risk assessments
    3. Ensuring that the business has contingency plans in place in case of a disaster
    4. Ensuring all IT staff know their role in the event of a disaster
      1. 2, 3, and 4
      2. 1, 2, and 4
      3. 1 and 2 only
      4. All of the above
  4. IT service continuity management carries out a BIA in conjunction with the business. What does BIA stand for?

    1. Business integrity appraisal
    2. Business information alternatives
    3. Benefit integration assessment
    4. Business impact analysis
  5. Which of the following statements about IT service continuity management (ITSCM) is TRUE?

    1. ITSCM defines the service that can be provided in the event of a major disruption. The business can then plan how it will use the service.
    2. ITSCM and business continuity management (BCM) have no impact on each other.
    3. BCM defines the level of IT service that will be required in the event of a major disruption. ITSCM is responsible for delivering this level of service.
    4. It is the responsibility of ITSCM to deliver a single continuity plan that will fit all situations.
  6. Match each subprocess (a, b, and c) to its definition (1, 2, and 3).

    1. Business capacity management
    2. Service capacity management
    3. Component capacity management
    1. View of the future plans and requirements of the organization
    2. View of the detailed information relating to the performance management of technical assets
    3. View of the service performance achieved in the operational environment
  7. True or False? Capacity management has both reactive and proactive activities.

    1. True
    2. False
  8. Which of these are KPIs relating to IT service continuity management?

    1. KPI: Regular and comprehensive testing of ITSCM plans achieved consistently
    2. KPI: Regular reviews undertaken, at least annually, of the business and IT continuity plans with the business areas
    3. KPI: Overall reduction in the risk and impact of possible failure of IT services
    4. KPI: Number of incidents that result in a major incident
      1. 1, 3, 4
      2. 2, 3, 4
      3. 1, 2, 3
      4. 1, 2, 4
  9. Which of these statements is/are correct?

    1. Risk management is a vital part of both capacity and service continuity management.
    2. Both capacity management and service continuity management are cyclic processes.
      1. Statement 1 only
      2. Statement 2 only
      3. Both statements
      4. Neither statement
  10. In which stages of the IT service continuity lifecycle does testing take place?

    1. Initiation and ongoing operation
    2. Initiation and implementation
    3. Implementation and ongoing operation
    4. Requirements and strategy and ongoing operation
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset