Risk Management

Note

Project Manager, Whole Team

We make and meet long term commitments.

The following statement is nearly true:

Our team delivers a predictable amount of work every iteration. Because we had a velocity of 14 story points last week, we’ll deliver 14 story points this week, and next week, and the next. By combining our velocity with our release plan, we can commit to a specific release schedule!

Good XP teams do achieve a stable velocity. Unfortunately, velocity only reflects the issues the team normally faces. Life always has some additional curve balls to throw. Team members get sick and take vacations; hard drives crash, and although the backups worked, the restore doesn’t; stakeholders suddenly realize that the software you’ve been showing them for the last two months needs some major tweaks before it’s ready to use.

Despite these uncertainties, your stakeholders need schedule commitments that they can rely upon. Risk management allows you to make and meet these commitments.

A Generic Risk-Management Plan

Every project faces a set of common risks: turnover, new requirements, work disruption, and so forth. These risks act as a multiplier on your estimates, doubling or tripling the amount of time it takes to finish your work.

How much of a multiplier do these risks entail? It depends on your organization. In a perfect world, your organization would have a database of the type shown in Figure 8-6.[34] It would show the chance of completing before various risk multipliers.

An example of historical project data

Figure 8-6. An example of historical project data

Because most organizations don’t have this information available, I’ve provided some generic risk multipliers instead. (See Table 8-2.) These multipliers show your chances of meeting various schedules. For example, in a “Risky” approach, you have a 10 percent chance of finishing according to your estimated schedule. Doubling your estimates gives you a 50 percent chance of on-time completion, and to be virtually certain of meeting your schedule, you have to quadruple your estimates.

Table 8-2. Generic risk multipliers

 Process approach 
Percent chanceRigorous[a]Risky[b]Description
10%x1x1Almost impossible (“ignore”)
50%x1.4x250-50 chance (“stretch goal”)
90%x1.8x4Virtually certain (“commit”)

[a] These figures are based on DeMarco & Lister’s RISKOLOGY simulator, version 4a, available from http://www.systemsguild.com/riskology.html. I used the standard settings but turned off productivity variance, as velocity would automatically adjust for that risk.

[b] These figures are based on [Little].

If you use the XP practices—in particular, if you’re strict about being “done done” every iteration, your velocity is stable, and you fix all your bugs each iteration—then your risk is lowered. Use the risk multiplier in the “Rigorous” column. On the other hand, if you’re not strict about being “done done” every iteration, if your velocity is unstable, or if you postpone bugs and other work for future iterations, then use the risk multiplier in the “Risky” column.

Note

These risk multipliers illustrate an important difference between risky and rigorous appraoches. Both can get lucky and deliver according to their estimates. Risky approaches, however, take a lot longer when things go wrong. They require much more padding in order to make and meet commitments.

Although these numbers come from studies of hundreds of industry projects, those projects didn’t use XP. As a result, I’ve guessed somewhat at how accurately they apply to XP. However, unless your company has a database of prior projects to turn to, they are your best starting point.

Project-Specific Risks

Using the XP practices and applying risk multipliers will help contain the risks that are common to all projects. The generic risk multipliers include the normal risks of a flawed release plan, ordinary requirements growth, and employee turnover. In addition to these risks, you probably face some that are specific to your project. To manage these, create a risk census—that is, a list of the risks your project faces that focuses on your project’s unique risks.

[DeMarco & Lister 2003]suggest starting work on your census by brainstorming catastrophes. Gather the whole team and hand out index cards. Remind team members that during this exercise, negative thinking is not only OK, it’s necessary. Ask them to consider ways in which the project could fail. Write several questions on the board:[35]

  1. What about the project keeps you up at night?

  2. Imagine it’s a year after the project’s disastrous failure and you’re being interviewed about what went wrong. What happened?

  3. Imagine your best dreams for the project, then write down the opposite.

  4. How could the project fail without anyone being at fault?

  5. How could the project fail if it were the stakeholders’ faults? The customers’ faults? Testers? Programmers? Management? Your fault? Etc.

  6. How could the project succeed but leave one specific stakeholder unsatisfied or angry?

Write your answers on the cards, then read them aloud to inspire further thoughts. Some people may be more comfortable speaking out if a neutral facilitator reads the cards anonymously.

Once you have your list of catastrophes, brainstorm scenarios that could lead to those catastrophes. From those scenarios, imagine possible root causes. These root causes are your risks: the causes of scenarios that will lead to catastrophic results.

For example, if you’re creating an online application, one catastrophe might be “extended downtime.” A scenario leading to that catastrophe would be “excessively high demand,” and root causes include “denial of service attack” and “more popular than expected.”

After you’ve finished brainstorming risks, let the rest of the team return to their iteration while you consider the risks within a smaller group. (Include a cross-section of the team.) For each risk, determine:

  • Estimated probability—I prefer “high,” “medium,” and “low.”

  • Specific impact to project if it occurs—dollars lost, days delayed, and project cancellation are common possibilities.

You may be able to discard some risks as unimportant immediately. I ignore unlikely risks with low impact and all risks with negligible impact. Your generic risk multipler accounts for those already.

For the remainder, decide whether you will avoid the risk by not taking the risky action; contain it by reserving extra time or money, as with the risk multiplier; or mitigate it by taking steps to reduce its impact. You can combine these actions. (You can also ignore the risk, but that’s irresponsible now that you’ve identified it as important.)

For the risks you decide to handle, determine transition indicators, mitigation and contingency activities, and your risk exposure:

  • Transition indicators tell you when the risk will come true. It’s human nature to downplay upcoming risks, so choose indicators that are objective rather than subjective. For example, if your risk is “unexpected popularity causes extended downtime,” then your transition indicator might be “server utilization trend shows upcoming utilization over 80 percent.”

  • Mitigationactivities reduce the impact of the risk. Mitigation happens in advance, regardless of whether the risk comes to pass. Create stories for them and add them to your release plan. To continue the example, possible stories include “support horizontal scalability” and “prepare load balancer.”

  • Contingency activities also reduce the impact of the risk, but they are only necessary if the risk occurs. They often depend on mitigation activities that you perform in advance. For example, “purchase more bandwidth from ISP,” “install load balancer,” and “purchase and prepare additional frontend servers.”

  • Risk exposure reflects how much time or money you should set aside to contain the risk. To calculate this, first estimate the numerical probability of the risk and then multiply that by the impact. When considering your impact, remember that you will have already paid for mitigation activities, but contingency activities are part of the impact. For example, you might believe that downtime due to popularity is 35 percent likely, and the impact is three days of additional programmer time and $20,000 for bandwidth, colocation fees, and new equipment. Your total risk exposure is $7,000 and one day.

Some risks have a 100 percent chance of occurring. These are no longer risks—they are reality. Update your release plan to deal with them.

Other risks can kill your project if they occur. For example, a corporate reorganization might disband the team. Pass these risks on to your executive sponsor as assumptions or requirements. (“We assume that, in the event of a reorg, the team will remain intact and assigned to this project.”) Other than documenting that you did so, and perhaps scheduling some mitigation stories, there’s no need to manage them further.

For the remaining risks, update your release plan to address them. You will need stories for mitigation activities, and you may need stories to help you monitor transition indicators. For example, if your risk is “unexpected popularity overloads server capacity,” you might schedule the story “prepare additional servers in case of high demand” to mitigate the risk, and “server load trend report” to help you monitor the risk.

You also need to set aside time, and possibly money, for contingency activities. Don’t schedule any contingency stories yet—you don’t know if you’ll need them. Instead, add up your risk exposure and apply dollar exposure to the budget and day exposure to the schedule. Some risks will occur and others won’t, but on average, the impact will be equal to your risk exposure.

How to Make a Release Commitment

With your risk exposure and risk multipliers, you can predict how many story points you can finish before your release date. Start with your timeboxed release date from your release plan. (Using scopeboxed release planning? See the Predicting Release Dates” sidebar.) Figure out how many iterations remain until your release date and subtract your risk exposure. Multiply by your velocity to determine the number of points remaining in your schedule, then divide by each risk multiplier to calculate your chances of finishing various numbers of story points.

risk_adjusted_points_remaining = (iterations_remaining - risk_exposure) * velocity / risk_multiplier

For example, if you’re using a rigorous approach, your release is 12 iterations away, your velocity is 14 points, and your risk exposure is one iteration, you would calculate the range of possibilities as:

points remaining = (12 - 1) * 14 = 154 points
10 percent chance: 154 / 1 = 154 points
50 percent chance: 154 / 1.4 = 110 points
90 percent chance: 154 / 1.8 = 86 points

In other words, when it’s time to release, you’re 90 percent likely to have finished 86 more points of work, 50 percent likely to have finished 110 more points, and only 10 percent likely to have finished 154 more points.

You can show this visually with a burn-up chart, shown in Figure 8-7.[36] Every week, note how many story points you have completed, how many total points exist in your next release (completed plus remaining stories), and your range of risk-adjusted points remaining. Plot them on the burn-up chart as shown in the figure.

A burn-up chart

Figure 8-7. A burn-up chart

Use your burn-up chart and release plan to provide stakeholders with a list of features you’re committing to deliver on the release date. I commit to delivering features that are 90 percent likely to be finished, and I describe features between 50 and 90 percent likely as stretch goals. I don’t mention features that we’re less than 50 percent likely to complete.

Success over Schedule

Important

Success is more than a delivery date.

Important

Vision

The majority of this discussion of risk management has focused on managing the risk to your schedule commitments. However, your real goal should be to deliver an organizational success—to deliver software that provides substantial value to your organization, as guided by the success criteria in your project vision.

Taking too long can put that success at risk, so rapid delivery is important. Just don’t forget about the real goal. As you evaluate your risks, think about the risk to the success of the project, not just the risk to the schedule. Take advantage of adaptive release planning. Sometimes you’re better off taking an extra month to deliver a great result, rather than just a good one.

When Your Commitment Isn’t Good Enough

Someday, someone will ask you to commit to a schedule that your predictions show is impossible to achieve. The best way to make the schedule work is to reduce scope or extend the release date. If you can’t do so personally, ask your manager or product manager to help.

If you can’t change your plan, you may be able to improve your velocity (see Estimating” later in this chapter). This is a long shot, so don’t put too much hope in it.

It’s also possible that your pessimistic predictions are the result of using a risky process. You can improve the quality of your predictions by being more rigorous in your approach. Make sure you’re “done done” every iteration and include enough slack to have a stable velocity. Doing so will decrease your velocity, but it will also decrease your risk and allow you to use risk multipliers from the “Rigorous” column, which may lead to a better overall schedule.

Important

Trust

Typically, though, your schedule is what it is. If you can’t change scope or the date, you can usually change little else. As time passes and your team builds trust with stakeholders, people may become more interested in options for reducing scope or extending the delivery date. In the meantime, don’t promise to take on more work than you can deliver. Piling on work will increase technical debt and hurt the schedule. Instead, state the team’s limitations clearly and unequivocally.

If that isn’t satisfactory, ask if there’s another project the team can work on that will yield more value. Don’t be confrontational, but don’t give in, either. As a leader, you have an obligation to the team—and to your organization—to tell the truth.

In some organizations, inflexible demands to “make it work” are questionable attempts to squeeze more productivity out of the team. Sadly, applying pressure to a development team tends to reduce quality without improving productivity. In this sort of organization, as the true nature of the schedule becomes more difficult to ignore, management tends to respond by laying on pressure and “strongly encouraging” overtime.

Look at the other projects in the organization. What happens when they are late? Does management respond rationally, or do they respond by increasing pressure and punishing team members?

If it’s the latter, decide now whether you value your job enough to put up with the eventual pressure. If not, start looking for other job opportunities immediately. (Once the pressure and mandatory overtime begin, you may not have enough time or energy for a job hunt.) If your organization is large enough, you may also be able to transfer to another team or division.

As a tool of last resort, if you’re ready to resign and you’re responsible for plans and schedules, it’s entirely professional to demand respect. One way to do so is to say, “Because you no longer trust my advice with regard to our schedule, I am unable to do the job you hired me to do. My only choice is to resign. Here is my resignation letter.”

This sometimes gets the message through when nothing else will work, but it’s a big stick to wield, and it could easily cause resentment. Be careful. “Over my dead body!” you say. “Here’s your noose,” says the organization.

Questions

What should we tell our stakeholders about risks?

It depends how much detail each stakeholder wants. For some, a commitment and stretch goal may be enough. Others may want more detailed information.

Be sure to share your risk census and burn-up chart with your executive sponsor and other executives. Formally transfer responsibility for the project assumptions (those risks that will kill the project if they come true). Your assumptions are the executives’ risks.

Your risk multipliers are too high. Can I use a lower multiplier?

Are your replacement multipliers based on objective data? Is that data representative of the project you’re about to do? If so, go ahead and use them. If not, be careful. You can lower the multiplier, but changing the planned schedule won’t change the actual result.

We’re using a scopeboxed plan, and our stakeholders want a single delivery date rather than a risk-based range. What should we do?

Your project manager or product manager might be able to help you convince the organization to accept a risk-based range of dates. Talk with them about the best way to present your case.

If you must have a single date, pick a single risk multiplier to apply. Which one you choose depends on your organization. A higher risk multiplier improves your chances of success but makes the schedule look worse. A lower risk multiplier makes the schedule look better but reduces your chances of meeting that schedule.

Many organizations have acclimated to slipping delivery dates. Managers in these organizations intuitively apply an informal risk multiplier in their head when they hear a date. In this sort of organization, applying a large risk multiplier might make the schedule seem ridiculously long.

Consider other projects in your organization. Do they usually come in on time? What happens if they don’t? Talk with your project manager and product manager about management and stakeholder expectations. These discussions should help you choose the correct risk multiplier for your organization. Remember, though, using a risk-based range of dates is a better option, and using a timeboxed schedule is better than using a scopeboxed schedule.

Results

With good risk management, you deliver on your commitments even in the face of disruptions. Stakeholders trust and rely on you, knowing that when they need something challenging yet valuable, they can count on you to deliver it.

Contraindications

Use risk management only for external commitments. Within the team, focus your efforts on achieving the unadjusted release plan as scheduled. Otherwise, your work is likely to expand to meet the deadline. Figure 8-8[37] shows how a culture of doubling estimates at one company prevented most projects from finishing early without reducing the percentage of projects that finished late.

Just double the estimate

Figure 8-8. Just double the estimate

Be careful of using risk management in an organization that brags that “failure is not an option.” You may face criticism for looking for risks. You may still be able to present your risk-derived schedule as a commitment or stretch goal, but publicizing your risk census may be a risk itself.

Alternatives

Risk management is primarily an organizational decision rather than an individual team decision. If your organization has already institutionalized risk management, they may mandate a different approach. Your only trouble may be integrating it into XP’s simultaneous phases; to do so, use this description as a starting point and consider asking your mentor (see Find a Mentor” in Chapter 2) for advice specific to your situation.

Some organizations add a risk buffer to individual estimates rather than the overall project schedule. As Figure 8-8 illustrates, this tends to lead to waste.

Further Reading

Waltzing with Bears: Managing Risk on Software Projects [DeMarco & Lister 2003] provides more detail and considers risk management from an organizational perspective. My favorite quote comes from the back cover: “If there’s no risk on your next project, don’t do it.”



[34] Reprinted from [Little].

[35] Based on [DeMarco & Lister 2003] (p. 117)

[36] According to John Brewer (http://tech.groups.yahoo.com/group/extremeprogramming/message/81856), the burn-up chart was created by Phil Goodwin as a variant of Scrum’s burn-down charts. I’ve modified it further to include risk-based commitments.

[37] Reprinted from [Little].

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset