9      EVALUATING THE SYSTEM

One thing that should be clear by now is that what is ‘acceptable’ is a very hard concept to pin down. In the case of an IS it depends on how the system was built or acquired, who the stakeholders are and what their needs are. There is no ‘one size fits all’ answer to the question. What we need to understand and be able to apply is a process for determining whether a particular system is acceptable in a particular scenario, and that is what this chapter will focus on.

This chapter is about the final decision to accept or not accept a system. In practice, of course, the decision is not simple and nor does it have only two possible outcomes. There is a range of possible scenarios that we will explore, each relating to whether and to what extent the acceptance criteria have been demonstrably met.

Topics covered in this chapter

•  How do we decide whether or not to accept a system?

•  When the testing has to stop

•  The risk of release

•  Measuring the risk of release

•  Defining and evaluating emergency-release criteria

•  Decision process for evaluating UAT results

•  Test summary report conclusions

•  The final release decision

HOW DO WE DECIDE WHETHER OR NOT TO ACCEPT A SYSTEM?

The simple answer is that we have to decide whether or not the system has met its acceptance criteria, but that answer assumes we have a neat package of criteria that we can assess. If you have taken the advice in this book to heart and applied it to your project, you should have a set of criteria to work with; if not, you will need a practical and pragmatic decision process.

Figure 9.1 presents the logic of acceptance in its simplest form.

Figure 9.1 Process for deciding go/no go for release

images

The logic contains four decisions, of which three are key to acceptance.

Decision 1 is about whether or not the development was contracted to a third party or, equivalently from an acceptance standpoint, whether or not the system was acquired from a third-party supplier. In either case there will be some contractual criteria on which final payment is based. This is not the same thing as acceptance of the system but it is a necessary prerequisite.

Decision 2 assumes that there are contractual criteria and asks whether or not they have been met. If we find the contractual criteria have been met, the supplier must be paid and we have to work with what has been delivered. If not, we would expect to enter some kind of negotiation with the supplier that will result in a settlement that improves the situation from our point of view. The key factor here is the relationship of the contract criteria to the acceptance criteria set by the stakeholders. There is unlikely to be a perfect fit unless the system was specially custom-built to meet your exact requirements, in which case the contract criteria and the acceptance criteria will be the same. In most cases the contract criteria will relate to the computer system that is at the heart of the IS but not the IS itself. We will still need to ensure that the IS as a whole meets our business need and is acceptable to our user community, and that will usually mean more testing and more work to achieve an acceptable release, and there will still be two more decisions to take.

Decision 3 asks whether the system achieves its acceptance criteria, which implies that the business benefits for which it was built or acquired can be achieved. For contracted systems, achievement of contract criteria leads on to this decision; for systems built in-house we arrive at this as the first acceptance decision. Remember that acceptance criteria are pragmatic and take into account both the business value of the system and the risk of releasing it. Achieving the acceptance criteria means that we now have a platform on which to build up the business value so we can take the step of releasing it. We need to bear in mind, however, that there may still be quite a lot of work to do to ensure the system as released is capable of growing business value to achieve the business benefits for which it was originally commissioned. Release of the system does not guarantee that the benefits will be achieved and a plan for ramping up the value of the system to the business will still be a priority.

Decision 4 is the fallback question and the one that needs to be exercised much more often than it is; it asks whether the risk of releasing the system is acceptable. All the information we have gathered will be needed to get a sound answer to this question, and all the stakeholders will need to be ready and willing to engage in a serious assessment of the possible outcomes if the system is released. This is an ideal opportunity to use some kind of scenario analysis to identify what could happen after release and assess how the system in its current state would cope, looking beyond the immediate boundary of the system to the business effects of any shortfall in performance or capability. Development of scenarios for this risk assessment would be a good opportunity for the users, developers, managers and sponsor(s) to work as a team. The work of scenario building will be valuable even if the system proves to meet all its acceptance criteria, but it will be absolutely priceless if it does not and will speed up the process of making the system fit for use.

WHEN THE TESTING HAS TO STOP

We began with a clear idea about how much testing we would need to do and an equally clear idea about how we would determine whether the system was fit to release to its users. That framework enabled us to plan and execute structured testing and gather data about the status of testing and the state of the system. Once our plan has been completed and all the data gathered, we can stop testing and evaluate the results to decide whether or not to accept the system.

That is an ideal scenario, of course, and one that we cannot expect to be the reality we face at the end of our UAT project; we need to be ready for alternative scenarios. Among these might be:

•  The business can no longer afford to burn money on the project and needs to get the system into service.

•  The benefits of the system are time-critical and time is running away.

•  The testing is finding few, if any, problems and further testing is hard to justify.

These are realistic scenarios and the challenge for us is how to react to them. Our approach must enable us to evaluate the system at any stage in testing so that none of these scenarios catches us unprepared. That is one reason why routine reporting on the testing was recommended and, as part of that reporting, a regular update of progress towards the acceptance criteria so that we have a clear idea of where we are in relation to the release decision at every stage in testing.

Therefore the first and most important lesson is that evaluation of the system must be continuous and consistent. That is to say we must decide how we will evaluate the system and set up the processes of evaluation from the beginning so that we can, at any stage, produce a credible evaluation of the state of the system.

If we do have to stop testing before it is completed, for any reason, we will need to evaluate the risk of release and the business value of the system in its current state.

THE RISK OF RELEASE

All the work we did at the planning stage was leading up to this challenge. We put in place acceptance criteria to make the decision on fitness for release as objective as possible. We set targets for test coverage and levels of defects that provided us with clear visibility of the status of the system at each stage. If all our criteria have been met, we have no difficulty in making a decision to accept the system.

But what if the criteria have not been met when testing is complete? And what if the criteria have still not been met when testing has to stop? These are the most important questions we have to answer.

The answer, in a word, is risk. We have to decide what the level of risk is in releasing a system that is not ready according to our criteria. What is the probability that it will fail partially or completely, and what would be the consequences of failure? This is the reason we decided at the outset that risk-based testing would be a good approach. With risk-based testing we have confidence that every test reduces the risk by some amount and, however tiny the reduction of risk might be, it will be accumulating day by day and test by test. If we have tested using a risk-based approach, we can at least say that the risk of release is as low as we could make it on the day the decision has to be made.

But whether we have been using a risk-based approach or not does not change the need to be able to make a decision about the risk that remains in releasing the system at a given time and at a given stage in the testing.

We need to make a judgement based on the best data we have available and centred around acceptance criteria. The actual evaluation will depend on the criteria we set initially, but we can define a process that will work in every case.

MEASURING THE RISK OF RELEASE

Turning again to Figure 9.1 we can see that there are two ends to the risk spectrum. If we have prepared the way by carefully defining contractual or acceptance criteria at the start of the project and we have maintained control of all incidents, defects and other changes during the project, the final decision is very likely to be clear-cut and the risk of release is likely to be small. We could still have built the wrong system for the business purpose or diverged from our original intent in some way, but the monitoring of our progress will have shown up these problems at a relatively early stage. We may have a hard decision to make about whether to release – hard in the sense of being unpopular and uncomfortable – but we should have an easy decision in the sense that the correct or best (most appropriate) outcome will be easy to determine. The purpose of this whole process has been to make rational decisions possible and this will have been achieved, even if the outcome is not the one originally desired.

If, on the other hand, we have not defined contract or acceptance criteria at the outset or we have not monitored the development and testing process carefully so that we have a continuously updated record of the quality of the outputs and the status of the system in terms of defects and incidents at this stage, then we have a hard decision to make – this time hard in the sense that it will not be easy to reach a sound conclusion from the information available.

What follows will therefore consider the latter scenario. How do we proceed if we have no clear criteria for acceptance? What can we do to enable a rational decision in the face of incomplete and possibly conflicting information and with business and time pressures militating against spending time on gathering and assessing data? Remember that as UA testers we are not the decision makers, but we are closest to the system at this point and therefore best able to identify risk factors and report to decision makers in a way that is helpful, positive and constructive – always remembering that we will also be the last group to touch the system before release, so we will be remembered as being at least partially responsible for the outcome.

DEFINING AND EVALUATING EMERGENCY-RELEASE CRITERIA

If we have no positive acceptance criteria to identify what we want from the system, we have to fall back on some more defensive criteria to try to protect ourselves from trouble. There are three parameters that will be of some value to us in this decision:

1.  stability of the system;

2.  usability of the system;

3.  coverage of the testing.

Stability

Stability is a measure of a system’s ability to cope with change. We need the system to be stable enough for us to put in place an improvement plan that will almost certainly involve changes to the computer system at the heart of the IS. If the computer system cannot accommodate change, we will be unable to improve it to meet our original expectations.

We must at least ensure we understand how stable the system is so that we have confidence that we can make essential changes to improve its business value, performance, usability or other parameters as we need to.

Measuring stability

One way to assess stability is to look at what changes have been made during development, why they were made and what problems, if any, arose after changes were made. To do this we would need to review the incidents that have been raised, the defects that were identified and corrected, and the changes that were made as the system was built. The testing logs will confirm whether the planned changes were stable and the IM system will tell us whether changes made to correct defects resulted in a clean new release or the discovery of further defects.

If we have access to incident logs and change logs, we can identify the relationship between changes and defects and see whether changes tended to cause spikes in defect rates. This is a typical pointer to instability.

Usability

Usability is defined here in a very informal way as a measure of how well the users can operate the computer system at the heart of the IS. If users have difficulty in using the system effectively, improvements will be hard to make because users will be trying to interface with a system that is already problematical for them; change could make matters considerably worse.

Determining usability

What we are seeking here is an informal but reliable assessment of how well users will be able to manage to operate the system to deliver at least basic services while the improvement programme is being implemented. A good mechanism would be to define a core set of user interactions that enable key services, provide a broad range of user interactions and create a scenario that users can work through in a consistent way. A small but representative group of trained end-users can then work through the scenario and provide feedback on their experience. Each user should be timed as they work through the scenario and feedback should be via a standard questionnaire or an interview with structured questions. Defining the scenario and the feedback mechanism would be a good exercise for end-users, managers and sponsor(s) to tackle as a team. Users will need to be well briefed so that they are not intimidated by the exercise.

The results should provide a good guide to the system’s usability as perceived by its users and the exercise will have the additional benefit of engaging the end-user community in the improvement programme from the outset.

Coverage

We have already defined coverage of testing as a measure of how much of the system has been tested so far. The measures of stability and usability must be viewed against coverage because a system that is unstable or unusable when, say, only 25 per cent of the requirements have been tested is facing major rework. At the other end of the spectrum, a system that has been tested against 90 per cent of the requirements and is reasonably stable and usable represents a low risk as a platform for improvement.

We need to seriously consider the testing we have done throughout the life cycle and in UAT and decide whether it is enough on which to base any decision.

Determining test coverage

If we have not designed tests to achieve specific levels of test coverage, we can still make some attempt to identify what has been tested in the test cases that have been run. We first need to analyse some test cases to determine how they relate to requirements. This needs to be done by a testing specialist, a business analyst or a developer; it may require skills and insights we could not reasonably expect an end-user to have and we need this exercise done quickly. Once the sample analysis is completed we may have a simple way to determine requirements coverage for the whole UAT suite. We may find that requirements coverage is high, in which case the testing has been quite effective. If we find coverage is low, there should be some concern about possible undiscovered defects in important areas.

If the initial analysis cannot determine coverage, for example if tests have been constructed without reference to requirements, we can only draw the conclusion that requirements coverage is very low. This is, in itself, not necessarily a barrier to an improvement programme, but it points to potential problems ahead and it certainly identifies a gap that must be filled as part of the improvement programme – the achievement of systematic testing to provide coverage data for the future as changes are made.

Determining whether emergency criteria have been met

Using all the data about incidents, defects, changes, usability and coverage that we can find, we should be able to draw some conclusions about risk of release. The data may not allow detailed conclusions but it would be unusual to find that this brief investigation did not lead to a conclusion that risk of release was relatively high or relatively low, and that may be enough for our purposes.

These three criteria are our emergency fallback. They are in no sense an alternative to more rigorous acceptance criteria because all they provide us with is a measure of the confidence we can have that the computer system is a suitable platform for making incremental improvements. This is a last-ditch effort to keep the system alive while we work out a plan to enhance it to the level we originally anticipated and paid for.

DECISION PROCESS FOR EVALUATING UAT RESULTS

We now have a generic process for drawing conclusions about the outcome of UAT. The level of detail and the firmness of conclusions will depend on the extent and quality of testing done and the detail of records that have been kept, but the approach will be similar in each case.

Step 1 – contract acceptance

If, and only if, there was an external contract to build our system, or if a system was acquired with or without modification, there are certain to be criteria associated with acceptance of the system from the supplier(s). These criteria are likely to be similar in kind to those we define for our own acceptance decision and they should be measurable so that determination of acceptance is clear-cut and not open to debate. Step 1 is to evaluate these criteria (or that part of the criteria that relates to testing) and report the results to the relevant stakeholders with a recommendation whether to accept or not.

The process cannot proceed to the next step until this one is complete, so any discrepancies or negotiations need to be completed before we can move on to step 2.

If there was no third-party contractual involvement in development, the process moves on to step 2.

Step 2 – meeting acceptance criteria

Our own acceptance criteria were based on the extent to which business intent has been achieved. These, too, should be measurable and relatively straightforward to evaluate. We should have been tracking how far we are from achieving these criteria throughout UAT, so this final evaluation should not require major activity. It should yield a clear determination of whether or not the system meets its business intent and this should be reported to the relevant stakeholders with a recommendation. This is usually in the form of a UAT completion report that describes exactly what testing has been done for UAT and the results of that testing in relation to the acceptance criteria. Figure 9.2 is an outline of a UAT completion report.

In the event that acceptance criteria are not met, a decision by the stakeholders will require some input from the UAT team so we need to ensure that the UAT completion report incorporates an assessment of the performance of the system against each of the acceptance criteria, identifies how far the system is from achieving the criteria and assesses the implications of any gaps. Based on this analysis the report should suggest possible alternative courses of action for the stakeholders to consider.

Step 3 – assessing risk of release

Whenever acceptance criteria are not met we should carry out an assessment of the risk of releasing the system in its current state. The mechanism for this was outlined in step 2, but if data related to acceptance criteria are limited we can use the approach outlined in the section ‘Defining and evaluating emergency-release criteria’. Our conclusions in this case will typically be more tentative than those associated with a clear evaluation of acceptance criteria and, consequently, more commentary will normally be needed to enable the stakeholders to make their own evaluation of the possible outcomes and the risks associated with each of them so that they can make a release decision.

Figure 9.2 UAT completion report format

images

TEST SUMMARY REPORT CONCLUSIONS

The conclusions in the UAT completion report will need to be structured around the extent to which business intent has been achieved and the level of risk associated with releasing the system. We can identify a range of possible recommendations at this point.

Outcome 1 – release the system as it is

This is the most optimistic outcome. It implies that business benefits have been met and the risk of release is low. There may be some discrepancies in acceptance criteria but these are not significant enough to require any specific action.

Outcome 2 – defer release until key risk reduction measures are in place

Outcome 2 implies that business benefits have not been fully achieved and that risk of release is relatively high. Recommendations on risk reduction activities can also be matched by activities to enhance the business benefits achievable by the system. If, for example, the number and severity of defects not yet corrected are considered high, the recommendation may be to defer release while this situation is improved. The improvement will take a certain amount of time and require a certain level of development resources, providing stakeholders with a spectrum of strategies to achieve more or less risk reduction in more or less time, using more or fewer resources. This allows other factors such as time or commercial pressures to be taken into account.

It would be prudent to measure stability and usability in this case to provide confidence that a risk reduction and improvement programme can be implemented without increasing the risk of a system failure.

Outcome 3 – release the system with additional support

Another option that can be considered is an immediate or early release but with additional resources committed to system support to offset the risk of early problems. This clearly depends on the nature of the expected problems and will be based on a risk analysis. Where the risk is related to possible user interface or performance problems that can be corrected within a reasonably short time frame, there may be an option to enhance the user support (for example by providing additional help resources and building FAQ lists from the test results to enable support staff to quickly diagnose problems and suggest effective ‘workarounds’).

As for outcome 2, there may be a spectrum of possible responses. Here, too, it would be prudent to measure stability and usability to provide confidence that a risk reduction and improvement programme can be implemented without increasing the risk of a system failure.

Outcome 4 – defer release and apply risk reduction and additional support

Outcome 4 is clearly a combination of outcomes 2 and 3. It becomes a serious alternative where there are risks associated with the system’s ability to achieve its main functions, necessitating a deferment while risk reduction is taking place, and there are also potential issues related to the user interface or performance. The time spent on risk reduction is likely to require significant development resources so the less-critical defects will not be corrected in the short term. The additional support therefore eases the problems when the system is released after a deferment.

In this situation it would be important to evaluate the emergency-release criteria to ensure that the system can be brought to an acceptable standard in a realistic time frame and at reasonable cost.

Outcome 5 – reject the system

Outcome 5 is a logical possibility although it is unlikely to be an outcome at this stage because the feedback from development and testing would most likely have identified serious problems before this stage is reached.

THE FINAL RELEASE DECISION

The final release decision is in the hands of stakeholders and not the UAT team, but it is important for the UAT team leader to be available for consultation and advice. Whatever release decision is taken, the UAT team’s results, data and experience are likely to be valuable in determining how to proceed to the next stage, whether that is an improvement programme or an immediate release of the system.

CHAPTER SUMMARY

This chapter has examined a range of possible scenarios at the end of UAT and for each of them it has identified an appropriate outcome and next steps.

After reading this chapter you should be able to answer the following questions:

•  How can I be sure the system is ready for use?

•  How do I know if the risk of releasing the software into service is manageable?

•  Who decides if the system can be accepted?

•  Who decides if the acceptance criteria have been met or not?

•  What happens if the acceptance criteria are not fully met?

•  What can we do to minimise risk if the acceptance criteria are partially met?

What have you learned?

Test your knowledge of Chapter 9 by answering the following questions. The correct answers can be found in Appendix B.

1. What are the three key decisions for acceptance?

A.  Was the build contracted to a third party?

B.  Is there an immovable deadline?

C.  Were there contractual criteria?

D.  Were the contractual criteria met?

E.  Does the system achieve the business benefits?

F.  Is the risk of releasing the system acceptable?

2. How can earlier UAT activities mitigate risk when testing has to finish early?

A.  Having taken a risk-based approach

B.  Having evaluated the UAT continuously

C.  Earlier UAT activities have no impact on mitigating risks during testing

D.  Both A and B are true

3. Which is the least likely outcome of the risk assessment?

A.  Release the system as it is

B.  Defer the release

C.  Release with extra support

D.  Do not release (reject the system)

Some questions to consider (our responses are in Appendix B)

1.  The test manager wants to set up a meeting to discuss the release towards the end of UAT. Who should they invite and why?

2.  There are a number of critical defects still outstanding. What does this mean in terms of the risk of release and to the release decision?

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset