Chapter 6. Defect Removal Effectiveness

The concept of defect removal effectiveness and its measurement are central to software development. Defect removal is one of the top expenses in any software project and it greatly affects schedules. Effective defect removal can lead to reductions in the development cycle time and good product quality. For improvements in quality, productivity, and cost, as well as schedule, it is important to use better defect prevention and removal technologies to maximize the effectiveness of the project. It is important for all projects and development organizations to measure the effectiveness of their defect removal processes.

In Chapter 4 we briefly touched on the metrics of defect removal effectiveness and fault containment. In this chapter we elaborate on the concept, its measurements, and its use in the phase-based defect removal model. After a brief literature review, we take a closer look at the defect injection and removal activities of the phases of a typical development process. Using a matrix approach to cross-tabulate defect data in terms of defect origin and phase of defect discovery (where found), we provide a detailed example of calculating the values of the overall defect removal effectiveness, the inspection effectiveness, the test effectiveness as well as the phase-specific defect removal effectiveness. We then establish the formulas of these indexes based on the defect origin/where found matrix. Next we elaborate the role of defect removal effectiveness in quality planning with more examples. We discuss the cost effectiveness of phase defect removal and also the defect removal effectiveness levels in the context of the capability maturity model (CMM) before summarizing the chapter.

Before we begin, a point on terminology is in order. Some writers use the terms defect removal efficiency, error detection efficiency, fault containment, defect removal effectiveness, and the like. In this book we prefer the term effectiveness rather than efficiency. Efficiency implies the element of time, effectiveness is related to the extent of impact and we think the latter is more appropriate. In the following sections we may sometimes use the two terms interchangeably, especially when we refer to the definitions and metrics of other writers.

Literature Review

In the 1960s and earlier, when software development was simply “code and test” and software projects were characterized by cost overruns and schedule delays, the only defect removal step was testing. In the 1970s, formal reviews and inspections were recognized as important to productivity and product quality, and thus were adopted by development projects. As a result, the value of defect removal as an element of he development process strengthened. In his classic article on design and code inspections, Fagan (1976) touches on the concept of defect removal effectiveness. He defined error detection efficiency as:

Literature Review

In an example of a COBOL application program Fagan cites, the total error detection efficiency for both design and code inspection was 82%. Such a degree of efficiency seemed outstanding. Specifically, the project found 38 defects per KNCSS (thousand noncommentary source statements) via design and code inspections, and 8 defects per KNCSS via unit testing and preparation for acceptance testing. No defects were found during acceptance testing or in actual usage in a six-month period. From this example we know that defects found in the field (actual usage of the software) were included in the denominator of Fagan’s calculation of defect removal efficiency.

Intriguingly, the concept of defect removal effectiveness and its measurements were seldom discussed in the literature, as its importance would merit, until the mid-1980s (Jones, 1986). Not surprisingly, Jones’s definition, stated here, is very similar to Fagan’s:

Literature Review

In Jones’s definition, defects found in the field are included in the denominator of the formula.

IBM’s Federal Systems Division in Houston, Texas, developed mission-specific space shuttle flight software for the National Aeronautics and Space Administration (NASA) and was well known for its high product quality. The space shuttle is “fly-by-wire”; all the astronaut’s commands are sent from flight-deck controls to the computers, which then send out electronic commands to execute a given function. There are five computers onboard the shuttle. The Primary Avionics Software System (onboard software) is responsible for vehicle guidance, navigation, flight control, and numerous systems management and monitoring functions, and also provides the interface from the vehicle to crew and ground communications systems. The onboard software contains about 500,000 lines of source code. In addition, there are about 1.7 million lines of code for the ground software systems used to develop and configure the onboard system for shuttle missions (Kolkhorst and Macina, 1988).

IBM Houston won many quality awards from NASA and from the IBM Corporation for its outstanding quality in the space shuttle flight systems. For example, it received the first NASA Excellence Award for Quality and Productivity in 1987 (Ryan, 1987), and in 1989 it won the first Best Software Laboratory Award from the IBM Corporation. Its shuttle onboard software (PASS) achieved defect-free quality since 1985, and the defect rate for the support systems was reduced to an extraordinarily low level. IBM Houston took several key approaches to improve its quality, one of which is the focus on rigorous and formal inspections. Indeed, in addition to design and code inspections, the IBM Houston software development process contained the phase of formal requirements analysis and inspection. The requirements, which are specified in precise terms and formulas, are much like the low-level design documents in commercial software. The rationale for the heavy focus on the front end of the process, of course, is to remove defects as early as possible in the software life cycle. Indeed, one of the four metrics IBM used to manage quality is the early detection percentage, which is actually inspection defect removal effectiveness. From Ryan (1987) and Kolkhorst and Macina (1988):

Literature Review

where total number of errors is the sum of major inspection errors and valid discrepancy reports (discrepancy report is the mechanism for tracking test defects).

According to IBM Houston’s definitions, a major inspection error is any error found in a design or code inspection that would have resulted in a valid discrepancy report (DR) if the error had been incorporated into the software. Philosophical differences, errors in comments or documentation, and software maintenance issues are inspection errors that may be classified as minor and do not enter into this count. Valid DRs document that the code fails to meet the letter, intent, or operational purpose of the requirements. These DRs require a code fix, documented waiver, or user note to the customer. From the preceding formula it appears that the denominator does not include defects from the field, when the software is being used by customers. In this case, however, it is more a conceptual than a practical difference because the number of field defects for the shuttle software systems is so small.

IBM Houston’s data also substantiated a strong correlation between inspection defect removal effectiveness and product quality (Kolkhorst and Macina, 1988). For software releases from November 1982 to December 1986, the early detection percentages increased from about 50% to more than 85%. Correspondingly, the product defect rates decreased monotonically from 1984 to 1986 by about 70%. Figures 6.1 and 6.2 show the details.

Early Detection of Software Errors

From “Developing Error-Free Software,” by IEEE AES Magazine: 25–31. Copyright © 1988 IEEE. Reprinted with permission.

Figure 6.1. Early Detection of Software Errors

Relative Improvement of Software Types

From “Developing Error-Free Software,” by IEEE AES Magazine: 25–31. Copyright © 1988 IEEE. Reprinted with permission.

Figure 6.2. Relative Improvement of Software Types

The effectiveness measure by Dunn (1987) differs little from Fagan’s and from Jones’s second definition. Dunn’s definition is:

Relative Improvement of Software Types

where

E = Effectiveness of activity (development phase)

N = Number of faults (defects) found by activity (phase)

S = Number of faults (defects) found by subsequent activities (phases)

According to Dunn (1987), this metric can be tuned by selecting only defects present at the time of the activity and susceptible to detection by the activity.

Daskalantonakis (1992) describes the metrics used at Motorola for software development. Chapter 4 gives a brief summary of those metrics. Two of the metrics are in fact for defect removal effectiveness: total defect containment effectiveness (TDCE) and phase containment effectiveness (PCEi). For immediate reference, we restate the two metrics:

Relative Improvement of Software Types
Relative Improvement of Software Types

where phase i errors are problems found during that development phase in which they were introduced, and phase i defects are problems found later than the development phase in which they were introduced.

The definitions and metrics of defect removal effectiveness just discussed differ little from one to another. However, there are subtle differences that may cause confusion. Such differences are negligible if the calculation is for the overall effectiveness of the development process, or there is only one phase of inspection. However, if there are separate phases of activities and inspections before code integration and testing, which is usually the case in large-scale development, the differences could be significant. The reason is that when the inspection of an early phase (e.g., high-level design inspection) took place, the defects from later phases of activities (e.g., coding defects) could not have been injected into the product yet. Therefore, “defects present at removal operation” may be very different from (less than) “defects found plus defect found later” or “N + S.” In this regard Dunn’s (1987) view on the fine tuning of the metric is to the point. Also, Motorola’s PCEi could be quite different from others. In the next section we take a closer look at this metric.

A Closer Look at Defect Removal Effectiveness

To define defect removal effectiveness clearly, we must first understand the activities in the development process that are related to defect injections and to removals. Defects are injected into the product or intermediate deliverables of the product (e.g., design document) at various phases. It is wrong to assume that all defects of software are injected at the beginning of development. Table 6.1 shows an example of the activities in which defects can be injected or removed for a development process.

For the development phases before testing, the development activities themselves are subject to defect injection, and the reviews or inspections at end-of-phase activities are the key vehicles for defect removal. For the testing phases, the testing itself is for defect removal. When the problems found by testing are fixed incorrectly, there is another chance to inject defects. In fact, even for the inspection steps, there are chances for bad fixes. Figure 6.3 describes the detailed mechanics of defect injection and removal at each step of the development process. From the figure, defect removal effectiveness for each development step, therefore, can be defined as:

Defect Injection and Removal During One Process Step

Figure 6.3. Defect Injection and Removal During One Process Step

Defect Injection and Removal During One Process Step

Table 6.1. Activities Associated with Defect Injection and Removal

Development Phase

Defect Injection

Defect Removal

Requirements

Requirements-gathering process and the development of programming functional specifications

Requirement analysis and review

High-level design

Design work

High-level design inspections

Low-level design

Design work

Low-level design inspections

Code implementation

Coding

Code inspections

Integration/build

Integration and build process

Build verification testing

Unit test

Bad fixes

Testing itself

Component test

Bad fixes

Testing itself

System test

Bad fixes

Testing itself

This is the conceptual definition. Note that defects removed is equal to defects detected minus incorrect repairs. If an ideal data tracking system existed, all elements in Figure 6.3 could be tracked and analyzed. In reality, however, it is extremely difficult to reliably track incorrect repairs. Assuming the percentages of incorrect repair or bad fixes are not high (based on my experience), defects removed can be approximated by defects detected. From experience with the AS/400, about 2% are bad fixes during testing, so this assumption seems reasonable. If the bad-fix percentage is high, one may want to adjust the effectiveness metric accordingly, if an estimate is available.

To derive an operational definition, we propose a matrix approach by cross-classifying defect data in terms of the development phase in which the defects are found (and removed) and the phases in which the defects are injected. This requires that for each defect found, its origin (the phase where it was introduced) be decided by the inspection group (for inspection defects) or by agreement between the tester and the developer (for testing defects). Let us look at the example in Figure 6.4.

Defect Data Cross-Tabulated by Where Found (Phase During Which Defect Was Found) and Defect Origin

Figure 6.4. Defect Data Cross-Tabulated by Where Found (Phase During Which Defect Was Found) and Defect Origin

Once the defect matrix is established, calculations of various effectiveness measures are straightforward. The matrix is triangular because the origin of a defect is always at or prior to the current phase. In this example, there were no formal requirements inspections so we are not able to assess the effectiveness of the requirements phase. But in the requirements phase, defects can be injected that can be found later in the development cycle. Therefore, the requirements phase also appears in the matrix as one of the defect origins. The diagonal values for the testing phases represent the number of bad fixes. In this example all bad fixes are detected and fixed, again correctly, within the same phase. In some cases, however, bad fixes may go undetected until subsequent phases.

Based on the conceptual definition given earlier, we calculate the various effectiveness metrics as follows.

  • High-Level Design Inspection Effectiveness; IE (I0)

    • Defects removed at I0: 730

    • Defects existing on step entry (escapes from requirements phase): 122

    • Defects injected in current phase: 859

      Defect Data Cross-Tabulated by Where Found (Phase During Which Defect Was Found) and Defect Origin
  • Low-Level Design Inspection Effectiveness; IE (I1)

    • Defects removed at I1: 729

    • Defects existing on step entry (escapes from requirements phase and I0):

      Defect Data Cross-Tabulated by Where Found (Phase During Which Defect Was Found) and Defect Origin
    • Defects injected in current phase: 939

      Defect Data Cross-Tabulated by Where Found (Phase During Which Defect Was Found) and Defect Origin
  • Code Inspection Effectiveness; IE (I2)

  • Defects removed at I1: 1095

  • Defects existing on step entry (escapes from requirements phase, I0 and I1):

    Defect Data Cross-Tabulated by Where Found (Phase During Which Defect Was Found) and Defect Origin
  • Defects injected in current phase: 1537

    Defect Data Cross-Tabulated by Where Found (Phase During Which Defect Was Found) and Defect Origin
  • Unit Test Effectiveness; TE (UT)

  • Defects removed at I1: 332

  • Defects existing on step entry (escapes from all previous phases):

    Defect Data Cross-Tabulated by Where Found (Phase During Which Defect Was Found) and Defect Origin
  • Defects injected in current phase (bad fixes): 2

    Defect Data Cross-Tabulated by Where Found (Phase During Which Defect Was Found) and Defect Origin

For the testing phases, the defect injection (bad fixes) is usually a small number. In such cases, effectiveness can be calculated by an alternative method (Dunn’s formula or Jones’s second formula as discussed earlier). In cases with a high bad-fixes rate, the original method should be used.

Defect Data Cross-Tabulated by Where Found (Phase During Which Defect Was Found) and Defect Origin

Component Test Effectiveness; TE (CT)

Defect Data Cross-Tabulated by Where Found (Phase During Which Defect Was Found) and Defect Origin

System Test Effectiveness; TE (ST)

Defect Data Cross-Tabulated by Where Found (Phase During Which Defect Was Found) and Defect Origin

Overall Inspection Effectiveness; IE

Defect Data Cross-Tabulated by Where Found (Phase During Which Defect Was Found) and Defect Origin

or

Defect Data Cross-Tabulated by Where Found (Phase During Which Defect Was Found) and Defect Origin

Overall Test Effectiveness; TE

Defect Data Cross-Tabulated by Where Found (Phase During Which Defect Was Found) and Defect Origin

Overall Defect Removal Effectiveness of the Process; DRE

Defect Data Cross-Tabulated by Where Found (Phase During Which Defect Was Found) and Defect Origin

To summarize, the values of defect removal effectiveness from this example are as follows:

I0: 74%

I1: 61%

I2: 55%

  • Overall Inspection Defect Removal Effectiveness: 74%

  • UT: 36%

  • CT: 67%

  • ST: 58%

  • Overall Test Defect Removal Effectiveness: 91%

  • Overall Defect Removal Effectiveness of the Process: 97.7%

From the matrix of Figure 6.4 it is easy to understand that the PCEi used by Motorola is somewhat different from phase defect removal effectiveness. PCEi refers to the ability of the phase inspection to remove defects introduced during a particular phase, whereas phase defect removal effectiveness as discussed here refers to the overall ability of the phase inspection to remove defects that were present at that time. The latter includes the defects introduced at that particular phase as well as defects that escaped from previous phases. Therefore, the phase containment effectiveness (PCE) values will be higher than the defect removal effectiveness values based on the same data. The PCEi values of our example are as follows.

  • I0: 681/859 = 79%

  • I1: 681/939 = 73%

  • I2: 941/1537 = 61%

  • UT: 2/2 = 100%

  • CT: 4/4 = 100%

  • ST: 1/1 = 100%

Assume further that the data in Figure 6.4 are the defect data for a new project with 100,000 lines of source code (100 KLOC). Then we can calculate a few more interesting metrics such as the product defect rate, the phase defect removal rates, phase defect injection rates, the percent distribution of defect injection by phase, and phase-to-phase defect escapes. For instance, the product defect rate is 81/100 KLOC = 0.81 defects per KLOC in the field (for four years of customer usage). The phase defect removal and injection rates are shown in Table 6.2.

Having gone through the numerical example, we can now formally state the operational definition of defect removal effectiveness. The definition requires information of all defect data (including field defects) in terms both of defect origin and at which stage the defect is found and removed. The definition is based on the defect origin/where found matrix.

Let j = 1, 2, . . . , k denote the phases of software life cycle.

Let i = 1, 2, . . . , k denote the inspection or testing types associated with the life-cycle phases including the maintenance phase (phase k).

Table 6.2. Phase Defect Removal and Injection Rates from Figure 6.3

Phase

Defects/KLOC (removal)

Defect Injection per KLOC

Total Defect Injection (%)

Requirements

1.2

3.5

High-level design

7.3

8.6

24.9

Low-level design

7.3

9.4

27.2

Code

11.0

15.4

44.5

Unit test

3.3

 

Component test

3.9

 

System test

1.1

 

Total

33.9

34.6

100.1

Then matrix M (Figure 6.5) is the defect origin/where found matrix. In the matrix, only cells Nij,where ij (cells at the lower left triangle), contain data. Cells on the diagonal (Nij where i = j) contain the numbers of defects that were injected and detected at the same phase; cells below the diagonal (Nij where i > j) contain the numbers of defects that originated in earlier development phases and were detected later. Cells above the diagonal are empty because it is not possible for an earlier development phase to detect defects that are originated in a later phase. The row marginals (Ni.) of the matrix are defects by removal activity, and the column marginals (N.j) are defects by origin.

Defect Origin/Where Found Matrix—Matrix M

Figure 6.5. Defect Origin/Where Found Matrix—Matrix M

Phase defect removal effectiveness (PDREi) can be phase inspection effectiveness [IE(i)] or phase test effectiveness [TE(i)]

Defect Origin/Where Found Matrix—Matrix M

Phase defect containment effectiveness (PDCEi)

Defect Origin/Where Found Matrix—Matrix M

Overall inspection effectiveness (IE)

Defect Origin/Where Found Matrix—Matrix M

where I is the number of inspection phases.

Overall test effectiveness (TE)

Defect Origin/Where Found Matrix—Matrix M

where I + 1, I + 2, . . . , k − 1 are the testing phases.

Overall defect removal effectiveness (DRE) of the development process:

Defect Origin/Where Found Matrix—Matrix M

Defect Removal Effectiveness and Quality Planning

Phase defect removal effectiveness and related metrics associated with effectiveness analyses (such as defect removal and defect injection rates) are useful for quality planning and quality management. These measurements clearly indicate which phase of the development process we should focus on for improvement (e.g., unit testing in our example in Figure 6.4). Effectiveness analyses can be done for the entire project as well as for local areas, such as at the component level and specific departments in an organization, and the control chart technique can be used to enforce consistent improvement across the board (e.g., Figure 5.14 in Chapter 5). Longitudinal release-to-release monitoring of these metrics can give a good feel for the process capability of the development organization. In addition, experiences from previous releases provide the basis for phase-specific target setting and for quality planning.

Phase-Based Defect Removal Model

The phase-based defect removal model (DRM) summarizes the relationships among three metrics—defect injection, defect removal, and effectiveness. The DRM takes a set of error-injection rates and a set of phase-effectiveness rates as input, then models the defect removal pattern step by step. It takes a simplified view of Figure 6.3 and works like this:

Phase-Based Defect Removal Model

For example, the metrics derived from data in Figure 6.4 can be modeled step by step as shown in Table 6.3.

Now if we are planning for the quality of a new release, we can modify the values of the parameters based on the set of improvement actions that we are going to take. If we plan to improve the effectiveness of I2 and unit tests by 5%, how much can we expect to gain in the final product quality? What are the new targets for defect rates for each phase (before the development team exits the phase)? If we invest in a defect prevention process and in an intensive program of technical education and plan to reduce the error injection rate by 10%, how much could we gain? Approximate answers to questions like these could be obtained through the DRM, given that the DRM is developed from the organization’s experience with similar development processes.

Be aware that the DRM is a quality management tool, not a device for software reliability estimation. Unlike the other parametric models that we will discuss in later chapters, the DRM cannot reliably estimate the product quality level. It cannot do so because the error injection rates may vary from case to case even for the same development team. The rationale behind this model is that if one can ensure that the defect removal pattern by phase is similar to one’s experience, one can reasonably expect that the quality of the current project will be similar.

Table 6.3. Example of Phase-Based Def ect Removal Model

Phase

(A) Defect Escaped from Previous Phase (per KLOC)

(B) Defect Injection (per KLOC)

Subtotal (A+B)

Removal Effectiveness

Defect Removal (per KLOC)

Defects at Exit of Phase (per KLOC)

Requirements

1.2

1.2

1.2

High-level design

1.2

8.6

9.82

× 74%

= 7.3

2.5

Low-level design

2.5

9.4

11.9

× 61%

= 7.3

4.6

Code

4.6

15.4

20.0

× 55%

= 11.0

9.0

Unit test

9.0

9.0

× 36%

= 3.2

5.8

Component test

5.8

5.8

× 67%

= 3.9

1.9

System test

1.9

1.9

× 58%

= 1.1

0.8

Field

0.8

Some Characteristics of a Special Case Two-Phase Model

Remus and Zilles (1979) elaborate the mathematical relationships of defect removal effectiveness, the number of defects found during the front end of the development process (before the code is integrated), the number found during testing, and the number remaining when the product is ready to ship to customers. They derived some interesting characteristics of the defect removal model in a special case:

  1. There are only two phases of defect removal.

  2. The defect removal effectiveness for the two phases is the same.

The percentage of bad fixes is one of the parameters in the Remus and Zilles model; the derivation involves more than twenty formulas. Here we take a simplified approach without taking bad fixes into account. Interestingly, despite taking a different approach, we arrive at the same conclusion as Remus and Zilles did.

Assume there are two broad phases of defect removal activities:

  1. Those activities handled directly by the development team (design reviews, code inspections, unit test) for large software projects take place before the code is integrated into the system library.

  2. The formal machine tests after code integration.

Further assume that the defect removal effectiveness of the two broad phases is the same. Define:

MP

=Major problems found during reviews/inspections and unit testing (from phase 1); these are the problems that if not fixed, will result in testing defects or defects in the field.

PTR

=Problem tracking report after code integration: errors found during formal machine tests.

μ

=MP/PTR, μ>1 (Note: The higher the value of μ, the more effective the front end.)

Q

= Number of defects in the released software—defects found in the field (customer usage).

TD

=Total defects for the life of the software = MP + PTR + Q.

By definition of effectiveness:

Equation 6.1. 

Some Characteristics of a Special Case Two-Phase Model

Equation 6.2. 

Some Characteristics of a Special Case Two-Phase Model

By the assumption that the two phases have the same effectiveness:

Some Characteristics of a Special Case Two-Phase Model

Thus,

Equation 6.3. 

Some Characteristics of a Special Case Two-Phase Model

Then,

Some Characteristics of a Special Case Two-Phase Model
Some Characteristics of a Special Case Two-Phase Model

Therefore,

Equation 6.4. 

Some Characteristics of a Special Case Two-Phase Model

By the same token, it can be shown that:

Equation 6.5. 

Some Characteristics of a Special Case Two-Phase Model

Furthermore, from the definition of μ:

Some Characteristics of a Special Case Two-Phase Model

Therefore,

Equation 6.6. 

Some Characteristics of a Special Case Two-Phase Model

Equations (6.4) through (6.6) can be useful for quality planning. The equations can be applied to absolute numbers as well as to normalized rates (e.g., defects per KLOC). Given the number of MP and μ, or PTR and μ, one can estimate the number of defects that remained in the product by Equations (6.4) and (6.5). Also, assuming we use the lifetime defect rate (TD) of a predecessor product to approximate the TD of the product being developed, given a target product quality level to shoot for, Equation (6.6) can determine the value of μ that we need to achieve in order to reach the target. Choosing a specific value of μ determines how much focus a project should have on front-end defect removal. Once the μ target is set, the team can determine the defect removal techniques to use (e.g., formal inspection, function verification by owner, team verifications, rigorous unit testing, etc.). For example, if we use the data from the example of Figure 6.4 (TD = 34.6 defects/KLOC, Q = 0.81 defects/KLOC for life of customer use), then the value of μ should be:

Some Characteristics of a Special Case Two-Phase Model

This means that if the effectiveness is the same for the two phases, then the number of defects to be removed by the first phase must be at least 6.5 times of the number to be removed by testing in order to achieve the quality target. Note that the equations described in this section are valid only under the assumptions stated. They cannot be generalized. Although Equations (6.4) and (6.5) can be used to estimate product quality, this special case DRM is still not a projection model. The equal effectiveness assumption cannot be verified until the product defect rate Q is known or estimated via an independent method. If this assumption is violated, the results will not be valid.

Cost Effectiveness of Phase Defect Removal

In addition to the defect removal effectiveness by phase per se, the cost of defect removal must be considered for efficient quality planning. Defect removal at earlier development phases is generally less expensive. The closer the defects are found relative to where and when they are injected, the less the removal and rework effort. Fagan (1976) contends that rework done at the I0, I1, and I2 inspection levels can be 10 to 100 times less expensive than if work done in the last half of the process (formal testing phases after code integration). According to Freedman and Weinberg (1982, 1984), in large systems, reviews can reduce the number of errors that reach the testing phases by a factor of 10, and such reductions cut testing costs, including review costs, by 50% to 80%. Remus (1983) studied the cost of defect removal during the three major life-cycle phases of design and code inspection, testing, and customer use (maintenance phase) based on data from IBM’s Santa Teresa (California) Laboratory. He found the cost ratio for the three phases to be 1 to 20 to 82.

Based on sample data from IBM Rochester, we found the defect removal ratio for the three phases for the AS/400 similar to Remus’s, at 1 to 13 to 92. Caution: These numbers may not be interpreted straightforwardly because defects that escaped to the later testing phases and to the field are more difficult to find. When we invest and improve the front end of the development process to prevent these more difficult defects from escaping to the testing phases and to the field, the ratios may decrease. Nonetheless, as long as the marginal costs of additional front-end defect removal remains less than testing and field maintenance, additional investment in the front end is warranted.

Our sample study also revealed interesting but understandable findings. The cost of defect removal is slightly higher for I0 inspection than for I1 and I2 (Figure 6.6). The main reason for this is that external interfaces are being impacted and more personnel are involved in the I0 inspection meetings. The cost for creating and answering a problem trouble report during testing (i.e., problem determination cost) is correlated with the testing phase, defect origin, and defect severity (1 being the most severe and 4 the least) (Figure 6.7).

Cost of Defect Removal by Inspection Phase

Figure 6.6. Cost of Defect Removal by Inspection Phase

Cost of Creating and Answering a Problem Trouble Report by Several Variables

Figure 6.7. Cost of Creating and Answering a Problem Trouble Report by Several Variables

In his work on software inspection, Gilb (1993, 1999) conducted thorough analysis with ample data. The findings corroborate with those discussed here and support the general argument that software inspection not only improves the quality of the product, but also is beneficial to the economics of the project and the organization.

Although front-end defect removal activities in the form of reviews, walk-throughs, and inspections are less expensive than testing, in general practice, these methods are not rigorous enough. Fagan’s inspection method is a combination of a formal review, an inspection, and a walkthrough. It consists of five steps:

  1. overview (for communications and education)

  2. preparation (for education)

  3. inspection (to find errors and to walk through every line of code)

  4. rework (to fix errors), and

  5. follow-up (to ensure all fixes are applied correctly)

Such a combination has made Fagan’s method somewhat more formal and therefore more effective than earlier methods. The Active Design Reviews method, introduced by Parnas and Weiss (1985), represents an important advance. The approach involves conducting several brief reviews rather than one large review, thereby avoiding many of the difficulties of conventional reviews. Knight and Myers (1991) proposed the phased inspection method to improve the rigor of the process. It consists of a series of coordinated partial inspections called phases (therefore, the term is used differently). Each phase is designed to achieve a desirable property in the product (e.g., portability, reusability, or maintainability), and the responsibilities of each inspector are specified and tracked.

Knight and Myers defined two types of phase. The first type, referred to as a single-inspector phase, is a rigidly formatted process driven by a list of unambiguous checks, for examples, internal documentation, source code layout, source code readability, programming practices, and local semantics. The second type of phase, designed to check for those properties of the software that cannot be captured in a precise yes or no statement (such as functionality and freedom from defects), is called the multi-inspector phase. Multiple personnel conduct independent examinations and then compare findings to reach reconciliation. To facilitate and enforce the process, the phased inspection method also involves use of an online computer tool. The tool contains navigation facilities for displaying the work product, documentation display facilities, facilities for the inspector to record comments, and facilities to enforce the inspection process.

Advances such as the preceding offer organizations much promise for improving the front-end defect removal effectiveness. Beyond reviews and inspections, one can even adopt formal methods such as the Cleanroom functional verification (as discussed in Chapter 2).

Defect Removal Effectiveness and Process Maturity Level

Based on a special study commissioned by the Department of Defense, Jones (Software Productivity Research, 1994; Jones, 2000) estimates the defect removal effectiveness for organizations at different levels of the development process capability maturity model (CMM):

  • Level 1: 85%

  • Level 2: 89%

  • Level 3: 91%

  • Level 4: 93%

  • Level 5: 95%

These values can be used as comparison baselines for organizations to evaluate their relative capability with regard to this important parameter.

In a discussion on quantitative process management (a process area for Capability Maturity Model Integration, CMMI, level 4) and process capability baselines, Curtis (2002) shows the estimated baselines for defect removal effectiveness by phase of defect insertion (or defect origin in our terminology). The cumulative percentages of defects removed up through acceptance test (the last phase before the product is shipped) by phase insertion, for CMMI level 4, are shown in Table 6.4. Based on historical and recent data from three software engineering organizations at General Dynamics Decision Systems, Diaz and King (2002) report that the phase containment effectiveness by CMM level as follows:

  • Level 2: 25.5%

  • Level 3: 41.5%

  • Level 4: 62.3%

  • Level 5: 87.3%

Table 6.4. Cumulative Percentages of Defects Removed by Phase for CMMI Level 4

Phase Inserted

Cumulative % of Defects Removed Through Acceptance Test

Requirements

94%

Top-level design

95%

Detailed design

96%

Code and unit test

94%

Integration test

75%

System test

70%

Acceptance test

70%

It is not clear how many key phases are there in the development process for these projects and the extent of variations in containment effectiveness across phases. It appears that these statistics represent the average effectiveness for peer reviews and testing for a number of projects at each maturity level. Therefore, these statistics perhaps could be roughly interpreted as overall inspection effectiveness or overall test effectiveness.

According to Jones (2000), in general, most forms of testing are less than 30% efficient. The cumulative efficiency of a sequence of test stages, however, can top 80%.

These findings demonstrate a certain level of consistency among each other and with the example in Figure 6.4. The Figure 6.4 example is based on a real-life project. There was no process maturity assessment conducted for the project but the process was mature and quantitatively managed. Based on the key process practices and the excellent field quality results, the project should be at level 4 or level 5 of a process maturity scale.

More empirical studies and findings on this subject will surely produce useful knowledge. For example, test effectiveness and inspection effectiveness by process maturity, characteristics of distributions at each maturity level, and variations across the type of software are all areas for which reliable benchmark baselines are needed.

Summary

Effective defect removal during the development process is central to the success of a software project. Despite the variations in terms of terminology and operational definitions (error detection efficiency, removal efficiency, early detection percentage, phase defect removal effectiveness, phase defect containment effectiveness, etc.), the importance of the concept of defect removal effectiveness and its measurement is well recognized. Literature and industry examples substantiate the hypothesis that effective front-end defect removal leads to improved quality of the end product. The relative cost of front-end defect removal is much lower than the cost of formal testing at the back end and the maintenance phase when the product is in the field.

To measure phase defect removal effectiveness, it is best to use the matrix approach in which the defect data are cross-tabulated in terms of defect origin and the phase in which the defects are found. Such an approach permits the estimation of phase defect injection and phase defect removal. In general, the shorter the time between defect origin and defect discovery, the more effective and the less expensive the development process will be. The special case of the two-phase defect removal model even provides a link between the relative effectiveness of front-end defect removal and the estimated outcome of the quality of the product.

Based on recent studies, defect removal effectiveness by the level of process maturity has been assessed and comparison baselines have been established. Organizations with established data on their defect removal effectiveness can make comparisons to the baselines and assess their maturity level with regard to this important parameter.

In quality planning it is important that, in addition to the final quality goals, factors such as the defect model, the phase defect removal targets, the process and specific methods used, the possible effectiveness of the methods, and so forth be examined. Inclusion of these factors in early planning facilitates achievement of the software’s quality goals.

It should be noted that defect removal effectiveness and defect removal models are useful quality planning and management tools. However, they are not equipped for quality or reliability projections; they are not predictive models. In the next several chapters, we discuss the parametric models that were developed to perform such tasks.

References

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset