Chapter 1

A Brief History of Software Testing

Modern testing tools are becoming more and more advanced and user-friendly. The following describes how software testing activity has evolved, and is evolving, over time. This sets the perspective on where automated testing tools are going.

Software testing is the activity of running a series of dynamic executions of software programs after the software source code has been developed. It is performed to uncover and correct as many potential errors as possible before delivery to the customer. As pointed out earlier, software testing is still an “art.” It can be considered a risk management technique; the quality assurance technique, for example, represents the last defense to correct deviations from errors in the specification, design, or code.

Throughout the history of software development, there have been many definitions and advances in software testing. Figure 1.1 graphically illustrates these evolutions. In the 1950s, software testing was defined as “what programmers did to find bugs in their programs.” In the early 1960s the definition of testing underwent a revision. Consideration was given to exhaustive testing of the software in terms of the possible paths through the code, or total enumeration of the possible input data variations. It was noted that it was impossible to completely test an application because (1) the domain of program inputs is too large, (2) there are too many possible input paths, and (3) design and specification issues are difficult to test. Because of the foregoing points, exhaustive testing was discounted and found to be theoretically impossible.

As software development matured through the 1960s and 1970s, the activity of software development was referred to as “computer science.” Software testing was defined as “what is done to demonstrate correctness of a program” or as “the process of establishing confidence that a program or system does what it is supposed to do” in the early 1970s. A short-lived computer science technique that was proposed during the specification, design, and implementation of a software system was software verification through “correctness proof.” Although this concept was theoretically promising, in practice it was too time consuming and insufficient. For simple tests, it was easy to show that the software “works” and prove that it will theoretically work. However, because most of the software was not tested using this approach, a large number of defects remained to be discovered during actual implementation. It was soon concluded that “proof of correctness” was an inefficient method of software testing. However, even today there is still a need for correctness demonstrations, such as acceptance testing, as described in various sections of this book.

Images

Figure 1.1   History of software testing.

In the late 1970s it was stated that testing is a process of executing a program with the intent of finding an error, not proving that it works. The new definition emphasized that a good test case is one that has a high probability of finding an as-yet-undiscovered error. A successful test is one that uncovers an as-yet-undiscovered error. This approach was the exact opposite of that followed up to this point.

The foregoing two definitions of testing (prove that it works versus prove that it does not work) present a “testing paradox” with two underlying and contradictory objectives:

  1. To give confidence that the product is working well

  2. To uncover errors in the software product before its delivery to the customer (or the next state of development)

If the first objective is to prove that a program works, it was determined that “we shall subconsciously be steered toward this goal; that is, we shall tend to select test data that have a low probability of causing the program to fail.”

If the second objective is to uncover errors in the software product, how can there be confidence that the product is working well, inasmuch as it was just proved that it is, in fact, not working! Today it has been widely accepted by good testers that the second objective is more productive than the first objective, for if one accepts the first one, the tester will subconsciously ignore defects trying to prove that a program works.

The following good testing principles were proposed:

  1. ■ A necessary part of a test case is a definition of the expected output or result.

  2. ■ Programmers should avoid attempting to test their own programs.

  3. ■ A programming organization should not test its own programs.

  4. ■ Thoroughly inspect the results of each test.

  5. ■ Test cases must be written for invalid and unexpected, as well as valid and expected, input conditions.

  6. ■ Examining a program to see if it does not do what it is supposed to do is only half the battle. The other half is seeing whether the program does what it is not supposed to do.

  7. ■ Avoid throwaway test cases unless the program is truly a throwaway program.

  8. ■ Do not plan a testing effort under the tacit assumption that no errors will be found.

  9. ■ The probability of the existence of more errors in a section of a program is proportional to the number of errors already found in that section.

The 1980s saw the definition of testing extended to include defect prevention. Designing tests is one of the most effective bug prevention techniques known. It was suggested that a testing methodology was required, specifically, that testing must include reviews throughout the entire software development life cycle and that it should be a managed process. Promoted was the importance of testing not just a program but the requirements, design, code, tests themselves, and the program.

“Testing” traditionally (up until the early 1980s) referred to what was done to a system once working code was delivered (now often referred to as system testing); however, testing today is “greater testing,” in which a tester should be involved in almost every aspect of the software development life cycle. Once code is delivered to testing, it can be tested and checked, but if anything is wrong, the previous development phases have to be investigated. If the error was caused by a design ambiguity, or a programmer oversight, it is simpler to try to find the problems as soon as they occur, not wait until an actual working product is produced. Studies have shown that about 50 percent of bugs are created at the requirements (what do we want the software to do?) or design stages, and these can have a compounding effect and create more bugs during coding. The earlier a bug or issue is found in the life cycle, the cheaper it is to fix (by exponential amounts). Rather than test a program and look for bugs in it, requirements or designs can be rigorously reviewed. Unfortunately, even today, many software development organizations believe that software testing is a back-end activity.

In the mid-1980s, automated testing tools emerged to automate the manual testing effort to improve the efficiency and quality of the target application. It was anticipated that the computer could perform more tests of a program than a human could perform manually, and more reliably. These tools were initially fairly primitive and did not have advanced scripting language facilities (see the section, “Evolution of Automated Testing Tools,” later in this chapter for more details).

In the early 1990s the power of early test design was recognized. Testing was redefined to be “planning, design, building, maintaining, and executing tests and test environments.” This was a quality assurance perspective of testing that assumed that good testing is a managed process, a total life-cycle concern with testability.

Also, in the early 1990s, more advanced capture/replay testing tools offered rich scripting languages and reporting facilities. Test management tools helped manage all the artifacts from requirements and test design, to test scripts and test defects. Also, commercially available performance tools arrived to test system performance. These tools tested stress and load-tested the target system to determine their breaking points. This was facilitated by capacity planning.

Although the concept of a test as a process throughout the entire software development life cycle has persisted, in the mid-1990s, with the popularity of the Internet, software was often developed without a specific testing standard model, making it much more difficult to test. Just as documents could be reviewed without specifically defining each expected result of each step of the review, so could tests be performed without explicitly defining everything that had to be tested in advance. Testing approaches to this problem are known as “agile testing.” The testing techniques include exploratory testing, rapid testing, and risk-based testing.

In the early 2000s Mercury Interactive (now owned by Hewlett-Packard [HP]) introduced an even broader definition of testing when they introduced the concept of business technology optimization (BTO). BTO aligns the IT strategy and execution with business goals. It helps govern the priorities, people, and processes of IT. The basic approach is to measure and maximize value across the IT service delivery life cycle to ensure applications meet quality, performance, and availability goals. Interactive digital cockpit revealed vital business availability information in real-time to help IT and business executives prioritize IT operations and maximize business results. It provided end-to-end visibility into business availability by presenting key business process indicators in real-time, as well as their mapping to the underlying IT infrastructure.

Historical Software Testing and Development Parallels

In some ways, software testing and automated testing tools are following similar paths as traditional development. The following is a brief evolution of software development and shows how deviations from prior best practices are also being observed in the software testing process.

The first computers were developed in the 1950s, and FORTRAN was the first 1GL programming language. In the late 1960s, the concept of “structured programming” stated that any program can be written using three simple constructs: simple sequence, if-then-else, and do while statements. There were other prerequisites such as the program being a “proper program” whereby there must exist only one entry and one exit point. The focus was on the process of creating programs.

In the 1970s the development community focused on design techniques. They realized that structured programming was not enough to ensure quality—a program must be designed before it can be coded. Techniques such as Yourdon’s, Myers’, and Constantine’s structured design and composite design techniques flourished and were accepted as best practice. The focus still had a process orientation.

The philosophy of structured design was partitioning and organizing the pieces of a system. By partitioning is meant the division of the problem into smaller subproblems, so that each subproblem will eventually correspond to a piece of the system. Highly interrelated parts of the problem should be in the same piece of the system; that is, things that belong together should go together. Unrelated parts of the problem should reside in unrelated pieces of the system; for example, things that have nothing to do with one another do not belong together.

In the 1980s, it was determined that structured programming and software design techniques were still not enough: the requirements for the programs must first be established for the right system to be delivered to the customer. The focus was on quality that occurs when the customer receives exactly what he or she wanted in the first place.

Many requirement techniques emerged, such as data flow diagrams (DFDs). An important part of a DFD is a store, a representation of where the application data will be stored. The concept of a store motivated practitioners to develop a logical-view representation of the data. Previously the focus was on the physical view of data in terms of the database. The concept of a data model was then created: a simplified description of a real-world system in terms of data, for example, a logical view of data. The components of this approach included entities, relationships, cardinality, referential integrity, and normalization. These also created a controversy as to which came first: the process or data, a chicken-and-egg argument. Prior to the logical representation of data, the focus was on the processes that interfaced to databases. Proponents of the logical view of data initially insisted that the data was the first analysis focus point and then the process. With time, it was agreed that both the process and data must be considered jointly in defining the requirements of a system.

In the mid-1980s, the concept of information engineering was introduced. It was a new discipline that led the world into the information age. With this approach, there is more interest in understanding how information can be stored and represented, how information can be transmitted through networks in multimedia forms, and how information can be processed for various services and applications. Analytical problem-solving techniques, with the help of mathematics and other related theories, were applied to the engineering design problems. Information engineering stressed the importance of taking an enterprise view of application development rather than a specific application. By modeling the entire enterprise in terms of processes, data, risks, critical success factors, and other dimensions, it was proposed that management would be able to manage the enterprise in a more efficient manner.

During this same time frame, fourth-generation computers embraced microprocessor chip technology and advanced secondary storage at fantastic rates, with storage devices holding tremendous amounts of data. Software development techniques had vastly improved, and 4GLs made the development process much easier and faster. Unfortunately, the emphasis on quick turnaround of applications led to a backward trend of fundamental development techniques to “get the code out” as quickly as possible. This led to reducing the emphasis on requirement and design and still persists today in many software development organizations.

Extreme Programming

Extreme programming (XP) is an example of such a trend. XP is an unorthodox approach to software development, and it has been argued that it has no design aspects. The extreme programming methodology proposes a radical departure from commonly accepted software development processes. There are really two XP rules: (1) Do a Little Design and (2) No Requirements, Just User Stories. Extreme programming disciples insist that “there really are no rules, just suggestions. XP methodology calls for small units of design, from ten minutes to half an hour, done periodically from one day between sessions to a full week between sessions. Effectively, nothing gets designed until it is time to program it.”

Although most people in the software development business understandably consider requirements documentation to be vital, XP recommends the creation of as little documentation as possible. No up-front requirement documentation is created in XP, and very little is created in the software development process.

With XP, the developer comes up with test scenarios before she does anything else. The basic premise behind test-first design is that the test class is written before the real class; thus, the end purpose of the real class is not simply to fulfill a requirement, but simply to pass all the tests that are in the test class. The problem with this approach is that independent testing is needed to find out things about the product the developer did not think about or was not able to discover during her own testing.

Evolution of Automated Testing Tools

Test automation started in the mid-1980s with the emergence of automated capture/replay tools. A capture/replay tool enables testers to record interaction scenarios. Such tools record every keystroke, mouse movement, and response that was sent to the screen during the scenario. Later, the tester may replay the recorded scenarios. The capture/replay tool automatically notes any discrepancies in the expected results. Such tools improved testing efficiency and productivity by reducing manual testing efforts.

Images

Figure 1.2   Motivation for test automation. (From “Why Automate,” Linda Hayes, Worksoft, Inc. white paper, 2002, www.worksoft.com. With permission.)

The cost justification for test automation is simple and can be expressed in a single figure (Figure 1.2). As this figure suggests, over time the number of functional features for a particular application increases owing to changes and improvements to the business operations that use the software. Unfortunately, the number of people and the amount of time invested in testing each new release either remain flat or may even decline. As a result, the test functional coverage steadily decreases, which increases the risk of failure, translating to potential business losses.

For example, if the development organization adds application enhancements equal to 10 percent of the existing code, this means that the test effort is now 110 percent as great as it was before. Because no organization budgets more time and resources for testing than they do for development, it is literally impossible for testers to keep up.

This is why applications that have been in production for years often experience failures. When test resources and time cannot keep pace, decisions must be made to omit the testing of some functional features. Typically, the newest features are targeted because the oldest ones are assumed to still work. However, because changes in one area often have an unintended impact on other areas, this assumption may not be true. Ironically, the greatest risk is in the existing features, not the new ones, for the simple reason that they are already being used.

Test automation is the only way to resolve this dilemma. By continually adding new tests for new features to a library of automated tests for existing features, the test library can track the application functionality.

The cost of failure is also on the rise. Whereas in past decades software was primarily found in back-office applications, today software is a competitive weapon that differentiates many companies from their competitors and forms the backbone of critical operations. Examples abound of errors in the tens or hundreds of millions—even billions—of dollars in losses due to undetected software errors. Exacerbating the increasing risk is the decreasing cycle times. Product cycles have compressed from years into months, weeks, or even days. In these tight time frames, it is virtually impossible to achieve acceptable functional test coverage with manual testing.

Capture/replay automated tools have undergone a series of staged improvements. The evolutionary improvements are described in the following sections.

Static Capture/Replay Tools (without Scripting Language)

With these early tools, tests were performed manually and the inputs and outputs were captured in the background. During subsequent automated playback, the script repeated the same sequence of actions to apply the inputs and compare the actual responses to the captured results. Differences were reported as errors. The GUI menus, radio buttons, list boxes, and text were stored in the script. With this approach the flexibility of changes to the GUI was limited. The scripts resulting from this method contained hard-coded values that had to change if anything at all changed in the application. The costs associated with maintaining such scripts were astronomical, and unacceptable. These scripts were not reliable even if the application had not changed, and often failed on replay (pop-up windows, messages, and other “surprises” that did not happen when the test was recorded could occur). If the tester made an error entering data, the test had to be rerecorded. If the application changed, the test had to be rerecorded.

Static Capture/Replay Tools (with Scripting Language)

The next generation of automated testing tools introduced scripting languages. Now the test script was a program. Scripting languages were needed to handle conditions, exceptions, and the increased complexity of software. Automated script development, to be effective, had to be subject to the same rules and standards that were applied to software development. Making effective use of any automated test tool required at least one trained, technical person—in other words, a programmer.

Variable Capture/Replay Tools

The next generation of automated testing tools introduced added variable test data to be used in conjunction with the capture/replay features. The difference between static capture/replay and variable is that in the former case the inputs and outputs are fixed, whereas in the latter the inputs and outputs are variable. This is accomplished by performing the testing manually, and then replacing the captured inputs and expected outputs with variables whose corresponding values are stored in data files external to the script. Variable capture/replay is available from most testing tools that use a script language with variable data capability. Variable capture/replay and extended methodologies reduce the risk of not performing regression testing on existing features, improving the productivity of the testing process.

However, the problem with variable capture/replay tools is that they still require a scripting language that needs to be programmed. However, just as development programming techniques improved, new scripting techniques emerged.

The following are four popular techniques:

  1. ■ Data-driven: The data-driven approach uses input and output values that are read from data files (such CVS files, Excel files, text files, etc.) to drive the tests.

    This approach to testing with variable data re-emphasizes the criticality of addressing both process and data as discussed in the “Historical Software Testing and Development Parallels” section. It is necessary to focus on the test scripts AND test automation data, i.e., development data modeling. Unfortunately, the creation of test automated data is often a challenge. The creation of test data from the requirements (if they exist) is a manual and “intuitive” process. In the future, futuristic tools such as Smartwave Technologies’ “Smart Test,” a test data generator tool, solves the problem by scientifically generating intelligent test data that can be imported into automated testing tools as variable data (see Chapter 34, “Software Testing Trends,” for more details).

  2. ■ Modular: The modular approach requires the creation of small, independent automation scripts and functions that represent modules, sections, and functions of the application under test.

  3. ■ Keyword: The keyword-driven approach is one in which the different screens, functions, and business components are specified as keywords in a data table. The test data and the actions to be performed are scripted with the test automation tool.

  4. ■ Hybrid: The hybrid is a combination of all of the foregoing techniques, integrating from their strengths and trying to mitigate their weaknesses. It is defined by the core data engine, the generic component functions, and the function libraries. Whereas the function libraries provide generic routines useful even outside the context of a keyword-driven framework, the core engine and component functions are highly dependent on the existence of all three elements.

(See the section, “Test Automation Framework,” in Chapter 28 for more details of each technique.)

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset