Chapter 8. The Future of Software Testing

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Chapter 8. The Future of Software Testing

“The best way to predict the future is to invent it.”

—Alan Kay

Welcome to the Future

Modern software testing is a far cry from the discipline as practiced in the past. It has evolved and changed a great deal since the middle part of the last century, when the first programs were being written.

In the early days of computer programming, the same people who wrote software were the ones who tested it. Programs were small and by today’s standards very simple. They were algorithms, physics problems really, that were often completely specified (as most mathematical algorithms tend to be), meant only to be run on a single computer in a very controlled environment and used by people who knew what they were doing. Such control of complexity, the operational environment, and usage patterns is a far cry from modern software that must run on nearly any computer and be used by nearly any user and that solves problems much more diverse than the physics problems and military applications that dominated the birth of computing. Today’s software truly requires professional test engineers.

But somewhere between the dawn of computing and the time of this writing, software testing took a fateful turn. There was a point at which the need for software applications outpaced the ability of trained programmers to produce them. There simply were not enough coders. One of the many solutions to this problem was to separate the roles of developer and tester. To free up time of those trained to code, the problem of testing that code was put into the hands of a new class of IT professionals, the software testers.

This was not a partition of the existing developer community into dev and test, that would not have served to increase the number of developers; instead, software testing became more of a clerical role, with the reasoning that because they didn’t have to program, testers did not need to be as technical.

Obviously, there were exceptions, and places like IBM, DEC, and the new player on the block Microsoft hired programming talent for testing positions, particularly within groups that produced low-level applications such as compilers, operating systems, and device drivers. But outside these islands of ISVs, the tradition of hiring nontechnical people for testing roles became pervasive.

Modern testers still come from the nontechnical (or at least non–computer science) ranks, but there is a great deal of training and on-the-job mentoring available now, and this trend is slowly reversing. In my opinion, however, the slow evolution of our discipline is not enough to keep pace with the major advances in computing and software development. Applications are getting much more complicated. Platforms are becoming much more capable, and their complexity is mushrooming. The applications of the future that will be built on these new platforms and that were discussed in Chapter 1, “The Case for Software Quality,” will require a level of testing sophistication that we do not currently possess. How will we as a test community rise to the challenges of the future and be in a position to test these applications and help make them as reliable and trustworthy as they need to be? In a world where everything runs on computers and everyone depends on software, are we confident that the way we test now is good enough?

I am not. In fact, in many ways the applications of the future are just not testable given our current toolset and knowledge base. The rate of failure of most all of our current software systems makes it hard to argue that what we have now in testing is enough for today’s applications, much less tomorrow’s. To have some degree of optimism in the quality of tomorrow’s software, we are going to need some new tools and techniques to help us get there.

That’s what this chapter is about: painting a picture of the future of software testing that will actually work to deliver the highly reliable applications of tomorrow. These are a collection of ideas and visions about what software testing ought to be.

The Heads-Up Display for Testers

The tester sits in her office, papers full of printed diagrams, prose, and technical information scattered haphazardly across her workspace, a full dozen folders open on her desktop containing links to specifications, documentation, and bug databases. With one eye, she watches her email and instant message clients waiting for answers from her developers about bug fixes. With the other eye, she watches the application under test for symptoms of failure and her test tools for progress indicators. Her mind is a hodgepodge of test cases, bug data, spec data...so much information that she’s overwhelmed with it and yet not enough information to do her job properly.

Contrast the tester’s predicament with that of the computer (or console) gamer. Gamers have no need for workspace. That area can be used for empty soda cans and chip wrappers; every piece of information they require about their video game is provided by the video game itself. Unlike the tester who must sit and wonder what the application is doing behind the interface, gamers can see their entire world laid out in front of them. Thanks to their heads-up display, information seeps into their consciousness automatically.

Consider the wildly popular World of Warcraft online game and its heads-up display. In the upper-right corner of the screen, a mini map of the world pinpoints the hero’s exact location. Across the entire bottom of the screen, every tool, spell, weapon, capability, and trick the hero possesses is laid out for easy access. In the upper left, information about objects in the world, adversaries, and opportunities stream into view as the hero moves about the world. This information, called the “heads-up display,” overlays the users’ view of the game world, making them more effective without reducing their ability to play and enjoy it.

This parallels nicely with software testing, I think, to produce a nice vision for our future—a future in which information from the application and the documents and files that define it are seamlessly displayed in a configurable skin that overlays the application under test: the tester’s heads-up display, or THUD for short.

Imagine a heads up display that allows the tester to hover the cursor over a UI control and a window into the source code materializes. Don’t want to see the source? That’s fine because you can also view the code churn (fixes and modifications to the source code) information and bug fix history, and view all the prior test cases and test values for that control and any other information relevant to the tester. It is a wealth of information at your fingertips, presented in a noninvasive manner to be consumed by the tester as needed. Useful information painted on the canvass of the application under test.

Information is important enough that the THUD allows any number of overlays. Imagine a game world where targeting information is grafted over adversaries so that they are easier to shoot. The tester can see architectural layer dependencies overlaid on top of the UI while the application is running, quickly spotting hotspots and understanding the interaction between the inputs she applies and architectural and environmental dependencies. You could experience the interaction between a web app and its remote data store, much like Master Chief of the Halo Xbox series conquers a level. You could watch inputs trickle through two levels of stored procedures, or see the interaction between an API and its host file system through a visual experience that mirrors the game world experience of today.

From this knowledge will come much better guidance than what we have now. As we play our video game called software testing, we will know when bugs get fixed and what components, controls, APIs, and so forth are affected. Our heads-up display will tell us. As we test, we will be reminded of prior inputs we applied and past testing results. We’ll be reminded of which inputs are likely to find problems and which ones already have been part of some automated suite or even a prior unit test. The THUD will be a constant companion to the manual tester and a font of knowledge about the application under test, its structure, its assumptions, its architecture, its bugs, and entire test history.

The existence of the THUD will be to testers what the HUD is to gamers now. You simply don’t play a video game with the HUD turned off. Without the information the HUD provides, you’d never manage to navigate the complex and dangerous world of the game. HUD-less, you could have but one strategy: Try everything you can think of and hope for the best. That pretty much sums up what a lot of THUD-less software testing is today. Without the right information, displayed when and how we need it, what else are we supposed to do?

In the future, the experience of the software tester will be unrecognizable from what it is today. Manual testing will become much more similar to playing a video game.

“Testipedia”

The THUD and the technology it will enable will make testing much more repeatable, reusable, and generalize-able. Testers will be able to record manual test cases to have them automatically converted to automated test cases. This will increase the ability for testers across different teams, organizations, or even companies to share test experiences and test assets. The development of resources to access and share these assets is the obvious next step. I call these resources Testipedia in deference to its analogous predecessor Wikipedia.

Wikipedia is one of the most novel and useful sites on the Internet and often the very top result of an Internet search. It’s based on the idea that all the information about every concept or entity that exists is in the head of some human being. What if we got all those human beings to build an encyclopedia that exposed that knowledge to everyone else? Well, that happened, and the result is www.wikipedia.org.

Now, let’s apply the Wikipedia concept to testing. I conjecture that every function you can test has already been tested somewhere and at sometime by some tester who has come before you. Need to test a username and password entry dialog? It’s been tested before. Need to test a web form that displays the contents of a shopping cart? It’s been tested before. Indeed, every input field, function, behavior, procedure, API, or feature has been tested before. And if not the exact feature, something so close that whatever testing applied to that prior product also applies to yours to some greater or lesser extent. What we need is a Testipedia that encapsulates all this testing knowledge and surfaces the test cases in a form usable by any software tester.

Two main things have to happen for Testipedia to become a reality. First, tests have to be reusable so that a test that runs on one tester’s machine can be transported to another tester’s machine and execute without modification. Second, test cases have to be generalized so that they apply to more than just a single application. Let’s talk about these in order and discuss what will have to happen to make this a reality.

Test Case Reuse

Here’s the scenario: One tester writes a set of test cases and automates them so that she can run them over and over again. They are good test cases, so you decide to run them, as well. However, when you do run them, you find they won’t work on your machine. Your tester friend used automation APIs that you don’t have installed on your computer and scripting libraries that you don’t have either. The problem with porting test cases is that they are too specific to their environment.

In the future, we will solve this problem with a concept I call environment-carrying tests. Test cases of the future will be written in such a way that they will encapsulate their environment needs within the test case using virtualization.¹ Test cases will be written within virtual capsules that embed all the necessary environmental dependencies so that the test case can run on whatever machine you need it to run on.

¹ Virtualization plays a significant role in my future testing vision, as you will see later in this chapter.

The scope of technological advances we need for this to happen are fairly modest. However, the Achilles heel of reuse has never been technological so much as economic. The real work required to reuse software artifacts has always been on the consumer of the reused artifact and not on its producer. What we need is an incentive for testers to write reusable test cases. So, what if we create a Testipedia that stored test cases and paid the contributing tester, or their organization, for contributions? What is a test case worth? A dollar? Ten dollars? More? Clearly they have value, and a database full of them would have enough value that a business could be created to host the database and resell test cases on an as-needed basis. The more worthy a test case, the higher its value, and testers would be incentivized to contribute.

Reusable test cases will have enough intrinsic value that a market for test case converters would likely emerge so that entire libraries of tests could be provided as a service or licensed as a product.

But this is only part of the solution. Having test cases that can be run in any environment is helpful, but we still need test cases that apply to the application we want to test.

Test Atoms and Test Molecules

Microsoft, and other companies I have worked for, is really like a bunch of smaller companies all working under the same corporate structure. SQL Server, Exchange, Live, Windows Mobile...there are a lot of testers writing a lot of test cases for a lot of applications. Too bad these test cases are so hard to transfer from application to application. Testers working on SQL Server, for example, don’t find it easy to consume test cases from, say, Exchange even though both products are large server applications.

The reason for this is that we write test cases that are specifically tied to a single application. This shouldn’t come as any big surprise given that we’ve never expected test cases to have any value outside our immediate team. But the picture I’ve painted of the future requires exactly that. And if you accept the argument that test cases have value outside their immediate project, there will be financial incentive to realize that value.

Instead of writing a test case for an application, we could move down a level and write them for features instead. Any number of web applications implement a shopping cart, so test cases written for such a feature should be applicable to any number of applications. The same can be said of many common features such as connecting to a network, making SQL queries to a database, username and password authentication, and so forth. Feature-level test cases are far more reusable and transferable than application-specific test cases.

The more focused we make the scope of the test cases we write, the more general they become. Features are more focused than applications, functions and objects are more focused than features, controls and data types are more focused than functions, and so forth. At a low enough level, we have what I like to call “atomic” test cases. A test atom is a test case that exists at the lowest possible level of abstraction. Perhaps you’d write a set of test cases that simply submits alphanumeric input into a text box control. It does one thing only and doesn’t try to be anything more. You may then replicate this test atom and modify it for different purposes. For example, if the alphanumeric string in question is intended to be a username, a new test atom that encoded the structure of valid usernames would be created. Over time, Testipedia entries for thousands (and hopefully orders of magnitude more) of such test atoms would be collected.

Test atoms will be combined into test molecules, as well. Two alphanumeric string atoms might be combined into a test molecule that tests a username and password dialog box. I can see cases where many independent test authors would build such molecules and then over time the best such molecule would win out on Testipedia, and yet the alternatives would still be available. With the proper incentives, test case authors would build any number of molecules that could then be borrowed, leased, or purchased for reuse by application vendors that implement similar functionality.

An extremely valuable extension of this idea is to write atoms and molecules in such a way that they will understand whether they apply to an application. Imagine highlighting and then dragging a series of 10,000 tests onto an application and having the tests themselves figure out whether they apply to the application and then running themselves over and over within different environments and configurations. At some point, there would exist enough test atoms and molecules that the need to write new, custom tests would be minimal.

Virtualization of Test Assets

Environment-carrying test cases are only the tip of the iceberg when it comes to the use of virtual machines in the future of software testing. One of the major complexities a tester faces, as discussed in Chapter 3, “Exploratory Testing in the Small,” is the availability of actual customer environments in which to run her test cases. If we are shipping our application to run on consumer machines, how do we anticipate how those machines will be configured? What other applications will be running on them? How do we get the machines in our test labs configured in a similar manner so that our tests are as realistic as possible?

At Microsoft, they have an amazing tool called “Watson” that gives insight into what user machines look like and how our applications fail in actual user situations. Watson is built in to applications that run on user machines. It detects catastrophic failures and allows the user to choose whether to package up information about the failure and send it to Microsoft where it can be diagnosed and fixed. (Fixes are delivered through Windows Update or other similar update services by other vendors.)

Virtualization technology could be used in the future to improve this process and to contribute customer environments to Testipedia for reuse. Imagine that instead of just sending the bug report to Microsoft or some other vendor, the entire machine (minus personal information obviously) is virtualized and sent to the vendor over the Internet. Developers could debug a problem from the actual point of failure on the user’s machine. Debugging of field failures would be reduced dramatically, and over time a cache of virtual machines representing thousands of real customer environments would be stored. With the right infrastructure, these VMs could be made available as part of a virtual test lab. A trade in buying, selling, and leasing test labs would supplement Testipedia and relegate test lab design and construction to the dustbin of twentieth-century testing history.

Visualization

The availability of reusable test labs and test cases will make the job of future software testers much more of a design activity than the low-level test case construction activity that it is today. Testers will be collecting prebuilt artifacts and using sophisticated lab management solutions to organize and execute their tests. There exists the distinct possibility that without the proper oversight and insight, all this testing could miss the mark by a wide margin.

This points to the need for better software visualization so that testers can monitor testing progress and ensure that the tests are doing the job that needs to be done.

But how does one visualize software? What indeed does software look like? Software is not a physical product, like an automobile, that we can see, touch, and analyze. If an automobile is missing a bumper, it’s easy to see the defect, and everyone associated with building the car can agree on the fix and when the fix is complete. With software, however, this scenario is not quite that easy. Missing and broken parts aren’t so readily distinguished from working parts. Tests can execute without yielding much information about whether the software passed or failed the test or how new tests contribute to the overall knowledge pool about application quality. Clearly, visualization tools that expose important software properties and allow useful images of the software, both at rest and in use, to assist testers of the future would fill an important gap in our testing capability.

Visualizations of software can be based on actual software assets or properties of those assets. For example, inputs, internally stored data, dependencies, and source code are all concrete software assets that can be rendered in ways useful for testers to view. Source code can be visualized textually in a code editor or pictorially as a flow graph. For example, Figure 8.1 is a screen shot of a testing tool used in Microsoft’s Xbox and PC games test teams to help a tester visualize code paths.

Figure 8.1. A UI visualization tool from the Microsoft Games Test Org.

Instead of seeing paths as code artifacts, the visual displays the flow in terms of which screen in the game follows other screens.² It’s a sequence of UI elements that allows testers to look ahead and understand how their inputs (in this case, the input is steering the car through Project Gotham Racing II) take them through different paths of the software. This visual can be used to drive better coverage and can help testers select inputs that take them to more interesting and functionally rich places in the game.

² This particular image is taken from the testing of Project Gotham Racing II and is used with permission of the Games Test Org at Microsoft.

Visualizations can also be based on properties of the application, like churned (changed) code, coverage, complexity, and so forth. When the visualizations are based on information used to guide testing decisions, that’s when they are most useful. For example, if you wanted to test, say, the components that are the most complex, how would you decide how to choose the right components?

During the testing of Windows Vista, a visualization tool was constructed to expose such properties in a way that is consumable by software testers. The screen snap in Figure 8.2 is one example of this.

Figure 8.2. A reliability analysis tool used to visualize complexity.

Figure 8.2 is a visualization of the components of Vista and their relative complexity.³ Note that each labeled square is a component grouped according to the feature set to which the component belongs. The size of each square represents one of the numerous complexity measures the Windows group has defined, and the color represents a different complexity measure. Thus, bigger and darker squares represent more complex features. If our goal were to focus testing on complexity, this visual would be an excellent starting point.

³ This tool was developed by Brendan Murphy of Microsoft Research Cambridge and is used with permission.

Good visualizations will require support from the runtime environment in which the target software is operating. Interactions between the system under test and its external resources, such as the files, libraries, and APIs, can no longer be something that happens privately and invisibly and that only complicated tools like debuggers can reveal. We need the equivalent of X-ray machines and MRIs that bring the inner workings of software to life in three dimensions. Imagine a process similar to medical dye injection where an input is injected into a working system and its path through the system is traced visually. As the dyed input travels through the application’s components, its impact on external resources can be monitored and experiments conducted to isolate bugs, identify performance bottlenecks, and compare various implementation options. It will be this level of transparency, scrutiny, instrumentation, and analysis that will enable us to achieve the level of sophistication in software that is currently enjoyed in medical science. Software is both complicated enough and important enough that such an investment is needed.

Clearly, some visualizations will prove to be more useful than others, and in the future we will be able to pick and choose which ones we need and when. As we gain experience using visualizations, best practice will emerge to provide guidance on how to optimize their usage.

Testing in the Future

So what does the future hold for the software tester? How will the THUD-like tools and Testipedia work with virtualization and visualization tools to remake the future of the software tester? Here’s how I envision it.

Imagine a software development organization. Perhaps it is a start-up developing an application meant to run on GPS-enabled cellular phones or the IT department in a giant financial institution building a new line of business app. Perhaps it is even Microsoft itself building a cloud-based application interacting with a small client-side managed application. Any of these vendors will build their application and then contract with a test designer. For either the start-up or the IT shop, this is likely to be an external consultant, whereas Microsoft may use full-time employees for this task.

In either case, the test designer will analyze the application’s testing requirements and interfaces and document the testing needs. She will then satisfy those needs by identifying the test assets she needs to fulfill the testing requirements. She may lease or purchase those assets from commercial vendors or from a free and open source like Testipedia.⁴

⁴ Reusable test assets, both virtualized operational environments and test cases/test plans, are likely to have good commercial value. I can envision cases where multiple vendors provide commercial off-the-shelf-test assets coexisting with a Testipedia that provides a bazaar-like open source market. Which model prevails is not part of my prediction; I’m only predicting their general availability, not the acquisition and profit model.

The result will be any number of virtualized operational environments; I imagine tens of thousands or more, and millions of applicable test cases and their variants. These environments would then be deployed on a cloud-based virtual test lab, and the test cases would be executed in parallel. In a matter of a few hours, the amount of testing would exceed centuries of person-year effort. Coverage of application’s source code, dependent libraries and resources, UI, and so forth would be measured to quality levels far beyond our current technology and likely contain every conceivable use case. All this will be managed by visualization, measurement, and management tools that will provide automated bug reporting and build management so that little human monitoring will be necessary.

This future will be possible only after some years or decades of developing and collecting reusable test assets. What it will eventually mean for software testers is that they will no longer be consumed by the low-level tasks of writing test cases and executing them. They will move up several layers of abstraction to the point where they will be designing test plans and picking and choosing among relevant existing test cases and automation frameworks.

For the start-up developing cell phone apps, they would be able to acquire virtual environments representing every conceivable cell phone their customers would use. LOB app developers would be able to simulate user environments with every conceivable configuration and with thousands of potentially conflicting applications installed. Microsoft would be able to create test environments for its cloud-based application that would meet or exceed the complexity and diversity of the real production environment.

Test atoms and test molecules numbering in the hundreds of millions would, over time, be gathered and submitted individually and in groups. These tests will scour the application looking for every place where they apply, and then execute automatically, compiling their own results into the larger test monitoring system so that the human test designer can tweak how the automation is working and measure its progress. Over hours of actual time, centuries of testing will occur, and as bugs are fixed, the tests will rerun themselves at the exact moment the application is available.

By the time the application is released, every conceivable test case will have been run against every conceivable operational environment. Every input field will have been pummeled with legal and illegal input numbering in the millions. Every feature will be overtested, and every potential feature conflict or application compatibility issue will have been checked and double-checked. All possible outputs that the application can generate will be generated numerous times, and its state space will be well covered. Test suites for security, performance, privacy, and so forth will have run on every build and intermediate build. Gaps in test coverage will be identified automatically and new tests acquired to fill these gaps.

And all this is before the application ships. After it ships, testing will continue.

Post-Release Testing

Even in the face of centuries of testing that we can achieve, testing can never really be complete. If we’re not done testing when the product releases, why should we stop? Test code should ship with the binary, and it should survive release and continue doing its job without the testers being present to provide ongoing testing and diagnostics.

Part of this future is already here. The Watson technology mentioned earlier (the famous “send/don’t send” error reporting for Windows apps) that ships in-process allows the capture of faults when they occur in the field. The next logical step is to do something about them.

Watson captures a fault and snaps an image of relevant debug info. Then some poor sap at the other end of the pipe gets to wade through all that data and figure out a way to fix it via Windows update. This was revolutionary in 2004, still is actually. In two to five years, it will be old school.

What if that “poor sap” could run additional tests and take advantage of the testing infrastructure that existed before the software was released? What if that poor sap could deploy a fix and run a regression suite in the actual environment in which the failure occurred? What if that poor sap could deploy a production fix and tell the application to regress itself?

He would no longer be a poor sap, that’s for sure.

To accomplish this, it will be necessary for an application to remember its prior testing and carry along that memory wherever it goes. And that means that the ability to test itself will be a fundamental feature of software of the future. Our job will be to figure out how to take our testing magic and embed it into the application itself. The coolest software feature of the future could very well be placed there by software testers!

Conclusion

The world of the software tester is full of information. Information flows from the application we are testing, the platform and environment the application runs in, and from the development history of the application itself. The way testers consume and leverage this information will ultimately determine how well applications are tested and subsequently the quality levels of the applications that make up the software ecosystem. The only successful future for software testers is one where we master this information and use it in the ways previously prescribed failing to do so will mean the same levels of low quality our industry has historically achieved.

Any number of industries have successfully harnessed such a large amount of information, and we should look to those industries for inspiration. The one I think represents the best model for software testing is the video gaming industry, where the sheer amount and complexity of information is just as overwhelming as the body of information software testers must handle. However, gamers can handle this information simply and elegantly through their collection of tricks, tips, cheats, and the almighty heads-up display. It really boils down to information at the gamer’s fingertips and a shared set of strategies and guidance available to all so that new gamers can rapidly become experts through the experiences of old gamers.

In fact, the gaming world has been so successful that their processes, tools, and techniques have built incredibly engaging products that have created their own complex economies and changed society in some very fundamental ways. Surely, if they can take information and guidance that far, so can software testers!

The processes, techniques, and guiding principles of the gamer are a compelling model for the software tester to mimic. In the future, information should flow seamlessly and conveniently from the application under test, its environment, and its history of use directly to the tester in the most simple-to-consume form: a heads-up display for testers. Testers will be as equipped as gamers to navigate a complex world of inputs, outputs, data, and computation.

Testing stands to gain a substantial amount in terms of productivity, precision, and completeness. And, who knows, if testing is like playing a video game, it might just be more fun as well.

Exercises

1. Name five things you’d want on a tester’s heads-up display.

2. Name five things you’d want on an exploratory tester’s heads up display.

3. If you had to write a business plan for Testipedia, how would you argue its utility?

a. Write a summary of how Testipedia could be used as a free community asset.

b. Write a summary of how Testipedia could be used to make a profit.

4. In Chapter 3, testing in the small was broken into input, state, code paths, user data, and environment issues. Besides environment, which of these issues is the best possible candidate for using virtualization? Explain how virtualization could be used to support better testing of this issue.

5. Figure 8.2 is often called a “heat map” and shows two aspects of the application under test: size and complexity. Name two additional properties of software that would be useful to visualize in this manner. Explain how these properties might be used to guide a tester.

6. Could the infrastructure for post-release testing be misused by a hacker? Give a case that might be a problem. Can you name another concern a user might have with a post-release testing infrastructure? How might these problems be overcome?

7. Have a class/team/group discussion about whether human testers might one day become obsolete. What parts of the future of testing still require a human tester to be present? What would have to happen for software companies to no longer be required to employ large numbers of in-house testers?

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Chapter 8. The Future of Software Testing

Create new playlist

Sign In

Sign Up

Chapter 8. The Future of Software Testing

Welcome to the Future

The Heads-Up Display for Testers

“Testipedia”

Test Case Reuse

Test Atoms and Test Molecules

Virtualization of Test Assets

Visualization

Testing in the Future

Post-Release Testing

Conclusion

Exercises

Table of Contents for
Chapter 8. The Future of Software Testing