Exploratory Testing

Note

Testers

By Elisabeth Hendrickson

We discover surprises and untested conditions.

XP teams have no separate QA department. There’s no independent group of people responsible for assessing and ensuring the quality of the final release. Instead, the whole team—customers, programmers, and testers—is responsible for the outcome. On traditional teams, the QA group is often rewarded for finding bugs. On XP teams, there’s no incentive program for finding or removing bugs. The goal in XP isn’t to find and remove bugs; the goal is not to write bugs in the first place. In a well-functioning team, bugs are a rarity—only a handful per month.

Does that mean testers have no place on an XP team? No! Good testers have the ability to look at software from a new perspective, to find surprises, gaps, and holes. It takes time for the team to learn which mistakes to avoid. By providing essential information about what the team overlooks, testers enable the team to improve their work habits and achieve their goal of producing zero bugs.

Note

Beware of misinterpreting the testers’ role as one of process improvement and enforcement. Don’t set up quality gates or monitor adherence to the process. The whole team is responsible for process improvement. Testers are respected peers in this process, providing information that allows the whole team to benefit.

One particularly effective way of finding surprises, gaps, and holes is exploratory testing: a style of testing in which you learn about the software while simultaneously designing and executing tests, using feedback from the previous test to inform the next. Exploratory testing enables you to discover emergent behavior, unexpected side effects, holes in the implementation (and thus in the automated test coverage), and risks related to quality attributes that cross story boundaries such as security, performance, and reliability. It’s the perfect complement to XP’s raft of automated testing techniques.

Exploratory testing predates XP. [Kaner] coined the term in the book Testing Computer Software, although the practice of exploratory testing certainly preceded the book, probably by decades. Since the book came out, people such as Cem Kaner, James and Jonathan Bach, James Lyndsay, Jonathan Kohl, and Elisabeth Hendrickson have extended the concept of exploratory testing into a discipline.

About Exploratory Testing

Philosophically, exploratory testing is similar to test-driven development and incremental design: rather than designing a huge suite of tests up-front, you design a single test in your head, execute it against the software, and see what happens. The result of each test leads you to design the next, allowing you to pursue directions that you wouldn’t have anticipated if you had attempted to design all the tests up-front. Each test is a little experiment that investigates the capabilities and limitations of the emerging software.

Exploratory testing can be done manually or with the assistance of automation. Its defining characteristic is not how we drive the software but rather the tight feedback loop between test design, test execution, and results interpretation.

Exploratory testing works best when the software is ready to be explored—that is, when stories are “done done.” You don’t have to test stories that were finished in the current iteration. It’s nice to have that sort of rapid feedback, but some stories won’t be “done done” until the last day of the iteration. That’s OK—remember, you’re not using exploratory testing to guarantee quality; you’re using it provide information about how the team’s process guarantees quality.

Note

This switch in mindset takes a little while to get used to. Your exploratory testing is not a means of evaluating the software through exhaustive testing. Instead, you’re acting as a technical investigator—checking weak points to help the team discover ways to prevent bugs. You can also use exploratory testing to provide the team with other useful data, such as information about the software’s performance characteristics.

Exploratory testers use the following four tools to explore the software.

Tool #1: Charters

Some people claim that exploratory testing is simply haphazard poking at the software under test. This isn’t true, any more than the idea that American explorers Lewis and Clark mapped the Northwest by haphazardly tromping about in the woods. Before they headed into the wilderness, Lewis and Clark knew where they were headed and why they were going. President Thomas Jefferson had given them a charter:[55]

The Object of your mission is to explore the Missouri river & such principal stream of it as by [its] course and communication with the waters of the Pacific ocean, whether the Columbia, Oregon, Colorado or any other river may offer the most direct & practicable water communication across this continent for the purpose of commerce.

Similarly, before beginning an exploratory session, a tester should have some idea of what to explore in the system and what kinds of things to look for. This charter helps keep the session focused.

Note

An exploratory session typically lasts one to two hours.

The charter for a given exploratory session might come from a just-completed story (e.g., “Explore the Coupon feature”). It might relate to the interaction among a collection of stories (e.g., “Explore interactions between the Coupon feature and the Bulk Discount feature”). It might involve a quality attribute, such as stability, reliability, or performance (“Look for evidence that the Coupon feature impacts performance”). Generate charters by working with the team to understand what information would be the most useful to move the project forward.

Charters are the testing equivalent of a story. Like stories, they work best when written down. They may be as informal as a line on a whiteboard or a card, but a written charter provides a touchstone the tester can refer to in order to ensure she’s still on track.

Tool #2: Observation

Automated tests only verify the behavior that programmers write them to verify, but humans are capable of noticing subtle clues that suggest all is not right. Exploratory testers are continuously alert for anything out of the ordinary. This may be an editable form field that should be read-only, a hard drive that spun up when the software should not be doing any disk access, or a value in a report that looks out of place.

Such observations lead exploratory testers to ask more questions, run more tests, and explore in new directions. For example, a tester may notice that a web-based system uses parameters in the URL. “Aha!” thinks the tester. “I can easily edit the URL.” Where the URL says http://stage.example.com/edit?id=42, the tester substitutes http://stage.example.com/edit?id=9999999.

Tool #3: Notetaking

While exploring, it’s all too easy to forget where you’ve been and where you’re going. Exploratory testers keep a notepad beside them as they explore and periodically take notes on the actions they take. You can also use screen recorders such as Camtasia to keep track of what you do. After all, it’s quite frustrating to discover a bug only to find that you have no idea what you did to cause the software to react that way. Notes and recordings tell you what you were doing not just at the time you encountered surprising behavior but also in the minutes or hours before.

Be especially careful to keep notes about anything that deserves further investigation. If you cannot follow a path of testing in the current session, you want to remember to explore that area in more detail later. These are opportunities for further exploratory testing.

Tool #4: Heuristics

Remember that exploratory testing involves simultaneously designing and executing tests. Some test design techniques are well known, such as boundary testing. If you have a field that’s supposed to accept numbers from 0–100, you’ll probably try valid values like 0, 100, and something in the middle, and invalid values like –1 and 101. Even if you had never heard the term boundary testing, you’d probably consider trying such tests.

A heuristic is a guide: a technique that aids in your explorations. Boundary testing is an example of a test heuristic. Experienced testers use a variety of heuristics, gleaned from an understanding of how software systems work as well as from experience and intuition about what causes software to break. You can improve your exploratory efforts by creating and maintaining a catalog of heuristics that are worth remembering when exploring your software. Of course, that same catalog can also be a welcome reference for the programmers as they implement the software.

Some of your heuristics will be specific to your domain. For example, if you work with networking software, you must inevitably work with IP addresses. You probably test your software to see how it handles invalid IP addresses (“999.999.999.999”), special IP addresses (“127.0.0.1”), and IPv6 style addresses (“::1/128”). Others are applicable to nearly any software project. The following are a few to get you started.

None, Some, All

For example, if your users can have permissions, try creating users with no permissions, some permissions, and all permissions. In one system I tested, the system treated users with no permissions as administrators. Granting a user no permissions resulted in the user having root access.

Goldilocks: too big, too small, just right

Boundary tests on a numeric field are one example of Golidlocks tests. Another example might be uploading an image file: try uploading a 3 MB picture (too big), a file with nothing in it (too small), and a file of comfortable size (50 KB).

Position: beginning, middle, end

Position can apply to an edit control: edit at the beginning of a line, middle of a line, end of a line. It can apply to the location of data in a file being parsed. It can apply to an item selected from a list, or where you’re inserting an item into a list.

Count: zero, one, many

For example, you could create invoices with zero line items, one line item, and many line items. Or you might perform a search for zero items, one item, or many pages of items. Many systems have a small typographical error related to plurals—that is, they report “1 items found” instead of “1 item found.”

Similarly, count can apply in numerous situations: a count of dependent data records (a customer with zero addresses, one address, many addresses, or a group with zero members, one member, many members), a count of events (zero transactions, one transaction, many simultaneous transactions), or anything else that can be counted.

CRUD: create, read, update, delete

For each type of entity, and each data field, try to create it, read it, update it, and delete it. Now try CRUD it while violating system constraints such as permissions. Try to CRUD in combination with Goldilocks, Position, Count, and Select heuristics. For example, delete the last line item on an invoice, then read it; update the first line item; delete a customer with zero, one, or many invoices, etc.

Command Injection

Wherever your software supports text coming from an external source (such as a UI or a Web Services interface), ensure it doesn’t ever interpret incoming text as a command—whether an SQL, a JavaScript, or a shell/command-line command. For example, a single quote in a text field will sometimes raise SQL exceptions; entering the word tester's will cause some applications to respond with an SQL error.

Data Type Attacks

For each type of input, try values that push common boundaries or that violate integrity constraints. Some examples: in date fields, try February 29 and 30. Also try dates more than 100 years ago. Try invalid times, like 13:75. For numbers, try the huge (4,294,967,297 = 2^32 + 1) and the tiny (0.0000000000000001). Try scientific notation (1E-16). Try negative numbers, particularly where negative numbers should not be allowed, as with purchase price. For strings, try entering long strings (more than 5,000 characters), leaving fields blank, and entering a single space. Try various characters including common field delimiters (` | / , ; : & < > ^ * ? Tab). The list could go on, but you get the idea.

An Example

“Let’s decide on a charter for this session,” Jill said, settling into the seat next to Michael. Jill was one of the team’s testers; Michael was a programmer. “What should we focus on?”

“You know us,” Michael replied. “Our code doesn’t have any bugs!” He grinned, knowing what was coming.

“Oooh, you’ll pay for that one.” Jill smiled. “Although I have to say our quality is better than any other team I’ve worked with. How many bugs did we have last month?”

“It’s been a bit high, actually.” Michael counted on his fingers. “There was that installer issue... and the networking problem....” He paused. “Four.”

“Let’s see how that holds up. What’s new in the code?”

“Some fairly routine stuff,” Michael said. “One thing that’s brand-new is our support for multilingual input. But there’s nothing to find there—we tested it thoroughly.”

“I love a challenge!” Jill said with a wicked grin. “Besides, character-set issues are a rich source of bugs. Let’s take a look at that.”

“Sure.” Michael wrote a card for the session: Explore Internationalization. He clipped the card to the monitor so it would remind them to keep on track.

Jill took the keyboard first. “Let’s start with the basics. That will help us know what’s going wrong if we see anything strange later on.” As she navigated to a data entry screen, Michael wrote the date and time at the top of a new page in his notepad and prepared to take notes.

Jill opened up a character viewer and pasted together a string of gibberish: German, Hebrew, Greek, Japanese, and Arabic characters. “Let’s use the CRUD heuristic to make sure this stuff gets into the database and comes back properly.” Moving quickly, she saved the text, opened up a database viewer to make sure it was present in the database, then closed the application and reloaded it. Everything was intact. Next, she edited the string, saved it again, and repeated her check. Finally, she deleted the entry. “Looks good.”

“Wait a minute,” Michael said. “I just had an idea. We’re not supposed to allow the user to save blank strings. What if we use a Unicode space rather than a regular space?”

“I’ll try it.” Jill went back to work. Everything she tried was successfully blocked. “It all looks good so far, but I have a special trick.” She grinned as she typed #FEFF into her character viewer. “This character used to mean ‘zero-width no-break space.’ Now it’s just a byte-order mark. Either way, it’s the nastiest character I can throw at your input routines.” She pressed Save.

Nothing happened. “Score one for the good guys,” Jill murmured. “The data input widgets look fairly solid. I know from past experience that you’ve abstracted those widgets so that if one is working, they’re probably all working.” Michael nodded, and she added, “I might want to double-check a few of those later, but I think our time is better spent elsewhere.”

“OK, what should we look at next?” Michael asked.

“A few things come to mind for us to check: data comparison, sorting, and translation to other formats. Unicode diacritical marks can be encoded in several ways, so two strings that are technically identical might not be encoded using the same bytes. Sorting is problematic because Unicode characters aren’t sorted in the same order that they’re represented in binary...”

“...and format translation is nasty because of all the different code page and character-set issues out there,” Michael finished. He wrote the three ideas—data comparison, sorting, and translation—on his notepad.

“You said it,” Jill agreed. “How about starting with the nasty one?”

“Sounds good.” Michael reached for the keyboard. “My turn to type,” he said, smiling. He handed Jill the notepad so she could continue the notetaking. “Now, where could the code page issues be lurking...?”

When You Find Bugs

Important

No Bugs

Exploratory testing provides feedback about the software and also about the effectiveness of the team’s process. When it reveals a bug, it indicates that the team may not be working as effectively as it could. To remedy, fix your software and your process as described in No Bugs” in Chapter 7.

If you’ve already used the feedback from exploratory testing to improve both the software and the process, but you’re consistently finding a lot of bugs, it means the process is still broken. Don’t give up: look for root causes and keep plugging away. It’s often a simple case of trying to do too much in too little time.

When the bugs are rampant, you may be tempted to add a QA department to catch the bugs. This may provide a temporary fix, but it’s the first step down a slippery slope. Here is an example:

Imagine a team that has been struggling with velocity because stories just never seem to be “done done,” as is often the case when customers finds bugs in “done” stories. The programmers are frustrated that they can’t seem to prevent or detect the problems that result in their customers rejecting stories.

“I give up,” says Wilma to her partner, Betty. “This story is taking too long, and there are too many other stories in the queue. Let’s ask Jeff to test this and tell us where all the problems are. He’s good at finding bugs.”

In fact, Jeff is great at finding bugs. The next morning he delivers a stack of sticky notes to Wilma and Betty that detail a bunch of bugs. They fix the bugs and deliver the story. The customer accepts it, smiling for the first time in weeks. “That was great!” says Wilma. “We finally finished something!”

On the next story, Betty turns to Wilma and says, “You know, Jeff was so good at finding bugs in the last story....”

“Yeah,” Wilma agrees. “Even though there are some more error conditions we could think through, let’s see what Jeff has to say.”

Important

No Bugs

The pattern continues. The more that programmers rely on testers to find bugs for them, the fewer bugs they find themselves. Testers find more and more bugs, but in the end, quality gets worse.[56] The key to having no bugs is not to get better testers, but for the team to take responsibility for producing bug-free software—before testers try it. Instead of relying on testers to get the bugs out, use the information that exploratory testing provides to improve your process.

Questions

Should testers pair with other members of the team?

This is up to your team. Pairing with programmers and customers helps break down some of the natural barriers between testers and other team members, and it helps information flow better throughout the team. On the other hand, programmers may not always have time to pair with testers. Find a good balance.

Won’t the burden of exploratory testing keep getting bigger over the course of the project?

It shouldn’t. Sometimes teams use exploratory testing as a form of manual regression testing; with each iteration, they explore the new features, and the existing features, and the interactions, and so on. They put so much on the list of things to explore that the time needed for exploratory testing during the iteration becomes unmanageable.

The flaw in this approach is using exploratory testing as a means of regression testing. Use test-driven development to create a comprehensive, automated regression test suite. Focus your exploratory testing on new features (and their interactions with existing features), particularly those features that do things differently from previous features.

Just as you timebox your releases and work on just the most important features, you can timebox your explorations and test just the most important charters.

How can we get better at exploratory testing?

Exploratory testing involves a set of skills that can be learned. To improve, you should:

Practice

You can test more than the software your team is developing. Whenever you use software for your own purposes, like word processing or email, try testing it. Also consider applying your testing skills to open source projects.

Get feedback

Find out what surprises other people discovered, and ask yourself why you didn’t see them. Sometimes it’s because there were more test conditions you might have tried, so you can add new tricks to your heuristics toolbox. Other times it’s because you didn’t understand the significance of what you were seeing; you saw the same behavior but didn’t recognize it as a bug.

Share tips and techniques

You can share ideas within your team, and you can reach out to the broader community. Online discussion forums are a good place to start. Other options include round table–style meetings. For example, James Bach hosts WHET, the Workshop on Heuristics and Exploratory Techniques, and James Lyndsay hosts LEWT, the London Exploratory Workshop in Testing. Both are gathering places where testers share stories and experiences.

Results

When you use exploratory testing, you discover information about both the software and the process used to create that software. You sometimes discover missing test cases, incomplete or incorrect understanding of the story, and conversations that should have happened but didn’t. Each surprise gives you an opportunity to improve both the software and your development practices. As a team, you use this information to improve your process and reduce the number of bugs found in the future.

Contraindications

Don’t attempt to use exploratory testing as a regression testing strategy. Regression tests should be automated.

Only do exploratory testing when it is likely to uncover new information and you are in a position to act on that information. If, for example, there is already a list of known bugs for a given story, additional exploratory testing will waste time rediscovering known issues. Fix the bugs first.

Alternatives

You can use exploratory testing as a mechanism for bug hunting when working with software, particularly legacy software, that you suspect to be buggy. However, beware of relying on exploratory testing or any other testing approach to ensure all the bugs are caught. They won’t be.

Some teams don’t do any manual or end-to-end testing with XP. These teams are using another mechanism to confirm that they don’t produce bugs—presumably, they’re relying on user feedback. This is OK if you actually don’t produce bugs, which is certainly possible, but it’s better to confirm that before giving software to users. Still, you might get away with it if the software isn’t mission-critical and you have forgiving users.

Other teams use testers in a more traditional role, relying on them to find bugs rather than committing to deliver bug-free code. In my experience, these teams have lower quality and higher bug rates, probably as a result of the “Better Testing, Worse Quality” dynamic.

Further Reading

“General Functionality and Stability Test Procedure for Microsoft Windows Logo, Desktop Applications Edition” [Bach 1999] is the first widely published reference on how to do exploratory testing. The WinLogo certification assures customers that the software behaves itself on Windows. But how do you create a systematic process for assessing the capabilities and limitations of an arbitrary desktop application in a consistent way? Microsoft turned to James Bach, already well known for his work on exploratory testing. The resulting test procedure was made available to independent software vendors (ISVs) and used by certifiers like Veritest. It’s online at http://www.testingcraft.com/bach-exploratory-procedure.pdf.

“Session-Based Test Management” [Bach 2000] is the first published article that explains in detail how to use session-based test management, which James Bach and his brother Jonathan devised, and which I discussed as Charters and Sessions earlier in this chapter. One of the challenges of exploratory testing is how to manage the process: stay focused, track progress, and ensure the effort continues to yield value. This is the solution. The article is online at http://www.satisfice.com/articles/sbtm.pdf.

“Did I Remember To” [Hunter] is a great list of heuristics, framed as things to remember to test. The list is particularly great for Windows applications, but it includes ideas that are applicable to other technologies as well. It’s online at http://blogs.msdn.com/micahel/articles/175571.aspx.

“Rigorous Exploratory Testing” [Hendrickson] sets out to debunk the myth that exploratory testing is just banging on keyboards, and discusses how exploratory testing can be rigorous without being formal. The article is online at http://www.testobsessed.com/2006/04/19/rigorous-exploratory-testing/.

“User Profiles and Exploratory Testing” [Kohl 2005a] is a nice reminder that different users experience software in different ways, and it provides guidance on characterizing user behavior to support exploratory testing. It’s online at http://www.kohl.ca/blog/archives/000104.html.

“Exploratory Testing on Agile Teams” [Kohl 2005b] presents a case study of using automation-assisted exploratory testing to isolate a defect on an Agile project—which flouts the belief that exploratory testing is a purely manual activity. It complements this chapter nicely. Read it online at http://www.informit.com/articles/article.asp?p=405514&rl=1.

“A Survey of Exploratory Testing” [Marick] provides a nice collection of published work related to exploratory testing, including [Bach 1999]’s “General Functionality and Stability Test Procedure for Microsoft Windows Logo.” Online at http://www.testingcraft.com/exploratory.html.

“Exploring Exploratory Testing” [Tinkham & Kaner] is an in-depth discussion of how exploratory testers do what they do, including strategies such as questioning and using heuristics. Based on work funded in part by an NSF grant, this paper elaborates on exploratory testing practices missing from the then-current draft of the IEEE’s Software Engineering Body of Knowledge (SWEBOK). Available online at http://www.testingeducation.org/a/explore.pdf.



[56] I wrote about this effect in “Better Testing, Worse Quality,” at http://testobsessed.com/wordpress/wp-content/uploads/2006/12/btwq.pdf. Page past the end of the slideshow to find it.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset