AoAD2 Practice: No Bugs

Book cover for “The Art of Agile Development, Second Edition.”

Second Edition cover

This is a pre-release excerpt of The Art of Agile Development, Second Edition, to be published by O’Reilly in 2021. Visit the Second Edition home page for information about the open development process, additional excerpts, and more.

Your feedback is appreciated! To share your thoughts, join the AoAD2 open review mailing list.

This excerpt is copyright 2007, 2020, 2021 by James Shore and Shane Warden. Although you are welcome to share this link, do not distribute or republish the content without James Shore’s express written permission.

No Bugs

Audience
Whole Team

We release with confidence.

Let’s cook up a bug pie. We’ll start with a nice, error-prone language. How about C? We’ll season it with a dash of assembly.

Next, mix in even more bugs with concurrent programming. Deadlocks and race conditions are extra tasty!

Now we need a really difficult problem domain. How about a real-time embedded system?

Take a coach interested in trying Agile, assemble a team of novices, shake well, and bake for three years. This is how it turns out:

The GMS team delivered this product after three years of development, having encountered a total of 51 defects during that time. The open bug list never had more than two items at a time. Productivity was measured at almost three times the level for comparable embedded software teams. The first field test units were delivered after approximately six months into development. After that point, the software team supported the other engineering disciplines while continuing to do software enhancements. [Van Schooenderwoert 2006]

Embedded Agile Project by the Numbers with Newbies

These folks had everything stacked against them—except their coach and her approach to software development. If they can do it, so can you.

Is “No Bugs” Realistic?

“No bugs” is a way of thinking about bugs.

If you’re on a team with a bug count in the hundreds or thousands, the idea of “no bugs” probably sounds ridiculous. I’ll admit: “no bugs” is an ideal. It’s a way of thinking about bugs, and a goal to strive for, not something your team will ever achieve. There will always be some bugs. (Or defects; I use “bug” and “defect” interchangeably.)

But you can get closer to the “no bugs” ideal than you might think. The team in the introduction averaged 1½ bugs per month in a very difficult domain. Over three years, they generated 51 defects and delivered 21 to their customer. According to their coach’s analysis of Capers Jones’ data, an average team would have generated 1,035 defects and delivered 207 to their customer. That’s a 95% reduction in generated defects and a 90% reduction in delivered defects.

We don’t have to rely on self-reported data. QSM Associates is a well-regarded company that performs independent audits of software development teams. In an early analysis of a company practicing a variant of Extreme Programming (XP), they reported an average reduction from 2,270 defects to 381 defects, an 83% decrease. Furthermore, the XP teams delivered 24% faster with 39% fewer staff. [Mah 2006]

More recent case studies confirmed those findings. QSM found 11% defect reduction and 58% schedule reduction on a Scrum team; 75% defect reduction and 53% schedule reduction on an XP team; and 75% defect reduction and 30% schedule reduction in a multi-team analysis of thousands of developers. [Mah 2018]

Eliminate errors at their source rather than finding and fixing them after the fact.

How do you achieve these results? It’s a matter of building quality in, rather than testing defects out, as “Key Idea: Build Quality in” on p.XX explains. Eliminate errors at their source rather than finding and fixing them after the fact. The following sections describe how.

Prevent Programmer Errors

Programmer errors occur when a programmer knows what to program, but makes a mistake. It could be an incorrect algorithm, a typo, or some other mistake made while translating ideas to code.

Allies
Test-Driven Development
Energized Work
Pair Programming
Mob Programming

Test-driven development is your defect-elimination workhorse. Not only does it ensure that you program what you intended to, it gives you a comprehensive regression suite you can use to detect future errors.

To enhance the benefits of test-driven development, support energized work, and use pairing or mobbing to bring multiple perspectives to bear on every line of code.

Prevent Design Errors

Design errors create breeding grounds for bugs. According to Barry Boehm, 20% of the modules in a program are typically responsible for 80% of the errors. [Boehm 1987] It’s an old statistic, but it matches my experience with modern software, too.

Those errors aren’t necessarily the result of outright mistakes. A module will often start out well, but get crufty as the software grows.

Allies
Collective Code Ownership
Pair Programming
Mob Programming
Simple Design
Incremental Design
Reflective Design
Slack

The techniques for preventing programmer errors will also benefit the design, but additional practices also help. Collective code ownership, evolutionary design—consisting of simple design, incremental design, and reflective design—and slack are your main tools for preventing design errors. Collective ownership and the design practices help keep your design clean, and slack gives you time to do so.

Prevent Requirements Errors

Requirements errors occur when a programmer creates code that does exactly what they intended it to do, but their intention was wrong. Perhaps they misunderstood what they were supposed to do, or perhaps nobody knew what needed to be done. Either way, the code works, but it doesn’t do the right thing.

Allies
Whole Team
Purpose
Context
Team Room
Ubiquitous Language
Customer Examples
Incremental Requirements
Stakeholder Demos
Stories
Done Done

A cross-functional, whole team is essential for preventing requirements errors. Your team needs to include people with the skills to understand and generate the software’s requirements. Sometimes they make up requirements out of whole cloth, but more often they supplement their work with lots of stakeholder feedback. Understanding the team’s purpose and context is vital to this process.

A shared team room is also important. When programmers have a question about requirements, they need to be able to turn their head and ask. Use a ubiquitous language to help programmers and on-site customers understand each other, and supplement your conversations with customer examples.

Confirm that the software does what it needs to do with frequent customer reviews and stakeholder demos. Perform those reviews incrementally, as soon as programmers have something to show, so misunderstandings and refinements are discovered early, in time to be corrected. Use stories to focus the team on customers’ perspective.

Finally, don’t consider a story “done done” until on-site customers agree it’s done.

Prevent Systemic Errors

If everyone does their job perfectly, these practices yield software with no defects. Unfortunately, perfection is impossible. Your team is sure to have blind spots: subtle areas where they make mistakes, but they don’t know it. These blind spots lead to repeated, systemic errors. They’re “systemic” because they’re a consequence of your entire development system: your team, its process, the tools you use, the environment you work in, and more.

Your development system is different from your software system. Your software system is the software you’re building. Your development system is the way you build it, including everything from your tools to your organizational structure.

Every escaped defect indicates a need to improve your development system.

Escaped defects are a clear signal of problems in paradise. Although defects are inevitable—TDD alone has programmers correcting mistakes every few minutes—most of them are short-lived. Defects found by end-users have “escaped.” Every escaped defect indicates a need to improve your development system.

Ally
Blind Spot Discovery

Of course, you don’t want your end-users to be your beta testers. (Not usually, anyway.) That’s where blind spot discovery comes in. It’s a variety of techniques, such as chaos engineering and exploratory testing, for finding flaws in your approach. I discuss them in the next practice.

Some teams use these techniques to check the quality of their software system: they’ll code a story, search for bugs, fix them, and repeat.

But to build quality in, treat your blind spots as a clue about how to improve your development system, not just your software system. The same goes for escaped defects. They’re all clues about what to improve.

Ally
Incident Analysis
Done Done

Incident analysis helps you decipher those clues. It’s not just for production incidents; it’s for all bugs. What’s a bug? Anything your team considered “done done” which, in retrospect, was not actually done. This applies to well-meaning mistakes, too: if everybody thinks a particular new feature is a great idea, and it turns out to enrage your customers, it deserves just as much analysis as a production outage.

How to Fix a Bug

When you find a bug, start by resolving the immediate problem. If it’s the result of a recent deployment, roll back the deployment and get your code back to a known-good state. (See “Resolving Deployment Failures” on p.XX.)

Ally
Test-Driven Development

Next, write an automated test which demonstrates the bug. It could be a unit test or narrow integration test, depending on the type of defect you’ve found. In some cases, you’ll need to start with a broad integration test. Once you’ve narrowed down the source of the problem, replace the broad test with a narrow test, to keep your tests fast and reliable.

Once you have a good test, fix the bug. Get a green bar.

Ally
Incident Analysis

Don’t congratulate yourself yet; you’ve fixed the bug, but the underlying problem still exists. If you don’t change your approach, something similar will just happen again. Look at the design of your code. Can it be improved? What about your development system? Incident analysis, whether performed with a formal session, or informally in your own thoughts, will help you decide what to do.

Fix Bugs Immediately

Each defect is the result of a flaw in your code, and probably your design and development system, that’s likely to breed more mistakes. Improve quality and productivity by fixing them right away.

Allies
Collective Code Ownership
Team Room

Fixing bugs quickly requires the whole team to participate. Programmers, use collective code ownership so anyone can fix a buggy module. Customers and testers, personally bring new bugs to the attention of a programmer and help them reproduce it. These actions are easiest when the team shares a team room.

In practice, it’s not possible to fix every bug right away. You may be in the middle of something else when you learn about a bug. When this happens to me, I ask my navigator to write the problem on our to-do list. We come back to it 10-20 minutes later, when we come to a good stopping point.

Ally
Slack

Some bugs are too big to fix while you’re in the middle of another task. For these, I write the bug on a story card and announce it to the team. We collectively decide if we have enough slack to fix the bug and still meet our other commitments. If we do, we create tasks for the story and people volunteer for them as normal. (Sometimes your only task will be “fix the bug.” I use the story card as its own task card when this happens.)

If there isn’t enough slack to fix the bug, decide as a team whether it’s important enough to fix for your next release. If it is, schedule it as your first priority for the next iteration (or next story slot, if you’re using continuous flow). If it isn’t, add it to your visual plan in the appropriate release. If it’s never going to be important enough to fix, discard it.

Testers’ Role

Because fluent Delivering teams build quality in, rather than testing defects out, people with testing skills “shift left.” Instead of focusing their skills on the completed product, they focus on helping the team build a quality product from the beginning.

In my experience, some testers are business-oriented testers: they’re very interested in getting business requirements right. They work with customers to uncover all the nit-picky details customers would otherwise miss. They’ll often prompt customers to think about edge cases during requirements discussions.

Ally
Stories
Blind Spot Discovery

Other testers are more technically-oriented. They’re interested in test automation and non-functional requirements. These testers act as technical investigators for the team. They create the testbeds that look at issues such as scalability, reliability, and performance. They review logs to understand how the software system works in production. Through these efforts, they help the team understand the behavior of their software and decide when to devote more effort to operations, security, and “nonfunctional” stories.

Testers also help the team identify blind spots. Although anybody on the team can use blind spot discovery techniques, people with testing skills tend to be particularly good at it.

‘Tude

Bugs don’t happen here.

I encourage an attitude among my teams... a bit of eliteness, even snobbiness: “Bugs don’t happen here.”

If you do everything I’ve described, bugs should be a rarity. Your next step is to treat them that way. Rather than shrugging your shoulders when a bug occurs—“Oh yeah, another bug, that’s what happens in software”—be shocked and dismayed. Bugs aren’t something to be tolerated; they’re a sign of underlying problems to be solved.

Ally
Pair Programming
Mob Programming
Team Room
Collective Code Ownership

Ultimately, “no bugs” is about establishing a culture of excellence. When you learn about a bug, fix it right away, then figure out how to prevent that type of bug from happening again.

You won’t be able to get there overnight. All the practices I’ve described take discipline and rigor. They’re not necessarily difficult, but they break down if people are sloppy or don’t care about their work. A culture of “no bugs” helps the team maintain the discipline required, as do pairing or mobbing, a team room, and collective ownership.

You’ll get there eventually. Agile teams can and do achieve nearly zero bugs. You can too.

Questions

How do we prevent security defects and other challenging bugs?

Ally
Blind Spot Discovery

Threat analysis (see “Threat Analysis” on p.XX) can help you think of security flaws in advance. That said, you can only prevent bugs you think to prevent. Security, concurrency, and other difficult problem domains may introduce defects you never considered. That’s why blind spot discovery is also important.

How should we track our bugs?

Allies
Stories
Incremental Requirements

You shouldn’t need a bug database or issue tracker for new bugs, assuming your team isn’t generating a lot of bugs. (If they are, focus on solving that problem first.) If a bug is too big to fix right away, turn it into a story, and track its details in the same way you handle other requirements details.

How long should we work on a bug before we turn it into a story?

Ally
Slack

It depends on how much slack you have. Early in an iteration, when there’s still a lot of slack, I might spend half a day on a defect before turning it into a story. Later, when there’s less slack, I might only spend ten minutes on it.

We have a lot of legacy code. How can we adopt a “no bugs” policy without going mad?

It will take time. Start by going through your bug database and identifying the ones you want to fix in the current release. Schedule at least one to be fixed every week, with a bias towards fixing them sooner rather than later.

Ally
Incident Analysis

Every week or two, randomly choose one of that week’s bugs to get a full incident analysis. This will allow you to gradually improve your development system and prevent bugs in the future.

Prerequisites

Ally
Management

“No Bugs” is about a culture of excellence. It can only come from within the team. Managers, don’t ask your teams to report defect counts, and don’t reward or punish them based on the number of defects they have. You’ll just drive the bugs underground, and that will make quality worse, not better. I discuss this further in “Accountability” on p.XX.

This practice depends on a huge number of Agile practices—essentially, every Focusing and Delivering practice in this book. Until your team reaches fluency in those practices, don’t expect dramatic reductions in defects.

Conversely, if you have the investments described in chapter “Invest in Agility” and you’re using the Focusing and Delivering practices in this book, more than a few new bugs per month may indicate a problem with your approach. You’ll need time to learn the practices and refine your process, of course, but if you don’t see an improvement in your bug rates within a few months, ask a mentor for help.

Indicators

When your team has a culture of “no bugs:”

  • Your team is confident in the quality of their software.

  • You’re comfortable releasing to production without a manual testing phase.

  • Stakeholders, customers, and users rarely encounter unpleasant surprises.

  • Your team spends their time producing great software instead of fighting fires.

Alternatives and Experiments

One of the revolutionary ideas of Agile is that low-defect software can be cheaper to produce than high-defect software. This is made possible by building quality in. To experiment further, look at the parts of your process that check quality at the end, and think of ways to build that quality in from the beginning.

You can also reduce bugs by using more and higher quality testing, including inspection and static code analysis, to find and fix a higher percentage of bugs. However, this tends to be slow. You’ll need time for inspection, manual testing, and to review static code analysis reports. It will slow you down and make releases more difficult.

Share your feedback about this excerpt on the AoAD2 mailing list! Sign up here.

For more excerpts from the book, or to get a copy of the Early Release, see the Second Edition home page.

If you liked this entry, check out my best writing and presentations, and consider subscribing to updates by email or RSS.