AoA Correction: Test-Driven Development

The book went to press a few days ago, which means it will be on store shelves in the first week of November. Those of you who ordered it directly from O'Reilly might get it a little sooner, but I can't promise that. (If you haven't ordered it yet, there's a 35% discount available if you use offer code 'JAVAGL' at the O'Reilly store.)

Now that the book is on its way, I have a correction to make. (What, already? Yep... and it won't be the last.) In Chapter 7, in the section describing how to eliminate bugs, Shane and I say, "Start with test-driven development (TDD), which is a proven technique for reducing the number of defects you generate [Janzen & Saiedian]". Later, in Chapter 9, when describing how to use test-driven development, we say, "Research shows that TDD substantially reduces the incidence of defects [Janzen & Saiedian]."

These statements mischaracterize Janzen & Saiedian's conclusions. They actually conclude:

Further research must be done to determine and understand TDD's effects... there have only been a small number of studies conducted [on TDD's effect on defect density], and those on only small samples. One industry study with more than 10 participants involved a small application that took only one day to complete. The results were suspect because the control group wrote a minimal number of tests.

The few academic studies that have examined defect density produced inconsistent results. The largest study reported a 54 percent reduction in defect density with beginning programmers. Two other reasonably large studies with advanced programmers did not provide any significant reduction in defect density. One study hinted at better designs.

How did this happen? In my initial research, I focused on the article's summary of industry research, which does support our statements. Later, I read the article more carefully and made a note to clarify the reference, but I overlooked it in the bustle of preparing the book for print.

Here's the article's summary of TDD research in industry:

These studies showed that programmers using TDD produced code that passed 18 percent to 50 percent more external test cases that code produced by corresponding control groups. The studies also reported less time spent debugging code developed with TDD. Further, they reported that applying TDD had an impact that ranged from minimal to a 16 percent decrease in programmer productivity--which shows that applying TDD sometimes took longer. In the case that took 16 percent more time, researchers noted that the control group wrote far fewer tests than the TDD group.

Study Type Number of companies Number of programmers Quality effects Productivity effects
George Controlled experiment 3 24 TDD passed 18% more tests TDD took 16% longer
Maximillien Case study 1 9 50% reduction in defect density Minimal impact
Williams Case study 1 9 40% reduction in defect density No change

And here's its summary of academic research. I found it interesting that the article implied a preference for studies that looked at TDD without the other XP practices: "Several academic studies have examined XP as a whole, but a few focused on TDD." I couldn't tell from the article whether the studies studied TDD in isolation or in the context of XP. I can see why researchers would want to study one practice at a time, but part of what makes TDD successful is supporting practices like pair programming, collective code ownership, merciless refactoring, and incremental/evolutionary design.

Two of the five studies reported significant improvement in software quality and programmer productivity. One reported a correlation between the number of tests written and productivity. In this particular study, students using test-first methods wrote more tests and were significantly more productive. The remaining two studies reported no significant improvement in either defect density or productivity.

Controlled experiment Number of programmers Quality effects Productivity effects
Kaufmann 8 Improved information flow 50% improvement
Edwards 59 54% fewer defects n/a
Erdogmus 35 No change Improved productivity
Müller 19 No change, but better reuse No change
Pancur 38 No change No change

I believe that TDD is a powerful tool that many teams have used to reduce the number of defects they generate. This has been my personal experience, the experience of my colleagues, and it is supported by the industry studies quoted by Janzen & Saiedian. However, Janzen & Saiedien do not say that TDD is a proven technique for reducing defects. We regret the error.

If you liked this entry, check out my best writing and presentations, and consider subscribing to updates by email or RSS.