Bjorn Freeman-Benson: Three Challenges of Distributed Teams

19 Feb, 2019

Distributed teams. Are they the solution to our staffing problems, or an expensive fantasy? I had the opportunity to sit down with Bjorn Freeman-Benson earlier this month to explore these questions.

Bjorn is an experienced software leader. Highlights of his career include heading up engineering at New Relic as it grew from 3 developers to 330; growing Invision from 60 developers to 350 as its CTO; working on VisualAge Smalltalk at OTI (Object Technology International); and coordinating the open-source Eclipse developers at the Eclipse Organization.

Throughout his career, Bjorn has worked with many different types of teams, from fully co-located to fully distributed. New Relic’s developers all worked together, on several floors of an office building in downtown Portland. InVision was fully distributed, with everyone working remote. OTI had small offices all over the world. And the Eclipse Organization was semi-distributed, with contributions coming from a mix of on-site employees and remote volunteers.

When I heard Bjorn’s story, I knew I had to convince him to give me an interview. Two weeks ago, we sat down at a coffee shop in northeast Portland.

Bjorn described three challenges for distributed teams: the n+1 management problem, the junior people problem, and the friction of communication problem.

The N+1 Management Problem

“For an engineering manager role, you have to hire people with director-level skills.”

“In a distributed organization, managers must be n+1 good,” Bjorn said. In other words, for an engineering manager role, you have to hire people with director-level skills.

This is an astonishing claim. To understand it, you need to understand the difference between a manager’s skillset and a director’s skillset.

Manager Skills vs. Director Skills

To be a good manager, Bjorn says, you have to learn how to interact with people. In a co-located team, you can see the minute-to-minute behavior of a team just by walking around. This gives you... not “control,” exactly, but an understanding of the issues people are facing and what they need. It allows you to intervene without micromanaging.

As an example, Bjorn pointed to a woman sitting behind us at the coffee shop. “See that woman there?” he said. “I can tell she isn’t very engaged with her work because she keeps checking her phone.” If he were her manager, he could coach her about staying engaged.

To be a good director, on the other hand, you have to learn indirect management. You don’t have the same insight into people’s work that you do as a manager. Instead, you have to infer what’s happening based on the results you see.

Managing a Distributed Team

These indirect management skills are also needed in distributed teams. In a distributed team, you can’t see people working, so you have to infer what help they need from their output and questions.

Interpersonal interactions are also harder to manage in a distributed team. “If I piss you off but don’t realize it in a local office,” Bjorn said, “later on I’ll notice you glaring at me. You don’t usually do that. So we fix it.”

In a distributed team, “The chance of pissing you off is greater because we’re communicating via video. Then I don’t see you again, and the next thing I’ve heard is that you resigned.”

So instead, managers of distributed teams need to practice a remote version of management by walking around. You need to touch base with everyone on the team frequently, but you don’t want to appear to be micromanaging. Bjorn suggested ideas such as a virtual lunch with the team every day, calling up and chatting about football or some other shared interest, and project demo days every week. For the latter, Bjorn said he made a point of never providing negative feedback: the purpose of the demo was to give the team a chance to show off and reinforce interpersonal connections, not critique the team’s work.

The Junior People Problem

“It’s basically impossible to hire junior people in a distributed company.”

“It’s basically impossible to hire junior people in a distributed company,” Bjorn said. “I watched it work at [co-located] New Relic but not at [distributed] InVision.”

Bjorn didn’t mean that it was physically impossible to hire junior developers, of course. He meant that hiring junior people doesn’t pay off.

Bjorn explained it this way: When a developer is just starting, they have a lot of questions. They don’t know how to do the work. But at the same time, they’re at the bottom of the pecking order. They’re hesitant to interrupt and don’t want to be seen as a burden.

At New Relic, which had developers sitting next to each other, juniors could see what people were doing and ask questions at convenient times, when they knew they weren’t interrupting.

With InVision’s 100% remote teams, though, junior developers couldn’t see what other developers were doing. They were afraid to interrupt. “They froze and got stuck,” Bjorn said, “until much later when they were explicitly asked what they were doing.”

“Typically, we get much more from juniors than we pay them,” Bjorn said. That’s because junior developers aren’t paid much and they develop their skills quickly. At InVision, though, juniors “didn’t develop their skills and didn’t get much work done, so hiring juniors wasn’t a good use of money.”

Lack of junior developers is a problem in companies with more than fifty engineers, Bjorn says. “There’s some grunt work you can’t get [senior] people to do. They don’t want to do it.” You need people at all skill levels to take care of the work that exists at all levels.

The Friction of Communication Problem

“We had to overstaff to get the same amount of creativity.”

Due to communication friction, “we got much less creativity out of our [distributed teams] at InVision,” Bjorn said. “We had to overstaff to get the same amount of creativity. In other words, if communication was 80% as effective, we’d have to hire 30% more people.” Communication friction didn’t necessarily affect productivity, but it did affect creativity. And in InVision’s startup environment, creativity was key.

In a distributed team, Bjorn says, two thing keep occuring. First, the cost of establishing communication is much higher, even when the technology works perfectly.

  1. A: B, are you there?
  2. B: Yes I am
  3. A: Let’s chat
  4. B: OK
  5. A: (sets up video call)
  6. A: (waits)
  7. B: (joins call)

In comparison, in a co-located team, there’s no friction.

  1. A: (turns head) B, I have a question.

Theoretically, you could use always-on video links, but “people don’t want to do that. It’s uncomfortable.”

Second, in practice, video call technology doesn’t allow simultaneous conversation. Only one microphone picks up at once, and latency means you end up talking over each other.

This means that you can’t participate in the conversation like you would in person. In person, you’ll say “uh huh” and make small interjections that help guide the flow of the conversation into a proper collaboration. A video call, on the other hand, ends up being a series of short lectures.

This is especially bad for women, Bjorn says, who tend not to interrupt. It makes it harder to create a diverse team.

The end result is that you get much less creativity out of a distributed team. You have to overstaff to get the same amount of creativity you would get from a co-located team.

The Consequences of Distributed Work

“These purported benefits of fully distributed teams turn out to not be true at scale.”

Common wisdom is that distributed work makes some kinds of communication more difficult, but those downsides are compensated with decreased costs and greater hiring flexibility. But for Bjorn, it didn’t work out that way.

Expected BenefitsActual Result
More choice in hiringLess choice in hiring
Lower costsHigher costs
More work out of peopleMore work, but less creativity

For small organizations, Bjorn says, distributed teams do work effectively. Everyone knows each other and you don’t need multiple levels of indirection.

But beyond about 25 people, Bjorn says the problems he mentioned come into play. “These purported benefits of fully distributed teams turn out to not be true at scale.”

Less Choice in Hiring

Because of the N+1 management problem and the junior people problem, it’s actually harder to hire for distributed teams than co-located teams, Bjorn says. In addition, the people you really want to hire already live in one of the tech centers. “You don’t find a Kubernetes expert in Iowa,” Bjorn said. “You just don’t.”

Higher Costs

According to Bjorn, the ROI curve for developers looks like a bowl: high ROI for junior and senior developers, and closer to break-even for mid-career developers. That’s because junior developers are inexpensive and senior developers raise the capabilities of the people around them.

But in a distributed team, the ROI of junior developers is negative, so you end up hiring less cost-effective mid-career developers in their place. Additionally, Bjorn found that he had to hire more developers in order to get the amount of creativity he needed.

More Work, but Less Creativity

Because they didn’t have to commute, Bjorn did find that he got more hours out of people in his distributed teams. But, he said, “The key thing in any business I’ve worked in is the creative output. [In a distributed team,] you get less of it because of friction. You may even get more units of work, but Jira tickets don’t pay the bills.”

What works?

“My current belief is that there’s really only one structure that works.”

“My current belief is that there’s really only one [distributed] structure that works,” Bjorn said. That structure is the “Pod Remote” approach used by OTI.

OTI had a lot of small offices all over the world. Each office had about ten people who worked co-located. Each was built around a superstar developer, who recruited a mix of junior and senior people locally and was supported by the larger organization.

Each office specialized in a particular type of project. For example, “if you wanted to work on databases, you had to move to Sydney to work with Jeff’s team.”

This model allowed OTI to hire great people wherever they lived. Because teams were co-located, advanced management skills weren’t needed. In fact, one of these offices could be a star’s first management job. OTI still needed N+1 managers to run the distributed organization, but “you need that skill anyway from directors on up.”

The Pod Remote approach to distributed teams is the best of both worlds, Bjorn says. You can hire great people around the world, but still get the advantages of co-located work. “As I look to doing another startup,” Bjorn said, “it’s going to be Pod Remote.”

Don't Measure Unit Test Code Coverage

01 Feb, 2019

If you're using test-driven development, don't measure unit test code coverage. It's worse than a useless statistic; it will actively lead you astray.

What should you do instead? That depends on what you want to accomplish.

To improve code and test practices

If you're trying to improve your team's coding and testing practices, perform root-cause analysis1 of escaped defects, then improve your design and process to prevent that sort of defect from happening again.

1As Michael Bolton points out, it should really be root-causes analysis.

If waiting for defects to escape is too risky for you, have experienced QA testers conduct exploratory testing and conduct root-cause analysis on the results. Either way, the idea here is to analyze your defects to learn what to improve. Code coverage won't tell you.

To improve programmer code quality

If you're trying to improve programmers' code quality, teach testing skills, speed up the test loop, refactor more, use evolutionary design, and try pairing or mobbing.

Teaching testing skills and speeding up the test loop makes it easier for programmers to write worthwhile tests. Test coverage doesn't; it encourages them to write worthless tests to make the number go up.

Refactoring more and using evolutionary design makes your design simpler and easier to understand. This reduces design-related defects.

Pairing and mobbing enhance the self-discipline on your team. Everybody feels lazy once in a while, but when you're pairing (or mobbing), it's much less likely that everybody involved will be lazy at the same time. It also makes your code higher quality and easier to understand, because working together allows programmers to see the weaknesses in each other's code and come up with more elegant solutions.

To improve test discipline

Some people use code coverage metrics as a way of enforcing the habits they want. Unfortunately, habits can't be enforced, only nurtured. I'm reminded of a place I worked where managers wanted good code commit logs. They configured their tool to enforce a comment on every commit. They most common comment? "a." They changed the tool to enforce multiple-word comments on every commit. Now the most common comment was "a a a."

Enforcement doesn't change minds. Instead, use coaching and discipline-enhancing practices such as pairing or mobbing.

To add tests to legacy code

To build up tests in legacy code, don't worry about overall progress. The issue with legacy code is that, without tests, it's hard to change safely. The overall coverage isn't what matters; what matters is whether you're safe to change the code you're working on now.

So instead, nurture a habit of adding tests as part of working on any code. Whenever a bug is fixed, add a test first. Whenever a class is updated, retrofit tests to it first. Very quickly, the 20% of the code your team works on most often will have tests. The other 80% can wait.

To improve requirements code quality

If you're trying to improve how well your code meets customer needs, involve customer representatives early in the process, like "before the end of the Sprint" early. They won't always tell you what you're missing right away, but the sooner and more often you give them the chance to do so, the more likely you are to learn what you need to know.

To improve non-functional quality

If you're trying to improve "non-functional" qualities such as reliability or performance, use a mix of real-world monitoring, fail-fast code, and specialized testbeds. Non-functional attributes emerge from the system as a whole, so even a codebase with 100% coverage can have problems.

Here's the thing about TDD

The definition of TDD is that you don't write code without a failing test, and you do so in a tight loop that covers one branch at a time. So if you're doing TDD, any code you want to cover is ipso facto covered. If you're still getting defects, something else is wrong.

If people don't know how to do TDD properly, code coverage metrics won't help. If they don't want to cover their code, code coverage metrics won't help. If something else is wrong, you got it, code coverage metrics won't help. They're a distraction at best, and a metric to be gamed at worst. Figure out what you really want to improve and focus directly on that instead.

PS: Only a Sith deals in absolutes. Ryan Norris has a great story on Twitter about how code coverage helped his team turn around a legacy codebase. Martin Fowler has written about how occasional code coverage reviews are a useful sanity check.

(Thanks to everyone who participated in the lively Twitter debate about this idea.)

FluencyByDesign Game is a Bootleg Copy of Agile Fluency Game

30 Jan, 2019

Picture of The Agile Fluency Game box set, with all its component parts laid out on a table. The credits say, 'A game by James Shore and Arlo Belshee. Art by Eric Wahlquist.'

The real game.

Picture of The FluencyByDesign Simulation box set. The credits say, 'A game by James Shore and Arlo Belshee with Steve Holyer and Diana Larsen Bonnie Aumann Adam Light. Art by Eric Wahlquist.'

The bootleg copy.

I was shocked and disappointed to learn yesterday that Steve Holyer has made an unauthorized copy of The Agile Fluency* Game called "The FluencyByDesign* Simulation." I created this game with Arlo Belshee in 2012 and spent five years playtesting and refining it. Steve Holyer replaced our name, trademark, and branding with his own and sold copies of his version for thousands of euros.

*"Agile Fluency" is a trademark of James Shore and Diana Larsen. "FluencyByDesign" is a trademark of Steve Holyer and Associates.

The Agile Fluency Project, which I co-founded with Diana Larsen, is the only authorized publisher of the Agile Fluency Game. We at the Project are taking the appropriate legal steps. In the meantime, if you see copies of the bootleg version, please let the owner know they have an illegitimate version. We are replacing unauthorized copies at our expense. Write to info@agilefluency.org for details.

Please note, although Steve Holyer added several other people's names to the credits on his copy, they had nothing to do with Steve's bootleg. Please don't harass them.

The Details

As much as I would like to leave it there, I know some of you will find such blatant copyright infringement hard to believe and will want more details.

The short version is that Steve Holyer was a contributor to the Agile Fluency Project in 2016 and 2017. During that time, he had access to all our files. He must have kept a copy of them after he left the Project, because sometime in 2018 he used our game assets to publish his own version. Why he did this, and how he thought he could do it without getting caught, remains a mystery to me.

Here's the timeline of events:

    Picture of a table full of index cards and post-it notes. It looks like a confusing mess.

    The original design process.

  • Summer 2012. Arlo Belshee and I create the game for an Agile 2012 conference presentation called "This one goes to 121."

  • 8 August 2012. Diana Larsen and I write about our work on the Agile Fluency Model. Martin Fowler publishes it for us. Steve Holyer is one of several reviewers.

  • 2013-2016. I playtest the game at conferences and with clients, refining the game through six versions. I share copies with a few trusted colleagues for them to use with their own clients.

  • April 2015. Diana Larsen, Adam Light, and I host the first Agile Fluency Gathering at Diana's house. It's an intimate gathering with 15-20 attendees. We share Diana and Adam's work on what would become the Agile Fluency Diagnostic and ask the participants where we should go next with the Agile Fluency Model. Steve is one of the attendees.

  • Picture of an three game cards. They show dense text with three places in the middle for putting game tokens.

    v0.4 cards that Steve and Diana used.

  • May 2015. Diana and Steve host the "Agile Fluency Immersion" workshop in Germany. With my permission, Diana incorporates the game into their workshop materials.

  • 15 July 2015. Adam Light, Diana Larsen, and I found Agile Fluency Project LLC to respond to people's interest in the model.

  • Picture of game rules. They use the 'Agile Fluency Game' title and reference the Agile Fluency trademark. The copyright notice says, 'Copyright 2012-2015 by James Shore and Arlo Belshee.'

    Rules from v0.5 I sent Steve to playtest.

  • 26 Oct 2015. Steve sends me an email asking permission to use the game in his workshops. He writes, "Since [Diana and I] had such a great result, I'd like to confirm with you that it's OK to keep using the game in Agile Fluency workshops that Diana and I run or that Diana and I run separately. I'd also like to use it occasionally with clients not directly related to the Agile Fluency Model. I understand that you aren't ready to release the game, but I am hoping that you feel comfortable with me using it occasionally."

  • 27 Oct 2015. Arlo and I give Steve permission to use the game in his workshops. I send him v0.5 of the game. This version is functionally identical to today's published version. I ask for Steve's playtest feedback.

  • 11 Dec 2015. Steve provides playtest feedback, including a few minor suggestions that I incorporate, such as numbering the cards and clarifying which cards the PM gets. This, and similarly minor feedback in 2016, is the extent of Steve's creative contribution to the game.

  • 31 Dec 2015. Based on Steve's history with Diana and the Agile Fluency Model, Adam Light sends Steve an email saying he'd like to talk about "joining forces to move the Agile Fluency Project forward." He agrees to join the core team on 6 Jan 2016, although the relationship isn't formalized and Steve doesn't join the Project full time.

  • Jan 2016 - Apr 2017. Steve is a member of the Agile Fluency Project's core team which consists of Diana, Adam, Steve, myself, and another contributor who isn't part of this story. He is involved with a variety of marketing, sales, and planning efforts and has full, trusted access to everything we do.

  • Mid 2016. Arlo and I give the Agile Fluency Project permission to publish and sell the game. Steve takes point on figuring out printing and publication and settles on a print-on-demand publisher.

  • July 2016. I rebuild the game assets in InDesign, creating v0.6 of the game. Steve uses them to print a prototype box set that we show off at Agile 2016 at the end of July. To the best of my knowledge, this is the end of Steve's involvement with producing the game.

  • Picture of The Agile Fluency Game box set.

    The published game.

  • Feb - Mar 2017. Adam Light, acting on behalf of the Agile Fluency Project, hires Eric Wahlquist to produce production-grade art for the game. Eric is an experienced graphic designer and game designer. I work closely with him to produce the final game assets, which we call v7, and I organize everything with the printer. This is the version we still sell today. Steve is not involved and his prototype is not used.

  • April 2017. The Agile Fluency Project's core team meets in Portland to formalize the relationship between our contributors (including Steve) and the official Agile Fluency Project LLC entity. Despite several months of advance negotiations and apparent consensus, Steve ultimately declines to either agree or suggest changes. When the other participants insist on action one way or the other, Steve chooses to leave the Project instead.

  • April 2017 - Jan 2019. Although Steve is no longer part of the Agile Fluency Project's core team, he has favored status and continues to work with us on subjects that interest him. Steve is the first member of our Agile Fluency Game reseller program, which allows him to buy games from us at a significant discount and resell them to participants in his workshops. He helps plan the 2017 Agile Fluency Gathering and he presents the Agile Fluency Game workshop at the 2018 Agile Fluency Gathering.

  • 28 Jan 2019. We learn that Steve has copied and sold his own version of the Agile Fluency Game, in clear violation of copyright law and our reseller agreement with him. We cut all ties and refer the matter to our lawyer.

And that brings us to today. As you can see, although Steve playtested the game and helped print a prototype copy of the game, nothing gives him the right to copy my and Arlo's work and sell it as his own.

Large-Scale Agile: Where Do You Want Your Complexity?

19 Jan, 2019

One of the pernicious problems in large-scale software development is cross-team coordination. Most large-scale Agile methods focus on product and portfolio coordination, but there's a harder problem out there: coordinating the engineering work.

Poor engineering coordination leads to major problems: bugs, delays, production outages. Cross-team inefficiencies creep in, leading to Ron Jeffries' description: "a hundred-person project is a ten-person project, with overhead." One of my large-scale clients had teams that were taking three months to deliver five days of work—all due to cross-team coordination challenges.

How do you prevent these problems? One of the key ideas is to isolate your teams: to carefully arrange responsibilities so they don't need to coordinate so much. But even then, as Michael Feathers says, there's a law of conservation of complexity in software. We can move the complexity around, but we can't eliminate it entirely.

So where should your complexity live?

Monolith: Design Complexity

In the beginning, applications were monoliths. A monolith is a single program that runs in a single process, perhaps replicated across multiple computers, maybe talking to a back-end database or two. Different teams can be assigned to work on different parts of the program. Nice and simple.

Too simple. A monolith encourages bad habits. If two parts of the program need to communicate, sometimes the easiest way is to create a global variable or a singleton. If you need a piece of data, sometimes the easiest way is to duplicate the SQL query. These shortcuts introduce coupling that make the code harder to understand and change.

To be successful with a monolith, you need careful design to make sure different parts of the program are isolated. The more teams you have, the more difficult this discipline is to maintain. Monoliths don't provide any guardrails to prevent well-meaning programmers from crossing the line.

Microservices: Ops Complexity

At first glance, microservices seem like the perfect solution to the promiscuous chaos of the monolith. Each microservice is a small, self-contained program. Each is developed completely independently, with its own repository, database, and tooling. Services are deployed and run separately, so it's literally impossible for one team to inappropriately touch the implementation details of another.

I see it so often, I’ve given it a name: “Angry Ops Syndrome.”

Unfortunately, microservices move the coordination burden from development to operations. Now, instead of deploying and monitoring a single application, ops has to deploy and monitor dozens or even hundreds of services. Applications often can't be tested on developers' machines, requiring additional ops effort to create test environments. And when a service's behavior changes, other services that are affected by those changes have to be carefully orchestrated to ensure that they're deployed in the right order.

The microservice ops burden is often underestimated. I see it so often, I've given it a name: "Angry Ops Syndrome." As dev grows, the ops burden multiplies, but ops hiring doesn't keep pace. Problems pile up and a firefighting mentality takes over, leaving no time for systemic improvements. Sleepless nights, bitterness, and burnout result, leading to infighting between ops and dev.

Microservices also introduce complexity for programmers. Because of the stringent isolation, it's difficult to refactor across service boundaries. This can result in janky cross-service division of responsibilities. Programmers also need to be careful about dealing with network latency and errors. It's easy to forget, leading to production failures and more work for ops.

To be successful with microservices, you need well-staffed ops, a culture of dev-ops collaboration, and tooling that allows you to coordinate all your disparate services. Because refactoring across service boundaries is so difficult, Martin Fowler also suggests you start with a monolith so you can get your division of responsibilities correct.

Nanoservices: Versioning Complexity

What's a nanoservice? It's just like a microservice, but without all the networking complexity and overhead. In other words, it's a library.

Joking aside, the key innovation in microservices isn't the network layer, which introduces all that undesired complexity, but the idea of small, purpose-specific databases1. Instead of microservices, you can create libraries that each connect to their own database, just like a microservice. (Or even a separate set of tables in a single master database. The key here is separate, though.) And each library can have its own repository and tooling.

1Thanks to Matteo Vaccari for this insight.

Using a library is just a matter of installing it and making a function call. No network or deployment issues to worry about.

But now you have versioning hell. When you update your library, how do make sure everybody installs the new version? It's particularly bad when you want to update your database schema. You either have to maintain multiple versions of your database or wait for 100% adoption before deploying the updates.

To be successful with libraries, you need a way to ensure that new versions are consumed quickly and the software that uses those versions is deployed promptly. Without it, your ability to make changes will stall.

Monorepo: Tooling Complexity

Maybe total isolation isn't the answer. Some large companies, including Google and Facebook, take a hybrid approach. They keep all their code in a single repository, like a monolith, but they divide the repository into multiple projects that can be built and deployed independently. When you update your project, you can directly make changes to its consumers, but the people who are affected get to sign off on your changes.

The problem with this approach is that it requires custom tooling. Google built their own version control system. Facebook patched Mercurial. Microsoft built a virtual filesystem for git.

You also need a way of building software and resolving cross-team dependencies. Google's solution is called Blaze, and they've open-sourced a version of it called Bazel. Additional tooling is needed for cross-team ownership and sign-offs.

These tools are starting to enter the mainstream, but they aren't there yet. Until then, to be successful with a monorepo, you need to devote extra time to tooling.

Which Way is Best?

I'm partial to the monorepo approach. Of all the options, it seems to have the best ability to actually reduce coordination costs (via tooling) rather than just sliding the costs around to different parts of the organization. If I were starting from scratch, I would start with a monorepo and scale my tooling support for it along with the rest of the organization. In a large organization, a percentage of development teams should be devoted to enabling other developers, and one of them can be responsible for monorepo tooling.

But you probably don't have the luxury of starting from scratch. In that case, it's a matter of choosing your poison. What is your organization best at? If you have a great ops team and embedded devops, microservices could work very well for you. On the other hand, if you aren't great at ops, but your programmers are great at keeping their designs clean, a monolith could work well.

As always, engineering is a matter of trade-offs. Which one is best? Wherever you want your complexity to live.

(Thanks to Michael Feathers for reviewing an early draft of this essay.)

My Best Essays

11 Jan, 2019

I've been writing about agile software development for nearly 20 years, and most of it is available on this blog. I took some recently to refresh my list of best essays. The ten most popular are below. You can find the full list here.

Top Ten
The Agile Fluency™ Model: A Brief Guide to Success with Agile - 6 Mar 2018

A model for getting the most out of agile ideas. (Coauthored with Diana Larsen; hosted at martinfowler.com.)

Dependency Injection Demystified - 22 Mar, 2006

A 25-dollar term for a 5-cent concept.

The Art of Agile Development (Book) - 2008

The Agile how-to guide. (Coauthored with Shane Warden.)

Continuous Integration on a Dollar a Day - 27 Feb, 2006

An easier, cheaper (and better) way to do continuous integration.

Testing Without Mocks: A Pattern Language - 27 Apr, 2018

How to use test-driven development without traditional test doubles.

Red-Green-Refactor - 30 Nov, 2005

Test-driven development in a nutshell.

Cargo Cult Agile - 14 May, 2008

Following the rituals of agile development without understanding the underlying ideas.

The Decline and Fall of Agile - 14 Nov, 2008

It's human nature to only do the stuff that's familiar and fun, and that's what has happened with Agile.

Use Risk Management to Make Solid Commitments - 8 Oct, 2008

How to use risk multipliers and risk-adjusted burn-up charts to make solid commitments to your executives.

The Problems With Acceptance Testing - 27 Feb, 2010

Why I don't use acceptance testing tools.

Colophon

10 Jan, 2019

Every so often, I describe what I do to make this site a reality. For the four of you who like to watch me gaze at my navel, I have good news: today is one of those days. For the rest of you, I'm sorry. It's one of those days. You can move on. I won't be offended.

Okay, here goes.

Production

I use a MacBook to compose all of my essays. Years ago, my Windows laptop died and I decided to see if life was better on the other side. I have come to like Mac OS X, after some initial frustrations, and now I have a hard time imagining going back.

I write the body of my entries in hand-crafted HTML using Webstorm. I also use OmniGraffle when I need to make pretty diagrams. Mmm... lickable.

Everything required to make the site go is stored locally, on my MacBook. I deploy the site to a local copy of Apache for testing, then to my web host when I'm ready to go live. I use a Ruby Rakefile to build the site, and then rsync the whole mess to the server. The exact same site goes on the server as on the local test site; all that changes is the rsync destination.

I am absolutely paranoid about backups and revision control. There's decades of work here. That's why all of my essays are written in a text editor and stored on my laptop. Most blogging sites save your work in the cloud. Easy, but not fault tolerant. I've been writing for nearly two decades and plan to continue writing for several more. My stuff has to be easily scripted, version-controlled, backed up, and trivial to deploy to another web host when (not if) my host goes under... which it already has. Twice.

The source for my site (including rakefiles) is versioned by Subversion. (For new codebases, I use git, but there's no reason to switch. It could be worse... the code used to be in CVS.) The repository is local, on my laptop. The laptop is backed up several times over: first, the entire site is on the web host, of course. The computer itself is backed up to two redundant drives with Time Machine every hour. I also make a bootable whole-disk backup using SuperDuper!. Every quarter, I rotate one of the backup drives to an off-site location.

Design

The site was originally designed in 2002 and it shows. Boy, does it show. But it's functional and there always seems to be some other priority getting in the way of a redesign.

The biggest challenge for me in originally making the site was my complete lack of web design skills. (Software engineering, yes. Graphic design, no.) Sitepoint got me started and A List Apart carried me through. I borrowed liberally from Jeffrey Zeldman in creating the side menu, as permitted by comments in his CSS.

For the color scheme, Jason Beaird's "Color for Coders" article helped me get started and Visibone's Color Laboratory allowed me to pick web-safe colors. (I have no idea if we're supposed to limit ourselves to 216 "web-safe" colors any more, but it seemed like a safe bet. I cheated a bit for quotes, though.)

I also took advantage of some royalty-free icons. The RSS feed icon (sample icon) came from Feed Icons; the Twitter icon (sample icon) is resized from an icon I got from Productivedreams.com; the print icon (sample icon) came from graphicPUSH; the star icon (sample icon) came from 1ClipArt; and the spinner (sample icon) came from Andrew Davidson.

Finally, lots of trial and error got me to the result I have today. W3Schools' CSS Reference was invaluable for making it work, but these days MDN is a better resource. Eric Meyer's "Going to Print" article provided the finishing touch for my print stylesheet by showing me how to automatically insert URLs after printed links.

Hosting

The site is hosted by NearlyFreeSpeech and runs on the Apache web server. I use mod_rewrite to make sure public-facing URLs are implementation-independent and to ensure that you can link to a page forever (for sufficiently small values of forever). Domain Discover makes sure that my domains (jamesshore.com and titanium-it.com) point to the right place. Google handles my analytics. Sorry about that.

The site is rendered by the ultra-minimalistic Blosxom. Back in the day, Blosxom was the only blogging software I could find that actually allowed me to store entries locally, in files that I can back up and put in version control, rather than on a database on the server. (Nowadays, of course, static site generators are a dime a dozen.) Blosxom runs on Perl 5. Blosxom does almost nothing by itself, so I use these plug-ins to help out: blok, meta, default_flavour, prefs, breadcrumbs, directorybrowse, entries_index_tagged, interpolate_fancy, menu, plain_text, postheadprefoot, readme, static_file, urltranslate, and wordcount.

I run my whole site with Blosxom, not just the blog portion. Getting this working was quite a headache and I had to make some custom mods to several of the plug-ins as well as Blosxom itself. It's worked for over 15 years, but the volume of content is causing performance problems, and Blosxom's idiosyncracies are standing in the way of a redesign. It's past time for me to move on. I have a very nice CMS I wrote to manage my Let’s Code JavaScript site, and one of these days I'll migrate to that. That's probably when I'll make the badly-needed visual refresh, too.

Colophonem adidi.

Testing Without Mocks: A Pattern Language

27 Apr, 2018

For example code demonstrating these ideas, see my example on GitHub.

When programmers use test-driven development (TDD), the code they test interacts with other parts of the system that aren't being tested. To test those interactions, and to prevent the other code from interfering with their tests, programmers often use mock objects or other test doubles. However, this approach requires additional integration tests to ensure the system works as a whole, and it can make structural refactorings difficult.

This pattern language1 describes a way of testing object-oriented code without using mocks. It avoids the downsides of mock-based testing, but it has tradeoffs of its own.

1The structure of this article was inspired by Ward Cunningham's CHECKS Pattern Language of Information Integrity, which is a model of clarity and usefulness.

Contents:

Goals

  • No broad tests required. The test suite consists entirely of "narrow" tests that are focused on specific concepts. Although broad integration tests can be added as a safety net, their failure indicates a gap in the main test suite.

  • Easy refactoring. Object interactions are considered implementation to be encapsulated, not behavior to be tested. Although the consequences of object interactions are tested, the specific method calls aren't. This allows structural refactorings to be made without breaking tests.

  • Readable tests. Tests follow a straightforward "arrange, act, assert" structure. They describe the externally-visible behavior of the unit under test, not its implementation. They can act as documentation for the unit under test.

  • No magic. Tools that automatically remove busywork, such as dependency-injection frameworks and auto-mocking frameworks, are not required.

  • Fast and deterministic. The test suite only executes "slow" code, such as network calls or file system requests, when that behavior is explicitly part of the unit under test. Such tests are organized so they produce the same results on every test run.

Tradeoffs

  • Test-specific production code. Some code needed for the tests is written as tested production code, particularly for infrastructure classes. It requires extra time to write and adds noise to class APIs.

  • Hand-written stub code. Some third-party infrastructure code has to be mimicked with hand-written stub code. It can't be auto-generated and takes extra time to write.

  • Sociable tests. Although tests are written to focus on specific concepts, the units under test execute code in their dependencies. (Jay Fields coined the term "sociable tests" for this behavior.) This can result in multiple tests failing when a bug is introduced.

  • Not a silver bullet. Code must be written with careful thought to design. Design mistakes are inevitable and this necessitates continuous attention to design and refactoring.

Architectural Patterns

Testing without mocks requires careful attention to the dependencies in your codebase. These patterns help establish the ground rules.

Overlapping Sociable Tests

Our goal is to create a test suite consisting entirely of "narrow" tests, with no need for "broad" end-to-end tests. But most narrow tests don't test that the system is wired together correctly. Therefore:

When testing the interactions between an object and its dependencies, inject real dependency instances (not test doubles) into the unit under test. Don't test the dependencies' behavior itself, but do test that the unit under test uses the dependencies correctly.

This will create a strong linked chain of tests. Each test will overlap with dependencies' tests and dependents' tests. The test suite as a whole should cover your entire application in a fine overlapping mesh, giving you the coverage of broad tests without the need to write them.

To avoid constructing the entire dependency chain, use Zero-Impact Instantiation and Parameterless Instantiation. To isolate your unit under test, use Collaborator-Based Isolation and Nullable Infrastructure. To test code that depends on infrastructure, use Configurable Responses, Send State, Send Events, and Behavior Simulation.

A-Frame Architecture

Because we aren't using mocks to isolate our dependencies, it's easiest to test code that doesn't depend on infrastructure (external systems such as databases, file systems, and services). However, a typical layered architecture puts infrastructure at the bottom of the dependency chain:

   Application/UI
         |
         V
       Logic
         |
         V
   Infrastructure
Therefore:

Structure your application so that infrastructure and logic are peers under the application layer, with no dependencies between Infrastructure and Logic. Coordinate between them at the application layer with a Logic Sandwich or Traffic Cop.

       Application/UI
       /            \            
      V              V
   Infrastructure   Logic

Build the bottom two layers using Infrastructure Patterns and Logic Patterns.

Although A-Frame Architecture is a nice way to simplify application dependencies, it's optional. You can test code that mixes infrastructure and logic using Infrastructure Wrappers and Nullable Infrastructure.

To build a new application using A-Frame Architecture, Grow Evolutionary Seeds. To convert an existing layered architecture, Climb the Ladder.

Logic Sandwich

When using an A-Frame Architecture, the infrastructure and logic layers can't communicate with each other. But the logic layer needs to read and write data controlled by the infrastructure layer. Therefore:

Implement the top-level code as a "logic sandwich," where data is read by the infrastructure layer, then processed by the logic layer, then written by the infrastructure layer. Repeat as needed. Each piece can then be tested independently.

let input = infrastructure.readData();
let output = logic.processInput(input);
infrastructure.writeData(output);

This simple algorithm can handle sophisticated needs if put into a loop with a stateful logic layer.

For applications with complicated infrastructure, use a Traffic Cop instead.

Traffic Cop

The Logic Sandwich boils infrastructure down into simple infrastructure.readData() and infrastructure.writeData() abstractions. Applications with complex infrastructure may not be a good fit for this approach. Therefore:

Instead of asking the infrastructure for its data, use the Observer pattern to send events from the infrastructure layer to the application layer. For each event, implement a Logic Sandwich. In some cases, your application code might need a bit of logic of its own.

infrastructure.on("login", (token) => {  // infrastructure layer
  let loginInfo = LoginInfo.createFromToken(token);  // logic layer
  if (loginInfo.isValid) {  // application logic
    let userData = subscriberService.lookUpUser(loginInfo.userId);  // infrastructure layer
    let user = new User(userData);  // logic layer
    infrastructure.createSession(user.sessionData);  // infrastructure layer
  }
});
infrastructure.on("event2", (data) => {
  let output = logic.processEvent2(data);
  infrastructure.writeData2(output);
});
//...etc...

Be careful not to let your Traffic Cop turn into a God Class. If it gets complicated, better infrastructure abstractions might help. Sometimes taking a less "pure" approach and moving some Logic code into the Infrastructure layer can simplify the overall design. In other cases, splitting the application layer into multiple classes, each with its own Logic Sandwich or simple Traffic Cop, can help.

Grow Evolutionary Seeds

One popular design technique is outside-in design, in which an application is programmed by starting with the externally-visible behavior of the application, then working your way in to the details.

This is typically done by writing a broad integration test to describe the externally-visible behavior, then using narrow unit tests to define the details. But we want to avoid broad tests. Therefore:

Use evolutionary design to grow your application from a single file. Choose a simple end-to-end behavior as a starting point and test-drive a single class to implement a trivial version of that behavior. Hardcode one value that would normally come from the Infrastructure layer, don't implement any significant logic, and return the result to your tests rather than displaying in a UI. This class forms the seed of your Application layer.

// JavaScript example: simplest possible Application seed

// Test code
it("renders user name", function() {
  const app = new MyApplication();
  assert.equal("Hello, Sarah", app.render());
});

// Production code
class MyApplication {
  render() {
    return "Hello, Sarah";
  }
}

Next, implement a barebones Infrastructure Wrapper for the one infrastructure value you hardcoded. Code just enough infrastructure to provide one real result to your application layer class. Don't worry about making it robust or reliable yet. This Infrastructure Wrapper class forms the seed of your Infrastructure layer.

Before integrating your new Infrastructure class into your Application layer class, implement Nullable Infrastructure. Then modify your application layer class to use the infrastructure wrapper, injecting the Null version in your tests.

// JavaScript example: Application + read from infrastructure

// Test code
it("renders user name", function() {
  const usernameService = UsernameService.createNull({ username: "my_username" });
  const app = new MyApplication({ usernameService });
  assert.equal("Hello, my_username", app.render());
});

// Production code
class MyApplication {
  // usernameService parameter is optional
  constructor({ usernameService = UsernameService.create() } = {}) {
    this._usernameService = usernameService;
  }
    
  async render() {
    const username = await this._usernameService.getUsername();
    return `Hello, ${username}`;
  }
}

Next, do the same for your UI. Choose one simple output mechanism that your application will use (such as rendering to the console, the DOM, or responding to a network request) and implement a barebones Infrastructure Wrapper for it. Add support for Nullable Infrastructure and modify your application layer tests and code to use it.

// JavaScript example: Application + read/write to infrastructure

// Test code
it("renders user name", function() {
  const usernameService = UsernameService.createNull({ username: "my_username" });
  const uiService = UiService.createNull();
  const app = new MyApplication({ usernameService, uiService });
  
  app.render();
  assert.equal("Hello, my_username", uiService.getLastRender());
});

// Production code
class MyApplication {
  constructor({ 
    usernameService = UsernameService.create(),
    uiService = UiService.create(),
  } = {}) {
    this._usernameService = usernameService;
    this._uiService = uiService;
  }
    
  async render() {
    const username = await this._usernameService.getUsername();
    await uiService.render(`Hello, ${username}`);
  }
}

Now your application tests serve the same purpose as broad end-to-end tests: they document and test the externally-visible behavior of the application. Because they inject Null application dependencies, they're narrow tests, not broad tests, and they don't communicate with external systems. That makes them fast and reliable. They're also Overlapping Sociable Tests, so they provide the same safety net that broad tests do.

At this point, you have the beginnings of a walking skeleton: an application that works end-to-end, but is far from complete. You can evolve that skeleton to support more features. Choose some aspect of your code that's obviously incomplete and test-drive a slightly better solution. Repeat forever.

// JavaScript example: Application + read/write to infrastructure
// + respond to UI events

// Test code
it("renders user name", function() {
  const usernameService = UsernameService.createNull({ username: "my_username" });
  const uiService = UiService.createNull();
  const app = new MyApplication({ usernameService, uiService });
  app.start();
  
  uiService.simulateRequest("greeting");
  assert.equal("Hello, my_username", uiService.getLastRender());
});

// Production code
class MyApplication {
  constructor({ usernameService = UsernameService.create() }) {
    this._usernameService = usernameService;
    this._uiService = uiService;
  }
    
  async start() {
    this._uiService.on("greeting", () => {
      const username = await this._usernameService.getUsername();
      await uiService.render(`Hello, ${username}`);
    });
  }
}

At some point, probably fairly early, your Application layer class will start feeling messy. When it does, look for a concept that can be factored into its own class. This forms the seed of your Logic layer. As your application continues to grow, continue refactoring so that class collaborations are easy to understand and responsibilities are clearly defined.

To convert existing code to an A-Frame Architecture, Climb the Ladder instead.

Climb the Ladder

Most pre-existing code you encounter will be designed with a layered architecture, where Logic code has Infrastructure dependencies. Some of this code will be difficult to test or resist refactoring. Therefore:

Refactor problem code into a miniature A-Frame Architecture. Start at the lowest levels of your Logic layer and choose a single method that depends on one clearly-defined piece of infrastructure. If the Infrastructure code is intertwingled with the Logic code, disintertwingle it by factoring out an Infrastructure Wrapper.

When the Infrastructure code has been separated from the rest of the code, the method will act similarly to an Application layer class: it will have a mix of logic and calls to infrastructure. Make this code easier to refactor by rewriting its tests to use Nullable Infrastructure dependencies instead of mocks. Then factor all the logic code into methods with no infrastructure dependencies.

At this point, your original method will have nothing left but a small Logic Sandwich: a call or two to the infrastructure class and a call to the new logic method. Now eliminate the original method by inlining it to its callers. This will cause the logic sandwich to climb one step up your dependency chain.

Repeat until the class no longer has any dependencies on infrastructure. At that point, review its design and refactor as desired to better fit the Logic Patterns and your application's needs. Continue with the next class.

Climbing the Ladder takes a lot of time and effort, so do it gradually, as part of your normal work, rather than all at once. Focus your efforts on code where testing without mocks will have noticeable benefit. Don't waste time refactoring code that's already easy to maintain, regardless of whether it uses mocks.

When building a new system from scratch, Grow Evolutionary Seeds instead.

Zero-Impact Instantiation

Overlapping Sociable Tests instantiate their dependencies, which in turn instantiate their dependencies, and so forth. If instantiating this web of dependencies takes too long or causes side effects, the tests could be slow, difficult to set up, or fail unpredictably. Therefore:

Don't do significant work in constructors. Don't connect to external systems, start services, or perform long calculations. For code that needs to connect to an external system or start a service, provide a connect() or start() method. For code that needs to perform a long calculation, consider lazy initialization. (But even complex calculations aren't likely to be a problem, so profile before optimizing.)

Signature Shielding

As you refactor your application, method signatures will change. If your code is well-designed, this won't be a problem for production code, because most methods will only be used in a few places. But tests can have many duplicated method and constructor calls. When you change those methods or constructors, you'll have a lot of busywork to update the tests. Therefore:

If a file has a lot of tests that call a specific method, provide a proxy function for that method. Similarly, if it has a lot of tests that instantiate a class, provide a factory method. Program the proxies and factories so their parameters are all optional. That way you can add additional parameters in the future without breaking existing tests.

// JavaScript code with named, optional parameters

// Example test
it("uses hosted page for authentication", function() {
  const client = createClient({   // Use the factory function
    host: "my.host", 
    clientId: "my_client_id"
  });
  const url = getLoginUrl({   // Use the proxy function
    client, 
    callbackUrl: "my_callback_url"
  });
  assert.equal(url, "https://my.host/authorize?response_type=code&client_id=my_client_id&callback_url=my_callback_url");
});

// Example factory function
function createClient({
  host = "irrelevant_host",
  clientId = "irrelevant_id",
  clientSecret = "irrelevant_secret",
  connection = "irrelevant_connection"
} = {}) {
  return new LoginClient(host, clientId, clientSecret, connection);
}

// Example proxy function
function getLoginUrl({
  client,
  username = "irrelevant_username",
  callbackUrl = "irrelevant_url"
} = {}) {
  return client.getLoginUrl(username, callbackUrl);
}

Logic Patterns

When using A-Frame Architecture, the application's Logic layer has no infrastructure dependencies. It represents pure computation, so it's fast and deterministic. To qualify for the Logic layer, code can't talk to a database, communicate across a network, or touch the file system.2 Neither can its tests or dependencies. Any code that breaks these rules belongs in the Application layer or Infrastructure layer instead. Code that modifies global state can be put in the Logic layer, but it should be avoided, because then you can't parallelize your tests.

Pure computation is easy to test. The following patterns make it even easier.

2This list inspired by Michael Feathers' unit testing rules.

Easily-Visible Behavior

Logic layer computation can only be tested if the results of the computation are visible to tests. Therefore:

Prefer pure functions where possible. Pure functions' return values are determined only by their input parameters.

When pure functions aren't possible, prefer immutable objects. The state of immutable objects is determined when the object is constructed, and never changes afterwards.

For methods that change object state, provide a way for the change in state to be observed, either with a getter method or an event.

In all cases, avoid writing code that explicitly depends on (or changes) the state of dependencies more than one level deep. That makes test setup difficult, and it's a sign of poor design anyway. Instead, design dependencies so they completely encapsulate their next-level-down dependencies.

Testable Libraries

Third-party code doesn't always have Easily-Visible Behavior. It also tends to introduce breaking API changes with new releases, or simply stop being maintained. Therefore:

Wrap third-party code in code that you control. Ensure your application's use of the third-party code is mediated through your wrapper. Write your wrapper's API to match the needs of your application, not the third-party code, and add methods as needed to provide Easily-Visible Behavior. (This will typically involve writing getter methods to expose deeply-buried state.) When the third-party code introduces a breaking change, or needs to be replaced, modify the wrapper so no other code is affected.

Frameworks and libraries with sprawling APIs are more difficult to wrap, so prefer libraries that have a narrowly-defined purpose and a simple API.

If the third-party code interfaces with an external system, use an Infrastructure Wrapper instead.

Parameterless Instantiation

Multi-level dependency chains are difficult to set up in tests. Dependency injection (DI) frameworks work around the problem, but we're avoiding magic like DI frameworks. Therefore:

Ensure that all Logic classes can be constructed without providing any parameters (and without using a DI framework). In practice, this means that most objects instantiate their dependencies in their constructor by default, although they may also accept them as optional parameters.

For some classes, a parameterless constructor won't make any sense. For example, an immutable "Address" class would be constructed with its street, city, and so forth. For these sorts of classes, provide a test-only factory method. The factory method should provide overridable defaults for mandatory parameters.

The factory method is easiest to maintain if it's located in the production code next to the real constructors. It should be marked as test-specific and should be simple enough to not need tests of its own.

// Example JavaScript code using named, optional parameters

class Address {
  // Production constructor
  constructor(street, city, state, country, postalCode) {
    this._street = street;
    this._city = city;
    //...etc...
  }
  
  // Test-specific factory
  static createTestInstance({
    street = "Address test street",
    city = "Address test city",
    state = State.createTestInstance(),
    country = Country.createTestInstance(),
    postalCode = PostalCode.createTestInstance()    
  } = {}) {
    return new Address(street, city, state, country, postalCode);
  }
}

Collaborator-Based Isolation

Overlapping Sociable Tests ensure that any changes to the semantics of a unit's dependencies will cause that unit's tests to break, no matter how far down the dependency chain they may be. On the one hand, this is nice, because we'll learn when we accidentally break something. On the other hand, this could make feature changes terribly expensive. We don't want a change in the rendering of addresses to break hundreds of unrelated reports' tests. Therefore:

Call dependencies' methods to help define test expectations. For example, if you're testing a InventoryReport that includes an address in its header, don't hardcode "123 Main St." as your expectation for the report header test. Instead, call Address.renderAsOneLine() as part of defining your test expectation.

// JavaScript example

// Example test
it("includes the address in the header when reporting on one address", function() {
  // Instantiate the unit under test and its dependency
  const address = Address.createTestInstance();
  const report = createReport({ addresses: [ address ] });
  
  // Define the expected result using the dependency
  const expected = "Inventory Report for " + address.renderAsOneLine();
  
  // Run the production code and make the assertion
  assert.equal(report.renderHeader(), expected);
});

// Example production code
class InventoryReport {
  constructor(inventory, addresses) {
    this._inventory = inventory;
    this._addresses = addresses;
  }
  
  renderHeader() {
    let result = "Inventory Report";
    if (this._addresses.length === 1) {
      result += " for " + this._address[0].renderAsOneLine();
    }
    return result;
  }  
}

This provides the best of both worlds: Overlapping Sociable Tests ensure that your application is wired together correctly and Collaborator-Based Isolation allows you to change features without modifying a lot of tests.

Infrastructure Patterns

The Infrastructure layer contains code for communicating with the outside world. Although it may contain some logic, that logic should be focused on making infrastructure easier to work with. Everything else belongs in the Application and Logic layers.

Infrastructure code is unreliable and difficult to test because of its dependencies on external systems. The following patterns work around those problems.

Infrastructure Wrappers

In the Logic layer, you can design your code to avoid complex, global state. In the Infrastructure layer, your code deals with nothing else. Testing infrastructure code that depends on other infrastructure code is particularly difficult. Therefore:

Keep your infrastructure dependencies simple and straightforward. For each external system--service, database, file system, or even environment variables--create one wrapper class that's responsible for interfacing with that system. Design your wrappers to provide a crisp, clean view of the messy outside world, in whatever format is most useful to the Logic and Application layers.

Avoid creating complex webs of dependencies. In some cases, high-level Infrastructure classes may depend on generic, low-level classes. For example, a LoginClient might depend on RestClient. In other cases, high-level infrastructure classes might unify multiple low-level classes, such as a DataStore class that depends on a RelationalDb class and a NoSqlDb class. Other than these sorts of simple one-way dependency chains, design your Infrastructure classes to stand alone.

Test your Infrastructure Wrappers with Focused Integration Tests and Paranoic Telemetry. Enable them to be used in other tests by creating Nullable Infrastructure.

Focused Integration Tests

Ultimately, Infrastructure code talks over a network, interacts with a file system, or involves some other communication with an external system. Its correctness depends on communicating properly. Therefore:

Test your external communication for real. For file system code, check that it reads and writes real files. For databases and services, access a real database or service. Make sure that your test systems use the same configuration as your production environment. Otherwise your code will fail in production when it encounters subtle incompatibilities.

Run your focused integration tests against test systems that are reserved exclusively for one machine's use. It's best if they run locally on your development machine, and are started and stopped by your tests or build script. If you share test systems with other developers, you'll experience unpredictable test failures when multiple people run the tests at the same time.

You won't be able to get a local test system for every external system your application uses. When you can't, use a Spy Server instead.

Some high-level Infrastructure classes will use lower-level classes to do the real work, such as a LoginClient class that uses a RestClient class to make the network call. They can Fake It Once You Make It.

Spy Server

Some external systems are too unreliable, expensive, or difficult to use for Focused Integration Tests. It's one thing to run dozens of tests against your local file system every few minutes; quite another to do that to your credit card gateway. Therefore:

Create a test server that that you can run locally. Program it to record requests and respond with pre-configured results. Make it very simple and generic. For example, all REST-based services should be tested by the same HTTPS Spy Server.

To test against the Spy Server, start by making a real call to your external system. Record the call and its results and paste them into your test code (or save them to a file). In your test, check that the expected request was made to the Spy Server and the response was processed correctly.

External systems can change out from under you. In the case of cloud-based services, it can happen with no warning. A Spy Server won't be able to detect those changes. To protect yourself, implement Paranoic Telemetry.

// Example Node.js LoginClient tests.

// Start, stop, and reset the Spy Server
const testServer = new HttpsTestServer();
before(async function() {
  await testServer.startAsync();
});
after(async function() {
  await testServer.stopAsync();
});
beforeEach(function() {
  testServer.reset();
});

// The test
it("gets user details", async function() {
  // Instantiate unit under test (uses Signature Shielding)
  const client = createNetworkedClient({
    managementApiToken: "my_management_api_token",
    connection: "my_auth0_connection",
  });
  
  // Set up Spy Server response
  testServer.setResponse({   
    status: 200,
    body: JSON.stringify([{
      user_id: "the_user_id",
      email_verified: false,
    }]),
  });

  // Call the production code
  const result = await client.getUserInfoAsync("a_user_email");
  
  // Assert that the correct HTTP request was made
  assert.deepEqual(testServer.getRequests(), [{
    method: "GET",
    url: "/api/v2/users?" +
      "fields=user_id%2Cemail_verified&" +
      "q=identities.connection%3A%22my_auth0_connection%22%20AND%20email%3A%22a_user_email%22&" +
      "search_engine=v2",
    body: "",
    headers: {
      host: testServer.host(),
      authorization: "Bearer my_management_api_token",
    },
  }], "request");
  
  // Assert that the response was processed properly
  assert.deepEqual(result, {
    userId: "the_user_id",
    emailVerified: false
  }, "result");
});

This is a complete example of a real-world Node.js HTTPS Spy Server. You can use this code in your own projects:

// Copyright 2018 Titanium I.T. LLC. All rights reserved. MIT License.
"use strict";

//** An HTTPS spy server for use by focused integration tests

const https = require("https");
const promisify = require("util").promisify;

const SELF_SIGNED_LOCALHOST_CERT_FOR_TESTING_ONLY =
  "-----BEGIN CERTIFICATE-----\n" +
  // TODO
  "-----END CERTIFICATE-----";

const CERT_PRIVATE_KEY_FOR_TESTING_ONLY =
  "-----BEGIN RSA PRIVATE KEY-----\n" +
  // TODO
  "-----END RSA PRIVATE KEY-----";

module.exports = class HttpsTestServer {
  constructor() {
    this._hostname = "localhost";
    this._port = 5030;
    this.reset();
  }

  reset() {
    this._forceRequestError = false;
    this._requests = [];
    this._responses = [];
  }

  hostname() { return this._hostname; }
  port() { return this._port; }
  host() { return this._hostname + ":" + this._port; }
  certificate() { return SELF_SIGNED_LOCALHOST_CERT_FOR_TESTING_ONLY; }

  getRequests() { return this._requests; }

  async startAsync() {
    const options = {
      cert: SELF_SIGNED_LOCALHOST_CERT_FOR_TESTING_ONLY,
      key: CERT_PRIVATE_KEY_FOR_TESTING_ONLY,
      secureProtocol: "TLSv1_method"
    };
    this._server = https.createServer(options);
    this._server.on("request", handleRequest.bind(null, this));

    await promisify(this._server.listen.bind(this._server))(this._port);
  }

  async stopAsync() {
    await promisify(this._server.close.bind(this._server))();
  }

  setResponses(responses) { this._responses = responses; }
  setResponse(response) { this._responses = [ response ]; }

  forceErrorDuringRequest() {
    this._forceRequestError = true;
  }
};

function handleRequest(self, request, response) {
  let responseInfo = self._responses.shift();
  if (responseInfo === undefined) responseInfo = { status: 503, body: "No response defined in HttpsTestServer" };

  const requestInfo = {
    method: request.method,
    url: request.url,
    headers: Object.assign({}, request.headers),
    body: ""
  };
  delete requestInfo.headers.connection;
  self._requests.push(requestInfo);

  if (self._forceRequestError) request.destroy();

  request.on("data", function(data) {
    requestInfo.body += data;
  });
  request.on("end", function() {
    response.statusCode = responseInfo.status;
    response.setHeader("Date", "harness_date_header");
    response.end(responseInfo.body);
  });
}

Paranoic Telemetry

External systems are unreliable. The only thing that's certain is their eventual failure. File systems lose data and become unwritable. Services return error codes, suddenly change their specifications, and refuse to terminate connections. Therefore:

Instrument the snot out of your infrastructure code. Assume that everything will break eventually. Test that every failure case either logs an error and sends an alert, or throws an exception that ultimately logs an error and sends an alert. Remember to test your code's ability to handle requests that hang, too.

All these failure cases are expensive to support and maintain. Whenever possible, use Testable Libraries rather than external services.

(An alternative to Paranoic Telemetry is Contract Tests, but they're not paranoid enough to catch changes that happen between test runs.)

Nullable Infrastructure

Focused Integration Tests are slow and difficult to set up. Although they're useful for ensuring that infrastructure code works in practice, they're overkill for code that depends on that infrastructure code. Therefore:

Program each infrastructure class with a factory method, such as "createNull()," that disables communication with the external system. Instances should behave normally in every other respect. This is similar to how Null Objects work. For example, calling LoginClient.createNull().getUserInfo("...") would return a default response without actually talking to the third-party login service.

The createNull() factory is production code and should be test-driven accordingly. Ensure that it doesn't have any mandatory parameters. (Nullable Infrastructure is the Infrastructure layer equivalent of Parameterless Instantiation.)

To implement Nullable Infrastructure cleanly, use an Embedded Stub. To test code that has infrastructure dependencies, use Configurable Responses, Send State, Send Events, and Behavior Simulation.

Embedded Stub

In order for Nullable Infrastructure to be useful to tests, null instances need to disable the external system while running everything else normally. The obvious approach is to use a flag and bunch of "if" statements, but that's a recipe for spaghetti. Therefore:

Stub out the third-party library that performs external communication rather than changing your infrastructure code. In the stub, implement the bare minimum needed to make your infrastructure code run. Ensure you don't overbuild the stub by test-driving it through your infrastructure code's public interface. Put the stub in the same file as your infrastructure code so it's easy to remember and update when your infrastructure code changes.

// Example Node.js wrapper for Socket.IO, a WebSockets library.
// Note how minimalistic the stub code is.

// Import real Socket.IO library
const io = require("socket.io");

// Infrastructure Wrapper
class RealTimeServer extends EventEmitter {
  
  // Instantiate normal wrapper
  static create() {
    return new RealTimeServer(io);
  }

  // Instantiate Null wrapper
  static createNull() {
    return new RealTimeServer(nullIo);
  }  

  // Shared initialization code
  constructor(io) {
    super();
    this._io = io;
    //...
  }

  // Normal infrastructure code goes here.
  // It's unaware of which version of Socket.IO is used.
}

// Null Socket.IO implementation is programmed here
function nullIo() {
  return new NullIoServer();
}

class NullIoServer {
  on() {}
  emit() {}
  close(done) { return done(); }
}

class NullSocket {
  constructor(id) { this.id = id; }
  get isNull() { return true; }
  emit() {}
  get broadcast() { return { emit() {} }; }
}

Fake It Once You Make It

Some high-level infrastructure classes depend on low-level infrastructure classes to talk to the outside world. For example, a LoginClient class might use a RestClient class to perform its network calls. The high-level code is typically more concerned with parsing and processing responses than the low-level communication details. However, there will still be some communication details that need to be tested. Therefore:

Use a mix of Focused Integration Tests and Nullable Infrastructure in your high-level infrastructure classes. For tests that check if external communication is done properly, use a Focused Integration Test (and possibly a Spy Server). For parsing and processing tests, use simpler and faster Nullable Infrastructure dependencies.

This Node.js JavaScript example demonstrates two tests of a LoginClient. The LoginClient depends on a RestClient. Note how the network request test uses a Spy Server and the error handling test uses a Null RestClient.

// Example Node.js tests for high-level LoginClient
// that depends on low-level RestClient.

describe("authentication", function() {
  // Network communication uses a Focused Integration Test and a Spy Server
  it("performs network request", async function() {
    // Instantiate the unit under test
    const client = createNetworkedClient({
      clientId: "my_auth0_id",
      clientSecret: "my_auth0_secret",
      managementApiToken: "my_management_api_token",
      connection: "my_auth0_connection",
    });
    
    // Set up Spy Server response
    testServer.setResponse({
      status: 200,
      body: JSON.stringify({
        id_token: createIdToken({
          email: "irrelevant_email_address",
          email_verified: false
        }),
      }),
    });

    // Call the production code
    await client.validateLoginAsync("login_code", "my_callback_url");
    
    // Assert that the correct HTTP request was made
    assert.deepEqual(testServer.getRequests(), [{
      method: "POST",
      url: "/oauth/token",
      body: JSON.stringify({
        client_id: "my_auth0_id",
        client_secret: "my_auth0_secret",
        code: "login_code",
        redirect_uri: "my_callback_url",
        grant_type: "authorization_code"
      }),
      headers: {
        host: testServer.host(),
        authorization: "Bearer my_management_api_token",
        "content-type": "application/json; charset=utf-8",
        "content-length": "148",
      },
    }]);
  });

  // Processing test uses Nullable Infrastructure RestClient dependency
  it("fails with error when HTTP status isn't 'okay'", async function() {
    // Instantiate unit under test with dependency configured to provide desired response
    const response = { status: 500, body: "auth0_response" };
    const client = createNulledClient(response);

    // Assert that the correct error was generated    
    await assert.exceptionAsync(
      () => validateLoginAsync(client),  // call production code
      expectedError(response, "Unexpected status code from Auth0.")  // expected error
    );
  });
});

// Factory for Focused Integration Tests
function createNetworkedClient({
  hostname = testServer.hostname(),
  port = testServer.port(),
  clientId = "irrelevant_id",
  clientSecret = "irrelevant_secret",
  managementApiToken = "irrelevant_token",
  connection = "irrelevant_connection",
} = {}) {
  if (port === null) port = undefined;
  return LoginClient.create({
    hostname,
    port,
    certificate: testServer.certificate(),
    clientId,
    clientSecret,
    managementApiToken,
    connection
  });
}

// Factory for Nullable Infrastructure tests
function LoginClient(responses) {
  return new Auth0Client({
    restClient: HttpsRestClient.createNull(responses),
    hostname: "irrelevant_hostname",
    clientId: "irrelevant_id",
    clientSecret: "irrelevant_secret",
    managementApiToken: "irrelevant_token",
    connection: "irrelevant_connection",
  });
}

Configurable Responses

Application and high-level infrastructure tests need a way of configuring the data returned by their infrastructure dependencies. Therefore:

Allow infrastructure methods' responses to be configured with an optional "responses" parameter to the Nullable Infrastructure's createNull() factory. Pass it through to the class's Embedded Stub and test-drive the stub accordingly.

When your infrastructure class has multiple methods that can return data, give each one its own createNull() parameter. Used named and optional parameters so they can be added and removed without breaking existing tests.

If a method needs to provide multiple different responses, pass them as an array. However, this may be a sign that your Infrastructure layer is too complicated.

// Example Node.js tests for Application layer code
// that reads data using LoginClient dependency

it("logs successful login", async function() {
  // Configure login client dependency
  const loginClient = LoginClient.createNull(
    validateLogin: {    // configure the validateLogin response
      email: "my_authenticated_email",
      emailVerified: true,
    }
  );
  const logCapture = LogService.createNull();
  
  // Run production code
  await performLogin({ loginClient, logCapture }));  // Signature Shielding
  
  // Check results
  assert.deepEqual(logCapture.logs, [ "Login: my_authenticated_email" ]);
});

To test code that uses infrastructure to send data, use Send State or Send Events. To test code that responds to infrastructure events, use Behavior Simulation.

Send State

Application and high-level infrastructure code use their infrastructure dependencies to send data to external systems. They need a way of checking that the data was sent. Therefore:

For infrastructure methods that send data, and provide no way to observe that the data was sent, store the last sent value in a variable. Make that data available via a method call.

// Example Send State implementation in JavaScript

class LoginClient {
  constructor() {
    //...
    this._lastSentVerificationEmail = null;
  }
  sendVerificationEmail(emailAddress) {
    //...
    this._lastSentVerificationEmail = emailAddress;
  }
  getLastSentVerificationEmail() {
    return this._lastSentVerificationEmail;
  }
  //...
}
// Example Node.js tests for Application layer code
// that sends data using LoginClient dependency

it("sends verification email", async function() {
  const loginClient = LoginClient.createNull();
  const emailPage = createPage({ loginClient });
  
  await emailPage.simulatePostAsync();
  assert.deepEqual(loginClient.getLastSentVerificationEmail(), "my_email");
});

If you need more than one send result, or you can't store the sent data, use Send Events instead. To test code that uses infrastructure to get data, use Configurable Responses. To test code that responds to infrastructure events, use Behavior Simulation.

Send Events

When you test code that uses infrastructure dependencies to send large blobs of data, or sends data multiple times in a row, Send State will consume too much memory. Therefore:

Rather than storing the sent data in a variable, use the Observer pattern to emit an event when your infrastructure code sends data. Include the data as part of the event payload. When tests need to make assertions about the data that was sent, they can listen for the events.

Send Events require complicated test setup. To make your tests easier to read, create a helper function in your tests that listens for send events and stores their data in an array. Doing this in production could cause a memory leak, but it's not a problem in your tests because the memory will be freed when the test ends.

This JavaScript example involves Application layer code for a real-time web application. When a browser connects, the server should send it all the messages the server had previously received. This test uses Send Events to check that the server sends those messages when a new browser connects.

// Example Node.js tests for Application layer code that sends
// multiple pieces of data using RealTimeServer dependency

it("replays all previous messages when client connects", function() {
  const network = createRealTimeServer();  // the infrastructure dependency
  const app = createApp({ network });  // the application code under test
  
  // Set up the test preconditions
  const message1 = new DrawMessage(1, 10, 100, 1000);
  const message2 = new DrawMessage(2, 20, 200, 2000);
  const message3 = new DrawMessage(3, 30, 300, 3000);
  network.connectNullBrowser(IRRELEVANT_ID);  // Behavior Simulation
  network.simulateBrowserMessage(IRRELEVANT_ID, message1);  // more Behavior Simulation
  network.simulateBrowserMessage(IRRELEVANT_ID, message2);
  network.simulateBrowserMessage(IRRELEVANT_ID, message3);

  // Listen for Send Events
  const sentMessages = trackSentMessages(network);
  
  // Run production code
  network.connectNullBrowser("connecting client");

  // Check that the correct messages were sent
  assert.deepEqual(sentMessages, [ message1, message2, message3 ]);
});

// Helper function for listening to Send Events
function trackSentMessages(network) {
  const sentMessages = [];
  network.on(RealTimeServer.EVENT.SEND_MESSAGE, (message) => {
    serverMessages.push(message);
  });
  return sentMessages;
}

To test code that uses infrastructure to get data, use Configurable Responses. To test code that responds to infrastructure events, use Behavior Simulation.

Behavior Simulation

Some external systems will push data to you rather than waiting for you to ask for it. Your application and high-level infrastructure code need a way to test what happens when their infrastructure dependencies generate those events. Therefore:

Add methods to your infrastructure code that simulate receiving an event from an external system. Share as much code as possible with the code that handles real external events, while remaining convenient for tests to use.

// Example Node.js Behavior Simulation implementation

class RealTimeNetwork {
  // Real Socket.IO event code
  _listenForBrowserMessages(socket) {
    socket.on("message", (payload) => {
      const message = Message.fromPayload(payload);
      _handleBrowserMessage(socket.id, message);
    }
  }
  
  // Simulated Socket.IO event code
  simulateBrowserMessage(clientId, message) {
    _handleBrowserMessage(clientId, message);
  }
  
  // Shared message-processing logic
  _handleBrowserMessage(clientId, message) {
    this.emit(RealTimeServer.EVENT.RECEIVE_MESSAGE, { clientId, message });
  }
  
  // Another example from the same class...
  
  // Real Socket.IO event code
  _listenForBrowserConnection(ioServer) {
    ioServer.on("connection", (socket) => {
      _connectBrowser(socket);
    }
  }
  
  // Simulated Socket.IO event code
  connectNullBrowser(browserId) {
    _connectBrowser(new NullSocket(browserId));
  }
  
  // Shared connection logic
  _connectBrowser(socket) {
    const id = socket.id;
    this._socketIoConnections[id] = socket;
    this.emit(RealTimeServer.EVENT.BROWSER_CONNECT, id);
  }
  
  //...
}
// Example Node.js tests for Application layer code
// that responds to events from RealTimeNetwork dependency

it("broadcasts messages from one browser to all others", function() {
  // Setup
  const network = createRealTimeNetwork();  // the infrastructure dependency
  const app = createApp({ network });  // the application code under test 
  const browserId = "browser id";
  const message = new PointerMessage(100, 200);
  network.connectNullBrowser(browserId);

  // Trigger event that runs code under test
  network.simulateBrowserMessage(browserId, clientMessage);
  
  // Check that code under test broadcasted the message
  assert.deepEqual(network.getLastSentMessage(), message);
});

To test code that uses infrastructure to get data, use Configurable Responses. To test code that uses infrastructure to send data, use Send Stream or Send Events.

Conclusion

These patterns are an effective way of writing code that can be tested without test doubles, DI frameworks, or end-to-end tests.

Agile Fluency Model Updated

06 Mar, 2018

Six years ago, Diana Larsen and I created the Agile Fluency™ Model, a way of describing how agile teams tend to grow over time.

The Agile Fluency Model, showing a path starting with 'Pre-Agile', followed by a team culture shift, then the 'Focusing' zone. The path continues with a team skills shift that leads to the 'Delivering' zone. Next, an organizational structure shift leads to the 'Optimizing' zone. Finally, an organizational culture shift leads to the 'Strengthening' zone. After that, the path fades out as it continues on to zones yet undiscovered.

In the last six years, the model has gone on to be more influential than Diana and I ever expected. People and companies are using it around the world, often without even telling us. In that time, we've also learned new things about the model and what fluent agility looks like.

Today I'm happy to announce that we've released a new, updated article about the Agile Fluency Model. It's a substantial revision with much more detail about the benefits, proficiencies, and investments needed for different kinds of agile development.

We've also launched the Agile Fluency Diagnostic, a way to help teams develop new capabilities. It's a facilitated self-assessment conducted by experienced agile coaches. We have a list of licensed facilitators and you can also become licensed to conduct the Diagnostic yourself.

Many thanks to Martin Fowler, who published our original article and encouraged us to release this updated version. Check it out!

A Nifty Workshop Technique

05 Apr, 2017

It's hard to be completely original. But I have a little trick for workshops that I've never seen others do, and participants love it.

It's pretty simple: instead of passing out slide booklets, provide nice notebooks and pass out stickers. Specifically, something like Moleskine Cahiers and 3-1/3" x 4" printable labels.

Closeup of a workshop participant writing on a notebook page, with a sticker on the other page

I love passing out notebooks because they give participants the opportunity to actively listen by taking notes. (And, in my experience, most do.) Providing notebooks at the start of a workshop reinforces the message that participants need to take responsibility for their own learning. And, notebooks are just physically nicer and more cozy than slide packets... even the good ones.

The main problem with notebooks is that they force participants to copy down material. By printing important concepts on stickers, participants can literally cut and paste a reference directly into their notes. It's the best of both worlds.

There is a downside to this technique: rather than just printing out your slides, your stickers have to be custom-designed references. It's more work, but I find that it also results in better materials. Worth it.

People who've been to my workshops keeping asking me if they can steal the technique. I asked them to wait until I documented my one original workshop idea. Now I have. If you use this idea, I'd appreciate credit. Other than that, share and enjoy. :-)

Picture of a table at the Agile Fluency Game workshop showing participants writing in their notebooks

Final Details for Agile Fluency Coaching Workshop

21 Mar, 2017

Our Agile Fluency™ Game coaching workshop is coming up fast! Signups close on March 28th. Don't wait!

We've been hard at work finalizing everything for the workshop. We hired Eric Wahlquist to do the graphic design and he did a great job.

Diana Larsen and I have also finalized the agenda for the workshop. It's so much more than just the game. The workshop is really a series of mini-workshops that you can use to coach your teams. Check 'em out:

  1. The Agile Fluency Game: Discover interrelationships between practices and explore the tradeoffs between learning and delivery
  2. Your Path through the Agile Fluency Model: Understand fluency zone tradeoffs and choose your teams' targets
  3. Zone Zoom: Understand how practices enable different kinds of fluency
  4. Trading Cards: Explore tradeoffs between practices
  5. Up for Adoption: See how practices depend on each other and which ones your teams could adopt
  6. Fluency Timeline: Understand the effort and time required for various practices
  7. Perfect Your Agile Adoption: Decide which practices are best for your teams and how to adopt them

These are all hands-on, experiential workshops that you'll learn how to conduct with your own teams. I think they're fantastic. You can sign up here.

The Agile Fluency Game: Now Available!

01 Mar, 2017

Five years ago, Arlo Belshee and I created a game about agile adoption. The ideas in that game influenced the Agile Fluency™ Model, Arlo's Agile Engineering Fluency map, and the Agile Fluency Diagnostic. We always intended to publish the game more widely, but the time and money required to do professional publishing job was just too much.

Until now.

I am very proud to announce that, in collaboration with the Agile Fluency Project, the game I've spent the last five years play-testing and revising is finally available! I'm conducting a special workshop with Diana Larsen that's packed full of useful exercises to improve your Agile coaching and training. Every participant will get a professional-produced box set of the game to take home.

Every time we've run the Agile Fluency Game, players have asked to get their own copy. Now it's finally available.

Sign up and learn more here.

Agile and Predictability

29 Sep, 2014

Over on the AgilePDX mailing list, there's an interesting conversation on making predictions with Agile. It started off with Michael Kelly asking if Agile can help with predictability. Here's my response:

It's entirely possible to make predictions with Agile. They're just as good as predictions made with other methods, and with XP practices, they can be much better. Agile leaders talk about embracing change because that has more potential value than making predictions.

Software is inherently unpredictable. So is the weather. Forecasts (predictions) are possible in both situations, given sufficient rigor. How your team approaches predictions depends on what level of fluency they have.

One-star teams adapt their plans and work in terms of business results. However, they don't have rigorous engineering practices, which means their predictions have wide error bars, on par with typical non-Agile teams (for 90% reliability, need 4x estimate*). They believe predictions are impossible in Agile.

Two-star teams use rigorous engineering practices such as test-driven development, continuous integration, and the other good stuff in XP. They can make predictions with reasonable precision (for 90% reliability, need 1.8x estimate*) They can and do provide reliable predictions.

Three- and four-star teams conduct experiments and change direction depending on market opportunities. They can make predictions just as well as two-star teams can, but estimating and predicting has a cost, and those predictions often have no real value in the market. They often choose not to incur the waste of making predictions.

So if a company were to talk to me about improving predictability, I would talk to them about what sort of fluency they wanted to achieve, why, and the investments they need to make to get there. For some organizations, *3 fluency isn't desired. It's too big of a cultural shift. In those cases, a *2 team is a great fit, and can provide the predictability the organization wants.

I describe the "how to" of making predictions with Agile in "Use Risk Management to Make Solid Commitments".

*The error-bar numbers are approximate and depend on the team. See the "Use Risk Management" essay for an explanation of where they come from.

How Does TDD Affect Design?

17 May, 2014

(This essay was originally posted to the Let's Code JavaScript blog.)

I've heard people say TDD automatically creates good designs. More recently, I've heard David Hansson say it creates design damage. Who's right?

Neither. TDD doesn't create design. You do.

TDD Can Lead to Better Design

There are a few ways I've seen TDD lead to better design:

  1. A good test suite allows you to refactor, which allows you to improve your design over time.

  2. The TDD cycle is very detail-oriented and requires you to make some design decisions when writing tests, rather than when writing production code. I find this helps me think about design issues more deeply.

  3. TDD makes some design problems more obvious.

None of these force you to create better design, but if you're working hard to create good design, I find that these things make it easier to get there.

TDD Can Lead to Worse Design

There are a few ways I've seen TDD lead to worse design:

  1. TDD works best with fast tests and rapid feedback. In search of speed, some people use mocks in a way that locks their production code in place. Ironically, this makes refactoring very difficult, which prevents designs from being improved.

  2. Also in search of speed, some people make very elaborate dependency-injection tools and structures, as well as unnecessary interfaces or classes, just so they can mock out dependencies for testing. This leads to overly complex, hard to understand code.

  3. TDD activates people's desire to get a "high score" by having a lot of tests. This can push people to write worthless or silly tests, or use multiple tests where a single test would do.

None of these are required by TDD, but they're still common. The first two are obvious solutions to the sorts of design problems TDD exposes.

They're also very poor solutions, and you can (and should) choose not to do these things. It is possible to create fast tests and rapid feedback without these mistakes, and you can see us take that approach in the screencast.

So Do Your Job

TDD doesn't create good design. You do. TDD can help expose design smells. You have to pay attention and fix them. TDD can push you toward facile solutions. You have to be careful not to make your design worse just to make testing better.

So pay attention. Think about design. And if TDD is pushing you in a direction that makes your code worse, stop. Take a walk. Talk to a colleague. And look for a better way.

The Lament of the Agile Practitioner

08 May, 2014

I got involved with Extreme Programming in 2000. Loved it. Best thing since sliced bread, yadda yadda. I was completely spoiled for other kinds of work.

So when that contract ended, I went looking for other opportunities to do XP. But guess what? In 2001, there weren't any. So I started teaching people how to do it. Bam! I'm a consultant.

Several lean years later (I don't mean Lean, I mean ramen), I'm figuring out this consulting thing. I've got a network, I've got business entity, people actually call me, and oh, oh, and I make a real damn difference.

Then Agile starts getting really popular. Certification starts picking up. Scrum's the new hotness, XP's too "unrealistic." I start noticing some of my friends in the biz are dropping out, going back to start companies or lead teams or something real. But I stick with it. I'm thinking, "Sure, there's some bottom feeders creeping in, but Agile's still based on a core of people who really care about doing good work. Besides, if we all leave, what will keep Agile on track?"

It gets worse. Now I'm noticing that there are certain clients that simply won't be successful. I can tell in a phone screen. And it's not Scrum's fault, or certification, or anything. It's the clients. They want easy. I start getting picky, turning them down, refusing to do lucrative but ineffective short-term training.

Beck writes XP Explained, 2nd edition. People talk about Agile "crossing the chasm." I start working on the 2nd edition XP Pocket Guide with chromatic and it turns into The Art of Agile Development. We try to write it for the early majority--the pragmatics, not the innovators and early adopters that were originally attracted to Agile and are now moving on to other things. It's a big success, still is.

It gets worse. The slapdash implementations of Agile now outnumber the good ones by a huge margin. You can find two-day Scrum training everywhere. Everybody wants to get in on the certification money train. Why? Clients won't send people to anything else. The remaining idealists are either fleeing, founding new brands, or becoming Certified Scrum Trainers.

I write The Decline and Fall of Agile. Martin Fowler writes Flaccid Scrum. I write Stumbling through Mediocrity. At conferences, we early adopters console each other by saying, "The name 'Agile' will go away, but that's just because practices like TDD will just be 'the way you do software.'" I start looking very seriously for other opportunities.

That was six years ago.

...

Believe it or not, things haven't really gotten worse since then. Actually, they've gotten a bit better. See, 2-5 years is about how long a not-really-Agile Agile team can survive before things shudder to a complete halt. But not-quite-Agile was Actually. So. Much. Better. (I know! Who could believe it?) than what these terribly dysfunctional organizations were doing before that they're interested in making Agile work. So they're finally investing in learning how to do Agile well. Those shallow training sessions and certifications I decried? They opened the door.

And so here we are, 2014. People are complaining about the state of Agile, saying it's dying. I disagree. I see these "Agile is Dying" threads as a good thing. Because they mean that the word is getting out about Agile-in-name-only. Because every time this comes up, you have a horde of commenters saying "Yeah! Agile sucks!" But... BUT... there's also a few people who say, "No, you don't understand, I've seen Agile work, and it was glorious." That's amazing. Truly. I've come to believe that no movement survives contact with the masses. After 20 years, to still have people who get it? Who are benefiting? Whose lives are being changed?

That means we have a shot.

And as for me... I found that opportunity, so I get to be even more picky about where I consult. But I continue to fight the good fight. Diana and I produced the Agile Fluency™ model, a way of understanding and talking about the investments needed, and we're launching the Agile Fluency Project later this month. We've already released the model, permissive license, for everyone to use. Use it.

Because Agile has no definition, just a manifesto. It is what the community says it is. It always has been. Speak up.

Discuss this essay on the Let's Code JavaScript blog.

Object Playground: The Definitive Guide to Object-Oriented JavaScript

27 Aug, 2013

Let's Code: Test-Driven JavaScript, my screencast series on rigorous, professional JavaScript development, recently celebrated its one year anniversary. There's over 130 episodes online now, covering topics ranging from test-driven development (of course!), to cross-browser automation, to software design and abstraction. It's incredibly in-depth and I'm very proud of it.

To celebrate the one-year anniversary, we've released Object Playground, a free video and visualizer for understanding object-oriented programming. The visualizer is very cool: it runs actual JavaScript code and graphs the object relationships created by that code. There are several preset examples and you can also type in arbitrary code of your own.

Example visualization

Object Playground in action

Understanding how objects and inheritance work in JavaScript is a challenge even for experienced developers, so I've supplemented the tool with an in-depth video illustrating how it all fits together. The feedback has been overwhelmingly positive. Check it out.