Testing Without Mocks: A Pattern Language

27 Apr, 2018

For example code demonstrating these ideas, see my example on GitHub.

When programmers use test-driven development (TDD), the code they test interacts with other parts of the system that aren't being tested. To test those interactions, and to prevent the other code from interfering with their tests, programmers often use mock objects or other test doubles. However, this approach requires additional integration tests to ensure the system works as a whole, and it can make structural refactorings difficult.

This pattern language1 describes a way of testing object-oriented code without using mocks. It avoids the downsides of mock-based testing, but it has tradeoffs of its own.

1The structure of this article was inspired by Ward Cunningham's CHECKS Pattern Language of Information Integrity, which is a model of clarity and usefulness.

Contents:

Goals

  • No broad tests required. The test suite consists entirely of "narrow" tests that are focused on specific concepts. Although broad integration tests can be added as a safety net, their failure indicates a gap in the main test suite.

  • Easy refactoring. Object interactions are considered implementation to be encapsulated, not behavior to be tested. Although the consequences of object interactions are tested, the specific method calls aren't. This allows structural refactorings to be made without breaking tests.

  • Readable tests. Tests follow a straightforward "arrange, act, assert" structure. They describe the externally-visible behavior of the unit under test, not its implementation. They can act as documentation for the unit under test.

  • No magic. Tools that automatically remove busywork, such as dependency-injection frameworks and auto-mocking frameworks, are not required.

  • Fast and deterministic. The test suite only executes "slow" code, such as network calls or file system requests, when that behavior is explicitly part of the unit under test. Such tests are organized so they produce the same results on every test run.

Tradeoffs

  • Test-specific production code. Some code needed for the tests is written as tested production code, particularly for infrastructure classes. It requires extra time to write and adds noise to class APIs.

  • Hand-written stub code. Some third-party infrastructure code has to be mimicked with hand-written stub code. It can't be auto-generated and takes extra time to write.

  • Sociable tests. Although tests are written to focus on specific concepts, the units under test execute code in their dependencies. (Jay Fields coined the term "sociable tests" for this behavior.) This can result in multiple tests failing when a bug is introduced.

  • Not a silver bullet. Code must be written with careful thought to design. Design mistakes are inevitable and this necessitates continuous attention to design and refactoring.

Architectural Patterns

Testing without mocks requires careful attention to the dependencies in your codebase. These patterns help establish the ground rules.

Overlapping Sociable Tests

Our goal is to create a test suite consisting entirely of "narrow" tests, with no need for "broad" end-to-end tests. But most narrow tests don't test that the system is wired together correctly. Therefore:

When testing the interactions between an object and its dependencies, inject real dependency instances (not test doubles) into the unit under test. Don't test the dependencies' behavior itself, but do test that the unit under test uses the dependencies correctly.

This will create a strong linked chain of tests. Each test will overlap with dependencies' tests and dependents' tests. The test suite as a whole should cover your entire application in a fine overlapping mesh, giving you the coverage of broad tests without the need to write them.

To avoid constructing the entire dependency chain, use Zero-Impact Instantiation and Parameterless Instantiation. To isolate your unit under test, use Collaborator-Based Isolation and Nullable Infrastructure. To test code that depends on infrastructure, use Configurable Responses, Send State, Send Events, and Behavior Simulation.

A-Frame Architecture

Because we aren't using mocks to isolate our dependencies, it's easiest to test code that doesn't depend on infrastructure (external systems such as databases, file systems, and services). However, a typical layered architecture puts infrastructure at the bottom of the dependency chain:

   Application/UI
         |
         V
       Logic
         |
         V
   Infrastructure
Therefore:

Structure your application so that infrastructure and logic are peers under the application layer, with no dependencies between Infrastructure and Logic. Coordinate between them at the application layer with a Logic Sandwich or Traffic Cop.

       Application/UI
       /            \            
      V              V
   Infrastructure   Logic

Build the bottom two layers using Infrastructure Patterns and Logic Patterns.

Although A-Frame Architecture is a nice way to simplify application dependencies, it's optional. You can test code that mixes infrastructure and logic using Infrastructure Wrappers and Nullable Infrastructure.

To build a new application using A-Frame Architecture, Grow Evolutionary Seeds. To convert an existing layered architecture, Climb the Ladder.

Logic Sandwich

When using an A-Frame Architecture, the infrastructure and logic layers can't communicate with each other. But the logic layer needs to read and write data controlled by the infrastructure layer. Therefore:

Implement the top-level code as a "logic sandwich," where data is read by the infrastructure layer, then processed by the logic layer, then written by the infrastructure layer. Repeat as needed. Each piece can then be tested independently.

let input = infrastructure.readData();
let output = logic.processInput(input);
infrastructure.writeData(output);

This simple algorithm can handle sophisticated needs if put into a loop with a stateful logic layer.

For applications with complicated infrastructure, use a Traffic Cop instead.

Traffic Cop

The Logic Sandwich boils infrastructure down into simple infrastructure.readData() and infrastructure.writeData() abstractions. Applications with complex infrastructure may not be a good fit for this approach. Therefore:

Instead of asking the infrastructure for its data, use the Observer pattern to send events from the infrastructure layer to the application layer. For each event, implement a Logic Sandwich. In some cases, your application code might need a bit of logic of its own.

infrastructure.on("login", (token) => {  // infrastructure layer
  let loginInfo = LoginInfo.createFromToken(token);  // logic layer
  if (loginInfo.isValid) {  // application logic
    let userData = subscriberService.lookUpUser(loginInfo.userId);  // infrastructure layer
    let user = new User(userData);  // logic layer
    infrastructure.createSession(user.sessionData);  // infrastructure layer
  }
});
infrastructure.on("event2", (data) => {
  let output = logic.processEvent2(data);
  infrastructure.writeData2(output);
});
//...etc...

Be careful not to let your Traffic Cop turn into a God Class. If it gets complicated, better infrastructure abstractions might help. Sometimes taking a less "pure" approach and moving some Logic code into the Infrastructure layer can simplify the overall design. In other cases, splitting the application layer into multiple classes, each with its own Logic Sandwich or simple Traffic Cop, can help.

Grow Evolutionary Seeds

One popular design technique is outside-in design, in which an application is programmed by starting with the externally-visible behavior of the application, then working your way in to the details.

This is typically done by writing a broad integration test to describe the externally-visible behavior, then using narrow unit tests to define the details. But we want to avoid broad tests. Therefore:

Use evolutionary design to grow your application from a single file. Choose a simple end-to-end behavior as a starting point and test-drive a single class to implement a trivial version of that behavior. Hardcode one value that would normally come from the Infrastructure layer, don't implement any significant logic, and return the result to your tests rather than displaying in a UI. This class forms the seed of your Application layer.

// JavaScript example: simplest possible Application seed

// Test code
it("renders user name", function() {
  const app = new MyApplication();
  assert.equal("Hello, Sarah", app.render());
});

// Production code
class MyApplication {
  render() {
    return "Hello, Sarah";
  }
}

Next, implement a barebones Infrastructure Wrapper for the one infrastructure value you hardcoded. Code just enough infrastructure to provide one real result to your application layer class. Don't worry about making it robust or reliable yet. This Infrastructure Wrapper class forms the seed of your Infrastructure layer.

Before integrating your new Infrastructure class into your Application layer class, implement Nullable Infrastructure. Then modify your application layer class to use the infrastructure wrapper, injecting the Null version in your tests.

// JavaScript example: Application + read from infrastructure

// Test code
it("renders user name", function() {
  const usernameService = UsernameService.createNull({ username: "my_username" });
  const app = new MyApplication({ usernameService });
  assert.equal("Hello, my_username", app.render());
});

// Production code
class MyApplication {
  // usernameService parameter is optional
  constructor({ usernameService = UsernameService.create() } = {}) {
    this._usernameService = usernameService;
  }
    
  async render() {
    const username = await this._usernameService.getUsername();
    return `Hello, ${username}`;
  }
}

Next, do the same for your UI. Choose one simple output mechanism that your application will use (such as rendering to the console, the DOM, or responding to a network request) and implement a barebones Infrastructure Wrapper for it. Add support for Nullable Infrastructure and modify your application layer tests and code to use it.

// JavaScript example: Application + read/write to infrastructure

// Test code
it("renders user name", function() {
  const usernameService = UsernameService.createNull({ username: "my_username" });
  const uiService = UiService.createNull();
  const app = new MyApplication({ usernameService, uiService });
  
  app.render();
  assert.equal("Hello, my_username", uiService.getLastRender());
});

// Production code
class MyApplication {
  constructor({ 
    usernameService = UsernameService.create(),
    uiService = UiService.create(),
  } = {}) {
    this._usernameService = usernameService;
    this._uiService = uiService;
  }
    
  async render() {
    const username = await this._usernameService.getUsername();
    await uiService.render(`Hello, ${username}`);
  }
}

Now your application tests serve the same purpose as broad end-to-end tests: they document and test the externally-visible behavior of the application. Because they inject Null application dependencies, they're narrow tests, not broad tests, and they don't communicate with external systems. That makes them fast and reliable. They're also Overlapping Sociable Tests, so they provide the same safety net that broad tests do.

At this point, you have the beginnings of a walking skeleton: an application that works end-to-end, but is far from complete. You can evolve that skeleton to support more features. Choose some aspect of your code that's obviously incomplete and test-drive a slightly better solution. Repeat forever.

// JavaScript example: Application + read/write to infrastructure
// + respond to UI events

// Test code
it("renders user name", function() {
  const usernameService = UsernameService.createNull({ username: "my_username" });
  const uiService = UiService.createNull();
  const app = new MyApplication({ usernameService, uiService });
  app.start();
  
  uiService.simulateRequest("greeting");
  assert.equal("Hello, my_username", uiService.getLastRender());
});

// Production code
class MyApplication {
  constructor({ usernameService = UsernameService.create() }) {
    this._usernameService = usernameService;
    this._uiService = uiService;
  }
    
  async start() {
    this._uiService.on("greeting", () => {
      const username = await this._usernameService.getUsername();
      await uiService.render(`Hello, ${username}`);
    });
  }
}

At some point, probably fairly early, your Application layer class will start feeling messy. When it does, look for a concept that can be factored into its own class. This forms the seed of your Logic layer. As your application continues to grow, continue refactoring so that class collaborations are easy to understand and responsibilities are clearly defined.

To convert existing code to an A-Frame Architecture, Climb the Ladder instead.

Climb the Ladder

Most pre-existing code you encounter will be designed with a layered architecture, where Logic code has Infrastructure dependencies. Some of this code will be difficult to test or resist refactoring. Therefore:

Refactor problem code into a miniature A-Frame Architecture. Start at the lowest levels of your Logic layer and choose a single method that depends on one clearly-defined piece of infrastructure. If the Infrastructure code is intertwingled with the Logic code, disintertwingle it by factoring out an Infrastructure Wrapper.

When the Infrastructure code has been separated from the rest of the code, the method will act similarly to an Application layer class: it will have a mix of logic and calls to infrastructure. Make this code easier to refactor by rewriting its tests to use Nullable Infrastructure dependencies instead of mocks. Then factor all the logic code into methods with no infrastructure dependencies.

At this point, your original method will have nothing left but a small Logic Sandwich: a call or two to the infrastructure class and a call to the new logic method. Now eliminate the original method by inlining it to its callers. This will cause the logic sandwich to climb one step up your dependency chain.

Repeat until the class no longer has any dependencies on infrastructure. At that point, review its design and refactor as desired to better fit the Logic Patterns and your application's needs. Continue with the next class.

Climbing the Ladder takes a lot of time and effort, so do it gradually, as part of your normal work, rather than all at once. Focus your efforts on code where testing without mocks will have noticeable benefit. Don't waste time refactoring code that's already easy to maintain, regardless of whether it uses mocks.

When building a new system from scratch, Grow Evolutionary Seeds instead.

Zero-Impact Instantiation

Overlapping Sociable Tests instantiate their dependencies, which in turn instantiate their dependencies, and so forth. If instantiating this web of dependencies takes too long or causes side effects, the tests could be slow, difficult to set up, or fail unpredictably. Therefore:

Don't do significant work in constructors. Don't connect to external systems, start services, or perform long calculations. For code that needs to connect to an external system or start a service, provide a connect() or start() method. For code that needs to perform a long calculation, consider lazy initialization. (But even complex calculations aren't likely to be a problem, so profile before optimizing.)

Signature Shielding

As you refactor your application, method signatures will change. If your code is well-designed, this won't be a problem for production code, because most methods will only be used in a few places. But tests can have many duplicated method and constructor calls. When you change those methods or constructors, you'll have a lot of busywork to update the tests. Therefore:

If a file has a lot of tests that call a specific method, provide a proxy function for that method. Similarly, if it has a lot of tests that instantiate a class, provide a factory method. Program the proxies and factories so their parameters are all optional. That way you can add additional parameters in the future without breaking existing tests.

// JavaScript code with named, optional parameters

// Example test
it("uses hosted page for authentication", function() {
  const client = createClient({   // Use the factory function
    host: "my.host", 
    clientId: "my_client_id"
  });
  const url = getLoginUrl({   // Use the proxy function
    client, 
    callbackUrl: "my_callback_url"
  });
  assert.equal(url, "https://my.host/authorize?response_type=code&client_id=my_client_id&callback_url=my_callback_url");
});

// Example factory function
function createClient({
  host = "irrelevant_host",
  clientId = "irrelevant_id",
  clientSecret = "irrelevant_secret",
  connection = "irrelevant_connection"
} = {}) {
  return new LoginClient(host, clientId, clientSecret, connection);
}

// Example proxy function
function getLoginUrl({
  client,
  username = "irrelevant_username",
  callbackUrl = "irrelevant_url"
} = {}) {
  return client.getLoginUrl(username, callbackUrl);
}

Logic Patterns

When using A-Frame Architecture, the application's Logic layer has no infrastructure dependencies. It represents pure computation, so it's fast and deterministic. To qualify for the Logic layer, code can't talk to a database, communicate across a network, or touch the file system.2 Neither can its tests or dependencies. Any code that breaks these rules belongs in the Application layer or Infrastructure layer instead. Code that modifies global state can be put in the Logic layer, but it should be avoided, because then you can't parallelize your tests.

Pure computation is easy to test. The following patterns make it even easier.

2This list inspired by Michael Feathers' unit testing rules.

Easily-Visible Behavior

Logic layer computation can only be tested if the results of the computation are visible to tests. Therefore:

Prefer pure functions where possible. Pure functions' return values are determined only by their input parameters.

When pure functions aren't possible, prefer immutable objects. The state of immutable objects is determined when the object is constructed, and never changes afterwards.

For methods that change object state, provide a way for the change in state to be observed, either with a getter method or an event.

In all cases, avoid writing code that explicitly depends on (or changes) the state of dependencies more than one level deep. That makes test setup difficult, and it's a sign of poor design anyway. Instead, design dependencies so they completely encapsulate their next-level-down dependencies.

Testable Libraries

Third-party code doesn't always have Easily-Visible Behavior. It also tends to introduce breaking API changes with new releases, or simply stop being maintained. Therefore:

Wrap third-party code in code that you control. Ensure your application's use of the third-party code is mediated through your wrapper. Write your wrapper's API to match the needs of your application, not the third-party code, and add methods as needed to provide Easily-Visible Behavior. (This will typically involve writing getter methods to expose deeply-buried state.) When the third-party code introduces a breaking change, or needs to be replaced, modify the wrapper so no other code is affected.

Frameworks and libraries with sprawling APIs are more difficult to wrap, so prefer libraries that have a narrowly-defined purpose and a simple API.

If the third-party code interfaces with an external system, use an Infrastructure Wrapper instead.

Parameterless Instantiation

Multi-level dependency chains are difficult to set up in tests. Dependency injection (DI) frameworks work around the problem, but we're avoiding magic like DI frameworks. Therefore:

Ensure that all Logic classes can be constructed without providing any parameters (and without using a DI framework). In practice, this means that most objects instantiate their dependencies in their constructor by default, although they may also accept them as optional parameters.

For some classes, a parameterless constructor won't make any sense. For example, an immutable "Address" class would be constructed with its street, city, and so forth. For these sorts of classes, provide a test-only factory method. The factory method should provide overridable defaults for mandatory parameters.

The factory method is easiest to maintain if it's located in the production code next to the real constructors. It should be marked as test-specific and should be simple enough to not need tests of its own.

// Example JavaScript code using named, optional parameters

class Address {
  // Production constructor
  constructor(street, city, state, country, postalCode) {
    this._street = street;
    this._city = city;
    //...etc...
  }
  
  // Test-specific factory
  static createTestInstance({
    street = "Address test street",
    city = "Address test city",
    state = State.createTestInstance(),
    country = Country.createTestInstance(),
    postalCode = PostalCode.createTestInstance()    
  } = {}) {
    return new Address(street, city, state, country, postalCode);
  }
}

Collaborator-Based Isolation

Overlapping Sociable Tests ensure that any changes to the semantics of a unit's dependencies will cause that unit's tests to break, no matter how far down the dependency chain they may be. On the one hand, this is nice, because we'll learn when we accidentally break something. On the other hand, this could make feature changes terribly expensive. We don't want a change in the rendering of addresses to break hundreds of unrelated reports' tests. Therefore:

Call dependencies' methods to help define test expectations. For example, if you're testing a InventoryReport that includes an address in its header, don't hardcode "123 Main St." as your expectation for the report header test. Instead, call Address.renderAsOneLine() as part of defining your test expectation.

// JavaScript example

// Example test
it("includes the address in the header when reporting on one address", function() {
  // Instantiate the unit under test and its dependency
  const address = Address.createTestInstance();
  const report = createReport({ addresses: [ address ] });
  
  // Define the expected result using the dependency
  const expected = "Inventory Report for " + address.renderAsOneLine();
  
  // Run the production code and make the assertion
  assert.equal(report.renderHeader(), expected);
});

// Example production code
class InventoryReport {
  constructor(inventory, addresses) {
    this._inventory = inventory;
    this._addresses = addresses;
  }
  
  renderHeader() {
    let result = "Inventory Report";
    if (this._addresses.length === 1) {
      result += " for " + this._address[0].renderAsOneLine();
    }
    return result;
  }  
}

This provides the best of both worlds: Overlapping Sociable Tests ensure that your application is wired together correctly and Collaborator-Based Isolation allows you to change features without modifying a lot of tests.

Infrastructure Patterns

The Infrastructure layer contains code for communicating with the outside world. Although it may contain some logic, that logic should be focused on making infrastructure easier to work with. Everything else belongs in the Application and Logic layers.

Infrastructure code is unreliable and difficult to test because of its dependencies on external systems. The following patterns work around those problems.

Infrastructure Wrappers

In the Logic layer, you can design your code to avoid complex, global state. In the Infrastructure layer, your code deals with nothing else. Testing infrastructure code that depends on other infrastructure code is particularly difficult. Therefore:

Keep your infrastructure dependencies simple and straightforward. For each external system--service, database, file system, or even environment variables--create one wrapper class that's responsible for interfacing with that system. Design your wrappers to provide a crisp, clean view of the messy outside world, in whatever format is most useful to the Logic and Application layers.

Avoid creating complex webs of dependencies. In some cases, high-level Infrastructure classes may depend on generic, low-level classes. For example, a LoginClient might depend on RestClient. In other cases, high-level infrastructure classes might unify multiple low-level classes, such as a DataStore class that depends on a RelationalDb class and a NoSqlDb class. Other than these sorts of simple one-way dependency chains, design your Infrastructure classes to stand alone.

Test your Infrastructure Wrappers with Focused Integration Tests and Paranoic Telemetry. Enable them to be used in other tests by creating Nullable Infrastructure.

Focused Integration Tests

Ultimately, Infrastructure code talks over a network, interacts with a file system, or involves some other communication with an external system. Its correctness depends on communicating properly. Therefore:

Test your external communication for real. For file system code, check that it reads and writes real files. For databases and services, access a real database or service. Make sure that your test systems use the same configuration as your production environment. Otherwise your code will fail in production when it encounters subtle incompatibilities.

Run your focused integration tests against test systems that are reserved exclusively for one machine's use. It's best if they run locally on your development machine, and are started and stopped by your tests or build script. If you share test systems with other developers, you'll experience unpredictable test failures when multiple people run the tests at the same time.

You won't be able to get a local test system for every external system your application uses. When you can't, use a Spy Server instead.

Some high-level Infrastructure classes will use lower-level classes to do the real work, such as a LoginClient class that uses a RestClient class to make the network call. They can Fake It Once You Make It.

Spy Server

Some external systems are too unreliable, expensive, or difficult to use for Focused Integration Tests. It's one thing to run dozens of tests against your local file system every few minutes; quite another to do that to your credit card gateway. Therefore:

Create a test server that that you can run locally. Program it to record requests and respond with pre-configured results. Make it very simple and generic. For example, all REST-based services should be tested by the same HTTPS Spy Server.

To test against the Spy Server, start by making a real call to your external system. Record the call and its results and paste them into your test code (or save them to a file). In your test, check that the expected request was made to the Spy Server and the response was processed correctly.

External systems can change out from under you. In the case of cloud-based services, it can happen with no warning. A Spy Server won't be able to detect those changes. To protect yourself, implement Paranoic Telemetry.

// Example Node.js LoginClient tests.

// Start, stop, and reset the Spy Server
const testServer = new HttpsTestServer();
before(async function() {
  await testServer.startAsync();
});
after(async function() {
  await testServer.stopAsync();
});
beforeEach(function() {
  testServer.reset();
});

// The test
it("gets user details", async function() {
  // Instantiate unit under test (uses Signature Shielding)
  const client = createNetworkedClient({
    managementApiToken: "my_management_api_token",
    connection: "my_auth0_connection",
  });
  
  // Set up Spy Server response
  testServer.setResponse({   
    status: 200,
    body: JSON.stringify([{
      user_id: "the_user_id",
      email_verified: false,
    }]),
  });

  // Call the production code
  const result = await client.getUserInfoAsync("a_user_email");
  
  // Assert that the correct HTTP request was made
  assert.deepEqual(testServer.getRequests(), [{
    method: "GET",
    url: "/api/v2/users?" +
      "fields=user_id%2Cemail_verified&" +
      "q=identities.connection%3A%22my_auth0_connection%22%20AND%20email%3A%22a_user_email%22&" +
      "search_engine=v2",
    body: "",
    headers: {
      host: testServer.host(),
      authorization: "Bearer my_management_api_token",
    },
  }], "request");
  
  // Assert that the response was processed properly
  assert.deepEqual(result, {
    userId: "the_user_id",
    emailVerified: false
  }, "result");
});

This is a complete example of a real-world Node.js HTTPS Spy Server. You can use this code in your own projects:

// Copyright 2018 Titanium I.T. LLC. All rights reserved. MIT License.
"use strict";

//** An HTTPS spy server for use by focused integration tests

const https = require("https");
const promisify = require("util").promisify;

const SELF_SIGNED_LOCALHOST_CERT_FOR_TESTING_ONLY =
  "-----BEGIN CERTIFICATE-----\n" +
  // TODO
  "-----END CERTIFICATE-----";

const CERT_PRIVATE_KEY_FOR_TESTING_ONLY =
  "-----BEGIN RSA PRIVATE KEY-----\n" +
  // TODO
  "-----END RSA PRIVATE KEY-----";

module.exports = class HttpsTestServer {
  constructor() {
    this._hostname = "localhost";
    this._port = 5030;
    this.reset();
  }

  reset() {
    this._forceRequestError = false;
    this._requests = [];
    this._responses = [];
  }

  hostname() { return this._hostname; }
  port() { return this._port; }
  host() { return this._hostname + ":" + this._port; }
  certificate() { return SELF_SIGNED_LOCALHOST_CERT_FOR_TESTING_ONLY; }

  getRequests() { return this._requests; }

  async startAsync() {
    const options = {
      cert: SELF_SIGNED_LOCALHOST_CERT_FOR_TESTING_ONLY,
      key: CERT_PRIVATE_KEY_FOR_TESTING_ONLY,
      secureProtocol: "TLSv1_method"
    };
    this._server = https.createServer(options);
    this._server.on("request", handleRequest.bind(null, this));

    await promisify(this._server.listen.bind(this._server))(this._port);
  }

  async stopAsync() {
    await promisify(this._server.close.bind(this._server))();
  }

  setResponses(responses) { this._responses = responses; }
  setResponse(response) { this._responses = [ response ]; }

  forceErrorDuringRequest() {
    this._forceRequestError = true;
  }
};

function handleRequest(self, request, response) {
  let responseInfo = self._responses.shift();
  if (responseInfo === undefined) responseInfo = { status: 503, body: "No response defined in HttpsTestServer" };

  const requestInfo = {
    method: request.method,
    url: request.url,
    headers: Object.assign({}, request.headers),
    body: ""
  };
  delete requestInfo.headers.connection;
  self._requests.push(requestInfo);

  if (self._forceRequestError) request.destroy();

  request.on("data", function(data) {
    requestInfo.body += data;
  });
  request.on("end", function() {
    response.statusCode = responseInfo.status;
    response.setHeader("Date", "harness_date_header");
    response.end(responseInfo.body);
  });
}

Paranoic Telemetry

External systems are unreliable. The only thing that's certain is their eventual failure. File systems lose data and become unwritable. Services return error codes, suddenly change their specifications, and refuse to terminate connections. Therefore:

Instrument the snot out of your infrastructure code. Assume that everything will break eventually. Test that every failure case either logs an error and sends an alert, or throws an exception that ultimately logs an error and sends an alert. Remember to test your code's ability to handle requests that hang, too.

All these failure cases are expensive to support and maintain. Whenever possible, use Testable Libraries rather than external services.

(An alternative to Paranoic Telemetry is Contract Tests, but they're not paranoid enough to catch changes that happen between test runs.)

Nullable Infrastructure

Focused Integration Tests are slow and difficult to set up. Although they're useful for ensuring that infrastructure code works in practice, they're overkill for code that depends on that infrastructure code. Therefore:

Program each infrastructure class with a factory method, such as "createNull()," that disables communication with the external system. Instances should behave normally in every other respect. This is similar to how Null Objects work. For example, calling LoginClient.createNull().getUserInfo("...") would return a default response without actually talking to the third-party login service.

The createNull() factory is production code and should be test-driven accordingly. Ensure that it doesn't have any mandatory parameters. (Nullable Infrastructure is the Infrastructure layer equivalent of Parameterless Instantiation.)

To implement Nullable Infrastructure cleanly, use an Embedded Stub. To test code that has infrastructure dependencies, use Configurable Responses, Send State, Send Events, and Behavior Simulation.

Embedded Stub

In order for Nullable Infrastructure to be useful to tests, null instances need to disable the external system while running everything else normally. The obvious approach is to use a flag and bunch of "if" statements, but that's a recipe for spaghetti. Therefore:

Stub out the third-party library that performs external communication rather than changing your infrastructure code. In the stub, implement the bare minimum needed to make your infrastructure code run. Ensure you don't overbuild the stub by test-driving it through your infrastructure code's public interface. Put the stub in the same file as your infrastructure code so it's easy to remember and update when your infrastructure code changes.

// Example Node.js wrapper for Socket.IO, a WebSockets library.
// Note how minimalistic the stub code is.

// Import real Socket.IO library
const io = require("socket.io");

// Infrastructure Wrapper
class RealTimeServer extends EventEmitter {
  
  // Instantiate normal wrapper
  static create() {
    return new RealTimeServer(io);
  }

  // Instantiate Null wrapper
  static createNull() {
    return new RealTimeServer(nullIo);
  }  

  // Shared initialization code
  constructor(io) {
    super();
    this._io = io;
    //...
  }

  // Normal infrastructure code goes here.
  // It's unaware of which version of Socket.IO is used.
}

// Null Socket.IO implementation is programmed here
function nullIo() {
  return new NullIoServer();
}

class NullIoServer {
  on() {}
  emit() {}
  close(done) { return done(); }
}

class NullSocket {
  constructor(id) { this.id = id; }
  get isNull() { return true; }
  emit() {}
  get broadcast() { return { emit() {} }; }
}

Fake It Once You Make It

Some high-level infrastructure classes depend on low-level infrastructure classes to talk to the outside world. For example, a LoginClient class might use a RestClient class to perform its network calls. The high-level code is typically more concerned with parsing and processing responses than the low-level communication details. However, there will still be some communication details that need to be tested. Therefore:

Use a mix of Focused Integration Tests and Nullable Infrastructure in your high-level infrastructure classes. For tests that check if external communication is done properly, use a Focused Integration Test (and possibly a Spy Server). For parsing and processing tests, use simpler and faster Nullable Infrastructure dependencies.

This Node.js JavaScript example demonstrates two tests of a LoginClient. The LoginClient depends on a RestClient. Note how the network request test uses a Spy Server and the error handling test uses a Null RestClient.

// Example Node.js tests for high-level LoginClient
// that depends on low-level RestClient.

describe("authentication", function() {
  // Network communication uses a Focused Integration Test and a Spy Server
  it("performs network request", async function() {
    // Instantiate the unit under test
    const client = createNetworkedClient({
      clientId: "my_auth0_id",
      clientSecret: "my_auth0_secret",
      managementApiToken: "my_management_api_token",
      connection: "my_auth0_connection",
    });
    
    // Set up Spy Server response
    testServer.setResponse({
      status: 200,
      body: JSON.stringify({
        id_token: createIdToken({
          email: "irrelevant_email_address",
          email_verified: false
        }),
      }),
    });

    // Call the production code
    await client.validateLoginAsync("login_code", "my_callback_url");
    
    // Assert that the correct HTTP request was made
    assert.deepEqual(testServer.getRequests(), [{
      method: "POST",
      url: "/oauth/token",
      body: JSON.stringify({
        client_id: "my_auth0_id",
        client_secret: "my_auth0_secret",
        code: "login_code",
        redirect_uri: "my_callback_url",
        grant_type: "authorization_code"
      }),
      headers: {
        host: testServer.host(),
        authorization: "Bearer my_management_api_token",
        "content-type": "application/json; charset=utf-8",
        "content-length": "148",
      },
    }]);
  });

  // Processing test uses Nullable Infrastructure RestClient dependency
  it("fails with error when HTTP status isn't 'okay'", async function() {
    // Instantiate unit under test with dependency configured to provide desired response
    const response = { status: 500, body: "auth0_response" };
    const client = createNulledClient(response);

    // Assert that the correct error was generated    
    await assert.exceptionAsync(
      () => validateLoginAsync(client),  // call production code
      expectedError(response, "Unexpected status code from Auth0.")  // expected error
    );
  });
});

// Factory for Focused Integration Tests
function createNetworkedClient({
  hostname = testServer.hostname(),
  port = testServer.port(),
  clientId = "irrelevant_id",
  clientSecret = "irrelevant_secret",
  managementApiToken = "irrelevant_token",
  connection = "irrelevant_connection",
} = {}) {
  if (port === null) port = undefined;
  return LoginClient.create({
    hostname,
    port,
    certificate: testServer.certificate(),
    clientId,
    clientSecret,
    managementApiToken,
    connection
  });
}

// Factory for Nullable Infrastructure tests
function LoginClient(responses) {
  return new Auth0Client({
    restClient: HttpsRestClient.createNull(responses),
    hostname: "irrelevant_hostname",
    clientId: "irrelevant_id",
    clientSecret: "irrelevant_secret",
    managementApiToken: "irrelevant_token",
    connection: "irrelevant_connection",
  });
}

Configurable Responses

Application and high-level infrastructure tests need a way of configuring the data returned by their infrastructure dependencies. Therefore:

Allow infrastructure methods' responses to be configured with an optional "responses" parameter to the Nullable Infrastructure's createNull() factory. Pass it through to the class's Embedded Stub and test-drive the stub accordingly.

When your infrastructure class has multiple methods that can return data, give each one its own createNull() parameter. Used named and optional parameters so they can be added and removed without breaking existing tests.

If a method needs to provide multiple different responses, pass them as an array. However, this may be a sign that your Infrastructure layer is too complicated.

// Example Node.js tests for Application layer code
// that reads data using LoginClient dependency

it("logs successful login", async function() {
  // Configure login client dependency
  const loginClient = LoginClient.createNull(
    validateLogin: {    // configure the validateLogin response
      email: "my_authenticated_email",
      emailVerified: true,
    }
  );
  const logCapture = LogService.createNull();
  
  // Run production code
  await performLogin({ loginClient, logCapture }));  // Signature Shielding
  
  // Check results
  assert.deepEqual(logCapture.logs, [ "Login: my_authenticated_email" ]);
});

To test code that uses infrastructure to send data, use Send State or Send Events. To test code that responds to infrastructure events, use Behavior Simulation.

Send State

Application and high-level infrastructure code use their infrastructure dependencies to send data to external systems. They need a way of checking that the data was sent. Therefore:

For infrastructure methods that send data, and provide no way to observe that the data was sent, store the last sent value in a variable. Make that data available via a method call.

// Example Send State implementation in JavaScript

class LoginClient {
  constructor() {
    //...
    this._lastSentVerificationEmail = null;
  }
  sendVerificationEmail(emailAddress) {
    //...
    this._lastSentVerificationEmail = emailAddress;
  }
  getLastSentVerificationEmail() {
    return this._lastSentVerificationEmail;
  }
  //...
}
// Example Node.js tests for Application layer code
// that sends data using LoginClient dependency

it("sends verification email", async function() {
  const loginClient = LoginClient.createNull();
  const emailPage = createPage({ loginClient });
  
  await emailPage.simulatePostAsync();
  assert.deepEqual(loginClient.getLastSentVerificationEmail(), "my_email");
});

If you need more than one send result, or you can't store the sent data, use Send Events instead. To test code that uses infrastructure to get data, use Configurable Responses. To test code that responds to infrastructure events, use Behavior Simulation.

Send Events

When you test code that uses infrastructure dependencies to send large blobs of data, or sends data multiple times in a row, Send State will consume too much memory. Therefore:

Rather than storing the sent data in a variable, use the Observer pattern to emit an event when your infrastructure code sends data. Include the data as part of the event payload. When tests need to make assertions about the data that was sent, they can listen for the events.

Send Events require complicated test setup. To make your tests easier to read, create a helper function in your tests that listens for send events and stores their data in an array. Doing this in production could cause a memory leak, but it's not a problem in your tests because the memory will be freed when the test ends.

This JavaScript example involves Application layer code for a real-time web application. When a browser connects, the server should send it all the messages the server had previously received. This test uses Send Events to check that the server sends those messages when a new browser connects.

// Example Node.js tests for Application layer code that sends
// multiple pieces of data using RealTimeServer dependency

it("replays all previous messages when client connects", function() {
  const network = createRealTimeServer();  // the infrastructure dependency
  const app = createApp({ network });  // the application code under test
  
  // Set up the test preconditions
  const message1 = new DrawMessage(1, 10, 100, 1000);
  const message2 = new DrawMessage(2, 20, 200, 2000);
  const message3 = new DrawMessage(3, 30, 300, 3000);
  network.connectNullBrowser(IRRELEVANT_ID);  // Behavior Simulation
  network.simulateBrowserMessage(IRRELEVANT_ID, message1);  // more Behavior Simulation
  network.simulateBrowserMessage(IRRELEVANT_ID, message2);
  network.simulateBrowserMessage(IRRELEVANT_ID, message3);

  // Listen for Send Events
  const sentMessages = trackSentMessages(network);
  
  // Run production code
  network.connectNullBrowser("connecting client");

  // Check that the correct messages were sent
  assert.deepEqual(sentMessages, [ message1, message2, message3 ]);
});

// Helper function for listening to Send Events
function trackSentMessages(network) {
  const sentMessages = [];
  network.on(RealTimeServer.EVENT.SEND_MESSAGE, (message) => {
    serverMessages.push(message);
  });
  return sentMessages;
}

To test code that uses infrastructure to get data, use Configurable Responses. To test code that responds to infrastructure events, use Behavior Simulation.

Behavior Simulation

Some external systems will push data to you rather than waiting for you to ask for it. Your application and high-level infrastructure code need a way to test what happens when their infrastructure dependencies generate those events. Therefore:

Add methods to your infrastructure code that simulate receiving an event from an external system. Share as much code as possible with the code that handles real external events, while remaining convenient for tests to use.

// Example Node.js Behavior Simulation implementation

class RealTimeNetwork {
  // Real Socket.IO event code
  _listenForBrowserMessages(socket) {
    socket.on("message", (payload) => {
      const message = Message.fromPayload(payload);
      _handleBrowserMessage(socket.id, message);
    }
  }
  
  // Simulated Socket.IO event code
  simulateBrowserMessage(clientId, message) {
    _handleBrowserMessage(clientId, message);
  }
  
  // Shared message-processing logic
  _handleBrowserMessage(clientId, message) {
    this.emit(RealTimeServer.EVENT.RECEIVE_MESSAGE, { clientId, message });
  }
  
  // Another example from the same class...
  
  // Real Socket.IO event code
  _listenForBrowserConnection(ioServer) {
    ioServer.on("connection", (socket) => {
      _connectBrowser(socket);
    }
  }
  
  // Simulated Socket.IO event code
  connectNullBrowser(browserId) {
    _connectBrowser(new NullSocket(browserId));
  }
  
  // Shared connection logic
  _connectBrowser(socket) {
    const id = socket.id;
    this._socketIoConnections[id] = socket;
    this.emit(RealTimeServer.EVENT.BROWSER_CONNECT, id);
  }
  
  //...
}
// Example Node.js tests for Application layer code
// that responds to events from RealTimeNetwork dependency

it("broadcasts messages from one browser to all others", function() {
  // Setup
  const network = createRealTimeNetwork();  // the infrastructure dependency
  const app = createApp({ network });  // the application code under test 
  const browserId = "browser id";
  const message = new PointerMessage(100, 200);
  network.connectNullBrowser(browserId);

  // Trigger event that runs code under test
  network.simulateBrowserMessage(browserId, clientMessage);
  
  // Check that code under test broadcasted the message
  assert.deepEqual(network.getLastSentMessage(), message);
});

To test code that uses infrastructure to get data, use Configurable Responses. To test code that uses infrastructure to send data, use Send Stream or Send Events.

Conclusion

These patterns are an effective way of writing code that can be tested without test doubles, DI frameworks, or end-to-end tests.

Agile Fluency Model Updated

06 Mar, 2018

Six years ago, Diana Larsen and I created the Agile Fluency™ Model, a way of describing how agile teams tend to grow over time.

The Agile Fluency Model, showing a path starting with 'Pre-Agile', followed by a team culture shift, then the 'Focusing' zone. The path continues with a team skills shift that leads to the 'Delivering' zone. Next, an organizational structure shift leads to the 'Optimizing' zone. Finally, an organizational culture shift leads to the 'Strengthening' zone. After that, the path fades out as it continues on to zones yet undiscovered.

In the last six years, the model has gone on to be more influential than Diana and I ever expected. People and companies are using it around the world, often without even telling us. In that time, we've also learned new things about the model and what fluent agility looks like.

Today I'm happy to announce that we've released a new, updated article about the Agile Fluency Model. It's a substantial revision with much more detail about the benefits, proficiencies, and investments needed for different kinds of agile development.

We've also launched the Agile Fluency Diagnostic, a way to help teams develop new capabilities. It's a facilitated self-assessment conducted by experienced agile coaches. We have a list of licensed facilitators and you can also become licensed to conduct the Diagnostic yourself.

Many thanks to Martin Fowler, who published our original article and encouraged us to release this updated version. Check it out!

A Nifty Workshop Technique

05 Apr, 2017

It's hard to be completely original. But I have a little trick for workshops that I've never seen others do, and participants love it.

It's pretty simple: instead of passing out slide booklets, provide nice notebooks and pass out stickers. Specifically, something like Moleskine Cahiers and 3-1/3" x 4" printable labels.

Closeup of a workshop participant writing on a notebook page, with a sticker on the other page

I love passing out notebooks because they give participants the opportunity to actively listen by taking notes. (And, in my experience, most do.) Providing notebooks at the start of a workshop reinforces the message that participants need to take responsibility for their own learning. And, notebooks are just physically nicer and more cozy than slide packets... even the good ones.

The main problem with notebooks is that they force participants to copy down material. By printing important concepts on stickers, participants can literally cut and paste a reference directly into their notes. It's the best of both worlds.

There is a downside to this technique: rather than just printing out your slides, your stickers have to be custom-designed references. It's more work, but I find that it also results in better materials. Worth it.

People who've been to my workshops keeping asking me if they can steal the technique. I asked them to wait until I documented my one original workshop idea. Now I have. If you use this idea, I'd appreciate credit. Other than that, share and enjoy. :-)

Picture of a table at the Agile Fluency Game workshop showing participants writing in their notebooks

Final Details for Agile Fluency Coaching Workshop

21 Mar, 2017

Our Agile Fluency™ Game coaching workshop is coming up fast! Signups close on March 28th. Don't wait!

We've been hard at work finalizing everything for the workshop. We hired Eric Wahlquist to do the graphic design and he did a great job.

Diana Larsen and I have also finalized the agenda for the workshop. It's so much more than just the game. The workshop is really a series of mini-workshops that you can use to coach your teams. Check 'em out:

  1. The Agile Fluency Game: Discover interrelationships between practices and explore the tradeoffs between learning and delivery
  2. Your Path through the Agile Fluency Model: Understand fluency zone tradeoffs and choose your teams' targets
  3. Zone Zoom: Understand how practices enable different kinds of fluency
  4. Trading Cards: Explore tradeoffs between practices
  5. Up for Adoption: See how practices depend on each other and which ones your teams could adopt
  6. Fluency Timeline: Understand the effort and time required for various practices
  7. Perfect Your Agile Adoption: Decide which practices are best for your teams and how to adopt them

These are all hands-on, experiential workshops that you'll learn how to conduct with your own teams. I think they're fantastic. You can sign up here.

The Agile Fluency Game: Now Available!

01 Mar, 2017

Five years ago, Arlo Belshee and I created a game about agile adoption. The ideas in that game influenced the Agile Fluency™ Model, Arlo's Agile Engineering Fluency map, and the Agile Fluency Diagnostic. We always intended to publish the game more widely, but the time and money required to do professional publishing job was just too much.

Until now.

I am very proud to announce that, in collaboration with the Agile Fluency Project, the game I've spent the last five years play-testing and revising is finally available! I'm conducting a special workshop with Diana Larsen that's packed full of useful exercises to improve your Agile coaching and training. Every participant will get a professional-produced box set of the game to take home.

Every time we've run the Agile Fluency Game, players have asked to get their own copy. Now it's finally available.

Sign up and learn more here.

Agile and Predictability

29 Sep, 2014

Over on the AgilePDX mailing list, there's an interesting conversation on making predictions with Agile. It started off with Michael Kelly asking if Agile can help with predictability. Here's my response:

It's entirely possible to make predictions with Agile. They're just as good as predictions made with other methods, and with XP practices, they can be much better. Agile leaders talk about embracing change because that has more potential value than making predictions.

Software is inherently unpredictable. So is the weather. Forecasts (predictions) are possible in both situations, given sufficient rigor. How your team approaches predictions depends on what level of fluency they have.

One-star teams adapt their plans and work in terms of business results. However, they don't have rigorous engineering practices, which means their predictions have wide error bars, on par with typical non-Agile teams (for 90% reliability, need 4x estimate*). They believe predictions are impossible in Agile.

Two-star teams use rigorous engineering practices such as test-driven development, continuous integration, and the other good stuff in XP. They can make predictions with reasonable precision (for 90% reliability, need 1.8x estimate*) They can and do provide reliable predictions.

Three- and four-star teams conduct experiments and change direction depending on market opportunities. They can make predictions just as well as two-star teams can, but estimating and predicting has a cost, and those predictions often have no real value in the market. They often choose not to incur the waste of making predictions.

So if a company were to talk to me about improving predictability, I would talk to them about what sort of fluency they wanted to achieve, why, and the investments they need to make to get there. For some organizations, *3 fluency isn't desired. It's too big of a cultural shift. In those cases, a *2 team is a great fit, and can provide the predictability the organization wants.

I describe the "how to" of making predictions with Agile in "Use Risk Management to Make Solid Commitments".

*The error-bar numbers are approximate and depend on the team. See the "Use Risk Management" essay for an explanation of where they come from.

How Does TDD Affect Design?

17 May, 2014

(This essay was originally posted to the Let's Code JavaScript blog.)

I've heard people say TDD automatically creates good designs. More recently, I've heard David Hansson say it creates design damage. Who's right?

Neither. TDD doesn't create design. You do.

TDD Can Lead to Better Design

There are a few ways I've seen TDD lead to better design:

  1. A good test suite allows you to refactor, which allows you to improve your design over time.

  2. The TDD cycle is very detail-oriented and requires you to make some design decisions when writing tests, rather than when writing production code. I find this helps me think about design issues more deeply.

  3. TDD makes some design problems more obvious.

None of these force you to create better design, but if you're working hard to create good design, I find that these things make it easier to get there.

TDD Can Lead to Worse Design

There are a few ways I've seen TDD lead to worse design:

  1. TDD works best with fast tests and rapid feedback. In search of speed, some people use mocks in a way that locks their production code in place. Ironically, this makes refactoring very difficult, which prevents designs from being improved.

  2. Also in search of speed, some people make very elaborate dependency-injection tools and structures, as well as unnecessary interfaces or classes, just so they can mock out dependencies for testing. This leads to overly complex, hard to understand code.

  3. TDD activates people's desire to get a "high score" by having a lot of tests. This can push people to write worthless or silly tests, or use multiple tests where a single test would do.

None of these are required by TDD, but they're still common. The first two are obvious solutions to the sorts of design problems TDD exposes.

They're also very poor solutions, and you can (and should) choose not to do these things. It is possible to create fast tests and rapid feedback without these mistakes, and you can see us take that approach in the screencast.

So Do Your Job

TDD doesn't create good design. You do. TDD can help expose design smells. You have to pay attention and fix them. TDD can push you toward facile solutions. You have to be careful not to make your design worse just to make testing better.

So pay attention. Think about design. And if TDD is pushing you in a direction that makes your code worse, stop. Take a walk. Talk to a colleague. And look for a better way.

The Lament of the Agile Practitioner

08 May, 2014

I got involved with Extreme Programming in 2000. Loved it. Best thing since sliced bread, yadda yadda. I was completely spoiled for other kinds of work.

So when that contract ended, I went looking for other opportunities to do XP. But guess what? In 2001, there weren't any. So I started teaching people how to do it. Bam! I'm a consultant.

Several lean years later (I don't mean Lean, I mean ramen), I'm figuring out this consulting thing. I've got a network, I've got business entity, people actually call me, and oh, oh, and I make a real damn difference.

Then Agile starts getting really popular. Certification starts picking up. Scrum's the new hotness, XP's too "unrealistic." I start noticing some of my friends in the biz are dropping out, going back to start companies or lead teams or something real. But I stick with it. I'm thinking, "Sure, there's some bottom feeders creeping in, but Agile's still based on a core of people who really care about doing good work. Besides, if we all leave, what will keep Agile on track?"

It gets worse. Now I'm noticing that there are certain clients that simply won't be successful. I can tell in a phone screen. And it's not Scrum's fault, or certification, or anything. It's the clients. They want easy. I start getting picky, turning them down, refusing to do lucrative but ineffective short-term training.

Beck writes XP Explained, 2nd edition. People talk about Agile "crossing the chasm." I start working on the 2nd edition XP Pocket Guide with chromatic and it turns into The Art of Agile Development. We try to write it for the early majority--the pragmatics, not the innovators and early adopters that were originally attracted to Agile and are now moving on to other things. It's a big success, still is.

It gets worse. The slapdash implementations of Agile now outnumber the good ones by a huge margin. You can find two-day Scrum training everywhere. Everybody wants to get in on the certification money train. Why? Clients won't send people to anything else. The remaining idealists are either fleeing, founding new brands, or becoming Certified Scrum Trainers.

I write The Decline and Fall of Agile. Martin Fowler writes Flaccid Scrum. I write Stumbling through Mediocrity. At conferences, we early adopters console each other by saying, "The name 'Agile' will go away, but that's just because practices like TDD will just be 'the way you do software.'" I start looking very seriously for other opportunities.

That was six years ago.

...

Believe it or not, things haven't really gotten worse since then. Actually, they've gotten a bit better. See, 2-5 years is about how long a not-really-Agile Agile team can survive before things shudder to a complete halt. But not-quite-Agile was Actually. So. Much. Better. (I know! Who could believe it?) than what these terribly dysfunctional organizations were doing before that they're interested in making Agile work. So they're finally investing in learning how to do Agile well. Those shallow training sessions and certifications I decried? They opened the door.

And so here we are, 2014. People are complaining about the state of Agile, saying it's dying. I disagree. I see these "Agile is Dying" threads as a good thing. Because they mean that the word is getting out about Agile-in-name-only. Because every time this comes up, you have a horde of commenters saying "Yeah! Agile sucks!" But... BUT... there's also a few people who say, "No, you don't understand, I've seen Agile work, and it was glorious." That's amazing. Truly. I've come to believe that no movement survives contact with the masses. After 20 years, to still have people who get it? Who are benefiting? Whose lives are being changed?

That means we have a shot.

And as for me... I found that opportunity, so I get to be even more picky about where I consult. But I continue to fight the good fight. Diana and I produced the Agile Fluency™ model, a way of understanding and talking about the investments needed, and we're launching the Agile Fluency Project later this month. We've already released the model, permissive license, for everyone to use. Use it.

Because Agile has no definition, just a manifesto. It is what the community says it is. It always has been. Speak up.

Discuss this essay on the Let's Code JavaScript blog.

Object Playground: The Definitive Guide to Object-Oriented JavaScript

27 Aug, 2013

Let's Code: Test-Driven JavaScript, my screencast series on rigorous, professional JavaScript development, recently celebrated its one year anniversary. There's over 130 episodes online now, covering topics ranging from test-driven development (of course!), to cross-browser automation, to software design and abstraction. It's incredibly in-depth and I'm very proud of it.

To celebrate the one-year anniversary, we've released Object Playground, a free video and visualizer for understanding object-oriented programming. The visualizer is very cool: it runs actual JavaScript code and graphs the object relationships created by that code. There are several preset examples and you can also type in arbitrary code of your own.

Example visualization

Object Playground in action

Understanding how objects and inheritance work in JavaScript is a challenge even for experienced developers, so I've supplemented the tool with an in-depth video illustrating how it all fits together. The feedback has been overwhelmingly positive. Check it out.

Estimation and Fluency

25 Feb, 2013

Martin Fowler recently asked me via email if I thought there might be a relationship between Agile Fluency and how teams approach estimation. This is my response:

I definitely see a relationship between fluency and estimation. I can't say it's clear cut or that I have real data on it, but this is my gut feel:

  1. "Focus on Value" teams tend to fall into two camps: either "estimation is bad Agile" or "we're terrible at estimating." These statements are the same boy dressed by different parents. One star teams can't provide reliable estimates because their iterations are unpredictable, typically with stories drifting into later iterations (making velocity meaningless), and they have a lot of technical debt (so even if they took a rigorous approach to "done done" iterations, there would be wide variance in velocity from iteration to iteration, so their predictions' error bars would be too wide to be useful).

  2. "Deliver Value" teams tend to take a "we serve our customer" attitude. They're very good at delivering what the customer asks for (if not necessarily what he wants). Their velocity is predictable, so they can make quite accurate predictions about how long it will take to get their current backlog done. Variance primarily comes from changes to the backlog and difficulty discussing needs with customers (leading to changes down the road), but those are manageable with error bars. Some two-star teams retain the "estimation is bad Agile" philosophy, but any two-star team with a reasonably stable backlog should be capable of making useful predictions.

  3. "Optimize Value" teams are more concerned with meeting business needs than delivering a particular backlog. Although they can make predictions about when work will be done, especially if they've had a lot of practice at it during their two-star phase, they're more likely to focus on releasing the next thing as soon as possible by reducing scope and collaboratively creating clever shortcuts. (E.g., "I know you said you wanted a subscriber database, but we think we can meet our first goal faster if we just make REST calls to our credit card processor as needed. That has ramifications x, y, z; what do you think?"). They may make predictions to share with stakeholders, but those stakeholders are higher-level and more willing to accept "we're working on business goal X" rather wanting than a detailed timeline.

  4. I'm not sure how this would play out with "Optimize for System" teams. I imagine it would be the same as three-star fluency, but with a different emphasis.

Analysis of Hacker News Traffic Surge on Let's Code TDJS Sales

25 Feb, 2013

A few weeks ago, my new screencast series, Let's Code: Test-Driven JavaScript, got mentioned on Hacker News. Daniel Markham asked that I share the traffic and conversion numbers. I agreed, and it's been long enough for me to collect some numbers, so here we are.

To begin with, Let's Code TDJS is a bootstrapped venture. It's a screencast series focused on rigorous, professional JavaScript development. It costs money. $19.95/month for now, $24.95/month starting March 1st, with a seven-day free trial.

Let's Code TDJS isn't exactly a startup--I already have a business as an independent Agile consultant--but I'm working on the series full-time. I've effectively "quit my day job" to work on it. I'm doing this solo.

I launched the series last year on Kickstarter. Thanks in large part to a mention on Hacker News and then even more from Peter Cooper's JavaScript Weekly, the Kickstarter was a huge success. It raised about $40K and attracted 879 backers. That confirmed the viability of the screencast and also gave me runway to develop the show.

(By the way, launching on Kickstarter was fantastic. I mean, sure, it was nailbiting and hectic and scary, and running the KS campaign took *way* more work than I ever expected, but the results were fantastic. As a platform for confirming the viability of an idea while simultaneously giving you seed funding, I can't imagine anything better for the bootstrapper.)

Anyway, the Kickstarter got a reasonable amount of attention. I tossed up a signup page for people who missed it, and by the time I was ready to release the series to the public, I had a little over 1500 people on my mailing list.

I announced the series' availability on February 4th. First on Sunday via Twitter, and then on Monday morning (recipient's time) via the mailing list. That's about 4,200 Twitter followers and 1,500 mailing list recipients.

Before we get into the numbers, I should tell you that I don't use Google Analytics. I don't track visitors, uniques, pageviews, none of that. I'm not sure what I would do with it. What I do track is 1) number of sales and 2) conversions/retention. That's it.

So, I announced on the weekend of the fourth. There was a corresponding surge in sales. Here's how many new subscriptions I got each day, with Monday the 4th counting as "100x." (So if I got 100,000 subscriptions on Monday--which, since you don't see me posting from a solid gold throne, you can assume I didn't--then I would have gotten 48,000 on Sunday.)

  • Sunday: 48x
  • Monday: 100x
  • Tuesday: 33x
  • Wednesday: 25x
  • (Total: 206x.)

82% of these subscribers converted to paying customers. 15% cancelled the trial and 3% just didn't pay. Although I collect credit card information at signup, those subscribers' credit card didn't process successfully.

A week and a half later, just after midnight my time (PST) on Wednesday the 13th, the series was posted to Hacker News by Shirish Padalkar. It was on the front page for about a day, and was near the top of the front page for the critical morning hours of every major European and US time zone. It peaked at #7. That led to a shorter, sharper surge.

  • Wednesday: 140x
  • Thursday: 23x
  • (Total: 163x, or 79% of the email's surge.)

83% of these subscribers converted from the free trial. 14% cancelled and 4% didn't pay. The only real difference between the two surges was that a lot more of the HN subscribers cancelled at the last moment. About half actually cancelled after their credit card failed to process and they got a dunning email. It makes me wonder if HN tire-kickers are more inclined to "hack the system" by somehow putting in a credit card that will can be authorized but cannot be charged. A debit card with insufficient funds would do the trick.

Another interesting data point is that "background" subscriptions--that is, the steady flow of subscriptions since February 2nd, other than traffic surges--averages 10x per day. (That is, on an average day, I get one tenth the subscriptions I got on Monday the 4th). However, the conversion rate for those "background" subscriptions is 95%. I'm not sure why it's so much higher. Perhaps those subscriptions are the result of word-of-mouth recommendations? That would make sense, since I'm not advertising or doing any real traffic generation yet.

My conclusions:

  • For this service, a mention on HN is about equivalent to a mailing list of 1,200 interested potential customers and 3,330 Twitter followers. That's actually less than I would have expected.

  • The conversion behavior of HN'ers is about the same as the mailing list. I would have expected the mailing list to convert at a higher rate, since they've already expressed interest. But it's essentially the same.

  • Both surges led to significantly less conversions than word-of-mouth subscribers. Don't get me wrong--from my research, I'm led to believe that ~83% is excellent. But 95% is frikkin' amazing.

  • HN'ers are more likely to cancel a free trial at the last moment and use credit cards that authorize but cannot be charged.

Finally--thanks to everyone who subscribed! I hope this was interesting. You can discuss this post on Hacker News. I'm available to answer questions.

(If you're curious about the series, you can find a demo video here.)

Let's Code: Test-Driven JavaScript Now Available

11 Feb, 2013

I'm thrilled to announce that Let's Code: Test-Driven JavaScript is now open to the public!

I've been working on this project for over nine months. Over a thousand people have had early access, and reactions have been overwhelmingly positive. Viewers are saying things like "truly phenomenal training" (Josh Adam, via email), "highly recommended" (Theo Andersen, via Twitter), and "*the* goto reference" (anonymous, via viewer survey).

Up until last week, the show was only available to Kickstarter backers. Now I've opened it up to everyone. Come check it out! There's a demo video below.

About Let's Code: Test-Driven JavaScript

If you've programmed in JavaScript, you know that it's an... interesting... language. Don't get me wrong: I love JavaScript. I love its first-class functions, the intensive VM competition between browser makers, and how it makes the web come alive. It definitely has its good parts.

It also has some not-so-good parts. Whether it's browser DOMs, automatic semicolon insertion, or an object model with a split personality, everyone's had some part of JavaScript bite them in the ass at some point. That's why test-driven development is so important.

Let's Code: Test-Driven JavaScript is a screencast series focused on rigorous, professional web development. That means test-driven development, of course, and also techniques such as build automation, continuous integration, refactoring, and evolutionary design. We support multiple browsers and platforms, including iOS, and we use Node.js on the server. The testing tools we're using include NodeUnit, Mocha, expect.js, and Testacular.

You can learn more on the Let's Code TDJS home page. If you'd like to subscribe, you can sign up here.

Come Play TDD With Me at CITCON

18 Sep, 2012

CITCON, the Continuous Integration and Testing Conference, is coming up this weekend in Portland, and I'm going to be there and recording episodes of Let's Play TDD! I'm looking for people to pair with during the conference.

If you're interested, check out the current source code and watch of a few of the most recent videos. Then, at the conference, come to my "Let's Play TDD" session and volunteer to pair! It should be a lot of fun.

There are still a few slots open at the conference, so if you haven't registered, there's still time. I hope to see you there!

Acceptance Testing Revisited

08 Sep, 2012

I recently had the chance to reconsider my position on acceptance testing, thanks to a question from Michal Svoboda over on the discussion forums at my Let's Code: Test-Driven Javascript screencast. I think this new answer adds some nuances I haven't mentioned before, so I'm reproducing it here:

I think "acceptance" is actually a nuanced problem that is fuzzy, social, and negotiable. Using tests to mediate this problem is a bad idea, in my opinion. I'd rather see "acceptance" be done through face-to-face conversations before, after, and during development of code, centering around whiteboard sketches (earlier) and manual demonstrations (later) rather than automated tests.

That said, we still need to test the behavior of the software. But this is a programmer concern, and can be done with programmer tests. Tools like Cucumber shift the burden of "the software being right" to the customer, which I feel is a mistake. It's our job as programmers to (a) work with the customer, on the customer's terms, so we build the right thing, and (b) make sure it actually works, and keeps working in the future. TDD helps us do it, and it's our responsibility, not the customer's.

I don't know if this is very clear. To rephrase: "acceptance" should be a conversation, and it's one that we should allow to grow and change as the customer sees the software and refines her understanding of what she wants. Testing is too limited, and too rigid. Asking customers to read and write acceptance tests is a poor use of their time, skill, and inclinations.

I have more here:

Lack of Fluency?

10 Aug, 2012

Dave Nicolette has written a thoughtful critique of my article with Diana Larsen, Your Path through Agile Fluency. I'm going to take a moment to respond to his points here.

Dave's lays out his core criticism thusly:

The gist of the article appears to be that we can effect organizational improvement in a large company by driving change from the level of individual software development teams.

He spends the rest of his article elaborating on this point, and particularly making the point that most software is built in IT (rather than product) organizations, where software teams have little ability to drive change.

It's a well-made argument, and I agree with it. Organizational change does require top-down support, and software teams in IT do have little ability to drive bottom-up change.

Just one problem: I don't see why Dave presents this as a criticism. Our article isn't about using software teams to drive organizational change.

Our essay describes how teams progress through stages of Agile fluency. It's based on what we've seen in 13 years of applying Agile and observing others' Agile efforts. We've developed the fluency model over the last year and a half. Along the way, we've reviewed it with dozens of people in all sorts of roles--team members, managers, executives, and consultants. The feedback has been clear: the model reflects their experiences. I doubt it's perfect, but the fluency model reflects reality to the best of our ability.

This model isn't about how organizations grow. It's about how teams grow. (There's probably room for an article about organizational Agile fluency, but that's for another time.) And this is what we see:

1. Teams learning Agile first get good at focusing on the business perspective. User stories and the like. (That's not to say that every team using user stories is successfully focusing on the business perspective!) This requires some cultural changes at the team level.

2. If the team has been working on improving their development skills, they get good at this next. It takes longer than the first stage because there's a lot to learn. I'm talking TDD, refactoring, and so forth. Some teams don't bother.

3. Typically, by this point, the team has been feeling the pain of poor business involvement for a while. Sometimes, in organizations that support it, the team gets good at the business side of things next. It takes longer than the previous stage because "getting good at it" means involving people outside the team. Most organizations are set up with "business" and "software" in different parts of the org chart, and this "third star" of fluency typically (but not always) requires them to be merged into cross-functional teams.

We don't say how to make this happen, just that it's a prerequisite for three-star fluency, and that these sorts of changes to organizational structure are difficult and require spending organizational capital. It can be top-down or bottom-up. (Really, it has to be both.) We also say that it may not be worth making this investment. Two-star fluency could be enough.

4. Finally, in a few companies, the team's focus extends beyond its product/projects to helping to optimize the overall system. This requires an organizational culture that likes its teams to advise on whole-company issues. Again, we don't say how to achieve this, just that it's a prerequisite for four-star fluency, and that we've only seen it happen in small companies, and typically companies that have this as an organizational value from the beginning. We again emphasize that more stars aren't necessarily worth the investment.

So, to recap: Dave argues that the individual software teams in IT cannot drive bottom-up organizational change. I agree. Organizational change must occur if you want to achieve three- or four-star fluency, but our article doesn't describe how to do so. It just says it's necessary.