James Shore: A Light Introduction to Nullables

A Light Introduction to Nullables

March 2, 2023

A few weeks ago, I released a massive update to my article, “Testing Without Mocks: A Pattern Language.” It’s 45 pages long if you print it. (Which you absolutely could. I have a fantastic print stylesheet.) Along with it, I released my new Nullables Hub, which has all sorts of resources related to that article.

But why bother? Why write 45 pages about testing, with or without mocks?

Because testing is a big deal. People who don’t have automated tests waste a huge amount of time manually checking their code, and they have a ton of bugs, too. But...

Most Automated Tests Suck

The problem is, people who do have automated tests also waste a huge amount of time. Most test suites are flaky and sloooooow. That’s because the easy, obvious way to write tests is to make end-to-end tests that are automated versions of manual tests.

Folks in the know use mocks and spies (I’ll say “mocks” for short) to write isolated unit tests. Now their tests are fast! And reliable! And that’s great!

Except that a lot of their tests are filled with detail about the interactions in the code. Structural refactorings become really hard. Sometimes, you look at a test, and realize: all it’s testing... is itself.

Not to mention that the popular way to use mocks is to use a mocking framework and... wow. Have you seen what those tests look like?

So we don’t want end-to-end tests, we don’t want mocks. What do we do?

The people really really in the know say, “Bad tests are a sign of bad design.” They’re right! They come up with things like Hexagonal Architecture and (my favorite) Gary Bernhardt’s Functional Core, Imperative Shell. It separates logic from infrastructure—external systems and state—so logic can be tested cleanly.

Totally fixes the problem.

For logic.

Anything with infrastructure dependencies... well... um... hey, look, a squirrel! (runs for hills)

Not to mention that approximately (checks notes) none of us are working in codebases with good separation of logic and infrastructure and approximately (checks notes again) none of us have permission to throw away our code and start over with a completely new architecture.

(And even if we did have permission, throwing away code and starting over is a Famously Poor Business Decision with Far Reaching Consequences.)

So we don’t want end-to-end tests, we don’t want mocks, we can’t start over from scratch... are we screwed? That’s it, the end, life sucks?

No.

Another Way

That’s why I wrote 45 pages. Because I’ve figured out another way. A way that doesn’t use end-to-end tests, doesn’t use mocks, doesn’t ignore infrastructure, doesn’t require a rewrite. It’s something you can start doing today, and it gives you the speed, reliability, and maintainability of unit tests with the power of end-to-end tests.

I call it “Testing with Nullables.” Or—because the Internet is a fickle place—“Testing Without Mocks.” Yeah, I’m an attention whore. Click me, baby.

Whatever you call it, it’s a set of patterns for combining narrow, sociable, state-based tests with a novel infrastructure technique called “Nullables.” Here’s what a high-level test using those patterns looks like:

it("reads command-line argument, transform it with ROT-13, and writes result", () => {
  const { output } = run({ args: [ "my input" ] });
  assert.deepEqual(output.data, [ "zl vachg\n" ];
});

function run({ args = [] } = {}) {
 const commandLine = CommandLine.createNull({ args });
 const output = commandLine.trackOutput();

 const app = new App(commandLine);
 app.run();

 return { output };
}

At first glance, Nullables seem like test doubles, but they’re actually production code with an “off” switch.

This is as good a point as any to remind you that nothing is perfect. End-to-end tests have tradeoffs, mocks have tradeoffs, Functional Core, Imperative Shell has tradeoffs... and Nullables have tradeoffs. All engineering is tradeoffs.

The trick is to find the combination of good + bad that is best for your situation.

Nullables have a pretty substantial tradeoff. Whether it’s a big deal or not is up to you. Having worked with these ideas for many years now, I think the tradeoffs are worth it. But you have to make that decision for yourself.

Here’s the tradeoff: Nullables are production code with an off switch.

Production code.

Even though the off switch may not be used in production.

The Core Idea

Okay, enough foreplay. Let’s talk about how this thing works. Again, you can see all the details in the article.

The fundamental idea is that we’re going to test everything—everything!—with narrow, sociable, state-based tests.

Narrow tests are like unit tests: they focus on a particular class, method or concept.
Sociable tests are tests that don’t isolate dependencies. The tests run everything in dependencies, although they don’t test them.
And state-based tests look at return values and state changes, not interactions.

(There’s a ton of code examples in the article, btw, if you want them.)

This does raise some questions about how to manage dependencies. Another core idea is “Parameterless Instantiation.” Everything can be instantiated with a constructor, or factory method, that takes no arguments.

Instead, classes do the unthinkable: they instantiate their own dependencies. (Gasp!)

Encapsulation, baby.

(You can still take the dependencies as an optional parameter.)

People ask: “but if we don’t use dependency injection frameworks...”

I interrupt: “Your code is simpler and easier to understand?” I’m kind of a dick.

They continue, glaring: “...doesn’t that mean our code is tightly coupled?”

And the answer is no, of course not. Your code was already tightly coupled! An interface with one production implementation is not “decoupled.” It’s just wordy. Verbose. Excessively file-system’d.

(The other answer is, sure, use your DI framework too. If you must.)

Anyway, that’s the fundamentals. Narrow, sociable, state-based tests that instantiate their own dependencies.

A-Frame Architecture

Next up: A-Frame Architecture! This is optional, but people really like it. It’s basically a formalized version of Functional Core, Imperative Shell (FCIS). I’m going to skip on ahead, but feel free to check out the article for details. Here’s the direct link to the A-Frame Architecture section.

Speaking of architecture, the big flaw with FCIS, as far I’ve seen, is that it basically ignores infrastructure, and things that depend on infrastructure.

“I test it manually,” Gary Bernhardt says, in his very much worth watching video.

That’s a choice. I’m going to show you how to make a different one.

(Not trying to dunk on FCIS here. I like it. A-Frame Architecture has a lot in common with FCIS, but has more to say about infrastructure.)

Infrastructure

So right, infrastructure!

Code these days has a lot of infrastructure. And sometimes very little logic. I see a bunch of code that is really nothing more than a web page controller that turns around and hands off to a database and a bunch of back-end services, and maybe has a bit of logic to glue it all together. Very hard to test with the “just separate your logic out” philosophy. And so it often doesn’t get tested at all. We can do better.

There are two basic kinds of infrastructure code:

Code that interfaces directly with the outside world. Your HTTP clients, database wrappers, etc. I call this “low-level infrastructure.”
Code that depends on low-level infrastructure. Your Auth0 and Stripe clients, your controllers and application logic. I call this “high-level infrastructure” and “Application/UI code.”

Low-level infrastructure should be wrapped in a dedicated class. I call these things “Infrastructure Wrappers,” ’cause I’m boring and like obvious names, but they’re also called “Gateways” or “Adapters.”

Because it talks to the outside world, this code needs to be tested for real, against actual outside world stuff. Otherwise, how do you know it works? For that, you can use Narrow Integration Tests. They’re like unit tests, except they talk to a test server. Hopefully a dedicated one.

High-level infrastructure should also be wrapped up in an Infrastructure Wrapper, but it can just delegate to the low-level code. So it doesn’t need to be tested against a real service—you can just check that it sends the correct JSON or whatever, and that it parses the return JSON correctly.

And parses garbage correctly. And error values. And failed connections. And timeouts.

(fratboy impression) Woo! Microservices rock!

Paranoic Telemetry

At this point, people ask, “But what if the service changes its API? Don’t you need to test against a real service to know your code still works?”

To which, I respond, “What, you think the service is going to wait for you to run your tests before changing its API?”

(Yeah, still kind of a dick.)

You need to have runtime telemetry and write your code to fail safe (and not just fall over) when it receives unexpected values. I call this “Paranoic Telemetry.”

Sure, when you first write the high-level wrapper, you’ll make sure you understand the API so you can test it properly, maybe do some manual test runs to confirm what the docs say.

But then you gotta have Paranoic Telemetry. They are out to get you.

True story: I was at a conference once and somebody—I think it was Recurly, but it might have been Auth0—changed their API in a way that utterly borked my login process.

My code had telemetry and failsafes, though, and handled it fine. I got paged and took care of it when I got back from the conference. Paranoia FTW.

Application Code

Moving up the call chain: Application code is like high-level infrastructure. It delegates, probably to the high-level infrastructure, which turns around and delegates to low-level infrastructure.

That raises the question: how do you test things that eventually talk to the outside world? Without using mocks, stubs, or spies?

And that’s where Nullables come in.

(“Finally!” some of you say. “Won’t this guy ever shut up?” the rest of you say.)

Nullables

Nullables are production code that can be turned off.

Let’s take a simple example. You’ve got a low-level wrapper for stdout called—creatively—Stdout. If it’s Nullable, then you can either say Stdout.create(), in which case it works normally, or you can say Stdout.createNull(), in which case it works normally in every respect except that it doesn’t write to stdout.

“Working normally” isn’t such a big deal for Stdout, because there’s no real logic or behavior there, but it is a big deal for more complicated infrastructure wrappers. For example, an HttpClient wrapper that returns a function that allows you to cancel requests. The cancel function still works with the Nulled HttpClient, even though it doesn’t actually make requests.

Your low-level infrastructure is Nullable, the high-level infrastructure that uses it is Nullable, and the application layer is Nullable. It’s Nullables all the way down. (Except in your logic layer, if you’re lucky enough to have one, which is beautiful and pure and mostly non-existent for us Morlocks.)

And the thing about Nullables is that they run real code and work normally in every way except that they don’t actually write to stdout, or make HTTP calls, or whatever.

That’s kind of a big deal for your tests, because it means that, when somebody changes your HttpClient abstraction in a totally cool, awesome, smart way, and they break all your shit, your tests fail.

Let me repeat that: your tests actually fail.

They learn that they broke your shit, and they fix it.

I don’t know about you, but that’s worth a certain amount of ugly tradeoffs to me.

The Tradeoff

So buckle up, because I’m about to reveal the granddaddy of all tradeoffs: the magic that makes this work.

Nullables run real code because, way, way down at the bottom of your dependency chain, in the lowest of low-level infrastructure wrappers, they’re implemented with an Embedded Stub.

An Embedded Stub is production code that stubs out your third-party infrastructure library.

It’s not a stub of your code; it’s a stub of the standard library, or framework, or what have you.

For example, in Node.js, you use http.request() to make an HTTP request. The Embedded Stub in HttpClient stubs out the standard http library. The stub is used when HttpClient.createNull() is called, and the normal http library is used when HttpClient.create() is called.

As a result, all your code runs the same regardless of whether it’s Nulled or not.

You’re probably looking for an example right about now. I get it. The Embedded Stub pattern has a simple JavaScript example of stubbing out Math’s random number generator, and a more complex one of stubbing out Node’s http.

If you like Java and Spring Boot, and who doesn’t, the Thin Wrapper pattern has examples of stubbing out Random and RestTemplateWrapper. (Cheers to Ted M. Young for creating these examples with me in our livestream.)

The Rest

The rest of the patterns are all about how you make this work in practice.

We’ve got things like Configurable Responses, which is how you control which data your Nullables return.

And Output Tracking, which is a way of keeping track of what your infrastructure code sends to the outside world.

And Behavior Simulation, which is a way of simulating events that come from the outside world, such as a POST request to a web page controller.

And there’s a whole section of patterns on how you work legacy code.

One of the neat things about the pattern language is that it’s totally compatible with your existing code.

This was a surprise! It wasn’t part of my original design goals.

But it turns ou that Nullables and all the other patterns, except the optional architecture patterns, can coexist side-by-side with your existing ~~sh—~~ lovingly handcrafted legacy code.

Like literally, even in the same test.

That means you can update a test to use Nullables by replacing exactly one mock and keeping everything else the same, run the tests, see them pass, and repeat.

That opens up some really nice opportunities for improving your codebase incrementally and gradually. And of course...

If it ain’t broke, don’t fix it.

Nullables and the rest of the patterns are a way of solving the problems I see with existing approaches to testing.

If you have slow and flaky tests...

If you have hard-to-read tests that you suspect are really only testing themselves...

If your code is hard to refactor...

...check them out.

And if you don’t have those problems, or they’re not so bad to be worth the Embedded Stub, you don’t have to use them.

Engineering is tradeoffs.

So choose the tradeoffs that are right for you.

For more resources related to Nullables and the “Testing Without Mocks” patterns, see the Nullables Hub.