Agile Fluency eBook in Portuguese

One of my most enduring works is the Agile Fluency Model, which I created with Diana Larsen. Our original article, The Agile Fluency Model: A Brief Guide to Success with Agile has been translated into multiple languages. And now... that includes Brazilian Portuguese!

Many thanks to Renato Barbieri for creating this translation for us. His book, Uma Breve História da Agilidade, tells the history of the Agile movement. I haven't read it yet—partly because I don’t know Portuguese—but I trust that it’s excellent.

Renato is donating all the royalties from the book to help victims of the current flooding in the south of Brazil. You can help him do so by buying his book here.

Free Self-Guided “Testing Without Mocks” Training

I’m thrilled to announce that my commercial “Testing Without Mocks” training course is now available for free!

My “Testing Without Mocks” resources—also known as “Nullables”—are consistently among the most popular material on this site. I used to offer an instructor-led course for it. But I’m too busy for that now, so I’ve released that same high-quality course in a self-guided format.

There’s just one caveat: the self-guided version of this course is offered without support. If you need tutoring or want a live, instructor-led course, contact me about paid options.

Other than that, it’s free for you to enjoy! Find it here.

A Useful Productivity Measure?

In my new role as VP of Engineering, there was one question I was dreading more than any other: “How are you measuring productivity?”

I can’t fault the question. I mean, sure, I’d rather it be phrased about how I’m improving productivity, rather than how I’m measuring it, but fair enough. I need to be accountable for engineering productivity. There are real problems in the org, I do need to fix them, and I need to demonstrate that I’m doing so.

Just one little problem: software productivity is famously unmeasurable. Martin Fowler: “Cannot Measure Productivity.” From 2003. 2003!

More recently, Kent Beck and Gergely Orosz tackled the same question. Kent concluded: “Measure developer productivity? Not possible.”1

1Kent and Gergely’s two-part article is excellent and worth reading. Part one. Part two. And a later followup.

So now what do I do? That’s it, I’m screwed, make up some bullshit metric and watch my soul die, McKinsey style? Fight with my CEO about his impossible request until he gives up and fires me?

Maybe not. I think I’ve found another way. It’s early, but it’s working for me so far. Will it work for you? Eeehhhhh... maybe. Probably not. But maybe.

My Solution

It started half a year ago, in September 2023. My CEO asked me how I was measuring productivity. I told him it wasn’t possible. He told me I was wrong. I took offense. It got heated.

After things cooled off, he invited me to his house to talk things over in person. (We’re a fully remote company, in different parts of the country, so face time takes some arranging.) I knew I couldn’t just blow off the request, so I decided to approach the question from the standpoint of accountability. How could I demonstrate to my CEO that I was being accountable to the org?

We met at the CEO’s house, along with the CTO and CPO (Chief Product Officer). I led them through an exercise: “Imagine we’ve built the best product engineering organization in the world. What does that look like?” We came up with six categories of ideas. Then I asked, “Which indicators will help us understand how we’re getting closer to these ideals?” We came up with indicators in each category. Blissfully, none of those categories were “productivity.” I don’t think anyone noticed. Bullet dodged.

I came away feeling fairly positive about the conversation. I discussed the results with my Engineering and Product peers, we refined, and I finally presented the first “Product Engineering Accountability Review” a few weeks ago. It went well! I used the indicators to support a qualitative discussion of what’s happening in Engineering, rather than just reporting numbers.

One problem: the CEO had a scheduling conflict and couldn’t come. So I don’t know what he would have thought. But at least the CTO and CPO liked it.

The Productivity OKR

Meanwhile, back in January, the leadership team had established that one of our company-wide OKRs1 would be to define and improve productivity metrics for each department. I was to present mine to the full Leadership team at the end of April. Crap. Bullet un-dodged.

1“OKRs” are “Objectives and Key Results.” They’re a way of setting and tracking goals. Similar to Management by Objectives, about which Deming said: “Eliminate management by objective. Eliminate management by numbers, numerical goals. Substitute leadership.” But that’s a rant for another day.

In October, we had defined six aspects of being the greatest product engineering company in the world. One of them was “profitability.” Its indicators were the most related to outcomes. If I had to measure productivity—and I did—they were the ones to use.

We had three indicators for profitability: actual RoI, estimated RoI, and value-add capacity. The first was best, in theory. In practice, it might be impossible to measure. Before I explain why, I need to explain how we calculate RoI.

Product Bets

Every engineering organization I’ve ever seen has had more demand than capacity. Prioritizing those demands is crucial. And fraught. Lots and lots of opportunity for conflict.

We’re no exception. To help bring order to the chaos, the VP of Product and I have introduced the idea of “Product Bets.” Each major initiative needs a Product Bet Proposal. It's a short, one-page document that explains:

  • What we’re going to accomplish
  • The value it’s estimated to generate
  • The amount we’re willing to bet
  • The justification for the value
  • How we’ll measure the value

In order for a proposal to be accepted, a member of the Leadership team needs to sponsor it, take accountability for its success, and convince the other Leadership members that their proposal is more important than all the others.

In theory, anyway. I’ve tried variants of this idea before, and it’s never lasted. Turns out leadership teams like accountability more when it’s other people who have to be accountable.

But we’re trying. So far... it’s kind of working. Maybe. Too early to tell, honestly. I’ll write more after the verdict’s in.

A True Measure of Productivity

But if the product bet process does work... well. That would be cool. It would give us a true measure of productivity. We would know the value of a bet, and it’s easy to know how much we spend on a bet. Value produced over dollars spent. Boom. Productivity. Done.

Even better, the numbers are nice. Very nice. Each bet has a “maximum wager,” which determines how many engineer-days we’ll invest in the bet before giving up. Those wagers are based on one tenth of the expected value over five years. In other words, 10x return on investment.

10x return on investment is enough to make anybody take notice. But... measuring value is a problem. Sure, each bet has an section on how we’ll measure value, but can we really tease that out from sales team effort, customer success team effort, other feature changes, and changes in the market? Probably not.

It might not matter. We may not be able to measure the actual value, but every bet also has an estimated value attached. Combined with actual cost, that gives us a measure of estimated RoI. It may not be real RoI, but it’s good enough for understanding the productivity of the engineering team.

That’s our first two productivity measures: actual RoI and estimated RoI. Pretty good. Except that we don’t have any data.

A Better Measure of Productivity... For Now

The RoI metrics rely on us having product bets. But we don’t. Not yet. We’re still rolling them out. So, no matter how good the metrics might be, we can’t use them. No data.

There’s a third indicator in the “profitability” category we can use, though. It’s value-add capacity.

Like any engineering organization, we spend some percent of our time on fixing bugs, performing maintenance, and other things that are necessary but don’t add value from a customer or user perspective. The Japanese term for this is muda.

If we didn’t have any muda, spent all our time on value-add work, and achieved a 10x return on each investment, our productivity would be ten: $10 for every $1 in salary (or close enough). If we spent 80% of our time on value-add work, our productivity would be eight. Twenty percent, two.

In other words, in the absence of RoI measures, the percent of engineering time spent on value-add activities is a pretty good proxy for productivity.

That’s the productivity number I reported to Leadership last week.

How It Was Received

It worked really well. The nice thing about reporting this number was that people were already frustrated with Engineering’s progress. They could see that we had capacity problems, but they didn’t know why. It was easy for them to assume that it was because people weren’t working hard, or didn’t know what they were doing.

I presented our metric as a single stacked bar chart. (Like a pie chart, but in a rectangle.) Muda on the bottom, value-add on the top. Then I expanded out the muda into another stacked bar chart, showing how much time was being spent across all of Engineering on deferred maintenance, bugs, on call, incident response, deployments, and so forth. Then expanded out again with more detail for the worst of those categories.

It completely changed the tenor of the conversation. Suddenly, the conversation shifted from, “how can we get the stuff we want sooner,” to “how can we decrease muda and spend more time on value-add work?” That’s exactly the conversation we need to be having.

Earlier in the week, the CEO told me that, next quarter, Leadership wants a briefing from me about how Engineering works. What my deliverables are, essentially. With the capacity measure, I have a good answer: my job is to double our value-add capacity over the next three years. Essentially, to double our output without increasing spending.

You know what? With my XP plans and the XP coaches I’ve hired, it’s totally doable. I think I’m being kind of conservative, actually.

A Fatal Flaw

So that’s my productivity measure: value-add capacity. The percentage of engineering time we spend on adding value for users and customers. Can you use it? Eehhhhh... maybe.

I can use value-add capacity—so far—because I’m aggressively stubborn about honest data. I refuse to skew our numbers or do worse work to make my department look good, and I’m keeping a close eye on what my teams are doing, too. This is important, because value-add capacity has a fatal flaw:

When a measure becomes a target, it ceases to be a good measure.

Goodhart’s Law

It’s ridiculously easy to cheat this metric. Even if you correctly categorize your muda—it’s very tempting to let edge cases slide—all you have to do is stop fixing bugs, defer some needed upgrades, ignore a security vulnerability... and poof! Happy numbers. At a horrible cost.

Actually, that’s the root of my org’s current capacity problems. They weren’t cheating a metric, but they were under pressure to deliver as much as possible. So they deferred a bunch of maintenance and took some questionable engineering shortcuts. Now they’re paying the price.

Unfortunately, you can get away with cheating this metric for a long time. Years, really. It’s not like you cut quality one month and then the truth comes out the next month. This is a metric that only works when people are scrupulously honest, including with themselves.

So, yeah, I’m not sure if this will work for you. It depends on how much ability you have to police things. The RoI indicators might not work for you either. They require product bets, or something similar, and that requires a lot of org changes. Even if they do work for me—jury’s out on that—they’re not something you can introduce overnight.

But, so far, value-add capacity is working for me, and I thought that might be interesting to you. Maybe spark some ideas. Just be cautious. Goodhart’s Law is a vengeful bastard. Remember, all this productivity metric stuff is a sideshow to what really matters:

Deliver valuable software. Do it often. And write it well.

Good luck.

A Software Engineering Career Ladder

I’ve been quiet lately, and that’s because I’ve joined OpenSesame as Vice President of Engineering. It’s been a fascinating opportunity to rebuild an engineering organization from the inside, and I’m loving every minute. We’re introducing a lot of cutting-edge software development practices, such as self-organizing vertically-scaled teams and Extreme Programming.

As you might expect, introducing these changes to an organization with [REDACTED] number of engineers has been challenging. (I’m not sure if I’m allowed to say how many engineers we have, so let’s just say “lots,” but not “tons.” Bigger than a breadbox, anyway. Enough that I don’t do any coding myself, and the managers that report to me don’t have time to do much either.)

What I’m really doing is changing the engineering culture at OpenSesame. Culture doesn’t change easily. It tends to snap back. True change involves changing hundreds of little day-to-day decisions. That’s hard, even when people want to make those changes, and full buy-in is hard to come by. I’ve hired several XP coaches to help, but even they’re stretched thin.

A Lever for Change

This is where the new career ladder comes in. OpenSesame had a pretty innovative approach to career development before I joined. It involved a spreadsheet where engineers would gather evidence of their skills. Each piece of evidence contributed towards an engineer’s promotion. It did a nice job of being objective (or at least, as objective as these things can be) and clear about expectations.

The new career ladder builds on the ideas of the previous spreadsheet to introduce the changes I want. Where the old spreadsheet focused on individual ownership and investigating new technologies, the new one emphasizes teamwork, peer leadership, and maintainable code. I’m hoping this will help direct people to new behaviors, which will in turn start to change the engineering culture.

The new spreadsheet also replaces the previous evidence-based approach with a simple manager-led evaluation of skills. This makes room for a lot more skills. Too many, possibly. It’s a very fine-grained approach. But I’m hoping that will help provide clarity to engineers and give them the opportunity to pick and choose which skills they want to work on first.

How It Works

Each title has certain skill requirements, which are grouped into skill sets. For example, “Software Engineer” requires these skill sets:

  • Basic Communication
  • Basic Leadership
  • Basic Product
  • Basic Implementation
  • Basic Design
  • Basic Operations

Each skill set includes several skills. For example, “Basic Design” includes these skills:

  • Decompose problem into tasks
  • Class abstraction
  • Mental model of your team’s codebase
  • Mental model of a complex dependency
  • Campsite rule
  • Fail fast
  • Paranoiac telemetry
  • Evaluate simple dependencies

(There’s a document that explains each skill in more detail.)

Managers evaluate each engineers’ skills by talking to team members and observing their work. Each skill is graded on this scale:

  • None. The engineer doesn’t have this skill.
  • Learning. The engineer is learning this skill.
  • Proficient. The engineer can succeed at the skill when they concentrate on it, but it isn’t second nature.
  • Fluent. The engineer uses the skill automatically, without special effort, whenever it’s appropriate.

When an employee is fluent at all the skills for a particular title (and all previous titles), they’re eligible for promotion to that title.

(We also offer step promotions, such as Software Engineer 1 to Software Engineer 2, which come when the engineer is proportionally far along their way to the next title.)

Submitted for Your Approval

Why tell you all this? Because I want your feedback. We have an early draft that we’re starting to roll out to a handful of engineers. I’m sure there are opportunities for improvement. We’ve probably forgotten some skills, or set the bar too high in some areas, or too low.

So I’d love for you to take a look and share what you think. Maybe you’ll find some of the ideas useful for your own teams, too. You can find the spreadsheet and documentation here:

Please share your feedback in one of these places:

Full Career Ladder

Here’s the full list of titles and skills. You can find descriptions of each skill in the documentation.

Associate Software Engineers

Associate Software Engineer 1s are at the start of their career. They’re expected to understand the basics of software development, and be able to work in a professional setting, but they’re mostly working under the guidance of more experienced engineers.

  • Professionalism

    • Spoken and written English
    • Work ethic
    • Intrinsic motivation
    • Remote attendance
    • In-person attendance
    • Active participation
    • Respectful communication
    • Transparency
    • Team orientation
    • Follow the process
    • Grit
    • Absorb feedback
    • Growth mindset
    • OpenSesame Qualified1
  • Classroom Engineering

    • Object-oriented programming language
    • Pairing/teaming driver
    • Classroom-level debugging
    • Function and variable abstraction

1“OpenSesame Qualified” is our internal training program.

Software Engineers

Software Engineer 1s still have a lot to learn, but they’re able to contribute to the work of their team without explicit guidance. They’re beginning to demonstrate peer leadership skills and develop their abilities as generalizing specialists.

  • Basic Communication

    • Collective ownership
    • Defend a contrary stance
    • “Yes, and...”
    • Try it their way
    • Technical feedback
    • Active listening
    • As-built documentation
  • Basic Leadership

    • Basic facilitation
    • Team steward
    • Valuable increment steward
    • Scut work
  • Basic Product

    • Your team’s product
    • Your team’s customers and users
    • User story definition
  • Basic Implementation

    • Your team’s programming language
    • Your team’s codebase
    • Basic test-driven development
    • Sociable unit tests
    • Narrow integration tests
    • End-to-end tests
    • Manual validation
    • Spike solutions
    • Basic SQL
    • Pairing/teaming navigator
    • Basic algorithms
    • Basic performance optimization
    • Debugging your team’s components
    • Simple dependency integration
    • Unhappy path thinking
  • Basic Design

    • Decompose problem into tasks
    • Class abstraction
    • Mental model of your team’s codebase
    • Mental model of a complex dependency
    • Method and variable refactoring
    • Campsite rule
    • Fail fast
    • Paranoiac telemetry
    • Evaluate simple dependencies
  • Basic Operations

    • Source control
    • Your team’s release process
    • On-call responsibility
    • On-call triaging
    • Issue investigation
    • Your team’s cloud infrastructure
    • Code vulnerability awareness
    • Cloud vulnerability awareness

Senior Software Engineers

Despite the name, Senior Software Engineer 1s are still fairly early in their careers. However, they have enough experience to take a strong peer leadership role in their teams. They’ve developed broader generalist skills and deeper specialist skills.

  • Advanced Communication

    • Clear and concise speaking
    • Clear and concise writing
    • Technical diagramming
    • Explain mental model
    • Ensure everyone’s voice is heard
    • Coalition building
    • Interpersonal feedback
    • Runbook documentation
  • Advanced Leadership

    • Peer leadership
    • Comfort with ambiguity
    • Risk management
    • Intermediate facilitation
    • Mentoring and coaching
    • Critique the process
    • Circles and soup
  • Advanced Product

    • Ownership
    • Vertical slices
    • Cost/value optimization
  • Advanced Implementation

    • All of your team’s programming languages
    • All of your team’s codebases
    • Codebase specialty
    • Code performance optimization
    • Complex dependency integration
    • Retrofitting tests
    • Exploratory testing
  • Advanced Design

    • Codebase design
    • Simple design
    • Reflective design
    • Cross-class refactoring
    • Basic database design
    • Mental model of team dependencies
    • Evaluate complex dependencies
    • Simplify and remove dependencies
  • Advanced Operations

    • Observability
    • Basic build automation
    • Basic deployment automation
    • Incident leader
    • Incident communicator
    • Incident fixer
  • Senior SE Specialty

    • Choose one of the specialty skill sets listed below.

Technical Leads

Technical Leads are the backbone of a team. They combine deep expertise in several specialties with the ability to mentor and coach less experienced team members. They work closely with the team’s other technical leads to advise engineering managers on the capabilities and needs of the team. However, this remains a coding-centric role, and the majority of their time is spent as a player-coach working alongside other team members.

  • Team Leadership

    • Personal authority
    • Leaderful teams
    • Leadership specialty
    • Assess technical skills
    • Assess interpersonal skills
    • Assess product skills
    • Technical interview
    • Impediment removal
  • Interpersonal Leadership

    • Humility
    • Psychological safety
    • Calm the flames
    • Ignite the spark
  • Product Leadership

    • Options thinking
    • Status and forecasting
    • Progress and priorities
  • Design Leadership

    • Simple codebase architecture
    • Reflective codebase architecture
    • Risk-driven codebase architecture
    • Architectural refactoring
    • Published API design
  • Technical Lead Specialties

    • Choose three(?) additional specialty skill sets.

Staff Engineers

Staff Engineers make a difference to the performance of Engineering as a whole. They rove between teams, cross-pollinating information and ideas. They work hands-on with each team, acting as player-coaches, bringing a breadth and depth of expertise that people are happy to learn from.

These skill sets haven’t been defined yet.

Principal Engineers

This level hasn’t been defined yet.

Specialty Skill Sets

Starting at the Senior Software Engineer level, engineers choose specialty skill sets in additional to the foundational skill sets described above. We haven’t defined these skill sets yet, but here are some of the ones we’re considering:

  • Product
  • Distributed systems
  • Databases
  • Security
  • Extreme Programming
  • Developer Automation
  • Algorithms
  • Machine Learning
  • Front-End
  • iOS
  • Android

Feedback

Please share your thoughts!

Art of Agile Development in Korean

Book cover for the Korean translation of “The Art of Agile Development, Second Edition” by James Shore. The title reads, “[국내도서] 애자일 개발의 기술 2/e”. It’s translated by 김모세 and published by O’Reilly. Other than translated text, the cover is the same as the English edition, showing a water glass containing a goldfish and a small sapling with green leaves.

I’m pleased to announce that the Korean translation of The Art of Agile Development is now available! You can buy it here.

Many thanks to 김모세 for their hard work on this translation.

Art of Agile Development in India and Africa (English)

Book cover for the Indian edition of “The Art of Agile Development, Second Edition” by James Shore. It’s the same as the normal edition, showing a water glass containing a goldfish and a small sapling with green leaves, except that the publisher is listed as SPD as well as O’Reilly. There’s also a black badge labelled “Greyscale Edition” that reads, “For Sale in the Indian Subcontinent and Selected Countries Only (refer back cover).”

I’m pleased to announce that there’s a special edition of The Art of Agile Development available in the Indian subcontinent and Africa! (It’s in English.) You can buy it here.

Many thanks to Shroff Publishers & Distributors Pvt. Ltd. (SPD) for making this edition available.

AI Chronicles #7: Configurable Client

In this weekly livestream series, Ted M. Young and I build an AI-powered role-playing game using React, Spring Boot, and Nullables. And, of course, plenty of discussion about design, architecture, and effective programming practices.

Watch us live every Monday! For details, see the event page. For more episode recordings, see the episode archive.

In this episode...

We turn to parsing the response returned from the “say” API, which will be the OpenAI response to a message. To do that, we add the ability to configure the nullable HttpClient so it returns predetermined responses from our tests. We discover that using the HTTP library's Response object provides a default Content Type, which we don't want for our tests, and deal with the window vs. Global implementation of fetch().

After we get everything working, we add types to make the TypeScript type checker happy. With that done, we're ready for the next episode, where we'll return to the Spring Boot back-end and implement the “say” API endpoint.

Contents

  • Ted spoke at the Kansas City Developer Conference (0:16)
  • Lack of Java-focused conferences in the USA (1:19)
  • Conferences in the USA vs. Europe/rest of world (2:10)
  • Trying to aim talks at the right level for the audience (4:38)
  • Ted's research to prepare for the AssertJ talk (6:05)
  • AssertJ assertions for Joda Money (7:23)
  • Joda Money project vs. JSR-354 Money & Currency (8:45)
  • Programming language cultures (10:34)
  • Checked exceptions and API design (12:20)
  • Language convention vs. enforced rules (14:02)
  • Why we wrap third-party libraries and objects (17:49)
  • Primitive Obsession (20:16)
  • Exploring and learning your tools (22:51)
  • Learning design from Martin Fowler's "Refactoring" book (23:25)
  • Small changes, small steps (24:39)
  • Loss of awareness of design? (25:32)
  • Reading books in a group (29:12)
  • Refactorings and their trade-offs (30:29)
  • James talks about CTO vs. VP Engineering (31:56)
  • Reviewing where we left off in the code (34:30)
  • Sidebar: forgetting what you were doing in a project (39:27)
  • Planning and doing one thing at a time (41:01)
  • Context-switching in a heavy pull-request environment (41:48)
  • Feedback loops and eXtreme Programming (42:55)
  • Testing parsing of responses in the BackEndClient (45:43)
  • Sidebar: who holds the state? (49:06)
  • Configuring the answer for the BackEndClient (50:05)
  • Test-driving HttpClient's default response (54:45)
  • Where is that text/plain content type coming from? (1:05:18)
  • Sidebar: differencing in test output and coding in VB, and QB (1:06:10)
  • Should our stubbed fetch() return content length? (1:09:06)
  • Configuring HttpClient's response for an endpoint (1:10:48)
  • Discovered need to specify full endpoint URL, not just path (1:14:58)
  • Test failed as expected, on to implementation (1:17:37)
  • Who has fetch()? Window vs. Global vs. globalThis (1:21:26)
  • Using Optional chaining and nullish coalescing (1:29:23)
  • Troubleshooting "headers.entries" (1:30:09)
  • Specifying content-type in configured response (1:36:30)
  • Generalize to allow partially configured response (1:42:10)
  • Sidebar on readability of "advanced" syntax in code (1:48:40)
  • Allowing multiple endpoints to be configured (1:52:13)
  • Avoiding real-world values in configuration tests (1:58:42)
  • Spiking some attempts at improving code (1:59:20)
  • Adding types to make TypeScript type checker happy (2:05:10)
  • Defining own type often easier than reusing library types (2:14:24)
  • Back to the BackEndClient failing test (2:16:12)
  • Refactor test code now that it passes (2:20:36)
  • Reviewing the test refactor (2:30:45)
  • BackEndClient is done: updated the plan and integrated (2:31:32)
  • Next time we'll start with the Spring Boot back-end endpoint (2:33:10)
  • Review our work (2:34:05)
  • Downside of sociable vs. isolated tests with mocks (2:35:02)
  • The Rubber Chicken (2:38:18)

Source code

Visit the episode archive for more.

AI Chronicles #6: Output Tracker

In this weekly livestream series, Ted M. Young and I build an AI-powered role-playing game using React, Spring Boot, and Nullables. And, of course, plenty of discussion about design, architecture, and effective programming practices.

Watch us live every Monday! For details, see the event page. For more episode recordings, see the episode archive.

In this episode...

We continue working on our front end. After some conversation about working in small steps, we turn our attention to BackEndClient, our front-end wrapper for communication to the back-end server. We start out by writing a test to define the back-end API, then modify our HttpClient wrapper to track requests. By the end of the episode, we have the back-end requests tested and working.

Contents

  • Program Note (0:12)
  • Multi-Step Refactorings (2:26)
  • Work in Small Steps (6:31)
  • Evaluating Complexity (11:11)
  • Collaborative Development (19:04)
  • Continuous Improvement (25:24)
  • Fixing the Typechecker (28:02)
  • Today’s Plan (31:11)
  • James Shore’s Housecleaning Tips (33:03)
  • Build the BackEndClient (35:59)
    • Sidebar: Delaying Good Code (38:48)
    • End sidebar (41:41)
  • Make HttpClient Nullable (51:02)
    • Sidebar: Tuple (58:48)
    • End sidebar (1:00:29)
  • Stubbing the fetch() Response (1:09:24)
    • Sidebar: In-Browser Testing (1:26:37)
  • Build OutputListener (1:28:00)
  • Request Tracking (1:55:04)
    • Sidebar: Sidebar: Lint Error (2:12:54)
    • End sidebar (2:16:50)
  • Back to the BackEndClient (2:33:51)
  • Debrief (2:44:23)

Source code

Visit the episode archive for more.

AI Chronicles #5: fetch() Wraps

In this weekly livestream series, Ted M. Young and I build an AI-powered role-playing game using React, Spring Boot, and Nullables. And, of course, plenty of discussion about design, architecture, and effective programming practices.

Watch us live every Monday! For details, see the event page. For more episode recordings, see the episode archive.

In this episode...

It’s an eventful episode as we start off with a discussion of event sourcing, event-driven code, event storming, and more. Then we return to working on our fetch() wrapper. We factor our test-based prototype into a real production class, clean up the tests, and add TypeScript types.

Contents

  • Event Sourcing (0:22)
  • Event-Driven Code (11:11)
  • Event Storming (23:30)
  • The Original Sin of Software Scaling (27:23)
  • Refactoring Events (28:45)
  • Naming Conventions (32:47)
  • Java 21 (42:08)
  • Inappropriate Abstractions (44:25)
  • Let’s Do Some Coding (53:01)
  • Design the fetch() Wrapper (56:34)
  • Factor Out HttpClient (1:17:41)
  • Add TypeScript Types (1:23:40)
  • Node/TypeScript Incompatibility (1:39:52)
  • Clean Up the Tests (1:58:36)
  • Close SpyServer with Extreme Prejudice (2:05:06)
  • Back to Cleaning Up Tests (2:11:08)
  • Debrief (2:32:22)

Source code

Visit the episode archive for more.

AI Chronicles #4: fetch() Quest

In this weekly livestream series, Ted M. Young and I build an AI-powered role-playing game using React, Spring Boot, and Nullables. And, of course, plenty of discussion about design, architecture, and effective programming practices.

Watch us live every Monday! For details, see the event page. For more episode recordings, see the episode archive.

In this episode...

It’s an “all rants, all the time” episode—at the least for the first hour. We start out talking about the role of engineering leadership in a company. Then it’s a discussion of evolutionary design and the costs of change. Then teaching through pairing. Finally, we buckle down to work, and make solid progress on a prototype integration test for the front-end fetch() wrapper.

Contents

  • Engineering Leadership (0:13)
  • Where We Left Off (23:53)
  • What We’re Going To Do Today (28:05)
  • Evolutionary Design (37:42)
  • Teaching Through Pairing (53:16)
  • WTF: Wholesome Test Framework (57:36)
  • Create a Spy Server (1:01:48)
  • fetch() (1:16:25)
  • The Server Isn’t Closing (1:26:06)
  • Look At the fetch() Response (1:40:29)
  • Factor Out the Server (1:49:41)
  • Compilation Error (1:58:08)
  • SpyServer.lastRequest (2:16:23)
  • SpyServer.setResponse() (2:30:34)
  • Prepare for Production (2:41:29)
  • Debrief (2:47:50)

Source code

Visit the episode archive for more.

Last Chance to Sign Up for “Testing Without Mocks” Training

If you're interested in my Nullables testing technique, this is your last chance to sign up for my "Testing Without Mocks" course. Ticket sales close this Thursday morning at midnight GMT and I don't plan to offer it again until October at the earliest.

Learn more and sign up here.

AI Chronicles #3: Fail Faster

In this weekly livestream series, Ted M. Young and I build an AI-powered role-playing game using React, Spring Boot, and Nullables. And, of course, plenty of discussion about design, architecture, and effective programming practices.

Watch us live every Monday! For details, see the event page. For more episode recordings, see the episode archive.

In this episode...

In a coding-heavy episode, we wrap up our OpenAiClient wrapper. Along the way, a confusing test failure inspires us to make our code fail faster. Then, in the final half hour of the show, we implement a placeholder front-end web site and make plans to connect it to the back end.

Contents

  • Intrinsic vs. Extrinsic Motivation (0:14)
  • Coaching Teams (14:26)
  • OpenAiClient Recap (20:36)
  • Failing OpenAiClient Test (27:54)
  • Parse the OpenAI Response (38:55)
  • OpenAiResponseBody DTO (54:37)
  • OpenAI Documentation (1:04:43)
  • Self-Documenting OpenAI (1:10:27)
  • Back to Parsing the OpenAI Response (1:13:48)
  • A Confusing Test Failure (1:21:07)
  • Fail Faster (1:26:47)
  • Guard Clauses (1:54:38)
  • Manually Test OpenAiClient (2:00:47)
  • The DTO Testing Gap (2:14:35)
  • Our Next Story (2:19:43)
  • Front-End Walkthrough (2:25:47)
  • Placeholder Front-End (2:28:01)
  • Integrate (2:41:31)
  • Debrief (2:45:29)

Source code

Visit the episode archive for more.

AI Chronicles #2: Faster Builds

In this weekly livestream series, Ted M. Young and I build an AI-powered role-playing game using React, Spring Boot, and Nullables. And, of course, plenty of discussion about design, architecture, and effective programming practices.

Watch us live every Monday! For details, see the event page. For more episode recordings, see the episode archive.

In this episode...

It’s a two-fer! In the first half, we look at the work James did on speeding up the front-end build, including a questionable choice to use a custom test framework. We debug a problem with the incremental build and end up with a nice, speedy build.

In the second half, we continue working on the OpenAiClient wrapper. The code POSTs to the Open AI service, but it doesn’t parse the responses. In order to implement that parsing, we modify JsonHttpClient to return POST responses and add the ability to configure those responses in our tests.

Contents

  • A Confession (0:11)
  • Buy vs. Build (12:42)
  • Incremental Compilation (24:47)
    • Sidebar: Why We Write Tests (45:26)
    • Sidebar: What Pairing is Good For (48:37)
    • End sidebar (51:08)
  • Clean Up the Build (54:10)
  • Failing the Build (58:32)
    • Sidebar: Using Booleans (1:05:50)
    • End sidebar (1:07:53)
  • Compilation Helper (1:22:08)
  • Bespoke Tooling (1:26:14)
  • Back to OpenAiClient (1:29:20)
    • Sidebar: TDD Reduces Stress (1:39:27)
    • End sidebar (1:40:23)
  • Reformatting and Merge Conflict (1:42:59)
  • Configuring the POST Response (1:52:28)
  • Refactor the ExampleDto (1:54:44)
  • Return Configured Values (2:02:35)
    • Sidebar: Repeating Yourself (2:03:27)
    • End sidebar (2:09:57)
  • Fine-Tuning the JsonHttpClient Tests (2:17:57)
    • Sidebar: Test Everything, or Just Enough? (2:24:56)
    • End sidebar (2:27:01)
    • Sidebar: When to Stop Pondering Design (2:33:31)
    • End sidebar (2:35:10)
  • Spring Complaints (2:36:03)
  • Frameworks vs. Libraries (2:40:50)
  • Microservices and Team Size (2:47:57)
  • Debrief (2:50:40)

Source code

Visit the episode archive for more.

The AI Chronicles #1

In this weekly livestream series, Ted M. Young and I build an AI-powered role-playing game using React, Spring Boot, and Nullables. And, of course, plenty of discussion about design, architecture, and effective programming practices.

Watch us live every Monday! For details, see the event page. For more episode recordings, see the episode archive.

In this episode...

Our new stream! We explain the goals of the project—to create an AI-powered role-playing game—then get to work. Our first task is to create a Nullable wrapper for the OpenAI service. The work goes smoothly, and by the end of the episode, we have an OpenAiClient that sends POST requests to the service.

Contents

  • About the Project (0:14)
  • What We’re Building (2:03)
  • Outside-In vs. Bottom-Up Design (14:17)
  • Structure of the Code (41:08)
  • Fake It Once You Make It (44:41)
  • Manual POST to OpenAI (47:47)
  • Start the OpenAiClient (1:01:24)
  • Import HttpClient (1:08:32)
  • Back to the OpenAIClient (1:21:19)
    • Sidebar: Configuration vs. Constants (1:23:04)
    • End sidebar (1:24:59)
  • Support HTTP headers (1:27:44)
    • Sidebar: Why Wrappers and Nullables? (1:43:32)
    • End sidebar (1:50:38)
    • Sidebar: Documenting APIs (2:03:50)
    • Sidebar: LLMs and Documentation (2:10:39)
    • End sidebar (2:13:07)
  • Tracking HTTP headers (2:17:02)
  • Finish the OpenAIClient POST (2:27:30)
    • Sidebar: New Hotness Syndrome (2:38:33)
    • Sidebar: TypeScript (2:41:07)
    • End sidebar (2:43:46)
  • Conclusion (2:46:44)

Source code

Visit the episode archive for more.

How Are Nullables Different From Mocks?

One of the most common questions I get about Nullables is, “How is that any different than a mock?” The short answer is that Nullables result in sociable, state-based tests, and mocks (and spies) result in solitary, interaction-based tests. This has two major benefits:

  1. Nullables catch bugs that mocks don’t.
  2. Nullables don’t break when you refactor.

Let’s dig deeper.

  1. Why They’re Different
  2. Nullables Catch More Bugs
  3. Nullables Don’t Break When You Refactor
  4. Conclusion

Why They’re Different

Imagine you have a class named HomePageController. It has a dependency, Rot13Client, that it uses to make calls to an external service. Rot13Client in turn depends on HttpClient to make the actual HTTP call to the service.

A class diagram for the example. HomePageController has an arrow pointing to Rot13Client, which has an arrow pointing to HttpClient. HttpClient has a jagged arrow pointing to Rot13Server. A class diagram for the example. HomePageController has an arrow pointing to Rot13Client, which has an arrow pointing to HttpClient. HttpClient has a jagged arrow pointing to Rot13Server.

Example design

A mock-based test of HomePageController will inject MockRot13Client in place of the real Rot13Client. It validates HomePageController by checking that the correct methods were called on the MockRot13Client.

The example design has been expanded with a test class pointing at HomePageController. The connection to Rot13Client has been x’d out and replaced with a connection to MockRot13Client. Rot13Client and all its dependencies are greyed out.The example design has been expanded with a test class pointing at HomePageController. The connection to Rot13Client has been x’d out and replaced with a connection to MockRot13Client. Rot13Client and all its dependencies are greyed out.

A mock-based test

This mock-based test is a “solitary, interaction-based test.” It’s solitary because the HomePageController is isolated from its real dependencies, and it’s interaction-based because the test checks how HomePageController interacts with its dependencies.

In contrast, a Nullable-based test of HomePageController will inject a real Rot13Client. The Rot13Client will be “nulled”—it’s configured not to talk to external systems—but other than that, it’s the exact same code that runs in production. The test validates HomePageController by checking its state and return values.

The example design has been expanded with a test class pointing at HomePageController. There is no mock class; instead, HomePageController depends on Rot13Client, which depends on HttpClient. Each of these connections is marked “nulled.” The jagged connection between HttpClient and Rot13Service has been x’d out. Rot13Service is greyed out.The example design has been expanded with a test class pointing at HomePageController. There is no mock class; instead, HomePageController depends on Rot13Client, which depends on HttpClient. Each of these connections is marked “nulled.” The jagged connection between HttpClient and Rot13Service has been x’d out. Rot13Service is greyed out.

A Nullable-based test

This is a “sociable, state-based test.” It’s sociable because the HomePageController talks to its real dependencies, and they talk to their real dependencies, and so on, all the way to the edge of the system. It’s state-based because the test checks HomePageController’s state and return values, not its interactions.

Nullables Catch More Bugs

Bugs tend to live in the boundaries. Imagine that someone intentionally changes the behavior of Rot13Client, not realizing that HomePageController relies on the old behavior. Now HomePageController doesn’t work properly. A well-meaning change to Rot13Client has introduced a bug in HomePageController.

Solitary tests, such as mock-based tests, can’t catch that bug. HomePageController’s tests don’t run the real Rot13Client, so they don’t see that the behavior is changed. The tests continue to pass, even though the code has a bug.

The “mock-based test” diagram has been annotated. It says, “A change here (Rot13Client) has an unexpected side effect here (HomePageController) and the mock (MockRot13Client) hides it. (Crying face emoji.)”The “mock-based test” diagram has been annotated. It says, “A change here (Rot13Client) has an unexpected side effect here (HomePageController) and the mock (MockRot13Client) hides it. (Crying face emoji.)”

How mocks hide bugs

Sociable tests, including Nullable-based tests, do catch that bug. That’s because HomePageController’s tests run the real Rot13Client. When its behavior changes, so do the tests results. The tests fail, revealing the bug.

The “Nullables-based test” diagram has been annotated. It says, “A change here (Rot13Client) has an unexpected side effect here (HomePageController) and it’s caught here (the test). (Celebration emoji.)”The “Nullables-based test” diagram has been annotated. It says, “A change here (Rot13Client) has an unexpected side effect here (HomePageController) and it’s caught here (the test). (Celebration emoji.)”

How Nullables reveal bugs

Nullables Don’t Break When You Refactor

Imagine that you need to change the Rot13Client API to support cancelling requests. You change its API, and when you do, you also update HomePageController to use the new API.

Interaction-based tests, such as mock-based tests, will break when you make this change. They’re expecting HomePageController to call the old API, and now it calls the new API.1

1Automated refactoring tools can prevent this problem, but not in every case.

The “mock-based test” diagram has been annotated. It says, “A design change here (Rot13Client) causes a failure here (the test) until the change is duplicated here (MockRot13Client).”The “mock-based test” diagram has been annotated. It says, “A design change here (Rot13Client) causes a failure here (the test) until the change is duplicated here (MockRot13Client).”

How mocks prevent refactoring

State-based tests, in contrast, won’t break when you refactor a dependency. The test checks the output of the HomePageController, not the methods it calls. As long as the code continues to return the correct value, the test will continue to pass.

The “mock-based test” diagram has been annotated. It says, “A design change here (Rot13Client) causes a failure here (the test) until the change is duplicated here (MockRot13Client).”The “mock-based test” diagram has been annotated. It says, “A design change here (Rot13Client) causes a failure here (the test) until the change is duplicated here (MockRot13Client).”

How Nullables support refactoring

Conclusion

Although Nullables and mocks seem similar at first glance, they take opposite approaches to testing. Nullables are sociable and state-based; mocks are solitary and interaction-based. This allows Nullable-based tests to catch more bugs and support more refactorings.

For more resources related to Nullables and the “Testing Without Mocks” patterns, see the Nullables Hub.