Test-Driven Javascript

Last year, I founded a start-up with Arlo Belshee and Kim Wallmark. It didn't go anywhere, but one legacy of that project was some interesting solutions to the problem of using test-driven development (TDD) with Javascript code.

We're weren't the first ones to solve this particular problem, but the question of how to test Javascript comes up often enough that I though it would be interesting to describe our approach here.

Our techniques were far from perfect. We were new to Javascript and AJAX, solved just the problems we needed to and no more, and only worked for about a month. Our ideas are likely obvious to anyone who's solved this problem before, and I'm sure the code would be cleaner if we had worked on it longer. I doubt we came up with the best answers. So use these ideas as a starting point, not as gospel.

Our approach can be divided into three parts:

  • Run Your Tests in the Browser
  • Automate Cross-Browser Testing
  • Isolate the Client From the Server

Part I: Run Your Tests in the Browser

Early approaches to TDD'ing Javascript involved running command-line Javascript interpreters. That's nice for automation, but the majority of Javascript code interacts with the browser DOM. That's where the bugs are too. A tool could simulate the DOM, but every browser is slightly different and making a truly bug-for-bug compatible DOM simulator isn't really feasible.

Running tests directly in the browser avoids that can of worms. We used QUnit for the task; to make it work, you provide a boilerplate HTML page and your tests, then load the HTML page into a browser to run.

QUnit wasn't actually that good: the UI was primitive and the API minimalistic at best. So I'm not recommending it. I think we chose it because it was the testing tool used by the excellent JQuery library's developers. If I were to choose a TDD tool today, I'd look at alternatives.

Still, QUnit worked well enough. The biggest problem was that it required us to duplicate the HTML of the page under test. It would have been better operate directly against the page under test. I thought about seeing if an iframe would allow that to work, but never tried gave it a serious look. There might be security restrictions that prevent that approach.

Part II: Automate Cross-Browser Testing

We used Watir and FireWatir to automate our tests and perform cross-browser testing. I'm a big believer in fast, automated builds. Watir was perfect for our needs:

  • It runs a browser rather than simulating the DOM
  • It runs both IE (Watir) and Firefox (FireWatir) with the same API
  • It has a great API and plays well with Rake, my preferred build scripting language

Having an automated command-line build that ran all of our tests against multiple browsers allowed us to develop in Firefox, using the excellent Firebug plug-in, but still fully test our code on IE. We found some awesome1 incompatibilities between Firefox and IE this way. My favorite was when we discovered that the variable top (or something similar) worked fine in Firefox but was verboten in IE.

1"Awesome" as in "That was so awesome I must now gouge my brain out with a spatula! Yay!"

Crashes deserve special mention. I don't know if the problem was specific to QUnit or if this is just a Javascript problem, but it was entirely possible for the tests to fail--perhaps because of a missing semicolon--and us to not know about it. The QUnit test runner would just report fewer test runs. (The mind boggles.) We would usually notice that our test counts had dropped when running the tests manually while developing in Firefox, but not when the automated IE test ran from the command-line. We fixed the problem by comparing the total number of tests across the two browsers. If they didn't match, we knew there was a problem.

Here's our (Fire)Watir code for automated cross-browser testing. To use this, we called run_qunit from our rake build. (More about that in a previous essay.)

require 'watir'
require 'firewatir'

def run_qunit
   ieNum = run_on(Watir::IE.new, "IE")
   firefoxNum = run_on(FireWatir::Firefox.new, "Firefox")
  
   if(ieNum != firefoxNum)
     print "FAILED\n";
     raise "Test counts don't match (IE: " + ieNum.to_s + "; Firefox: " + firefoxNum.to_s + ")"
   end
end

def run_on(browser, name)
   printf "Testing %s... ", name
  
   numTests = test_page(browser, name, "home");
   numTests += test_page(browser, name, "pagewide");
   # etc
  
   print "ok\n"
   browser.close
   return numTests;
end

def test_page(browser, name, filename)
   browser.goto "http://localhost:8086/test/" + filename
  
   failures = Integer(browser.span(:class, "bad").text)
   numTests = Integer(browser.span(:class, "all").text)
  
   if (failures == 0 && numTests != 0)
     return numTests;
   else
     print "FAILED\n"
     raise name + " failed " + failures.to_s + " of " + numTests.to_s + " tests on " + filename
   end
end

I understand that there's an even better ways to do cross-browser Javascript testing now, but I haven't tried any of them. Again, use these ideas as a starting point for your own exploration.

Part III: Isolate the Client From the Server

Almost every Javascript test will end up triggering a call to the server if you're not careful. That's a problem, because that means you have to do expensive and difficult server-side test setup. Generally, you end up in the land of end-to-end tests, which are slow, brittle, and lead to a false sense of confidence. To prevent this problem, we isolated our client-side tests from the server-side code.

Our application was heavily AJAX-based. The meat of the program was just one web page that had a lot of Javascript and made a lot of calls to the server. In order to isolate our tests, we stubbed out the AJAX calls. We solved this one thanks to Arlo's code-foo and some tricky2 hacks. Using a function called check_for_ajax (and other related functions), we actually cancelled the JQuery-based AJAX call mid-stream and replace it with our own handler.

2Some might say "nasty," but we ignore them. Pttthhbbbt.

These tools gave us the ability to stub out the server, but we also designed our code so that most tests didn't have to worry about it. Only the tests that were directly executing AJAX-related code needed to be isolated from the server, because the other tests didn't trigger server calls.

Here's an example of a test using check_for_ajax. In this example, we were testing that a particular UI object (a "token") was marked as pending--which meant it would pulse using JQuery's animation capabilities--while the AJAX call was in progress:


test('token is marked "pending" while it is being sent to server', function() {
  var token;
  check_for_ajax(
    function(request) {
      // Runs the code under test
      token = battlemat.click(3, 4);
    },
    {
      before_send: function(request) {
        // Runs after AJAX call is made, but before (stubbed-out) HTTP call
        ok(token.is_pending(), 'should be pending while ajax call is in progress');
      },
      populate_xhr: function(xhr) {
        // The return value from the AJAX call
        xhr.status = 200;
      },
      after_response: function(request) {
        // Runs after the AJAX callbacks completed
        ok(!token.is_pending(), 'should not be pending after ajax call');
      }
    }
  );
});

Here's the code for check_for_ajax and the other functions we used for client-side isolation. Yes, we even had tests on our test code. It's one of the rare cases where I've done that--this code was so complicated and hard to write we needed the tests to just to get it to work!


function FakeXhr() {
  var self = this;
  self.headers = {};

  // standard xhr properties and methods
  self.status;
  self.getResponseHeader = function(name) {
    return self.headers[name];
  }
}

check_for_ajax = function(function_that_performs_ajax, hooks) {
  var send_happened = false;
  var request;
  var xhr = new FakeXhr();

  if(!hooks) { hooks = {}; }
  validate_hooks(hooks);

  $(document).ajaxSend(function(_, xhr_in, req) {
    send_happened = true;
    request = req;
    if (hooks.populate_xhr)
      hooks.populate_xhr(xhr);
    xhr_in.abort();
  });
  function_that_performs_ajax();
  $(document).unbind('ajaxSend');
  ok(send_happened, 'should call ajax');
  if (send_happened) {
    if(hooks.before_send)
      hooks.before_send(request);
    if(request.success)
      request.success("", "success");
    if(request.complete)
      request.complete(xhr, "success");
    if(hooks.after_response)
      hooks.after_response(request, xhr);
  }
};

function prevent_network_traffic(f) {
  check_for_ajax(f);
}

ensure_no_ajax_happens = function(function_that_performs_ajax) {
  var send_happened = false;

  $(document).ajaxSend(function(_, xhr_in, req) {
    send_happened = true;
    xhr_in.abort();
  });
  function_that_performs_ajax();
  $(document).unbind('ajaxSend');
  ok(!send_happened, 'should not call ajax');
};

function validate_hooks(hooks) {
  var valid_hooks = ['before_send', 'populate_xhr', 'after_response'];
  for(var hook in hooks) {
    var found = false;
    for (var i = 0; i < valid_hooks.length; i++) {
      if(hook == valid_hooks[i]) found = true;
    }
    if (!found) ok(false, hook + ' is not a valid check_for_ajax hook');
  }
}

$(function(){
  module("Testing AJAX-testing util");

  test('Works with a "complete" handler', function() {
    var was_called = false;
    check_for_ajax(
      function() {
        $.ajax({
          url: '/blah',
          type: 'POST',
          data: 'a=b',
          complete: function(xhr) {
            was_called = true;
          }
        });
      },
      {
        after_response: function() {
          ok(was_called, 'should have been called');
        }
      }
    );
  });

  test('We can set the response in a callback', function() {
    check_for_ajax(
      function() {
        $.post('/foo', '', function(req) {});
      },
      {
        populate_xhr: function(xhr) {
          xhr.headers['foo'] = 'bar';
        },
        after_response: function(response, xhr) {
          equals('bar', xhr.getResponseHeader('foo'), 'should have modified xhr');
        }
      }
    );
  });

  test('populate_xhr unbinds properly', function() {
    var num_calls = 0;
    check_for_ajax(
      function() {
        $.post('/foo', '', function() {});
      },
      {
        populate_xhr: function() {
          num_calls++;
        },
        after_response: function() {
          equals(1, num_calls, 'should be called the first time');
        }
      }
    );
    check_for_ajax(
      function() {
        $.post('/foo', '', function() {});
      },
      {
        after_response: function() {
          equals(1, num_calls, 'should not be called the second time');
        }
      }
    );
  });

  test("nested AJAX calls shouldn't interfere with each other", function() {
    check_for_ajax(
      function() {
        $.get('/foo');
      },
      {
        before_send: function(outer_req) {
          inner_ajax_happened = check_for_ajax(
            function() {
              $.get('/bar');
            },
            {
              before_send: function(req) {
                equals('/bar', req.url);
              },
              after_response: function(req) {
                equals('/bar', req.url);
              }
            }
          );
          equals('/foo', outer_req.url);
        },
        after_response: function(req) {
          equals('/foo', req.url);
        }
      }
    );
  });
});

Future: Problems We Didn't Solve

The biggest problem we didn't solve was visual: we had no automated way of checking that changes to our HTML didn't break the look of the app. CSS was a particular problem, because a small change to the CSS could break a completely different page from the one we were working on.

Visuals are hard because we they change so frequently and they're so hard to test automatically. One idea I've kicked around is to have a pseudo-automated approach. The tests would automatically take screen-shots of reference pages (but possibly not real pages, as they would change too often) and compare them to a known-good render. If no change, If the screen had changed, it would pop up a dialog showing the two screen shots and asking if the changes were okay.

Other problems to solve included testing pages directly rather than copying HTML into our test page, testing more browsers (including multiple versions of each browser), and stubbing out other server interactions in addition to AJAX, such as following links or posting forms.

At any rate, that's how we approached the problem of test-driven Javascript. None of it's rocket science. The real lesson here: if testing is important to you--and I hope it is--you can make it happen. Keep pushing. It will take a while to get working smoothly, but the lowered friction and increase productivity will be worth it.

If you liked this entry, check out my best writing and presentations, and consider subscribing to updates by email or RSS.