James Shore: Integrating with Buggy Systems

Integrating with Buggy Systems

February 8, 2005

For a time, I hosted the Portland Software Development Roundtable. For a few of these sessions, I took notes and published them here.

The materials that follow are very rough outlines of our discussions: you won't find polished articles here. These outlines don't necessarily reflect my point of view or the point of view of all of the attendees.

The question I had for the round table is a naughty one because it deals with a problem that you're not supposed to run into and therefore I wonder if there are books or general guidelines for such things. First, some introductions:

I have been tasked with over the past year and a half, and I am now starting attempt number 3 on, writing an abstraction layer for a content management product that is buggy, has a flawed architecture, and has poorly documented API's. My company will do custom system integrations with this content manager with this abstraction layer. There are few sample programs that exist for the platform, and when they are not instructive, I:

Scan the trouble tickets on the vendor's website whose resolution descriptions explain some esoteric feature or give a code snippet.

Regularly resort to reverse engineering of the API's to figure out what the heck is going on under the hood so that I could proceed.

Other problems include generally poor performance of the server which encourages paranoia about speed and lots of premature optimization, and the API's don't parameter check and regularly send scary/bad data to the database. I'd also add that there's very little 3rd party software for this thing and the out-of-the-box tools are horrible--but that is likely to begin changing late this year. Oh my God, not having tools makes things so painful. There are no backup/restore tools, no good repository client tools, the admin tool is cruddy looking... The vendor knows this and is going on an acquisition spree, buying up companies they think can flesh out their client tools story--but all these kooky new client tools just muddy the waters and many of them are just plain inappropriate for many customers' needs.

The good stuff is:

the content manager vendor is mighty so there's a lot of selling power there and we want to ride that wave

the product is one that no integrators know, and if we develop a practice around it, we will be goto guys and win lots more work. (my company needs to focus on one technology anyways, so this could be good for us.)

I am encouraged by how good the datamodel that this content manager uses is. It is flexible and powers some very large sites.

The next version of the software is secretly a total rewrite (the API's will not change), which is a good thing because the content manager had many fundamental flaws that could not be patched over. Hopefully it will be faster and more reliable.

So, now that introductions are out of the way--When your vendor's software is cruddy and you need to integrate to it and expose all of its features, what should be the design goal(s) for an abstraction layer of that system? I've always believed that when integrating, you need to pick the top shortcoming of the system you're integrating to and build your solution to primarily address that shortcoming. In this case though, I'm not so sure that that's going to be good enough. Now that I've gone through this exercise two times already and seen the underbelly of the beast enough times, I have some new thoughts about how I should tackle this.

I'd like to discuss the particulars of this problem in some more detail at your round table and then perhaps get some insight on how I should best go about approaching my abstraction layer for this troubled content manager.

Attendees:

Rodney Bell
Vivien Chung
Ron Ellis Gaut
Karen King
Brian Knowles
Diana Larsen
Kieth Lofstrom
Jeff Olfert
Jim Shore
Jim Tyhurst
Dave Woldrich

Notes from our discussion:

Start with a positive attitude

You have to work with the product
Write down your thoughts with bad ones off to the side
You have an "insurmountable opportunity"

An abstraction layer is the obvious solution

"There's no problem that can't be solved with enough layers of indirection"
Abstract to the API?
Abstract to the customer's needs?

A story about a very similar scenario.

We had an old, crufty content management system. We found the simplest way into the system, which was the COM interface. Unfortunately, it was buggy. We had a lot of memory leaks, among other things. We created a thin layer over the COM API (eventually, we used another API as well) that was an exact duplicate of the API... except that it handled lifetime management issues. In other words, it fixed the lifetime management issues. Then we put another layer on top to provide some more services, like data transformation. Then, finally, we put our real abstraction layer on top of that, which presented a nice API for our system to use.
We put unit tests on all sides of the underlying layers to make sure they worked properly.
We weren't worried about performance, we were worried about reliability.
In the future, we will be finding a way to autogenerate the thin layer.

What about handling exceptions?

Throw low, catch high
Catch exceptions as soon as they can be fixed
Simplify exceptions (exception translation)

Unknown issues

How can I unit test when I don't know all the issues?
"Special log" to dump every bit of data when something unexpected occurs.
How do you reproduce and squash these special cases?

That's a problem when the up-front design and schedule doesn't include these sorts of unexpected problems.