Agility Isn’t For Everyone

29 04 2016

My post of less than 24 hours ago, The Only Legitimate Measure of Agility, has drawn some interesting comments.  (I mean real, face-to-face, vocal audio comments, not website or social media comments.)

For example, one fellow said to me, “What about a project to develop a new passenger jet?  What with all the safety concerns and government regulations and mountains of approvals such a thing has to go through, in addition to the fact that there’s no use putting a passenger jet into production before it can fly, means that you might not be able to release for ten years. However, you can still bring practices like TDD and demos and retrospectives to bear on such a project.  Your 1/w formula needs some kind of a scaling factor for projects like that.”

But no.

TDD and demos and retrospectives are practices, not agility.  Agility is frequently releasing to paying customers so as to get fast feedback to quickly fold back into the product and keep it relevant so that the money keeps rolling in–or to identify it rapidly as a bad idea and abandon it.

And you can’t do that with a passenger jet.  You can’t.  There’s no way.  Can’t be done. (…but see update below.)

There are plenty of projects in the industry today that could be made agile, either easily or with a bit of skull sweat, if the companies weren’t so huge and sluggish and shot through with enterprise corporate politics and perversity.  But the development of a new passenger jet isn’t one of them.

Therefore, the development of a new passenger jet can’t be agile.  (Or, to be more precise, if it takes ten years to develop, it can be at most 0.19% agile.)  That’s not a condemnation of the company or the business team or the developers; it’s a simple statement of fact.  If you want to be agile, then find (or start) a project that’s not developing a new passenger jet.

(Of course, once you have the jet, you might well be able to mount agile efforts to enhance and improve it.)

But while those practices aren’t the same as agility, they still bear talking about.  Where did those practices come from?  They came from agile teams who were desperately searching for ways to sustain their agility in the face of a high-volume cascade of production problems such as their waterfall predecessors never dreamed about.  They were invented and survived because they work.  They can produce fast, close-knit product teams who churn out high-quality, dependable code very quickly.

And any project, agile or not, can benefit from a product team like that.  Their practices are good practices, and (when used correctly) should be commended wherever they appear, and encouraged wherever they don’t.

Agility isn’t for everyone, but good practices are…or should be.

 

UPDATE: I just thought of a way the development of a new passenger jet might be agilified.

Manufacturers frequently (always?) accept orders for newly-designed airplanes years before they go into production, and such orders come with a significant amount of deposit money attached. This is real money from real customers.

Perhaps simulators could be devised, well before any aluminum was extruded from any furnaces anywhere, to demonstrate the anticipated experiences of the passengers and the crew and the mechanics and the support personnel and so on, such that the real code under development running in these simulators would give a reasonably faithful rendition of anticipated reality.

Releasing to these simulators, then, might qualify as releasing to a kind of production, since good experiences would lead to more orders with deposits, and changes to the simulated experiences would produce definite feedback from real customers with real money at real risk.  You could come up with a pretty tight feedback loop if you did something like that…and probably put a serious competitive hurtin’ on sluggish corporate government contractors like Boeing or Lockheed-Martin who will dismiss it as video-game nonsense.

Maybe a stupid thought, but…a thought, at least.

 





TDD: What About Code You Need But Have No Test For Yet?

26 07 2015

I spend a lot of time teaching TDD to other developers.  Here’s a situation I run into a lot.

Given an opening test of this sort:

@Test
public void shouldReturnChangeIfPaymentIsSufficient () {
    int price = 1999;
    int payment = 2000;

    int result = subject.conductTransaction (price, payment);

    assertEquals (1, result);
}

my mentee will begin supporting it like this:

public int conductTransaction (int price, int payment) {
    if (price >= payment) {

and I’ll stop him immediately.

“Whoa,” I’ll observe, “we don’t have a test for that yet.”

“Huh?” he says.

“We never write a line of production code that’s not demanded into existence by a failing test,” I’ll say. “That if statement you’re writing will have two sides, and we only have a test for one of those sides. Before we have that other test, we can’t write any code that steps outside the tests we do have.”

So I’ll erase his if statement and proceed like this:

public int conductTransaction (int price, int payment) {
    return price - payment;
}

We run the test, the test passes, and now we can write another test:

@Test
public void shouldThrowExceptionIfPaymentIsInsufficient () {
    int price = 1999;
    int payment = 1998;

    try {
        subject.conductTransaction (price, payment);
        fail ();
    }
    catch (IllegalArgumentException e) {
        assertEquals ("Payment of $19.98 is insufficient to cover $19.99 charge");
    }
}

Now we have justification to put that if statement in the production code.

Frequently, though, my mentee will be unsatisfied with this. “What if we get distracted and forget to add that second test?” he’ll ask. “We’ll have code that passes all its tests, but that is still incorrect and will probably fail silently.”

Until recently, the only response I could come up with was, “Well, that’s discipline, isn’t it? Are you a professional developer, or are you a hobbyist who forgets things?”

But that’s not really acceptable, because I’ve been a developer for four and a half decades now, and while I started out doing a pretty good job of forgetting things, I’m getting better and better at it as more of my hair turns gray.

So I started teaching a different path. Instead of leaving out the if statement until you (hopefully) get around to putting in a test for it, put it in right at the beginning, but instrument it so that it complains if you use it in an untested way:

public int conductTransaction (int price, int payment) {
    if (price < payment) {
        throw new UnsupportedOperationException ("Test-drive me!");
    }
    return price - payment;
}

Now you can wait as long as you like to put in that missing test, and the production code will remember for you that it’s missing, and will throw you an exception if you exercise it in a way that doesn’t have a test.

That’s better.  Much better.

But there are still problems.

First, “throw new UnsupportedOperationException ("Test-drive me!");” takes longer to type than I’d like.

Second, unless you want to string-search your code for UnsupportedOperationExceptions, and carefully ignore the ones that are legitimate but not the ones that are calling for tests (error-prone), there’s no easy way to make sure you’ve remembered to write all the tests you need.

So now I go a step further.

Somewhere in most projects, in the production tree, is at least one Utils class that’s essentially just an uninstantiable bag of class methods.  In that class (or in one I create, if that class doesn’t exist yet) I put a method like this:

   public void TEST_DRIVE_ME () {
        throw new UnsupportedOperationException ("Test-drive me!");
    }

Now, whenever I put in an if statement that has only one branch tested, I put a TEST_DRIVE_ME() call in the other branch.  Whenever I have to create an empty method so that a test will compile, instead of having it return null or 0 or false, I put a TEST_DRIVE_ME() in it.

Of course, in Java you still have to return something from a non-void method, but it’s just compiler candy, because it’ll never execute after a TEST_DRIVE_ME(). Some languages are different; for example, in Scala you can have TEST_DRIVE_ME() return Nothing—which is a subclass of every type—instead of being void, which makes things even easier.

Are you about to complain that I’m putting test code in the production tree, and that that’s forbidden?

Okay, fine, but wait to complain just a little longer, until after the next paragraph.

The really cool part is that when you think you’re done with your application, and you’re ready to release it, you go back to that bag-of-methods Utils class and just delete the TEST_DRIVE_ME() method. Now the compiler will unerringly find all your untested code and complain about it–whereupon you can Ctrl-Z the TEST_DRIVE_ME() back in and go finish your application, and then delete it again when you’re really done.

See? No test code in the production tree!





Spluttering With Indignation

26 03 2015

I got so angry at the late Sun Microsystems today that I could barely find words.  I’ll tell you why in a minute.

So…I’ve been working on some spike code for a system to generate webservice mocks (actually combination stub/spies, if I have my terminology correct) to use in testing clients.

Here’s the idea.

If you have the source code for an existing web service, or if you’re developing a new web service, you use @MockableService and @MockableOperation annotations to mark the classes and methods that you want mocks for.  Then you run a command-line utility that searches through your web service code for those annotations and generates source code for a mock version of your web service that has all the marked services and operations.

Consider, for example, one of these @MockableOperations that responds to an HTTP POST request.  The example I used for my spike was hanging ornaments on a Christmas tree.  The URL of the POST request designates a limb on a tree, and the body of the request describes in JSON the ornaments to be added to that limb.  The body of the response contains JSON describing the resulting total ornamentation of the tree. The generated mock version of this operation can accept four different kinds of POST requests:

  1. X-Mockability: clear
  2. X-Mockability: prepare
  3. X-Mockability: report
  4. [real]

“X-Mockability” is a custom HTTP header I invented just for the purpose of clandestine out-of-band communications between the automated test on the front end and the generated mock on the back end, such communications to be completely unbeknownst to the code under test.

If the incoming POST request has X-Mockability of “prepare” (type 2 above), the generated mock knows it’s from the setup portion of the test, and the body of the request contains (encapsulated first in Base64 and then in JSON) one or more HTTP responses that the mock should remember and then respond with when it receives real requests with no X-Mockability header (type 4 above) for that operation.

If the incoming POST request has X-Mockability of “report” (type 3 above), the generated mock knows it’s from the assert portion of the test, and it will send back an HTTP response whose body contains (again, encapsulated in Base64 and JSON) a list of all the real (type 4) HTTP requests that operation has recently received from the code under test.

If the incoming POST request has X-Mockability of “clear” (type 1 above), the generated mock will forget all about the client sending the request: it will throw away all pending prepared responses and all pending recorded requests.  In general, this is the first request a test will make.

And, of course, as has been mentioned, if there is no X-Mockability header at all, the generated mock knows that the request is genuine, having originated from code under test, and it responds with the prepared response that is next in line, or–and this is key for what follows–a 499 response (I made up that status code) saying, “Hey, I’m just a mock and I’m not prepared for that request!” if there is no prepared response.

Pretty cool, right?  Can you see the problem yet? No?  Don’t feel bad; I hadn’t by this point either.  Let me go a little further.

I wrote a sample client that looked at a pre-decorated Christmas tree (using GET on the tree limb and expecting exactly the same kind of response that comes from a POST) and decided whether the ornamentation was Good, TopHeavy, BottomHeavy, or Uneven.  Then I wrote an automated test for the client, meant to run on a mock service. The test used “X-Mockability: prepare” to set up an ornamentation response from the GET operation that the code under test would judge to be BottomHeavy; then it triggered the code under test; then it asserted that the code under test had indeed judged the ornamentation to be BottomHeavy.

How about now?  Do you see it?  I didn’t either.

When I ran the test, it failed with my made-up 499 status code: the generated mock thought it wasn’t prepared for the GET request. Well, that was weird.  Not too surprising, though: my generated mock has to deal with the facts that A) it may be called from a variety of different network clients, and it has to prepare for and record from each of them separately; B) the webserver in which it runs may decide to create several instances of the mock to use in thread pooling, but it still has to be able to keep track of everything; and C) each of the @MockableOperations has to be administered separately: you don’t want to send a GET and have the response you’d prepared for a POST come back, and you don’t want to send a GET to one URL and receive the response you’d prepared for a GET to a different URL.

That’s a fair amount of complexity, and I figured I’d gotten the keying logic wrong somewhere, so that my preparations were ending up somewhere that the real request couldn’t find them.

So I put in a raftload of logging statements–which I probably should have had in from the beginning: after all, it’s for testing, right?–and tried it again.

Turns out that when the real request came in, the generated mock really truly honestly wasn’t prepared for a GET: instead, it was prepared for a POST to that URL instead.

Huh?  A POST?  But I prepared it for a GET, not a POST.  Honest.

I went and looked at the code, and logged the request method several times, from the initial preparation call in the test right up to Apache’s HttpClient, which is the library I was using to contact the server.  The HTTP method was GET all the way.

Me, I was just getting confuseder and confuseder.  How about you?  Have you beaten me to the conclusion?

The problem is that while this system works just fine for POST and PUT requests, there’s a little issue with GET, HEADER, and DELETE requests.

What’s the issue?  Well, GET, HEADER, and DELETE aren’t supposed to have bodies–just headers.  It’s part of the HTTP standard.  So sending an “X-Mockability: prepare” version of any of these three kinds of requests, with a list of canned responses in the body, involves stepping outside the standard a bit.

If you try using curl to send a GET with a body, it’ll be very cross with you.  If you tell SoapUI that you’re preparing to send a GET, it’ll gray out the place where you put in the body data.  If you already have data in there, it’ll disappear.  So I figured it was fair to anticipate some recalcitrance from Apache HttpClient, but this was more than recalcitrance: somehow, somewhere, my GET was turning into a POST.

I did some packet sniffing.  Sure enough, the server was calling the POST operation because it was getting a POST request over the wire.  Everything about that POST request was exactly like the GET request I wanted to send except for the HTTP method itself.

I tried tracing into HttpClient, but there’s a lot of complexity in there, and TDD has pretty much destroyed my skill with a debugger.

I didn’t need all the capabilities of HttpClient anyway, so I tossed it out and tried a naked HttpURLConnection instead. It took me a few minutes to get everything reassembled and the unit tests passing, but once I did, the integration test displayed exactly the same behavior again: my HttpURLConnection.getMethod () was showing GET right up until the call to HttpURLConnection.getOutputStream () (which includes a call to actually send the request to the server), but the packet sniffer showed POST instead of GET.

HttpURLConnection is a little easier to step into than HttpClient, so I stepped in, and finally I found it, in sun.net.www.protocol.http.HttpURLConnection.  Here is the offending code, in a private method called from getOutputStream ():

    private synchronized OutputStream getOutputStream0() throws IOException {
        // ...
        if(this.method.equals("GET")) {
            this.method = "POST";
        }
        // ...
    }

See that?

Sun Microsystems has decided that if it looks like you want to add a body to a GET request, you probably meant to say POST instead of GET, so it helpfully corrects you behind the scenes.

It HELPFULLY CORRECTS YOU BEHIND THE FRICKIN’ SCENES.

Now, if it doesn’t want you putting a body in a GET request, it could throw an exception.

Does it throw an exception?

No.

It could simply fail to send the request at all.

Does it simply fail to send the request at all?

No.

It could refuse to accept a body for the GET, but give you access to a lower level of operation so that you can put the body in less conveniently and take more direct responsibility for the consequences.

Does it do that?

Noooo.

You can’t even reasonably subclass HttpURLConnection, because the instance is constructed by URL.openConnection () through a complicated service-provider interface of some sort.

This sort of hubris fills me with such rage that I can barely speak.  Sun presumes to decide that it can correct me?!  Even Apple isn’t this arrogant.

So I pulled out HttpURLConnection and used Socket directly, and I’ve got it working, sort of: very inconvenient, but the tests are green.

Unfortunately, Sun isn’t the only place we see practices like this.  JavaScript’s == operator frequently does things you didn’t ask for, and we’ve all experienced Apple’s and Android’s autocorrect mechanism putting words in our…thumbs…for us.

But at least JavaScript has a === operator that behaves, and you can either turn autocorrect off, or double-check your text or tweet to make sure it says what you want before you send it. Sun doesn’t consider it necessary to give you a choice; it simply pre-empts your decision on a whim.

I guess the lesson for me–and perhaps for you too, if you haven’t run into something like this already–is: don’t correct your clients.  Tell them they’re wrong, or refuse to do what they ask, or let them choose lower-level access and increased responsibility; but don’t assume you know better than they do and do something they didn’t ask you to do instead of what they did ask you to do.

They might be me, and I might know where you live.





This Test Was an Ordeal to Write

13 07 2012

I had an interesting time at work today, writing what should have been a simple nothing test.  I thought somebody might be interested in the story.

My client’s application has an entity called a Dashboard.  Before my pass through this particular section of the code, a Dashboard was simply a Name and a set of Modules.

For one reason or another, there is a business-equals operator for Dashboard that returns true if the two Dashboards it’s comparing have the same Name and the same set of Modules, false otherwise.  Simple, standard stuff.

Part of my change involved adding a preview/thumbnail Icon to Dashboard, so that the user could see, in a listing of Dashboards, approximately what the Dashboard looked like before examining it in detail.  I looked at the business-equals method and decided that the new Icon field was irrelevant to it, so I left it alone.

After completing my change, I submitted my code for review, and one of the reviewers flagged the business-equals method.  “Doesn’t check for equality of Icon,” he said.

I explained to him the point above–that Icon is irrelevant to business-equals–and he made a very good argument that I hadn’t considered.

“That makes sense,” he said, “but I didn’t notice it at first, and maybe I’m not unique.  What if somebody comes through this code someday and notices what I noticed, and in the spirit of leaving the code better than he found it, he goes ahead and puts Icon in the business-equals method?  Then we’ll start getting false negatives in production.”

I could see myself in just that role, so I understood exactly where he was coming from.

“Better put a comment in there,” he suggested, “explaining that Icon’s been left out for a reason.”

A comment? I thought to myself.  Comments suck.  I’ll write a test, that’s what I’ll do.

I’ll write a test that uses business equals to compare two Dashboards with exactly the same Name and Module set, but with different Icons, and assert that the two are declared equal.  That way, if somebody adds an Icon equality check later, that test will fail, and upon examining the failure he’ll understand that Icon was left out on purpose.  Maybe somebody has now determined that Icons are relevant to business equals, and the failing test will be deleted, but at least the change won’t be an accidental one that results in unexpected false negatives.

So I did write a test; one that looked a lot like this (C++ code, under the Qt framework).

QString name ("Name");
QPixmap p (5, 10); // tall
QPixmap q (10, 5); // wide
ModuleSet modules;
Dashboard a (name, p, modules);
Dashboard b (name, q, modules);

bool result = (a == b);

EXPECT_TRUE (result); // Google Test assertion

Elementary, right?

Well, not really.  When I ran the test, Qt segfaulted on the QPixmap constructor.  I made various minor changes and scoured Google, but nothing I tried worked: at least in the testing environment, I was unable to construct a QPixmap with dimensions.

I discovered, however, that I could construct a QPixmap with no constructor parameters just fine; that didn’t cause a segfault.  But I couldn’t make p and q both blank QPixmaps like that, because then they’d be the same and the test wouldn’t prove anything: they have to be different.

So I thought, well, why not just mock the QPixmaps?  I have Google Mock here; I’ll just write a mock with an operator==() that always returns false; then I don’t have to worry about creating real QPixmaps.

Problem is, QPixmap doesn’t have an operator==() to mock.  If that notional future developer is going to add code to compare QPixmaps, he’s going to have to do it by comparing properties manually.  Which properties?  Dunno: I’m not him.

Scratch mocking.

Well, I looked over the constructor list for QPixmap and discovered that I could create a QPixmap from a QImage, and I could create a QImage from a pair of dimensions, some binary image data, and a Format specifier (for example, bilevel, grayscale, color).  So I wrote code to create a couple of bilevel 1×1 QImages, one consisting of a single white pixel and the other consisting of a single black pixel, and used those QImages to construct a couple of unequal QPixmaps.

Cool, right?

Nope: the constructor for the first QImage segfaulted.

Cue another round of code shuffling and Googling: no joy.

Well, I’m not licked yet, I figured.  If you create .png files, and put them in a suitable place, and write an XML file describing them and their location, and refer to the XML file from the proper point in the build process, then you can create QPixmaps with strings that designate the handles of those .png files, and they’ll be loaded for you.  I had done this in a number of places already in the production code, so I knew it worked.

It was a long, arduous process, though, to set all that up.

But I’m no slacker, so I put it all together, corrected a few oversights, and ran the test.

Success!

Success?  Really?  I was suspicious.  I ran it in the debugger and stepped through that QPixmap creation.

Turns out all that machinery didn’t work; but instead of segfaulting or throwing an exception, the QPixmap constructors created blank QPixmaps–just like the parameterless version of the constructor would have–that were identical to each other; so once more, the test didn’t prove anything.

More Googling and spiking and whining to my neighbor.

This time, my neighbor discovered that there’s an object called QApplication that does a lot of global initialization in its constructor.  You create a QApplication object somewhere (doesn’t seem to matter where), and suddenly a bunch of things work that didn’t work before.

Okay, fine, so my neighbor wrote a little spike program that created a QPixmap without a QApplication.  Bam: segfault.  He modified the spike to create a QApplication before creating the QPixmap, and presto: worked just fine.

So that was my problem, I figured.  I put code to create a QApplication in the test method and ran the test.

Segfault on the constructor for QApplication.

No big deal: I moved the creation of the QApplication to the SetUp() method of the test class.

Segfault on the constructor for QApplication.

Darn.  I moved it to the initializer list for the test class’s constructor.

Segfault on the constructor for QApplication.

I moved it out of the file, into the main() class that ran the tests.

Segfault on the constructor for QApplication.

By now it was after lunch, and the day was on its way to being spent.

But I wasn’t ready to give up.  This is C, I said to myself: it has to be testable.

The business equals doesn’t ever touch Icon at all, and any code that does touch Icon should cause trouble.  That gave me an idea.

So I created two Dashboards with blank (that is, equal) Icons; then I did this to each of them:

memset (&dashboard.Icon, 0xFFFFFFFF, sizeof (dashboard.Icon));

See that?  The area of memory that contains the Icon field of dashboard, I’m overwriting with all 1 bits.  Then I verified that calling just about any method on that corrupted Icon field would produce a segfault.

There’s my test, right?  Those corrupted Icons don’t bother me, since I’m not bothering them, but if anybody changes Dashboard::operator==() to access the Icon field, that test will segfault.  Segfault isn’t nearly as nice as a formal assertion failure, but it’s a whole heck of a lot better than false negatives in production.

Well, it turned out to be close to working.  Problem was, once I was done with those corrupted QPixmaps and terminated the test, their destructors were called, and–of course, since they were completely corrupt–the destructors segfaulted.

Okay, fine, so I put a couple of calls at the end of my test to an outboard method that did this:

QPixmap exemplar;
memcpy (&dashboard.Icon, &exemplar, sizeof (dashboard.Icon));

Write the guts of a real Icon back over all the 1 bits when we’re finished with the corrupted Icon, and the destructor works just fine.

My reviewer didn’t like it, though: he pointed out that I was restoring the QPixmap from a completely different object, and while that might work okay right now on my machine with this version of Qt and this compiler, modifying any of those variables might change that.

So, okay, fine, I modified things so that corrupting each dashboard copied its Icon contents to a buffer first, from which they were then restored just before calling the destructor.  Still works great.

So: this is obviously not the right way to write the test, because Qt shouldn’t be segfaulting in my face at every turn.  Something’s wrong.

But it works!  The code is covered, and if somebody reflexively adds Icon to Dashboard’s business equals, a test will show him that it doesn’t belong there.

I felt like Qt had been flipping me the bird all day, receding tauntingly before me as I pursued it; but finally I was able to grab it by the throat and beat it into bloody submission anyway.  (Does that sound a little hostile?  Great: it felt a little hostile when I did it.)  I’m ashamed of the test, but proud of the victory.





So I’ve Got This AVR Emulator Partway Done…

2 07 2012

I decided to go with writing the emulator, for several reasons:

  1. It puts off the point at which I’ll start having to get serious about hardware.  Yes, I want to get into hardware, really I do, but it scares me a little: screw up software and you get an exception or a segfault; screw up hardware and you get smoke, and have to replace parts and maybe recharge the fire extinguisher.
  2. It’s going to teach me an awful lot about the microcontroller I’ll be using, if I have to write software to simulate it.
  3. It’s an excuse to write code in Scala rather than in C.

Whoa–what was that last point?  Scala?  Why would I want to write a low-level bit-twiddler in Scala?  Why not write it in a low-level language like C or C++?

If that’s your question, go read some stuff about Scala.  I can’t imagine anything that I’d rather write in C than Scala.

That said, I should point out that Scala’s support for unsigned values is–well, nonexistent.  Everything’s signed in Scala, and treating 0x94E3 as a positive number can get a little hairy if you put it in a Short.  So I wrote an UnsignedShort class to deal with that, including a couple of implicits to convert back and forth from Int, and then also an UnsignedByte class that worked out also to have a bunch of status-flag stuff in it for addition, subtraction, and exclusive-or.  (Maybe that should be somewhere else, somehow.)

Addition, subtraction, and exclusive-or?  Why just those?  Why no others?

Well, because of the way I’m proceeding.

The most important thing about an emulator, of course, is that it emulates: that is, that it acts exactly as the real part does, at least down to some small epsilon that is unavoidable because it is not in fact hardware.

So the first thing I did was to write a very simple C program in the trashy Arduino IDE that did nothing but increment a well-known memory location (0x500, if I remember correctly) by 1. I used the IDE to compile and upload the file to my ATmega2560 board, which thereupon–I assume–executed it. (Hard to tell: all it did was increment a memory location.)

Then I located the Intel .HEX format file that the IDE had sent over the wire, copied it into my project, and wrote a test to load that file into my as-yet-nonexistent emulator, set location 0x0500 to 42, run the loaded code in the emulator, and check that location 0x0500 now contained 43.  Simple, right?

Well, that test has been failing for a couple of weeks straight, now.  My mode of operation has been to run the test, look at the message (“Illegal instruction 0x940C at 0x0000FD”), use the AVR instruction set manual to figure out what instruction 0x940C is (it’s a JMP instruction), and implement that instruction.  Then I run the test again, and it works for every instruction it understands and blows up at the first one it doesn’t.  So I implement that one, and so forth.

Along the way, of course, there’s opportunity for all sorts of emergent design and refactoring.

At the moment, for instance, I’m stuck on the CALL instruction.  I have the CALL instruction tested and working just fine, but the story test fails because the stack pointer is zero–so pushing something onto the stack decrements it into negative numbers, which is a problem.  Why is the stack pointer zero?  Well, because the AVR architecture doesn’t have a specific instruction to load the stack pointer with an initial value, but it does expose the SP register as a sequence of input/output ports.  Write a byte to the correct output port, and it’ll appear as part of the two-byte stack pointer.

So I have the OUT instruction implemented, but so far (or at least until this morning), all the available input and output ports were just abstract concepts, unless you specifically connected InputRecordings or OutputRecorders to them.  The instructions to initialize the stack pointer are there in the .HEX file, but the numbers they’re writing to the I/O ports that refer to the stack pointer are vanishing into the bit bucket.  That connection between the I/O ports and the stack pointer (and an earlier one between the I/O ports (specifically 0x3F) and the status register) is more than an abstract concept, so I’m working on the idea of a Peripheral, something with special functionality that can be listed in a configuration file and hooked up during initialization to provide services like that.

Eventually, though, I’ll get the simple add application running in the emulator, by implementing all the instructions it needs and thereby all the infrastructure elements they need.

Then I plan a period for refactoring and projectization (right now the only way to build the project is in Eclipse, and you’ll find yourself missing the ScalaTest jars if you try).

After that, I want to write a fairly simple framework in C, to run on the Arduino, that will communicate with running tests on my desktop machine over the USB cable.  This framework will allow me to do two things from my tests: submit a few bytes of code to be executed and execute them; and ask questions about registers and I/O ports and SRAM and flash, and get answers to them.

Once I get that working on the Arduino (it’s going to be interesting without being able to unit-test it), I’ll grab the .HEX file for it, load it into my emulator, and go back to implementing more instructions until it runs.

After that, I’ll be able to write comparison tests for all the instructions I’ve implemented that far: use an instruction in several ways on the Arduino and make sure it operates as expected, then connect to the emulator instead of the Arduino and run the same test again.  With judiciously chosen tests, that ought to ensure to whatever small value of epsilon I’m willing to drive it to, that the emulator matches the part.

Anyway, that’s what I’ve been doing with my spare time recently, in the absence of insufficiently-Agile people to pester.





Travails of a Newly-Embedded TDDer

17 06 2012

So, as someone who has done a lot of building and repair of things strictly electrical, but who has done practically nothing having to do with analog or digital electronic hardware, I’ve taken delivery of an Arduino microcontroller board and have been playing around with it.  I’ve got a fairly decent breadboard, a set of wire jumpers, and a mess of resistors, capacitors, and LEDs.   I’ve put together several meaningless toy “devices,” ranging from a program that just repeatedly blinked “Shave And a Hair-cut, Two Bits” on an on-board LED to one that sequenced green outboard LEDs until you pulled a particular pin low, at which point it would sequence red outboard LEDs until you let the pin go, at which point it would go back to sequencing the green LEDs.

But as a TDD, I’m beginning to feel really, really crowded by the difficulty of test-driving embedded microcontroller code.  I just happened to find a bug in one of my programs that happened because I was refactoring without tests; if I had not just happened across it, there’s no reliable way I could have found it without tests.

So I’m thinking: If I could wish into existence whatever I wanted to support my Arduino development, what would I wish?

At first I thought, hey, this isn’t so hard.  Arduino provides me with a C library full of API calls like pinMode(), that sets a pin to input or output mode, digitalWrite(), that sets an output pin high or low, analogRead(), that reads a value from one of the ADC pins, and so on.  All I have to do is write an instrumented static-mock version of that library.  When I want to run tests, I can compile the microcontroller code for my development machine instead of for the microcontroller, link in my static mock instead of the Arduino library, and run tests with Google Test.  I can set up the static mock so that I can attach prerecorded “analog” signals (essentially sequences of floating-point numbers) to particular input pins, and attach “recorder” objects to particular output pins.  In the Assert portion of the tests, I can examine the recordings and make sure they recorded what should have happened.

But then I thought about timing.  Timing’s real important to microcontrollers.  For example, suppose I wanted to test-drive handling of the situation when an interrupt is received while a value is being written to EEPROM.  How am I going to arrange the input signal on the interrupt pin so that it changes state exactly at the right time during the run of the code under test?  Without access to the hardware on which that code is running, how do I even simulate an interrupt at all?

Now I’m thinking I need to write a complete software emulator of the microcontroller–keeping the idea of prerecording input signals and recording output signals during the test, and having asserts to compare them, but with debugger-like awareness of just what’s executing in the test: for example, you could have prerecorded signals where the numbers weren’t just numbers, but could be tagged with clock cycle counts or time delays, or even with something like “Fire this sample the third time execution passes address XXXX with memory location XXXX set to XXXX.”

There is at least one software emulator in existence for my microcontroller–the AVR ATmega2560–but it doesn’t offer the kind of control that a TDD (this one, anyway) would want.

But I’m thinking there’s got to be an easier way.





How Do You Measure Position on a Fingerboard?

17 06 2012

My first idea was just to put long ribbon controllers (like laptop touchpads, but long and skinny and sensitive only to one dimension) under the strings.  But there are two problems with that.  First, nobody seems to make ribbon controllers any longer than about 100mm.  Second, my bass strings can, over time, wear grooves in even ultra-hard ebony fingerboards.  I’d expect plastic ribbon controllers to last maybe ten minutes.

My second idea was to put a little piezoelectric transducer on the bridge end of the string and have them fire periodic compression pulses down the string, listen for the reflection from the point where the string is held down, and compute position from the delay.  But there three problems with that.  First, it’s going to be difficult to damp those pulses out at the bridge end; most likely, they’ll reflect off each end a couple of times before they disappear into the noise floor.  How am I going to tell the reflection of the pulse I just sent from the continuing echoes of its predecessors?  Second, holding a string against a fingerboard definitely stops transverse waves; but I have no reason for believing that it would cause a clean echo of a compression wave.  My instinct is that there’d be a very weak reflection from the fingered position followed by a nice sharp one from the other end of the string, so I’d have to find the weak, fuzzy one among all the sharp, clear ones.  Third, a compression pulse travels at about 20,000 feet per second down a steel string.  If I want, say, 1mm resolution in positioning, then even if the other two problems evaporate I’m going to have to be able to measure the delay with a granularity of 164 nanoseconds, which isn’t particularly realistic for the sort of microcontroller hardware one expects to find in a MIDI controller.

My third idea was to make the fingerboard of some hard, conductive metal and the strings of resistance wire–that is, wire that has a significant resistance per unit length, unlike copper or aluminum.  Then you could use an analog-to-digital converter to measure the resistance between the fingerboard and the bridge end of each string, and you’d get either infinity–if the string wasn’t being fingered at all–or a value roughly proportional to the distance of the fingered point from the bridge.

There are a whole host of problems with this third idea.

Resistance is affected by more than where the string is fingered.  For example, a small contact patch (light pressure) will have more resistance than a large contact patch (heavy pressure).  A place on the string or fingerboard with more oxidation will have more resistance than a place with less oxidation.  But…stainless steel doesn’t oxidize (much), and neither does nichrome resistance wire, if you’re not using it for a heating element.

Nichrome resistance wire doesn’t have as much resistance as one might imagine.  In one diameter that’s roughly the size of a middling guitar string, its resistance is about one third of an ohm per foot.  I can crank my ADC’s reference voltage down as far as 1.0V, but if I run 1.0V through about eight inches of wire (what I expect to be left over at the end of the fingerboard once a string is fingered in the highest position), Ohm’s Law says I’ll get 4.5 amps of current, which will burn my poor little ADC to a smoking crisp, as well as frying the battery and even heating up the strings significantly.  But…I could limit the current to one milliamp and use an operational amplifier with a gain of 1000 or so to get a low-current variable voltage between 0 and 1V for the ADC.

A/D converters don’t measure resistance, they measure voltage.  Sure, it’s easy to convert resistance into voltage; but it’s easy to convert all sorts of things into voltage, and much harder than you would think to avoid converting them–things like electromagnetic interference from elevators, power lines, air conditioners, and so on.  The signal arriving at the ADC is liable to be pretty noisy, especially given the required amplification.  But…I could add another string on the neck and put it somewhere where it couldn’t be played.  That string would be subject to all the same noise and interference as the one being played.  The only difference would be that it wouldn’t be being played; so you could use the differential mode of the op amp to subtract one from the other and cancel out the common stuff.

One idea I had to get rid of the amplifier, and all the noise it would amplify, would be to ditch the 16ga nichrome wire (about the size of a guitar string) and use 40ga instead: very skinny, with a significantly higher resistance per foot.  Then coat it with insulating enamel, like magnet wire, and wrap it around a central support wire made of steel or bronze, so that it looked very much like a real guitar string.  Then sand off the enamel from the outside of the string so that the nichrome can contact the steel fingerboard.  In this case, our nichrome wire is much thinner, with more resistance per foot, and much longer, because it’s wound in a helix instead of being straight.  I ran the numbers and discovered that I could expect about 10,000 ohms per foot for a string like that: almost perfect for the ADC.

Then I remembered that the resistance of the human body is about 10,000 ohms: the resistance of the system would be significantly modified if anyone touched it without insulating gloves.  So…ditch that idea.  (A 10,000-ohm parallel resistance will have no perceptible influence on the 1-ohm resistor you get from a three-foot 16ga nichrome wire.)

So there are problems with this third approach too, but all of them at least appear to be surmountable.  It might be worth a try.