Discovering what's independent

Designer's notes #16 - Home - Prev - Next
Řyvind Teig, Trondheim, Norway (

Functional "factoring out" to stop the stream of new errors

It's crystal clear. To tick off a shopping list on the go, you would naturally take one shop first, then another. And, within a shop you would try to pick up goods as you pass, in right sequence. You would not, for every fifth thing you pick up in a shop, run to another shop, to poll it for that item they expected to arrive "soon". You would do it afterwards.

For us programmers this is not crystal clear - or, rather: what's not so obvious is when it's not smart to poll the other shop that way. Maybe it was not on the list - what should happen if that item was not in stock when you asked. That need was an add-on. Or, worse: maybe it became an add-later-on to discover that it was not the best solution to run across the street to poll. "Why didn't I see that before?"

Having been around in the embedded software business for so many years, it's amazing to see how many times orthogonal issues are not initially discovered as what they are: independent of each other.

I'll give you an example. A bit-banging serial line controller. This means, all the outgoing start-bits, data-bits and stop-bits are controlled by a state machine. This needs to run at quite precise & never failing intervals, at the bit rate frequency. Not too hard. 

However, if we also need to read in-coming messages, we have to treat that in another, almost independent state machine. At a higher rate. Nyquist decided that we now need at least 2 times the bit rate. So, one usually oversamples by 4 or more. More difficult, yes, but still not impossible. Generations of embedded programmers have done it before us.

Some years ago we bit-banged lines with a field bus protocol called Mark/Space, which was (and is) still in use (implemented with relays as early as the fifties, we were told) at American oil tank farms.

I found a newer patent description (*1) describing this better than I am able to:

"The mark/space pulse code is implemented on a four-wire field interface. Two wires, B+ and B-, provide power to the encoder/transmitter, while the other two wires, Mark and Space, carry tank level information. Each of these lines is normally held at +48 VDC and dropped to 0 VDC to indicate a mark (on the Mark line) or a space (on the Space line). The interface is idle when both lines are high, and the interface is in a fault state when both lines are low. Mark/space communications must conform to pulse timing constraints to ensure reliability and accuracy in data communications." With this kind of spec one certainly needs control of the these lines in software. There are cases like this, and others, when it's not possible to use a standard USART/ UART, that bit-banging comes into the picture.

Most software engineers these days would use a microcontroller, with their timers in hardware - often called "output compare" and "input capture". They are made for these sort of things. (They may even be the best solution.)

Often one would need at least three interrupts to use them for something real. (Three of them would not make maximum interrupt latency any easier to find, should one need to relate to it.) So, there is the nice bit-banged UART (asynchronous) or USART (synchronous) - all in software. And it is bi-directional, able to send and receive over the same two lines. One at a time.

Most embedded systems these days also have some kind of operating- or runtime -system. Many of these notes are about this - but not this one.

Now to the problem: with the nice sw-USART, how do you detect that the communication line is shorted? This could be significant for the safety of the system.

Most often a low line is the same as the start bit - a logical "1" - and a short. Our sw-USART is easily able to detect a short line, defined say, as 35 ms low line. 

The sw-USART state machine or machines detect start, data and stop bits. And it also is able to detect collision, when other units may try to send while it is sending (we're past Mark/Space now). The low & shorted line is monitored by some kind of count of low-line samples, intermingled with the state machines. This gets complicated. Short line, what's this, why didn't your sw-USART detect it? Sorry. Fix. Works again. Returning question: another case. New fix. But now we're there...

Then we discover that 1.) shorted line and 2.) communication - are two independent concepts. Instead of having the sw-USART find out itself that it should start to count low samples, and then count while-it-still-may-be a legal char arriving or on its way out, it now just polls a bit to see whether the shorted line is already detected, by some other mechanism. 

That other mechanism now is a bit-wise  oversampling being handled in a timer interrupt that runs all the time. Stateless with respect to bits and bytes on the line. 

After having "factored out" this problem we would still see that there was code lurking in the old state machine that displayed the fact that these things had been conceived as a cloud in the code. But now, with one facet less to juggle, it was so much easier to concentrate on the right thing and remove the artefacts.

PS. It is probably just as easy to design state machines "in a cloud" with a graphical design tool, as with drawings, descriptions and coding. To factor out is a cognitive, non-computational process.

Aside: A personal tip about using the output compare & input capture built-in hw. Consider not to! Often you may do the same - with one only interrupt source and much simpler sw logic with a single, oversampling timer. Worst case interrupt time probably is shorter, however the processor may spend less time idling, and therefore perhaps burn more battery. Still, that special hw really just represents an extra layer. And, of course, the oversampled (or whatever you are making) solution is much more portable.
5.March.08 (initial)

Other publications at