On embedded code optimization

I wouldn't have noticed this at all, but while debugging the prototype, I took a look at the SPI waveforms on a scope. The code does bit-bang SPI. A CS pin is brought low, then for each bit you set a data pin to the appropriate value, then cycle a clock pin high, then low. After all of that's done, you bring the CS pin back high.

What I saw (and I don't have pictures, unfortunately), was that the time between the clock pulses was not even. Each pulse had a little less time between it than the pulse before. In essence, the first one took ~10 times longer than the last.

Here was the original code:

PORT_MAX &= ~BIT_MAX_CLK; // force CLK low to start
PORT_MAX &= ~BIT_MAX_CS;
for(int i = 15; i >= 0; i--) {
  if (value & (1 << i))
    PORT_MAX |= BIT_MAX_DO;
  else
    PORT_MAX &= ~BIT_MAX_DO;
  PORT_MAX |= BIT_MAX_CLK;
  PORT_MAX &= ~BIT_MAX_CLK;
}
PORT_MAX |= BIT_MAX_CS;

What's not immediately obvious from the C source is that in assembly, the code to do "(1 << i)" is itself a loop. So the first bit had 15 bit shift operations to do in the middle, then 14, then 13 and so on.

Oops.

How about this instead:

PORT_MAX &= ~BIT_MAX_CLK; // force CLK low to start
PORT_MAX &= ~BIT_MAX_CS;
for(unsigned int mask = 1 << 15; i != 0; mask >>= 1) {
  if (value & mask)
    PORT_MAX |= BIT_MAX_DO;
  else
    PORT_MAX &= ~BIT_MAX_DO;
  PORT_MAX |= BIT_MAX_CLK;
  PORT_MAX &= ~BIT_MAX_CLK;
}
PORT_MAX |= BIT_MAX_CS;

Much better. The for loop actually creates the mask as a side effect of counting the loop iterations. When the bit "falls out" of the end of the word, the loop is done.

I tried this with my GPSDO, where I need to bit-bang SPI for a 24 bit word. There, it's slightly more complicated. In that case your mask value is an unsigned long. If that's the only change you make, you get a compiler warning that you're left shifting too far for the word size. The problem is that (1 << 15) is an int expression - (1 << 23) will overflow. You have to say (1L << 23) instead.

More firmware developments

Tenth hacking

Discussions

Become a Hackaday.io Member