int32 multiplication = NOOOOOO!

A project log for drillPresseur - Drill-Press with Force-Feedback

OK, most drill-presses have "force-feedback" in the normal sense... so this is a bit ridiculous.

eric-hertzEric Hertz 12/07/2016 at 01:027 Comments
#define MOTOR1_KP_SHIFT 0  //kP = 1
#define MOTOR2_KP_SHIFT 3  //kP = 8

void pwmify(motor_t *motor)
   //Assuming "power" is > 0 when moving toward desiredPos>0 from 0...
   int32_t power = (motor->desiredPos - motor->actualPos);

   //power *= (int32_t)(motor->kP); //!!!
   //This ONE multiplication takes about 100us!
   //Regardless of whether kP is 1 or 8
   //Despite kP being 'const'
   //And replacement reduces codesize by nearly 400B
   // (Probably due to _mult_i32() not being linked in?)   
   //NOTE: The kP "multiplication" replacement 
   //is now handled after the following

   //(Alternatively, I suppose, I could assume the desired/actual
   // difference should never be greater than an int8_t...?)

   if(power >= 0)
      motor->dir = 1;
      motor->dir = -1;
      //unsign it!
      power = -power;
   uint32_t uPower = power;
   //The kP multiplication is now replaced with this:
   //NOTE: SIGNED-SHIFT is undefined per C's specs (as I recall)
   // So this needs to be AFTER the sign is removed.
   if(motor->num == 1)
      uPower <<= MOTOR1_KP_SHIFT;
      uPower <<= MOTOR2_KP_SHIFT;

   if(uPower > 255)
      uPower = 255;

   motor->pwm = uPower;


Ted Yapo wrote 12/08/2016 at 20:17 point

Funny thing - on the XC8 PIC compiler - I dug around and found - guess what - the 32x32 multiply routines are written in C, not hand-optimized assembly code.  I wonder of you are looking at a similar situation?  I guess this makes sense because the PIC compiler targets so many different 8-bit targets, but still.

The upside is that I think I found somewhere I can optimize my code :-)

  Are you sure? yes | no

Eric Hertz wrote 12/09/2016 at 02:32 point

Interesting, indeed. I'll have to take a look at the output!

  Are you sure? yes | no

Yann Guidon / YGDES wrote 12/07/2016 at 02:36 point

Can you have motor->dir = 0 ?

  Are you sure? yes | no

Eric Hertz wrote 12/08/2016 at 14:08 point

Dun see why not...

I haven't looked at this guy's instruction-set recently, but as I recall AVRs have Branch if Equal as well as Branch If Equal Or Greater, so I think it should compile to the same number of instructions, using -1 instead of 0. But that's quite an assumption, on my part.

What'd you have in mind?


Re Profiling: I'm sure there's a fancy way to do something like that, but I'm just toggling an LED between different function-calls. I also haven't yet looked into the assembly-output... 

Is that what you meant by "profiling"?

Functionality-wise, I've been meaning to post an update... The feedback loop works as-expected, but I didn't expect a bunch of "real-world" effects that render this system a lot less-intuitive (usefulness-wise) than I'd hoped... So kinda back to the drawing-board.

  Are you sure? yes | no

Ted Yapo wrote 12/07/2016 at 01:09 point

400B for a mul32?  That's almost half your allotment :-)

But wait - what uC is this on?  I'm using mul32s on a mid-range PIC with the lousy free XC8, and they're not 400B.

  Are you sure? yes | no

Eric Hertz wrote 12/07/2016 at 01:28 point

Agree, this seems utterly shocking to me.

Consider GRBL, which uses *dozens* of such calculations in floating-point, no less.

This is an ATtiny861, I'm assuming it has no "mult" instruction... so maybe that's part of it.

OTOH, consider that I've used this same uC to do realtime 10-bit audio recording to an SD Card at 19.2KS/s, as I recall... Seems with all that overhead, *bit-banged* serial I/O, a bit-banged PS/2 keyboard, and an SPI LCD display... 

Shocking, to say the least, I'm getting only 17000 loops/second, after the mult-replacement, on this tiny project.

  Are you sure? yes | no

Yann Guidon / YGDES wrote 12/07/2016 at 02:38 point

Can you profile your code ?

  Are you sure? yes | no