Close

xc32-gcc Optimizer Math Bug Fixed in v1.42

A project log for operation: Learn The MIPS (PIC32MX1xx/2xx/370)

Having been exclusive to a certain uC-line for over a decade, it's time to learn something new (and port commonCode!)... Enter MIPS

eric-hertzEric Hertz 06/20/2016 at 07:350 Comments

Alright! It's been 9 months since I submitted the bug, but it appears that the latest greatest xc32-gcc has Fixed It!

The ol' bug was described, experimented-with, and characterized in great-detail here: https://hackaday.io/project/6450-operation-learn-the-mips-pic32mx1xx2xx370/log/23899-optimizer-math-bug

Basically, what it boiled down to was that if you used -O1, (optimization, at the highest-level available in the "free" xc32-gcc), then math-errors would occur with uint8_t and int8_t. It seems what happened, on rare occasion (which I just happened to be lucky enough to encounter in my first project) was that the optimizer would treat int8_t as though they were 32-bit, and forget to pad the remaining-bits. It could've worked, if it padded them correctly, but it didn't. So, e.g. I had something like

int8_t direction = -1;
uint8_t power = 127;

int16_t signedPower = (int16_t)direction * (int16_t)power;

And, instead of getting -127, I was getting 32385. Yep, 32385 = 0xff * 127. And we all know that -1 is 0xff, when represented in an int8_t, right...?

Except, it didn't happen *all the time*... you'll have to look at the details of the link above to see *when* it happened, but it did happen, and when driving a motor, the difference between a power-level of -127 and 32385 could've caused quite a bit of finger-damage to the unwary.

This, apparently, having tested, only occurred on the linux version of xc32-gcc (v1.40)... I tried the exact same code under WinXP, and it worked fine.

So, I thought I got the ol' brush-off, because I never saw an update, and even looking through changelogs between v1.40 and v1.42, no mention of it... But I tried v1.42 anyways, and yep, it works.

Woot!

Apparently I inadvertently discovered another bug, as well... by having a bug within my own code... I used "\n" to end one line, and "\n\r" to end those thereafter... And, of course, I was outputting via serial-port, where a lone "\n" isn't enough to carriage-return back to column-zero as well...

In 1.40, it displayed as I'd intended... (despite my bug)

But in v1.42 it displays funky... "Almost as though" it's not returning to column-zero before starting the new-line...

Yep, I forgot a "\r" and they fixed not only the math-bug, but also the carriage-return "bug." Woot!


There may be more in a bit... I've yet to reimplement -O1 in a regular-ol' project to see how my code functions... logically, there should be one level of optimization greater than what it was before, so it might be a bit faster... OTOH, my workaround for the math-bug was to use -O0 and enable all the -f<options> I could find... so... we shall see.


Indeed, my "loop-count/second" has increased from around 32,000 to 80,000 by being able to use -O1 instead of all the -f<options> explicitly. And now my bit-banged UART works darn-near perfectly:

("Why would you use a bit-banged UART on a chip which has a built-in UART peripheral...?" that's another topic entirely...)

And here's a showing of the carriage-return bug-fix. The earlier lines shouldn't've been aligned as they are, I forgot "\r"... The later lines show "1:" shifted as it should be with merely a "\n", when sent via serial-port...

I had "\n" after the "loopNum" statement, but "\n\r" after all the other statements... The old version (v1.40) apparently automatically appended the "\r" (!?), but the new version (v1.42) doesn't, so it works as-expected per the coding, which had a bug in it.


Now, Microchip, I may have "lucked" into discovering the math-optimization bug, but it was actually a tremendous amount of effort, on my part, to determine how to reproduce the bug, how to present it in a way that didn't include the thousands upon thousands of lines of code that I discovered it in, etc... And you lucked-into the fact that I happened to be willing to go to all that effort (hours, MANY MANY hours) to present it to you... submit it as a ticket, even work with your employee.. When I'd already found a workaround for my own purposes using the "-f<options>" and disabling the optimizer altogether (via -O<num>)... That workaround worked fine for my needs; presenting it to you was a tremendous effort. Further, you happened, apparently, to "luck" into another bug-discovery, which I might've happened to reveal to yah... So, yahknow, a bunch of free PIC32s and various other chips via the "free samples" service are nice, and the curiosity-board was pretty cool, but in reality, we're talking a sum-total of something like $50 I've gotten outta y'all, total. And if you only consider services you *don't* offer to *everyone* (again, free-samples), then only the curiosity-board, which was something like $20... And I mean, not to be rude, but upping my loop-count from 32k to 80k is pretty nice, but I could've accomplished the same (and more) by learning to move my code from FLASH to SRAM, if it was *that* important to me... So, yahknow, I think I put, easily, a good 40+ hours into that project, and at even minimum-wage, I think you've got a shitton of schwag to send my way... I get it, the odds of my "lucking" into something like this again are pretty low, so maybe a telecommuting-job-offer at 20hr/wk is a bit much to ask... But, schwag, lots of schwag... surprise-schwag is even cooler... My cat could use a toy or two as well, and I'm always up for beer... And as much as I'm growing to love the PIC32's, I'm still partial to AVR's when it comes to 8-bit, so now that you own 'em... maybe you could send some of those (or at least allow me free-samples of 'em, since Atmel never did despite my decade+ loyalty). I'm living in Brokeasheck these days... Things I could sell to pay the bills would be even better. My address is in the files I uploaded to y'all...!

Discussions