Close

atmega128 voodoo

A project log for Portable environmental monitor

A handheld, battery powered, sensor array unit for environmental monitoring focused mostly on air quality using a global infrastructure.

radu-motisanRadu Motisan 07/15/2015 at 13:410 Comments

atmega128_voodoo_2

"Any improbable event which would create maximum confusion if it did occur, will occur."

While working on my Hackaday 2015 Prize project, the two development boards I was using, both based on atmega128 started to behave erratically. The issue made me blame parts of code related to the esp8266 or the ILI9341 modules, and I wasted a lot of time. Finally I stripped the code down to a blinking LED, just to see that the original Voodoo issue is back. No 5V programmer would make it go away this time.
This almost made me switch to STM32F4 microcontrollers as an alternative, but the time was too short for that, as the volume of code needed to be ported was too high. So back to AVRs, I purchased a few alternatives like the mega64, various programmers (initially I used usbAsp with avrdude under MacOS) hoping to find a working solution. Which I did not. This didn't stop me from rechecking everything over and over again. Between several inconsistent software runs, I noticed a code verification error, "first mistmatch at byte 0x0100" and "verification error; content mismatch":
avrdude_first_mistmatch_at_byte_0x0100
Tracking the issue I ended up on the avrdude website and their bug tracking system, where bug #41561 presented just that, but for the atmega64. Apparently a change in avrdude 5.11 introduced memory tagging, as explained by Joerg Wunsch :

Before, all memories had been treated as a large block of bytes (N = size of that memory area on the chose device), regardless of whether their contents actually came from an input file. Now, only those regions are touched that have corresponding bytes in the input file. (For paged memory areas, the term "region" here refers to the situation where at least one byte within a memory page has been mentioned in the input file.)
I traced the ISP traffic with a logic analyzer, and decoded the data stream back into ISP commands. See the attachment for the full trace. The bug is that the "write memory page" command is issued twice:
Time 393.416 ms: MOSI Load program memory page, address 0x007f, low byte, value 0x6d
Time 393.910 ms: MOSI Load program memory page, address 0x007f, high byte, value 0x6d
Time 394.370 ms: MOSI Write program memory page, address 0x007f
Time 394.804 ms: MOSI Read program memory, address 0x007f, high byte, value 0xff
Time 395.218 ms: MOSI Read program memory, address 0x007f, high byte, value 0xff
Time 395.688 ms: MOSI Read program memory, address 0x007f, high byte, value 0xff
Time 396.131 ms: MOSI Read program memory, address 0x007f, high byte, value 0xff
Time 396.538 ms: MOSI Read program memory, address 0x007f, high byte, value 0xff
Time 397.013 ms: MOSI Read program memory, address 0x007f, high byte, value 0xff
Time 397.427 ms: MOSI Read program memory, address 0x007f, high byte, value 0xff
Time 397.903 ms: MOSI Read program memory, address 0x007f, high byte, value 0xff
Time 398.368 ms: MOSI Read program memory, address 0x007f, high byte, value 0xff
Time 398.805 ms: MOSI Read program memory, address 0x007f, high byte, value 0x6d
Time 399.222 ms: MOSI Write program memory page, address 0x007f
Time 399.686 ms: MOSI Read program memory, address 0x007f, high byte, value 0x6d
Time 401.510 ms: MOSI Load program memory page, address 0x0080, low byte, value 0x6f
Time 402.139 ms: MOSI Load program memory page, address 0x0080, high byte, value 0x72
After filling the page buffer, the page is being programmed at time 394.370 ms. Then, USBasp polls the page for a response != 0xff, which indicates the end of the write operation (time 398.805 ms). However, just after this, it issues another "write page" command at 399.222 ms, but then proceeds to fill the page buffer again for the next page.
Apparently, the old devices (ATmega64/128) respond to the second page write immediately with a poll value of "OK" (i.e., they return the correct value), yet they are still busy programming afterwards. In contrast, the newer devices (like ATmega1281) correctly respond again with 0xff for the second page write operation:
Time 391.417 ms: MOSI Load program memory page, address 0x007f, low byte, value 0x6d
Time 391.910 ms: MOSI Load program memory page, address 0x007f, high byte, value 0x6d
Time 392.371 ms: MOSI Write program memory page, address 0x007f
Time 392.806 ms: MOSI Read program memory, address 0x007f, high byte, value 0xff
Time 393.218 ms: MOSI Read program memory, address 0x007f, high byte, value 0xff
Time 393.689 ms: MOSI Read program memory, address 0x007f, high byte, value 0xff
Time 394.130 ms: MOSI Read program memory, address 0x007f, high byte, value 0xff
Time 394.539 ms: MOSI Read program memory, address 0x007f, high byte, value 0xff
Time 395.014 ms: MOSI Read program memory, address 0x007f, high byte, value 0xff
Time 395.428 ms: MOSI Read program memory, address 0x007f, high byte, value 0xff
Time 395.903 ms: MOSI Read program memory, address 0x007f, high byte, value 0xff
Time 396.369 ms: MOSI Read program memory, address 0x007f, high byte, value 0xff
Time 396.806 ms: MOSI Read program memory, address 0x007f, high byte, value 0x6d
Time 397.222 ms: MOSI Write program memory page, address 0x007f
Time 397.687 ms: MOSI Read program memory, address 0x007f, high byte, value 0xff
Time 398.130 ms: MOSI Read program memory, address 0x007f, high byte, value 0xff
Time 398.539 ms: MOSI Read program memory, address 0x007f, high byte, value 0xff
Time 399.013 ms: MOSI Read program memory, address 0x007f, high byte, value 0xff
Time 399.431 ms: MOSI Read program memory, address 0x007f, high byte, value 0xff
Time 399.903 ms: MOSI Read program memory, address 0x007f, high byte, value 0xff
Time 400.368 ms: MOSI Read program memory, address 0x007f, high byte, value 0xff
Time 400.805 ms: MOSI Read program memory, address 0x007f, high byte, value 0xff
Time 401.218 ms: MOSI Read program memory, address 0x007f, high byte, value 0xff
Time 401.688 ms: MOSI Read program memory, address 0x007f, high byte, value 0x6d
Time 403.640 ms: MOSI Load program memory page, address 0x0080, low byte, value 0x6f
Time 404.155 ms: MOSI Load program memory page, address 0x0080, high byte, value 0x72
which explains why they can be programmed fine. But obviously, the second page write operation is completely unnecessary.
The difference ... is that AVRDUDE now works on a per-page basis throughout all programmers, rather than on the entire device memory. If I remove the USBASP_BLOCKFLAG_LAST (line 1330, function usbasp_spi_paged_write()), it seems to work as intended

And indeed it works! Personally I opted for using a version prior to 5.11 (CrossPack-AVR-20100115.dmg), as that was readily available for MacOS, but as soon as I finish my work for the HackADay Prize 2015, I'll have the time to properly compile the latest code that fixes the issue.

Discussions