MAX32660 Motion Co-Processor

Super-small, ultra-low-power 96 MHz Cortex M4F co-processor manages sensors and processes data allowing the MCU host attend to other things.

Similar projects worth following
Small breakout board using the 1.6 mm x 1.6 mm MAX32660 96 MHz Cortex M4F MCU as a co-processor to manage LSM6DSM+LIS2MDL+LPS22HB inertial sensors and provide accurate absolute orientation and altitude estimation to the host via simple I2C register reads.

We have been using EM Microelectronic's EM7180 motion co-processor for absolute orientation estimation with some success. The EM7180 embeds a 10 MHz ARC processor with single-precision floating point unit (FPU) optimized for fast fusion calculations using fusion algorithms developed by PNI Corporation. The EM7180 off-loads the management of the sensors and the computationally-intensive fusion calculations from the host MCU so that accurate absolute orientation estimation in the form of quaternions or Euler angles can be read from the EM7180 by the host via simple I2C register reads. This means even a poky 8 MHz Arduino Pro Mini can obtain <2-degree accurate heading data when using the EM7180 co-processor and a suitably accurate sensor suite like the MPU9250 or LSM6DSM+LIS2MDL.

This may seem trivial when low-cost, high-performance MCUs like the STM32L4, nRF52832, ESP32 and others abound. And when computationally-efficient, open-source fusion methods have been available for years making it easy for anyone to get quaternions (see here and here and here and here) with these host MCUs.

However, accurate sensor calibration and fusion methods are not trivial to implement, so the demonstrated performance (< 2 degree rms heading accuracy) of the EM7180 is quite attractive. Also, the idea of a co-processor (commonplace in SoC computers in the form of sensor hubs and GPUs, for example) is to off-load specialized sensor management and computationally-intensive data processing tasks from the host. This allows the host's full attention to be placed elsewhere, and/or allows a power-gulping host to sleep through sensor management tasks thereby conserving power. And for hosts with embedded connectivity like the nRF52 (BLE) and ESP32 (BLE and wifi), the radio stack usually commands top priority so off-loading computationally-intensive tasks when possible minimizes collisions.

In addition to sparing the host and obtaining superb heading accuracy, the advantages of using the EM7180 as motion co-processor include small size (1.6 mm x 1.6 mm WLCSP-16), auto gyro and magnetometer calibration, simple I2C serial output, and ultra-low-power usage. Quaternions can be updated at the rate of the gyro (up to 400 Hz guaranteed), and there is some flexibility in the form of generic user registers, fusion tuning parameters, and the ability to make use of RAM patches to allow some customization of the fusion algorithms (for example, we have added fusion of the accelerometer and barometer to provide a drift-corrected altitude estimation). There is a warm start capability that allows the EM7180 to start with the last session's calibration parameters upon subsequent power up.

There are some disadvantages to using the EM7180, of course. These include the fixed, "black-box" nature of PNI Corp.'s algorithms stored in ROM, the need to load the sensor-specific firmware into RAM on each power up, the very small amount of free RAM that limits customization, behaviors of the fusion algorithm (especially the dynamic magnetometer calibration) that cannot be adjusted or turned off. While the EM7180 is sensor agnostic, meaning it can use the input of almost any I2C accelerometer, gyro, and magnetometer to produce fused quaternions, the drivers for the sensors have to be created and compiled using a deprecated compiler. The ~24 kBytes of compiled firmware have to be stored on an EEPROM for loading into the EM7180 RAM on each power up, or loaded from the host. Lastly, the EM7180 is designed to manage I2C sensors so that devices with SPI or UART serial interfaces cannot be used directly and require a translator.

With the announcement of MAXIM Integrated's DARWIN family of MCUs, especially the MAX32660, we have an opportunity to design a motion co-processor that offers all of the advantages of the EM7180 with few, if any, of the disadvantages....

Read more »

OpenDocument Spreadsheet - 14.70 kB - 08/13/2018 at 19:59



MAX32660 Data Sheet

Adobe Portable Document Format - 597.46 kB - 08/08/2018 at 23:56


MAX32660 User Guide.pdf

MAX32660 User's Guide

Adobe Portable Document Format - 5.45 MB - 08/08/2018 at 23:54


EAGLE design files and gerbers for prototype design

x-zip-compressed - 100.89 kB - 08/08/2018 at 22:42


  • Final Hardware Design

    Kris Winer4 days ago 0 comments

    November 18, 2019

    As Greg mentioned in his log, we are making good progress on the motion co-processor firmware. We are also getting ready for pilot production of the hardware next month and are considering options for the 0.5 x 0.7 in. product board.

    The baseline is the same design we have been using for testing, more or less.

    Baseline MAX32660 motion co-processor design pcbs from OSH Park

    This design uses the 0.4-mm-pitch,  3 mm x 3 mm MAX32660GTG+ TQFN-24 package (without bootloader) and the LSM6DSM, LIS2MDL, and LPS22HB (10 DoF) sensor suite. We have dropped the 32.768 kHz crystal, since this is only needed if the MAX32660 RTC is required, which it is not in this application. We have also added a bypass capacitor to nRST that eliminates spurious device resets (good practice in any case). We have added 10K pulldowns to the formerly SWD port, one for an I2C address pin and one to put the MAX 32660 into boot mode, which returns these two pins to an SWD port for reprogramming.

    The advantage of this design is the inexpensive manufacturing using OSH Park design rules (for 4-layer pcbs). This makes it possible for anyone to easily customize this open-source design for their applications as well as reduce the per-board cost of manufacturing. The disadvantage is the "large" size of the MAX32660 package. This makes it necessary (for this board size) to drop one of the plated through holes, which will complicate mounting onto popular development boards like the Teensy or Dragonfly but should pose no problem for breadboard use.

    The alternative design uses the 0.35-mm-pitch, 1.6 mm x 1.6 mm MAX32660GWE+ WLP-16 "flip chip" package (without bootloader).

    Twenty-board panel from Sunking Circuits.

    All 16 of the pins are required for the design including the four internal pins; one is connected to GND and three are connected to SDAM, SWCLK, and host INT. I used EAGLE CAD to design this version and had to violate OSH Park design rules in order to place 3 mil vias onto these three internal pins:

    This would never work since the via annuli are too close together. This is an especially difficult part to design with since the pitch is 0.35 mm and the pads are 0.18 mm in diameter, both 10-15% smaller than most other WLCSPs I have dealt with, including the EM7180. Fortunately, I found a cooperative fab house (Sunking Circuits Electronic LTD) who took the design and modified it to conform to their via-in-pad process:

    Board layout of the pin pads after rework by Sunking.

    According to Sunking "there will be solder mask covering the three rings of the resin plugged vias, the yellow parts will be the exposed pads, their final diameter will be 0.18 mm with a tolerance of +/- 20%."

    The first batch had to be scrapped because of some kind of debris in the production process, but I just received the second "good" batch which passed their internal testing and look marvelous:

    Close up of the second design option.

    The cost was ~$16 per board for 20 of them delivered including the $200 NRE for setup on every new pcb design, high by OSH Park standards but not prohibitive. And certainly much cheaper than I could have had this work done locally in Silicon Valley. This means that if I produce 100 of these at Sunking the pcb cost would likely be about ~$5 per board instead of the ~$1 per board the baseline design might cost.

    So this extra cost and the added complexity of the production are definitely a disadvantage. The advantage is that we get back the missing PTH at the board corner, and maybe a cleaner design. Not sure it is worth it. But this was a useful learning experience in that there are many 0.4-mm-pitch WLSCP devices that I would like to be able to use that are now within range of my design abilities and pocketbook.

    The next challenge is assembling one or two of these boards and testing them for function.  We'll post some comparative testing results in a future log soon.

    I am sure we will go...

    Read more »

  • MAX32660 Motion Co-processor Firmware Development Challenges

    Greg Tomasch6 days ago 3 comments

    Well, it has been a long while since any log entries have been posted for this project... That is not because Kris and I haven't been working on it but because there have been some serious software development challenges to overcome.  It is one thing to try out some of the simple I2C loop-back examples in Maxim's software development kit (SDK) or blink an LED but it is a fundamentally harder task to make the MAX32660's peripherals perform to the level required by a high-speed asynchronous sensor fusion engine.

    I'm happy to report that the "show-stopper" software infrastructure problems preventing a robust, high-performance motion co-processor implementation on the MAX32660 have been solved. We have demonstrated reliable operation and excellent attitude estimation results from MAX32660-based parts. I plan to open a new project to document general motion co-processor calibration/characterization instrumentation and procedures soon. Results from the MAX32660 motion co-processor will figure prominently into this effort. The remainder of this log entry will be an overview of the firmware development challenges and what it took to overcome them.

    When I set out to write motion co-processor firmware for this micro-controller, I already had almost all of the necessary algorithmic and calibration "pieces" in-hand and successfully implemented on other micro-controllers. This should be the hard part, right? Not so fast... There are some other basic pieces of "data  plumbing" that are absolutely essential to make a motion co-processor work on any micro-controller:

    • An EEPROM emulator to store co-processor configuration information and sensor calibration data for retrieval at startup
    • An asynchronous I2C master bus; sensor data ready (DRDY)  interrupts initiate an I2C data read transaction from the appropriate sensor with a completion callback to the main sensor fusion loop
    • An asynchronous I2C slave bus; the host micro-controller (connected to the MAX32660) must be able to query the co-processor for data and receive a prompt response without significantly delaying the main sensor fusion loop
    • Rising-edge-interrupt-capable GPIO pins to handle sensor DRDY interrupts.

    After an initial assessment of Maxim's SDK, it became clear that only the rising-edge interrupt capability would be straightforward. I addressed the other three items in this order:

    1. EEPROM emulator
    2. Asynchronous I2C master bus
    3. Asynchronous I2C slave bus

    I should say that if any one of these capabilities could not be successfully implemented on the MAX32660, completion of the project would not be worthwhile. The final product would simply not be competitive. Before I proceed much further, I want to acknowledge the fact that I was very lucky to have an excellent teacher/mentor along the way. Thomas Roell is truly a master at micro-controller application programming interface (API) development and he shared his expertise with me generously. Without his inputs and help along the way it would have certainly been much, much harder to bring the project along this far.

    EEPROM Emulator

    An EEPROM capability is crucial for a motion co-processor. The basic proposition here is that we are using economical MEMS sensors as the inputs into the fusion/attitude estimation algorithm. No matter what, there is no algorithm that compensates for bad sensor data... So good sensor calibration is a prerequisite to achieve an accurate attitude estimate. Since MEMS sensors can significantly depart from ideal behavior, effective calibration may require more involved procedures performed under controlled conditions. The calibration procedures generate parametric corrections that are applied at run-time. If the sensor calibration procedure(s) are impractical to do each time at startup, the sensor correction parameters need to be available in non-volatile memory (NVM). 

    The basic idea behind an EEPROM emulator is that part of the micro-controller's flash memory is set up to mimic...

    Read more »

  • Some Power Measurements

    Kris Winer08/25/2018 at 01:15 5 comments

    August 24, 2018

    First some housekeeping. I learned from Maxim that there was a typo in their data sheet (since corrected) and that the MAX32660 roadmap only includes three device packages: the 1.6 mm x 1.6 mm WLCSP-16, the 4 mm x 4 mm TQFN-20, and the 3 mm x 3 mm TQFN-24 that we are using in our prototype development. No 1.6 mm x 1.6 mm TQFN-16 package :<  We will have to make do with the 3 mm x 3 mm version or bite the bullet and go for more expensive fab methods (or zGlue) and use the WLCSP for minimal size. For the moment we will stick with the 3 mm x 3 mm TQFN-24 IC since it is easy to assemble and has lots of pins for firmware development work.

    I have heard that a revision to this TQFN-20 package is due in a month or so that corrects/improves the silicon. Still trying to find out more details but the changes should include improvements in power usage. That is what I want to discuss now.

    I finally (with Maxim's help) figured out the low power sector of the MAX32660 and have started doing some preliminary power testing. The low power scheme is a bit different from the STM32L4. There are three core voltages that can be selected which are equivalent to setting the high-speed clock to 96 MHz (OVR= 1.1 V, default), 48 MHz (OVR = 1.0 V), or 24 MHz (OVR = 0.9 V). The actual voltage is an allowed range and lowering this range effectively selects a lower clock speed. Then, for each range the clock can be further divided by 2^N where N can range from 0 to 7. It sounds complicated but in fact it is straightforward to select amongst these limited choices and the effect on average power usage can be quite large. For example, with a simple blink program at 96MHz (OVR = 1.1 V) default the MAX32660 uses 11.1 mA (when the led is off but MCU active). At the lowest core voltage setting (OVR = 0.9 V, 24 MHz) the blink program uses 5.1 mA when using low power sleep mode between RTC alarm interrupts. Let me explain this...

    The way most programs we will be using work is to configure everything at the start of the program (like setup in Arduino) and then run a forever loop where all actions are interrupt driven (sensor data ready interrupts, timer interrupts, RTC alarm interrupts, etc). This is so when an interrupt is not being serviced, the MCU can sleep in a low-power state.

    So for the simple blink program, we set the RTC alarm to trigger an interrupt every two seconds. The interrupt wakes the MAX32660, it takes some action (turns the led on or turns the led off) and then goes back into SLEEP mode. The program uses 11.1 mA at 96 MHz without invoking SLEEP mode between interrupts and 9.6 mA when SLEEPing between interrupts. The average power drops to 5.1 mA when running at 24 MHz with SLEEP between interrupts. This demonstrates the benefit of the SLEEP scheme as well as selecting the clock speed appropriate to the task, and it just makes common sense to take advantage of these features as a general strategy.

    The wake up time from SLEEP mode is very fast (a few us). There is also a DEEPSLEEP mode with a somewhat longer wakeup time (~1 ms), and using this instead of SLEEP at 24 MHz (OVR = 0.9 V) drops the average power down to 175 uA. About 6 uA of this is just due to the sensors in power down mode on the custom board. This (~169 uA) sounds like an impressive power savings and it is much lower than 5.1 mA, but the data sheet says we should be at ~4 uA. So some more fine tuning is required like turning off peripherals that aren't needed (like SWD) and making sure all GPIOs not being specifically used are tri-stated, etc. to get the power as low as possible.

    Also, the pending silicon revision should reduce this 1 ms wake up time considerably. Still lots of work to do to get the power sector under control. The goal is 1 mA or less when running the motion co-processor in its normal mode, meaning accel and gyro at 208 Hz, mag at 100 Hz, and barometer at 10 Hz with quaternions at the rate of the gyro.

    ... Read more »

  • First Successful Programming of MAX32660

    Kris Winer08/14/2018 at 01:29 7 comments

    August 13, 2018

    As soon as we tried to program our custom MAX32660 motion co-processor board we knew we had a problem. The MAX32625 PICO DAP debugger which came with the MAX32660 evaluation kit and the MAX32625 PICO debugger we bought separately both interface with the MAX32660 target via a 2 x 5 SWD 0.05-inch connector, which of course, we didn't put on our tiny 0.7-inch x 0.5 inch breakout board; no room! We did expose SWDIO and SWDCLK as well as nRST on the breakout board edges. That is, we expected to have to use the SWD port to program our breakout. We just didn't think it would be so hard to do so.

    First we tried to use the edge pins on the MAX32625 debugger which are supposed to connect to SWD and nRST, etc but we couldn't make this work. Not sure why. Programming via the SWD connector did work on the evaluation board so we ordered some SWD cable breakout boards from Adafruit (faster than making our own), which allowed access to all of the SWD signals from the 0.05 inch cable. And...success! Simply connecting SWDIO, SWDCLK, GND, nRST, and VREF to 3V3 and we were able to program the custom board MAX32660 to toggle one of the GPIOs we exposed as WAKE on our breakout. The toggled GPIO then actuates an led and voila:

    Here we have GND_detect also connected but we subsequently learned this is not needed.

    Yes, it is a little clunky but for prototype hardware and firmware development this works just fine. In the end-user application, once programmed with firmware the breakout doesn't need to be programmed again. The breakout will communicate with the host MCU via I2C in use. So in reality for the final product we don't need SWD and nRST exposed to the user; and it is a shame to waste valuable board edge just for one-time programming. So we will likely expose the SWDIO/SWDCLK, and nRST along with GND and 3V3 as a 2 x 3 test pad pattern on the back of the board that can be programmed at the fab using a pogo pin connector.

    This was the last technical hurdle to overcome for proof of concept. Now we will continue our firmware development and, in a few days, test the board as a motion co-processor to an STM32L4 host MCU. Very exciting!

View all 4 project logs

Enjoy this project?



Daniel wrote 05/07/2019 at 17:33 point

Hi Kris.
Is this project still active? :)

  Are you sure? yes | no

Kris Winer wrote 05/07/2019 at 17:46 point

Yes. We have been working on the software side of things and have methods for calibration and sensor fusion that allow < 1 degree rms heading (typically 0.5 degree rms) accuracy routinely as well as magnetic anomaly detection/correction so the heading (i.e., absolute orientation) remains accurate even in the presence of external magnetic fields.

We are testing these now using a 2-stage goniometer and will update the project with results soon. But excellent fusion on the host MCU is working well.

We are still debating the best co-processor. We'd prefer to use the STM32L4 but this is rather larger than we would like. The MAX32660 is still a candidate as is the EM7180, and we might even bite the bullet and try to use the 1.6 mm x 1.6 mm WLCSP, although this would be an expensive pcb.

The hard part of the project is done. Realization as a motion co-processor still needs to be reduced to practice but I see no barriers to straightforward implementation. Just a question of the best vehicle for it.

  Are you sure? yes | no

Daniel wrote 05/08/2019 at 15:56 point

Wow, thanks for the reply!
It's really amazing work! :)

  Are you sure? yes | no

Kevin wrote 02/13/2019 at 00:25 point

How is it going? Did they get the quirks worked out of the power management?

  Are you sure? yes | no

Kris Winer wrote 05/07/2019 at 17:37 point

Sorry missed this. I haven't tried the latest silicon so I don't know yet.

  Are you sure? yes | no

John Stockton wrote 12/12/2018 at 02:08 point

Nicely done Kris.  Have you looked at some of the newer AK parts like the AK09917D since the original AK8963 looks like it is hard to get (maybe EOL)? 

  Are you sure? yes | no

Kris Winer wrote 12/12/2018 at 02:47 point

The LIS2MDL is superior to the AK line of Hall-sensor-based mags both in terms of lower power and lower jitter. In general, the current ST sensor suite (LSM6DSM + LIS2MDL + LPS22HB) is the best available of the "cell-phone" IMU sensors (in the few dollar price range) and is a good replacement for the MPU9250 + MS5637 barometer suite. The ST sensor suite offers low cost, very low power, superb stability and reproducibility and we have been able routinely to obtain ~1 degree heading accuracy using it. Good enough for me...

  Are you sure? yes | no

Kris Winer wrote 08/18/2018 at 15:35 point

Maxim has an Eclipse-based SDK for the MAX32660. But GCC-make also works as does mbed.

  Are you sure? yes | no

don.vukovic wrote 08/18/2018 at 15:15 point

What compiler/IDE do you use for this part ?

  Are you sure? yes | no

Martin Pålsson wrote 08/14/2018 at 19:52 point

Another great board Kris!

I see that you have kept the edge breakouts more or less the same as the USFS-MPU9250 (the essential pins, I2C gnd and 3v3). Seems like it will be a drop-in replacement, well done!

Question: will the axial alignment be the same as for the USFS-MPU9250? 

Best regards


  Are you sure? yes | no

Kris Winer wrote 08/14/2018 at 20:17 point

This is just a prototype so nothing is fixed as of yet. We kept the alignment the same as the EM7180 version of the board, that is the one with the same ST sensors ( But because the MPU9250 and LSM6DSM+LIS2MDL have different sensor axes alignment, no, the alignment here cannot be the same as the MPU9250. That being said, choice of True North is arbitrary and can be changed in firmware so we can choose to keep the same top of the board alignment of the Euler angles to True North.

  Are you sure? yes | no

Martin Pålsson wrote 08/17/2018 at 17:20 point

Thank you for your reply Kris. Very informative

  Are you sure? yes | no

Kris Winer wrote 08/11/2018 at 16:35 point

Thanks, fixed!

  Are you sure? yes | no

Winston wrote 08/11/2018 at 14:09 point

LIS6DSM should be LSM6DSM.

  Are you sure? yes | no

Similar Projects

Does this project spark your interest?

Become a member to follow this project and never miss any updates