I bought a FLIR Lepton 3.5 mounted on the Pure Engineering breakout after using a friend's thermal imaging camera to analyze heat generation on a PCB I designed. The Lepton 3.5 can output absolute temperature data (Radiometric) at 160x120 pixel resolution. I decided a thermal imaging camera would be a good tool to have and decided to build my own. I have started with a Teensy 3.2 based test platform but eventually want to turn a Beaglebone Black with 7" LCD cape into a full-featured, networked, camera using the PRUs to handle the real-time video SPI feed from the Lepton. I hope the documentation in this project is helpful to others who might also want to play with these amazing devices.
The long-term goal is to create a capable thermal imaging camera using the Beaglebone Black and a 7" LCD cape as the platform matching some of the features of high end commercial products. However it soon became obvious that I'd need a simpler platform to learn how to use the Lepton module when I started reading the documentation and playing with the various demo codebases. It is a very capable device with a moderately complex interface, both firmware and hardware. Although the device has good default settings I found enabling some features wasn't well documented and the video SPI interface (VoSPI) challenging to implement due to its real-time constraints.
There are a lot of other great projects online to help get going with the FLIR sensors. Pure Engineering is to be commended for making these devices available to makers and provides a wealth of code examples, many designed to work with the previous Lepton models. Max Ritter's DIY Thermocam is probably the most mature and well known. He has done a great job and I pored over his code. Damien Walsh's Leptonic is also really well done and works with the Lepton 3.5 as well. Both Max and Damien were very gracious when I sent them various questions.
I decided to follow Max's lead and build a test platform using a Teensy 3.5 that I had (selected for the multiple SPI interfaces and copious RAM). Unfortunately after soldering the Teensy to a Sparkfun breakout board I stressed the processor BGA package and made the board unreliable. So I replaced it with a Teensy 3.2 hoping it would have enough resources to successfully interface to the Lepton. It does, barely, and the next project log describes the test platform hardware.
The Beaglebone boards have an additional capability, aside from the PRUs, that differentiate them from many other SBCs like the Raspberry Pi. They have a true power-management controller (PMIC) that generates all the various on-board DC voltages and supports dual input voltage sources (DC in and USB in) as well as a 3.7v LiPo or Li-Ion battery including a charger. In addition the PMIC, a TI TPS65217 designed specifically for the AM355x processor, can be configured to supply power to the AM355x built-in real-time clock while the system is powered down and sports a power button that interacts with the kernel for controlled shut down - features that are perfect for a portable device like a thermal imaging camera.
Or at least in theory. It turns out that a hardware bug in the first revision AM335x processor and decisions about how to wire the PMIC to the processor in the original Beaglebone Black made it so the "RTC-only" mode would never work, and under certain circumstances could actually damage the processor (a current leakage path through the processor to the USB serial interface). However based on a lot of reading of various specs and errata I ascertained that the pocketbeagle shouldn't be at risk so I was excited to see how it ran on battery power (the battery signals are helpfully broken out).
I connected an old camera battery to the pocketbeagle and added a 5V step-up boost converter to power a USB WiFi dongle attached to the pocketbeagle's host USB port. The PMIC wants a temperature sensor in the battery back and the camera battery had the right kind (10K). Unfortunately the PMIC doesn't generate a 5V out when running on the battery so the boost converter was necessary. I noticed that the 3.3V output was only powered when the pocketbeagle was turned on so it sources the boost converter. Success! Or at least I thought at first. The pocketbeagle ran happily from the battery with a reasonable current draw (< 200 mA w/ WiFi). And it charged the battery too when plugged in via the USB Micro-B interface - at least while the system was running. The problem came when the system was shut down. Charging stopped. This is hardly ideal for a portable device as not only do we expect them to charge while off but turning the system off makes more power available for battery charging.
The TI PMIC has the capability to charge the battery when its DC output supplies are disabled (system off) but clearly something was disabling that when the system was shut down. I dug into kernel source with help from a friend at my hackerspace and finally figured out the sequence of events that prevented charging when the system was shut down. It turns out that the device driver for the PMIC (tps65217.c) configures the PMIC to enter "off" state when powered down instead of "sleep" state which would let it charge the battery. A fuller explanation is documented in the pocketbeagle's google group.
I thought this was good news because the driver configures the PMIC during probe at boot time and the PMIC can be reconfigured via its I2C bus after the system boots. I tried it. Success again - kind of. The battery charged after the system shut down. But there was also a ~15 mA load on the battery from the switched-off system - enough of a load to discharge a battery pretty quickly if the system was disconnected from the charger. Unfortunately the pocketbeagle uses an Octavo system-in-a-package and I couldn't understand what could be causing the additional discharge. Fortunately Octavo replied to a question on their support forum. When the PMIC is configured to "sleep" mode enabling battery charging, it also enables one LDO regulator that is designed to feed the RTC input on the processor. Octavo also connected another power-rail (plus an enable to the 3.3V LDO) to that output per TI's spec and...
My post-holiday obsession continued until I am - finally - reading data from the Lepton using both PRUs in a Beaglebone Black. Given how - relatively - easy it is to use the PRUs I have to confess I pored over a lot of web postings before figuring it out. I'm not even the first person to document using the PRUs to read a Lepton camera. That honor, as near as I can tell, goes to Mikhail Zemlyanukha who used a PRU and a custom kernel driver to get image data on a 4.9 kernel system. Unfortunately as I found out, programming PRUs is an evolving paradigm and what worked in the past, including Mikhail's code and old methods such the UIO interface no longer work on current kernels. Finally I found Andrew Wright's tutorials and started reading TI's remote processor and rpmsg framework source and started to get code running on the PRUs.
Although it was a slog, I have been converted. I think the real-time possibilities offered by the PRUs in close cooperation with the Linux system are amazing. The PRU is my current favorite embedded micro-controller because it has easy access to an entire Linux system without the baggage of any OS - and on something as small and cheap as the pocketbeagle.
I was daunted trying to get Mikhail's kernel driver running on my 4.14 system but understood his use of one PRU to capture packets and send them upward to user-land. I also successfully built and ran the rpmsg "hello" demos. The rpmsg system is built on top of the kernel's virtio framework to allow user-land and kernel processes to talk to remote devices (e.g. embedded cores or co-processors). TI has adopted it as the "official" mechanism for their OMAP processors to talk to the on-board co-processors (including the PRUs and the power-management ARM core). It is probably used in every smart phone as I found Qualcomm's contributions to the source. The kernel's rpmsg driver makes a co-processor available as a simple character device file that user-land processes can read or write just like any other character device.
So I put together a simple program that got non-discard packets from the PRU and sent them to the kernel using rpmsg. The PRU bit-banged a simulated SPI interface at about 16 MHz, buffered one packet's worth of data (164 bytes) and then copied it to kernel space via rpmsg. My idea was to essentially replace the calls to the SPI driver in earlier programs with a call to the rpmsg driver to get SPI packets through it via the PRU. I did have the sense to try to filter out discard packets but still, BOOM. My BBB was brought to its knees by a message from the PRU about every 95 uSec (basic packet rate at ~16 MHz SPI + a very quick buffer copy - the PRUs are excellent at pushing data to main system memory). They system was 100% pegged, unresponsive and my application seemed to be getting about 1 out of 100 packets.
I didn't know it at the time but I was way overrunning the capability of the rpmsg facility and the kernel was bogging down trying to write an error message for each rpmsg from my PRU (that was overflowing its virtio queues) into several log files. I saw later the hundred megabytes of log files that had accumulated. The poor micro-SD card.
Clearly I had to reduce the frequency at which messages were sent to the kernel driver to deal with - and also increase the amount of data sent at a time. Quickly I found that the maximum rpmsg message size is 512 bytes, of which 16 bytes are used for message overhead. It took a long time - this stuff doesn't seem to be documented anywhere - to understand that the kernel could manage a maximum of 32 entries in a queue for messages for one rpmsg "channel". At least I had some parameters to work with. The Lepton is...
The American Thanksgiving holiday finally gave me some time to attend to this project again. I even planned to use the Teensy version of the camera to look at my daughter's house to help her and her husband identify where it was losing heat but sadly I left it at home in a rush to get out the door for the holiday.
My end goal has been to make a wifi-enabled camera using the Beaglebone Black with one or both of its PRUs handling the Lepton VoSPI data stream to get the full 9 FPS out of the camera and be able to view it and access photos remotely. I looked into using the ESP32 like Damien Walsh did and although it is an incredibly capable embedded system, I ultimately decided that I want a full Linux environment to build the camera application on. Experiments with the Raspberry Pi show that it's hard for a user process to keep up with the VoSPI data flow - although as someone from my Makerspace pointed out, it's quite possible that applying the Linux real-time patches might make that feasible (something to be investigated at a future point). The PRU subsystem was the first solution that occurred to me but I also investigated writing a kernel driver or seeing if the v4l2 project supported the lepton. All of the solutions are intimidating to me because of the need to dive into some moderately complex low-level linux programming so it's been easy to put off getting started.
I have an old 7" 4D-systems Beaglebone Black touch LCD cape that I thought I'd use for the camera display but over time have decided it's too bulky and power hungry. I also have become interested in the Pocketbeagle board because of its small size and lower power consumption - characteristics better suited for a portable camera. This lead to an investigation of using the Linux framebuffer driver and one of the ubiquitous ILI9341-based small LCDs using a SPI interface. Because I also want to support another camera that also requires an SPI interface (for near IR - or night vision) I finally made the commitment to using the PRUs for the Lepton (over a low-latency software solution) because I can dedicate the two built-in SPI interfaces on the Pocketbeagle to the LCD and Arducam near IR camera.
Making that commitment, and the time afforded by the holiday, finally got me moving on building a prototype.
The first prototype simply connects the LCD to SPI0.0 and the Lepton to SPI1.0. I am taking things one step at a time, first getting the Linux FB talking to the LCD and then getting some user-space code talking to the Lepton. About twelve hours of hacking later (most spent figuring out how use the Device Tree system on the Beaglebone Black to configure IO and enable the frame buffer) yielded a hacked version of leptonic directly driving the frame buffer. It works pretty well. I'm not sure what the frame rate is but it's definitely higher than the 4.5 FPS I get on the Teensy 3.2 test platform. It also occasionally stutters and gets confused reading the VoSPI datastream resulting in a garbage display for a frame. The hacked code is pretty ugly - near midnight I was in full-on hack-hack-hack mode - so I'm not sure I'll post it (although I'm happy to share if you'd like a copy). There's also a version that can send packets over the ZMQ network interface. However I'm posting relevant parts of the /boot/uEnv.txt file below and a link to the Device Tree source file that finally worked to make my LCD a display (thanks Bruce!). I modified the dts file to fit my rotation (90°) and lowered the frame rate some (since my practical limit is 9 FPS).
The following uEnv.txt changes will make sense if you've messed with the Beaglebones much and maybe they'll help someone trying to get an ILI9341-based LCD to work as a display.
I finally got around to trying the lepton on a Pi Zero. Not a good outcome but an interesting result. The interrupt-driven version of leptonic failed miserably. It could not keep up with the VoSPI data. However Damien's original version did work partially. It could get data but constantly lost sync and frequently returned garbled data. I didn't look at the SPI bus on a scope to make sure there was nothing funny going on there but it doesn't seem the Pi Zero is well matched - at least for user-space programs - with the Lepton 3. No doubt a kernel driver would work but that's beyond me at the moment.
I have also been hacking around at the application level to make sure that the cross-platform application development tool I use (xojo) is up to the task and could be integrated with the zeromq messaging library since it makes working with sockets very simple. Much more success there. Here's a test app running on my Mac and also on a Beaglebone Black simultaneously getting a feed from the leptonic server on the Raspberry Pi 3. Getting the full 9 fps from the lepton. I think I'm ready to try to use the PRUs as a VoSPI engine.
Ultimately I'd like to create a solution that comprises a linux daemon communicating with the camera and a socket-based interface to a display application for local display as well as web server for remote display. The Beaglebone Black solution will make use of the VSYNC signal to synchronize transfers. Before tackling the BBB's PRU coding I think it will be a good idea to get the daemon interface running. Since this architecture is like Damien Walsh's leptonic project, it made sense to play around with his code on a Raspberry Pi. His code uses a thread to constantly read the VoSPI interface and even on a Pi 3 has troubles remaining synchronized sometimes because of user process scheduling. I decided to port his C server to use VSYNC and a user-space interrupt handler to see if this might be more reliable.
Some testing showed that the read system call resulted in a fast SPI transfer so the main technical issue seemed to be implementing a fast user-space interrupt handler. I started with Gordon Henderson's wiringPi library because I had experience with it. The result was strange. Latency between the VSYNC interrupt and execution of the ISR routine was low only the first time the process was run after booting. Latency was too high for all subsequent runs and the routine could not get a segment's worth of data before the next interrupt. I tried several mitigation strategies such as renicing the process to the highest priority and binding it to CPU 0 (which seems to be on the front line of handling interrupts) but nothing changed the behavior. After too many hours of piddling around I decided to try PIGPIO which lead to much better and repeatable results. I'm not sure what is happening beneath the hood but. at least for interrupts, this library gives great results. I can get a reliable stream of frames at or near the maximum 9 Hz on the Pi 3. Someday I'd like to see how it performs on a lower performance board like the Pi Zero.
I uploaded my ported version of the leptonic server to github in case anyone wants to see how the ISR was implemented. It can also enable AGC for better images.
Next up is to decide on a socket-based protocol for sending commands to the lepton (for example to tell it where to sample temperature in the image when AGC is running) and for sending complete frames to consumers such as the display application or web server.
Determining the accuracy of temperature readings made with the camera is a bit tricky. FLIR's documentation indicates it varies with ambient temperature, scene temperature and object emissivity. They seem to do their characterization at 25°C against a 35°C blackbody. They claim a typical accuracy of +/- 5°C or 5% and a range of accuracies up to +/- 8°C, depending on conditions. Ambient temperatures seem to make a large impact (the lepton measures its internal temperature but I'm not sure if external temperature affects its accuracy). My reading about this class of device also indicates that the emissivity of the object being measured also impacts accuracy - although it's not clear, other than when enclosing the lepton behind a lens of some kind, adjustments need to be made to its default parameters.
Wanting to see how accurate my lepton is, I hacked lep_test6 into lep_test9 (in the github repository) adding the ability to change the emissivity in the RAD Flux Linear parameters setting as well as compare the output of the spot meter function (that can average a specified set of pixels in a specified location in the image) with output pixel data. The spot meter can be used to get temperature when AGC is enabled and the pixel data does not contain actual temperature values.
I then attempted to compare the temperature output from the Lepton with other temperature measuring devices for a variety of materials and object temperatures (being a hot summer day, I didn't get much opportunity to play with the ambient temperature). I would call this very amateur science...
Short summary: I found the device fairly accurate without having to change the default emissivity setting (with one exception) and the spot meter works as advertised. Emissivity investigation will have to wait for another day.
Longer Description + some data
The test setup included a 4-channel Dallas DS18B20 temperature probe I made years ago, an Agilent multimeter with a type K thermocouple and a home-made IR Thermometer based on the Melexis MLX90614 sensor. I attempted to use either one DS18B20 or the Agilent probe to capture the ambient temperature and then as many sensors as possible to also read the object temperature. Basic claimed accuracy (over the temperature ranges I used) of the DS18B20 sensors is +/- 0.5°C, for the Agilent sensor, 1% + 1°C and for the Melexis sensor, 0.5°C.
I held the Melexis sensor very close to the object since it has a very wide field of view and averages all the thermal energy in its scene. Here it is measuring the temperature of a "blackbody" (electrical tape) on the side of a vase of ice water.
Measuring temperature is hard... There is the basic accuracy of the sensor itself and then how it interacts with the environment. I saw a quite bit of variability for all sensors, especially the IR sensors. The DS18B20 sensors tracked each other very well. However I think that the lead temperature makes a big difference so when their plastic cases were touching an object, the temperatures recorded may have been wrong because the leads were at a slightly different temperature. The thermocouple varied based how it touched the object (although it has very fast settling time).
As can be seen from the data, the Lepton generally agreed with the other sensors. I expected worse performance, partly because of the device specs and partly because I expected the surface emissivity to play a bigger role.
Not included in the above table (a late add) was a measurement of a soldering iron set to 350°C. The Lepton read 181°C and it wasn't until the emissivity setting was lowered to 35% that the temp was close (353°C). I'm not sure why and would love to hear any thoughts...
Most demo code, including my lepton_test6 sketch, use a simple linear transformation from the raw 14-bit count or 16-bit radiometric temperature data to 8-bit values that are then transformed to RGB values via a lookup table (colormap). To do this the maximum delta between all pixels in the frame is computed and then each pixel value scaled as follows:
8-bit pixel value = (Pixel value - minimum pixel value) * 255 / maximum delta
However this simple algorithm fails with scenes that contain both hot and cold regions because it tends to map most pixel values to either the maximum or minimum values resulting in images with little contrast between the two temperature extremes.
The Leptons have an AGC (Automatic Gain Control) function that can be switched on to try to ameliorate the shortcomings of a simple linear transformation for temperature data to be displayed. FLIR has an entire - slightly confusing - section (3.6) in the engineering data sheet describing their AGC implementation - actually two different forms of AGC they call histogram equalization (HEQ) and linear histogram stretch. They claim improvements over traditional histogram techniques with their HEQ algorithm and include many parameters to tune the algorithm but don't deeply describe it. They don't describe their linear histogram stretch algorithm.
Enabling AGC mode (and disabling radiometric calculations - something that took me a few weeks to figure out) generates 8-bit data out of the selected AGC algorithm. There are many parameters but they claim default values should produce good results. There is an additional, somewhat mysterious, AGC Calculation State Enable as well. The description states "This parameter controls the camera AGC calculations operations. If enabled, the current video histogram and AGC
policy will be calculated for each input frame. If disabled, then no AGC calculations are performed and the current
state of the ITT is preserved. For smooth AGC on /off operation, it is recommended to have this enabled. "
Since reading the AGC section I have been interested in how well their implementation performs. I wrote a sketch, lepton_test8, that allows cycling between the original linear transformation and 4 variations of FLIR's AGC function. They are
"AGC disabled" - Radiometric mode enabled, my code linearly transforms 16-bit data to 8-bit data.
"AGC linear C" - AGC mode enabled, linear histogram stretch mode, AGC Calculation State enabled. Lepton outputs 8-bit data.
I then compared a scene with hot and cold components (soldering iron and ice-water with a room-temperature background) using the various modes and various color maps. I also did some informal testing with less dynamic scenes. The TL,DR summary is that the AGC modes generate distinctly better images in scenes with large dynamic range. They do only slightly better with more mono-temperature scenes. I couldn't see much difference between the linear histogram stretch and HEQ modes. I also couldn't see much difference with the AGC Calculation state enabled or disabled (perhaps I saw artifacts left behind when the camera panned).
Following are some pictures showing output with different modes. You can see the soldering iron and glass in the first image. The mode and current colormap is shown at the top of the camera's LCD display.
A big difference with the "Iron Black" and "Golden" colormaps.
The advantages of the built-in AGC modes was less pronounced when the temperature range of...
Lepton's use an output-only slave SPI interface called VoSPI (Video SPI) to output pixel data with a maximum 20 MHz clock. It has the feeling of something that has evolved over time as FLIR added newer models. It definitely does not feel like something that would be designed from scratch. The Lepton's basic unit of data transfer is a 164- or 244-byte packet. The 164 byte packets contain 80 16-bit pixels (of which 8-, 14- or 16-bits of data may be valid for the Lepton 3.5 depending on its operating mode). The first 4 bytes are a 16-bit ID word and a 16-bit CRC word (that I have not, to-date, attempted to use). The 244-byte packets contain 80 24-bit (8-bit each for R, G and B) pixels and the ID and CRC words. The ID word carries a line number (0-59 or 0-62).
The 80x60 pixel Leptons (2 and 2.5) output 60 packets per frame (or 63 packets if telemetry data is also included with the pixel data). This turns out to be about 2 Mbits/second for a 100% dedicated interface.
The 160x120 pixel Leptons (3 and 3.5) modify the protocol a bit in order to carry 4x the data. They add a segment number to the ID word of packet 20. Segment numbers 1-4 indicate that the entire set of packets contain a valid segment. It definitely would have been easier (less buffering and easier processing) if the segment number was in the first packet. This family of device requires a minimum of 8+ Mbits/second although I found that with the Teensy I had to use 16- or 20-MHz SPI clocks. I noticed most of the other demo programs also use much higher SPI clock rates.
All Lepton's have a hard requirement that the host must get the data out within three lines of when it is generated in the Lepton or it will lose synchronization and be unable to output valid data. All Leptons also output what are called "Discard packets" that are indicated by a specific bit-pattern in the ID Word. The host is to ignore these packets but keep reading for good data later.
In my testing I also found that until synchronized they may output nonsensical non-discard packets too.
I wrote a quick Teensy sketch that enabled VSYNC and polled it until asserted. It then read packets until it saw a complete segment, or it saw invalid data (invalid line number) or a timeout was exceeded (maximum VSYNC period). The program could usually sync with the Lepton within a handful of VSYNC periods. The output from a typical run is shown below. Once synced you can see a new valid frame every twelve frames (or 1/3 the data frame rate). These are the frames that get displayed. Each line number (0-59) represents 160 bytes of pixel data.
It is interesting to see that alternating segments have discard packets. I don't have an explanation for this although I wonder if this has to do with the timing of my SPI bus (16 MHz) and the Lepton's internal processing rate.
Getting this code running meant I could get valid data out of the device.
Initially I connected the Lepton to a Raspberry Pi 3 and ran demos from the Group Gets repository. However many of those are written for the lower-resolution Lepton 2 family of modules. I hacked a couple of them with less-than-stellar results because of the much larger dataflow from the Lepton 3. I had more luck with Damien Walsh's leptonic but even it would occasionally lose sync on a lightly loaded Pi. This lead my decision to try to use the VSYNC output to make it easier to synchronize the VoSPI transfers to the Lepton's video engine.
FLIR has a reasonable set of default settings. For example the Lepton 3.5 is ready to output radiometric data (with absolute temperature values for each pixel) immediately after booting. However changing any of the default settings (e.g. enabling VSYNC) requires using the I2C interface to access its command interface (IDD). They provide a C++ library that is designed to compile on 32- or 64-bit Linux machines that I ended up porting to the Teensy Arduino environment. With this I was able to enable VSYNC and with the Teensy and Lepton on a proto board start to figure out how to reliably get video data out of it. It's a bit picky about the timing and data gets garbled if the host isn't able to keep up.
The Lepton 3 and 3.5 output data in 4 segments, each one quarter of a complete frame. The segment length may vary depending if the data is formatted as temperature or AGC-processed data or is 24-bit RGB colorized values or includes additional telemetry information. Each segment is comprised of a set of 60 (or more) 164-byte or 244-byte packets, including packets specifically designed to be discarded while the Lepton prepares the valid data and any optional telemetry data. Although the Lepton's internal frame rate is about 26.3 Hz because of government regulations it only outputs ~8.8 frames of video per second. However it generates data for all frames leading to a VSYNC rate of ~105.3 Hz (4 segments/frame). This means the host has to read, and process for validity, an entire segment after each VSYNC. It took me a while before I could manage this on the Teensy.
The host must resynchronize with the Lepton whenever it gets out of sync or it will receive only garbage data. This requires idling the VoSPI interface for at least 186 mSec. I found that it takes the Lepton a few VSYNC pulses to start outputting valid data. When the host is in sync then it will output 4 consecutive good segments on 4 consecutive VSYNC pulses every 12 VSYNC pulses. The other eight VSYNC pulses contain invalid segments (identified with a segment number of 0). Because my test fixture uses a single SPI interface to read data from the Lepton and then write frame buffer data to the LCD display the test sketches are only able to reach about half of the maximum frame rate (~4.4 frames/sec).
I put the ported IDD library and three test sketches into a github repository. The sketches also use the Adafruit ILI9341 and GFX libraries for the LCD module (although the code has its own routines to write pixel data to the display).
lep_test6 - This sketch takes the default 16-bit absolute temperature (Kelvin * 100) value of each pixel and linearly scales the data to 8-bit values that can be run through a color map look-up table for display. Because the data has absolute temperature values it is easy to display the temperature (currently without worrying about any real-world emissivity issues) of the center of the image. Normally the scaling maps the data from the minimum and maximum in the frame. However this sketch allows the user to select a few temperature ranges to scale the data in so the image does not change radically as different temperatures enter or exit the frame.
The Lepton also has a more sophisticated AGC capability that is claimed to produce better...