What I’ve Actually Built
At the core of the current prototype is an STM32H7 microcontroller. It drives a 720×720 RGB666 display and connects over Wi-Fi using the NRF7002 companion chip—fully controlled by the MCU, no separate nRF core involved. The MCU has two cores: one handles networking, the other decoding. In the near future, I will move decoding to a low-power FPGA and networking to a smaller, more lightweight MCU (see Roadmap).
The device receives live GUI frames streamed from a Linux machine running a custom video encoder. The compression within a frame is close to JPEG, but instead of MJPEG I use delta-frame logic based on differences between frames. It’s simple enough for an MCU to decode. While the STM32H7 supports hardware-accelerated JPEG decoding, I’m currently not using it—too cumbersome with my encoder setup, though not impossible.
Bandwidth depends on GUI complexity—ranging from nearly nothing up to ~25 Mbit/s at the target 25 FPS. These numbers lack context, so check the videos on YouTube (and for more context check the Research page).
On the client side, decoding time per frame at 720×720 RGB888 ranges between 40 and 150 ms depending on content. For worst-case testing, I transmit RGB888 compressed data—even though the display is RGB666. Throughout the pipeline, RGB888 is maintained, and the LTDC simply ignores the two LSB bits when reading from framebuffer. I’m planning to support more displays in the future, so it makes more sense to test the worst case for now.
End-to-end latency—from host encoding to display decoding—is low enough for responsive, simple GUI use. The Linux host handles all the heavy lifting. The device just receives rendered frames over the network. It is already usable for some cases, though I’m not fully happy with the result. Check out how I will speed things up more on my Roadmap.
The Linux server runs a software encoder that takes approximately 20 ms per frame on an Intel i7-4820K CPU. It’s currently single-threaded, unoptimized, without SIMD or GPU acceleration. I’m confident I can drastically improve speed.
Current Hardware Stack
- STM32H7 MCU
- NRF7002 Wi-Fi
- SDRAM buffer via FMC
- 720×720 RGB666 display
Why I Built It This Way
I’ve always wanted a Linux phone or communicator. Projects like PinePhone and Librem 5 are inspiring—they bring Linux to mobile form factors and promote open hardware. But they aim to be full-featured mobile computers. That’s not my goal.
I want something more focused—minimal, and purpose-built as a GUI terminal and reusable in other projects. A screen into my Linux system, capable of displaying full desktop applications, while keeping the embedded device lightweight and efficient.
As a developer, I need a system that fits how I already work. No mobile API sandboxes, no permission dialogs. Just Qt, Python, Java, or a Flask webserver—running on a proper Linux host. Most importantly, I want full control over peripherals: input devices, sensors, actuators, and more.
Why Not Just Use VNC, RDP or X2GO?
I learned a lot—that’s what matters most to me. Beyond that, I wanted the flexibility of having my own embedded-focused codec I can adapt and modify at any time.
This isn’t just a protocol tweak. The entire pipeline—encoding, decoding, memory usage—is purpose-built for constrained environments.
Power Use and Performance Outlook
During active streaming, the device draws around 850 mW (MCU + Wi-Fi + SDRAM) and up to 1 W for the display at full brightness—totaling approximately 1.9 W.
With a standard 3.7 V, 3000 mAh lithium battery, that yields about 4–5 hours of continuous operation. Lower brightness or idle usage will extend runtime.
Planned improvements:
- Offloading the codec to an FPGA to reduce latency and power (see Roadmap)
- Support for multiple displays
The Modular Future
This project is not just about a display—it’s...
Read more »
gitzi