a day ago •
2 days ago •
After a truly terrible week of meatspace problems last week, I'm back.
I've been working on and off on a 3D printed case for the cluster. The case doesn't need to be beautiful, but it does need to protect the cluster and restrain the Zeros so they don't flop around on the backplane's USB headers.
One problem though: the HPL runs from earlier show that the CPUs can reach over 70 degrees under load in ambient air. In an enclosure it's going to be much higher even with vents for convective airflow, and PLA plastic starts getting soft around 65 degrees...
Thus a fan is going into the case somehow.
Enjoy this pic of a test fit print while I figure this out:
10/04/2017 at 04:07 •
Although performance is not the goal here, I wasn't happy that the cluster wasn't benching anywhere near what I expected with HPL. After some head scratching, I found out that the math library that I used to build the benchmark (libatlas) is built for soft-float in the RPi repos.
I rebuilt HPL with the libopenblas math library, added the head node into the pool, and hooked up my bench supply so I can use its current meter.
Now the terrible cluster does a max of 1.281 GFLOPS, and drew an average of 4.962W over the run. That means it's only 72 million times slower than the fastest computer on the June '17 Top500, and at an efficiency of .258 GFLOPS/W is 4.9 times more efficent than the least efficient computer in the same list.
(corrected with a slightly higher score after retesting 10/5/17)
10/02/2017 at 21:46 •
Following instructions found here, I ran the HPL benchmark on the four compute nodes of the cluster. On the first try, I'm getting 390MFLOPS.
That's 300 million times slower than the fastest ranked supercomputer as of June 2017.
However, it's at worst only 40% less efficient* than the least power efficient computer on the same list, so that's nice I guess.
*Didn't measure average power during the run, but it does have a 5V 2A power supply and isn't hitting the limits. So I'm assuming 10W running full out.
09/28/2017 at 02:42 •
I spent the last couple of weeks trying to figure out the best way to deploy an OS to the nodes over USB. I spent a few late nights trying to roll my own minimal Raspbian with debootstrap, and a few more scratching my head over how to use the USB boot and an initramfs to write the SD card on first boot or when directed to reformat and reinstall. And I was hesitant to write anything to automate any of this, since I have my heart set on using Ansible for deployment and I'm still an Ansible noob. I was busy yak shaving and yet I still didn't have a working way to copy an OS out to all the nodes over USB.
And then I remembered what I named this project.
So I went for the easy way out. Stick with the stock Raspbian Lite image. Modify the image so that it doesn't boot over SD and falls back on USB. Use the mass storage mode in rpiboot to write the Raspbian SD image to each node. Use rpiboot to serve different cmdline.txt bootfiles based on the USB hub port so that they each get their own unique USB networking MAC address. And use shell scripts for image generation and SD writing instead of making this another Ansible lesson. None of this is how I'd imagined it working, but at least it's working now. Later I can move the process to Ansible and different OS images and deployment schemes.
Three evenings later and the cluster can now write a Raspbian .img to all four nodes very slowly (30 minutes or so for all 4), boot them very slowly with a unique MAC and IP (about 5 minutes for all to come up), and take remote SSH logins from the host Pi.
It's hard (and boring) to show any of this in action, so here's a session showing a network between the head node and all compute nodes at once:
See? The Terrible Cluster lives up to its name.
09/17/2017 at 05:37 •
I soldered the rest of the board. Realized I had ordered the wrong part for the USB power connector and bodged it on anyway. I'll update on functionality in a few days.
But for now, enjoy some photos of the assembled board and 5 Pi Zeros:
09/14/2017 at 05:59 •
The boards came in today, so naturally I had to get soldering despite being tired from working late last night.
I populated everything but the USB power jack and the four downstream node USB plugs. I didn't want to waste the connectors until I checked out the upstream end of the hub.
I first smoke tested the board with my bench PSU and checked that the oscillator was doing its thing. Once satisfied that I didn't have any surprise dead shorts, I hooked it to my PC through a micro USB jack to USB-A plug cable going to an intermediate sacrificial hub. I was able to see the hub enumerate under Linux, and checked that it's reporting as having per-port power control.
I also did a test fit of a Pi Zero W, and since I'm now confident it's not going to do any damage I powered it up. The Pi's LED did its blinking thing as usual as Linux boots.
Stuff I learned:
- The hole dimensions for the USB vertical plug are too big to fit snugly. I had to "justify" the upstream plug with one edge of the board. I'll need to remember to align the other four the same way to keep the Zeros spaced uniformly.
- The Cypress hub IC shuts off the 12MHz oscillator when nothing interesting is happening, probably to save power. I didn't catch this detail in the datasheet. I spent a good half hour wondering why I'd see a fleeting 12MHz on the scope on power-on and nothing afterwards. I suspect it will stay on once I get downstream devices connected.
- My thermal reliefs aren't that relieving. I soldered in one of the electrolytics backwards, and it was a pain to desolder and clean the hole on the grounded leg. I should have used thinner spokes on the reliefs, and made sure that there was no additional ground traces hiding under the plane fill. It would also help if I replaced the solder sucker I broke a few years ago.
- Just because silkscreen art looks good in the KiCAD 3d render doesn't mean it will look the same on a finished board. For grins I converted a photo of a picture one of my kids drew in school to a KiCAD symbol. What was supposed to be a skull looks more like an upside down diseased pear. Elecrow and other PCB manufacturers have a resolution limit to their silkscreen.
Next I need to get the rest of the connectors in, test downstream power control, make up a test cable for the downstream ports (more later), and maybe plug in some more Zeros. But that will have to wait for the weekend.
09/05/2017 at 16:04 •
Last week I submitted the Terrible Cluster backplane PCB to Elecrow for manufacturing, and this weekend received notification that it has shipped. Elecrow is nice enough to take a photo of the boards before shipping, and I can see that the plated slots for the microUSB plugs and the power jack were done correctly. The graphic on the back, a skull my son drew last year, doesn't look so hot. But I'm not losing any sleep over that.
I still haven't ordered parts. I have a BOM on DigiKey that's ready to go once I see the boards make it through customs. I'm guessing it's a couple of weeks before I can get down to soldering.
Meanwhile, let me catch you up.
A few weeks ago at work I had my first experience using Ansible. I needed to automatically build firmware for a legacy product that has very specific host requirements. At the suggestion of a coworker, I used Ansible and Vagrant to assemble VMs with the right build environment and see the build through to completion. While that was fun, it didn't give me an opportunity to use Ansible for parallel deployment. I had a pile of Raspberry Pi Zeros, an interest in HPC, and an itch to make another PCB. And so I decided to make a simple cluster.
I first did a mock-up using a Pi3 and three Pi Zeros connected using USB device mode to gauge network performance. I lost the numbers as they were unimpressive, but with all Zeros doing bidirectional transfers the upstream perfomance was in the 10Mb/s range and downstream was around 80Mb/s. That's going to drag down any tightly coupled compute jobs, but since I plan to use it for playing with deployment and management I think it will be fine.
I then looked at packaging. I ended up choosing a through-hole soldered USB micro vertical plug from Hirose rather than the "dock" SMT one used by other similar projects so that I could fit the Pis closer together. The SMT pins on the dock connector live on the long edge of the connector, so hand soldering would be near impossible unless I spaced them 20mm or so apart. Cost is comparable, and the through hole is narrower which gives me more board space to work with. For some reason, this connector and others like it require a board that's under 1mm thick. Fortunately Elecrow can do 0.8mm boards. I will have to take board flex into account when I make a case for this since 0.8mm could flex too easily under the strain of the vertically mounted Pi Zeros.
I ended up laying out a board with the Cypress CY7C65632 4 port USB 2.0 high speed hub controller. I chose this part for its on chip 3.3V regulator and the ability to configure operating modes without an external EEPROM. I added an AP2191DWG USB power controller and a big 220uF cap to each downstream port so that the head node can control power for each port. Each power controller is driven by the hub, and the overcurrent signals are fed back to the hub to handle any unpleasantness. The AP2191 can handle 1.5A per port, and I don't expect any of the nodes to go above 0.7A.
I strayed pretty far from Cypress's design guide, using a two layer board instead of four, and putting the smaller decoupling caps around the periphery of the hub chip instead on on the back side. I think I did an ok job on trying to keep loops small for power pins and maintain USB differential impedance, so I'm hoping for the best.
- 3D printed enclosure with strain relief for the backplane
- Build and bringup v0.1 backplane
- Shell scripts for USB port status monitoring and power control
- Set up head node to USB boot the compute nodes
- IP assignment based on USB port