I know that Pi clusters have been done, redone and overdone so nothing new under the sun. YouTube is already full of videos that show projects with 2 to 100s of RPi doing various stuff in different ways. But for the #PEAC Pisano with End-Around Carry algorithm project, I need a big cluster. Like, 10K cores or so. I'm not sure I can get them all with Pis (ha ha ha in advance) but I can slowly build up something that could eventually get me there, one day. It's a re-purposable project following an evolutive path that can spin-off a few cool and eventually marketable designs.
Now, consider that a basic ping to a RPi takes almost 1ms but toggling a GPIO pin is about one microsecond. For trivially parallel algorithms, this does not matter much but not everything is simple in practice...
The SPI interface can move data with low latency at almost 30Mbps and the SMI port has even more power (that I still must learn to harness).
How could that help with a cluster ? This can reduce latency both physically as shown, and through a leaner program because dealing with IO pins takes some lines of inline C at most, no need to call external libraries or the OS. The synchronisation primitives are then implemented in external HW. Many topologies and organisations are possible and any one that I implement will not fit a particular problem/project so I simply put a FPGA.
Intended is the affordable ProASIC3 A3P250 PQ208 to provide 150 IO pins, with 8 internal reconfigurable FIFOs/memory blocks that you can then use as mailboxes, transfer queues, glue logic... The PQ208 package can accommodate higher density parts (up to A3P1000 and A3PE3000 if you are rich) if more memory, processing, logic etc. are needed. Or use another FPGA brand/family if you prefer.
4 RPis is a good, square quantity, first and mostly because cheap Ethernet hubs have 5 ports, and you need the 5th port to connect the cluster to something else. This also allocates 2 FIFOs per Pi in the FPGA, which must preserve pins for extensions (like, connecting to other FPGA in a token ring maybe).
Maybe I'll find a way to boot, load and configure the program through the interco system, as well as control/sequence the power up/down. It's not a KVM but it gets closer to the comfort that you'll find in half-decent Beowulf clusters (I learned a few things while working for a cluster company around 2002).
Maybe a simplified, cheaper, smaller version for the Pi Zero or Compute Module could be derived from this later but for now I want the comfort of Ethernet. The 40-pin GPIO header is the focus of this project.