For this project, I enlist the ProASIC3 family because
- I know it very well,
- It's suited for the task (no need of ultra-high-performance features, the speed is right)
- The price is OK (look them up on eBay : A3P250 is around $15)
- The PQ208 package is reasonably easy to solder at home so the end user could swap parts or hack it further)
- The SW is free (as in free beer), works on Linux and Windows, not as crazy to install as others I've tried, and offers a choice of locks for the licence (it's not crazy constraining). Just make sure you get a compatible FlashPro JTAG probe, or a suitable equivalent.
- I have stock.
Now let's look at the product table from the official site:
A3P125 is the smallest in QFP208 and is able of minimal functions though one detail matters. Not only are only 133 GPIO available, but there are only 2 I/O banks, read: only 2 independent voltage zones. The others have 4 banks and can have their voltages vary, meaning: you can hotswap. Wouldn't it be nice if you could shutdown, remove or add a node while the cluster is operating?
So the "minimum specification" for my potato cluster (youtube reference) is the A3P250PQG208. It has 3072 LUT3 gates (serving either as logic or DFF) which is enough for normal interco management, and 8 small dual-port SRAM blocks that are easily configured as FIFO (when enabling the dedicated circuit). I have pushed that type of chip easily above 60MHz with real designs and synthesis around 100MHz is possible with some care. This is the range of frequency where the Pi's GPIO pins can operate rather reliably so it's a great match.
The file A3P_QFP208_pinout.txt shows the pinout differences between various chip densities so a single PCB can accommodate most of them. If I can go far enough, the pin layout files will be public of course.
From there on, if the A3P250 is too tight for you, you can look at the A3P400 (50% more resources) and the A3P600 (24 SRAM blocks and 7K LUT3) for when your routing protocols get crazy and you need more buffering (that provides a depth of maybe 4 or 5 FIFOs or 2KB per Pi, which is getting overkill unless you have a lousy protocol).
The top of the line is the A3P1000 with its 32 SRAM blocks and 11K tiles. I don't know what you'd want to do with that, unless you want to integrate a softcore CPU and/or more sophisticated interfaces instead of a basic message-passing link between quads. At least you have the choice.
Beyond that you'll find the A3PE1500 and A3PE3000. They're just massive and expensive. I doubt anybody would use that so I don't check the pinout compatibility. The A3PE1500 however has 6 independent GPIO zones so that could add another benefit (better hotplug support), but at a high cost.