ESP32 cluster system

The performance 600DMIPS of ESP32 is enough to drive me.

Similar projects worth following
The performance of ESP32 reaches up to 600DMIPS with the dual-core 32bit 240MHz processor. This performance is comparable with general CPU of ten years ago. Why not making experimental cluster system with such a capable MCU?

This project will be open for all. I have a plan to share the board on tindie and please join and make something good!!

As wrote in the introduction, the performance of ESP32 reaches (maximum) 600DMIPS, which is comparable with Pentium II of 20 years ago. From view point of costs, these old computer is nowadays available with very low price (if it works, though..) but point of weight, size, and consumption power, ESP32 is very handy and we have a possibility to make "super computer" by clustering. I would like to clarify that this is not serious project but the purpose is "making something funny".

General and popular clustering system will be that by Raspberry Pi connecting by wired Ethernet. But ESP32 does not have native Ethernet (I know and made one with LAN8720A) and wireless link by WiFi or Bluetooth will have a big overhead and also connection may have some reliable issue. Instead of Ether or Wireless connection through TCP/IP, this system has a inter-connected GPIO bus as shown in the above picture. At this moment, data structure and usage definition is not determined, but 16-GPIOs interconnection will be enough to inter exchange data.

The above picture shows the whole schematic of this experimental system consisting of 4-ESP32 (total 8-cores).

Each of ESP-WROOM-32 has two switches connecting to GPIO34 and 35 and LEDs to GPIO 32 and 33. These GPIOs are function limited (input or output) by MCU and excluded from inter-connection bus. ESP32 has several "un-touchable GPIOs" and such pins are also excluded. The blue thick lines are GPIO bus.

For writing firmware to ESP32, the circuit like above is required. Each of ESP32 has a pin-header to connect with this writer. Once firmware uploaded, we can detach and let them work...

This projet is not just thinking level but already PCB fabrication is on-going. As we can see 7805 is on the silkscreen but this is just a place-holder and I will use pin-compatible 3.3V/2A regulator.

The initial fabrication will be done by the mid of January and the completed board will be released on at $50.  If you have interest, please click "Join this project's team" on the top page. 

  • So,,,, struggling!

    kodera2t01/14/2018 at 08:56 0 comments

  • Finalize the design

    kodera2t01/13/2018 at 09:18 0 comments

    Recently I cannot take enough to time on this circuit but I managed to add "OLED console" for operation check.

    The "tentative" final version has on-board OLED display and SPI 128kB (1Mb) SRAM. I need to make some checking program, and hopefully I will show something next week.. hopefully..

  • Blink linkage between 4-ESP32s

    kodera2t01/07/2018 at 02:15 1 comment

    It is still far from "real clustering system" but as the first step, LED blink linkage is confirmed.

    4-ESP32s are inter-connected and any GPIO selection will work, and this time GPIO 13, 14, 15, 16 are selected for information linkage. The program is written on Arduino IDE (for quick easiness) as follows:

    initial condition: GPIO 13, 14, 15, 16 are high.
    U1: (GPIO 2: output, GPIO 13: input)
    start blink when GPIO 34 getting LOW and continue 5 times
    after 5 times, blink stop and turns GPIO 14 LOW
    U2: (GPIO 14: input, GPIO 15:output)
    initial state: nothing
    waiting for GPIO 14 turns LOW
    when GPIO 14 turns low, start blink 5 times
    after 5 times, blink stop and turns GPIO 15 LOW
    U3: (GPIO 15: input, GPIO 16:output)
    initial state: nothing
    waiting for GPIO 15 turns LOW
    when GPIO 15 turns low, start blink 5 times
    after 5 times, blink stop and turns GPIO 16 LOW
    U4: (GPIO 16: input, GPIO 13:output)
    initial state: nothing
    waiting for GPIO 16 turns LOW
    when GPIO 16 turns low, start blink 5 times
    after 5 times, blink stop

    The actual operation can be seen in the following movie..

  • Cluster comes to the real world!

    kodera2t01/06/2018 at 01:31 0 comments

    So the ESP32-Cluster consisting of 8-cores is coming to real world!

    For firmware uploading, just connecting CP2102 board is enough to do!

    3.3 V regulator part is bypassed because we can get good 3.3V regulator!!

  • Latest schematic

    kodera2t01/03/2018 at 13:18 0 comments

    This is the latest schematic. I still have question myself, regarding data sharing between nodes. The circuit above can share information through SPI SRAM and interconnected bus. In this case, for example, if node 2 want to write a result to shared SRAM, it will follow a procedure like this

    (1) Demand SRAM access to node 1

    (2) Node 1 allows access for node 2 to SRAM

    (3) Node 2 write data to SRAM

    But if without SRAM and assuming all data is collected on node 1, then

    (1) Node 2 demands data sending to node 1

    (2) Ack to node 2 by node 1

    (3) Node 1 waits for data sending by node 2, and node 2 will send data.

    Indeed, both will work and the latter is more simple from view point of circuit.

    This is enough!? 

  • Quick update: bus pull-up and "hot-line" separation

    kodera2t01/03/2018 at 00:57 0 comments

    In the previous design, the "hot-line" from node 2 to 4 to the controller (node 1) was a part of inter-connected bus. But I have no confidence to work and don't know the side-effect of sharing signal line, so now the hotline (H1 to H4) becomes private lines between node to the controller. 

    In many case, bus connection requires pull-up or pull-down (or appropriate termination) for stable operation, so the resisters for that (R17 to R24) are prepared in the circuit. If these pull-up are not needed, just we remove them from PCB...

  • Shared SRAM implementation done in schematic

    kodera2t01/02/2018 at 11:59 0 comments

    Now the shared SPI SRAM with access demand line to the controller is implemented in the circuit.

    Each of node has 4-tristate buffer and its state is controlled by node 1. through GPIO13, 4, 16, 17. Node 2 to 4 does not have access to shared SPI SRAM without node 1's permission. Node 1 will schedule the order of access by certain method (not determined) defined by firmware. The "Hot-line" can be GPIO15, 2, 5, 18 or any non-occupied GPIO lines. 4-GPIO lines are assigned to 4-ESP32s and the controller will publish "permission" to SPI SRAM access as per request (or controller's schedule). As the GPIO are interconnected between 4-ESP32, therefore, a hot-line information generated by any node can be snooped by any node (not limited to controller) and also fake "hot-line information" can be made by any node. (I am not sure it is useful or harmful at this moment..)

  • "Hotline" toward controller

    kodera2t01/02/2018 at 00:32 0 comments

    The previous configuration is surely able to control SPI SRAM access of nodes but node 2 to 4 cannot demand RAM access to the controller. The case, for example, some of node finish their task earlier than the other, or finishing exactly the same timing, SRAM access has congestion or processing delay (or corrupt).

    In order to solve this issue, additional "hot-line" for demanding accessing right to the controller is added. The controller has additional task "memory access task scheduling" but the issue can be solved, at least from hardware viewpoint..

  • Quick note about shared memory

    kodera2t01/01/2018 at 11:28 0 comments

    The initial board will be pure distributed system without shared memory. In the case of MCU are physically separated, no shared memory system will be feasible but in this board, the distance between MCU are enough close to add shared memory. Fetching and storing data and instruction from each MCU, the shared memory will be very useful so I made a vague configuration about it.

    One of the MCU will work system controller and permit/prohibit memory access. Indeed no need to be ESP32 for this part and another MCU can be applied. The rest will work "computation node" and when memory access is permitted by the controller, they will fetch and the result will be stored in shared memory. Additional element for this implementation is just 4 74125 (or 74126). The clock speed of SPI by ESP32 can reach 80 MHz ( with esp-idf) and I hope accessing speed will be acceptable one for 240MHz clock system if not so frequent memory access. (still it is slow, indeed but something is far better than nothing..)

    This part is not implemented in the initial version of the board but if this is feasible, I will add on the next PCB batch..

    The current candidate for the SPI will be this one.

    ESP-WROOM-32 already has 512 kB SRAM and 4 MB with ESP-WROVER-32 and the shared memory of this board will work as "supplemental memory" so 1 Mb: 128 kB can be enough capacity (depending on the purpose, though.) The maximum clock of this SPI SRAM is just 20MHz and it may make system bottle neck so we should avoid frequent shared memory access (mostly the task should be done in the main memory in ESP32 modules).

    If we want to avoid this slow speed access, differential access for multiple SRAM should be considered.

View all 9 project logs

Enjoy this project?



Similar Projects

Does this project spark your interest?

Become a member to follow this project and never miss any updates