A 32 Bit Variable Length Instruction Set Core and Transputer Like Comms Network
I've now implemented the two instruction Insert and Extract.
The reason for this is that I am starting to look at some Floating Point code with the intention of making it easier to work on IEEE 754 Standard.
Infact I have plans to extend those instructions but let's leave that for now.
The intention is to get some form of Single Precision Math up and running. If I can get that up and running then I will be on the path where I can start to do some cunchy compute on the Array.
Slow, incredibly SLOW but got to aim high :).
The smaller 2 D Trinity Net by the looks of it is up and running. A couple of squeaks but wasn't too bad.
Did notice a couple of possible improvements along the way so will target those next.
Getting this up and running means that I will be able to get an array into the larger FPGA board.
I would prefer to get this all tested out and secure before concentrating on the RS232 again. With that good to go I can start to write some code and get it to deploy to the array.
I've also bought a four line LCD display which will be used by the master to display information. This implies four machines but might be able to use two to a line.
Slow but sure.
I have now implemented the two dimensional version of Trinity Net, Trinity Small Net, into an MPPA type environment. It's using the Dual Core Small Net Complex so for every node there are two Trinity cores. I am still yet to check through to ensure there are no bugs introduced by the change. Note I still have Trinity Net available.
I decided to try out running a program on a 49 node array and took a timing of how long it took to simulate 400 us. This came out as about 2 hours, which means I am still compute bound for a large array and resource bound as to the possibility of getting into hardware.
I am going to use a four by four array for testing in sim and then cut it back for FPGA testing for the large board.
As I think I mentioned I've taken the 'register file' payload memories and converted them into SRAMs. This meant I could have a crack at seeing how large a single Trinity Net came out as. The result was not great, ~51k Luts for a 3D node, which is about 43% of the present FPGA. Time to rationalise a few things.
I am not going to be able to get an FPGA large enough to go 3D but will be able to go 2D. With that aspect in mind I was able to get down to 8 connections and the area came down to 9250 Luts, this is a bit more feasable. Could get this happily into a single core system and have an array. Note that the internal interconnect has come down from ~17.7 k Luts, this coming down to 1.7 k Luts. Also note this is still more connections than the original Transputer as this had four, we have eight.
Now that they MPPA RTL with Block RAM has now been updated time it's time to start to see if it can be implemented in an FPGA.
This will be a stepped process.
First a single core with only one instantiation. This will be targetted at the first FPGA board.
Next to repeat the same targeting the larger FPGA.
After this two instantations in X.
The next stage will be a 2x2 array. I suspect that will be the limit.
After that I suspect having to save up or sell the family into white slavery.
It'll be quiet but I think bear up !
I wanted to update the MPPA Complex with an updated Trinity Net.
One of the problems is that the MPPA and more specifically the Trinity Dual Noc Complex is comparatively old.
This means that I have to add in all the other updates that have happened since I last looked at it.
Now have the FPGA re-imaged with the dual core image along with the LCD Character Sender. I decided to put the Secondary Trinity Core into action along with the Primary Trinity Core. So I wrote a program that works out which core it is and then updates the lines with what it is. They then send a single character to the display, pause and then send a different one to the same position. So what we have here is something that tests that the two cores can extract instructions from the same memory and then writes to two slightly different areas of data memory. Then a third agent, the LCD Character sender, reads that memory and displays it. The Primary core is sending hash and then percent, the Secondary is sending backward slash and then is supposed to send forward slash. However it's sending the Yen symbol. Possibly the Display Chip has that encoded for 0x5C. Too tired and late to work it out. Any two cores running out of the same fabric successfully being arbitrated.
Sadly not got this on you tube or vimeo and I need to go to bed !
Now Joined Vimeo so here is the short video
I have now been able to get the LCD Char Sender to work after some debugging.
The system is told where in memory it can extract text from and then provides the control signals to the display.
Note that it can be anywhere in memory and does not need to be word aligned so if I create some text in a program I just send the address of the start of the text and off it goes. As the system can also be set so that it keeps reading from the system 19 times a second I can have a low level of 'animation'.
Also note that it sets the cursor to the bottom line after writing 16 characters and then writes a further 16 characters. The cursor is then sent to the top line again.
The infrastructure is non blocking so I can have the processor running and the LCD Character sender running in the background. It's only if the text is sitting in the same memory block will there be contention and that is arbitrated.
The first peripheral to send characters to the LCD from an area in memory has been completed. It's not in the complex yet but that will be the next step.
Once the LCD Char Sender is kicked off it gives a 50 ms wait then sets the Display up. It then reads in the display memory a character at a time.
Then it puts the cursor at the start of the top line and reads out the first 16 bytes.
it then sends the cursor to the start of the next line and proceeds to send the next 16 bytes.
Then if the 'go' signal is still active it waits for 50 ms and then reads the memory again in case there has been an update and displays it.
If the 'go' signal has been dropped then it waits for it to go again.
Along with other delays it cycles 19.something times a second which seems a reasonable refresh rate.
The next task is to integrate into the Dual Complex run some tests and then synth it.
Assuming this goes ok I will look at getting the graphics screen up. This will be slightly more challenging but hey.
After that must on back to the RS232 !
I have decided to add two new peripherals to the Trinity Complexes.
The first is one that will drive a 2x16 LCD. It will do the initial set up of the display and then on a timer update the display continuously. This will be possible as the 'fabric' is designed to do this as a non block matrix. It will do the carriage return for the second line. I just need to work out how to manipulate the cursor on the display. This will take up 32 bytes of memory.
The second will read out memory and display on a 128x64 matrix LCD. This will mean that a 1kB memory will need to be used. Again this will be a timed update. I will be interesting to see how it goes.
In the meantime I shall continue to investigate the Uart and see if I can get it to work as expected.