A 32 Bit Variable Length Instruction Set Core and Transputer Like Comms Network
A weekend of some success ! Barring a bug which I am still yet to fix in the Branch Predict I have managed to achieve the goal of this weekend.
Yet I have now sent text from Trinity to the Mac !
I had a look at reducing the amount of area for Trinity. I came up with a significant saving by reducing the number of levels 'pages' dedicated to the Exception Level. Bringing this down to four pages means that at the fourth nested level you need to push stuff into an Exception Stack. I could make this more automatic but I'm going to leave it for now.
This means that everything is back in play.
Back to getting the RS232 to work. A bit of pipecleaning. I have an old Amstrad NC100 (Noddy) which happens to have an RS232 port. Resurrecting old kit is fun but it appears to be working.
I bought what is advertised as a USB to RS232 conversion cable that can cope with a full +/- 12 V swing. This along with a Null Modem cable I connected up to Noddy and the Mac. Using Zterm I managed to get comms up and running in both directions at 9600 Baud. The following steps will now be made
1. Send a series of characters to Noddy from the FPGA board at 9600 Baud using the Null Modem cable.
2. Write a routine that runs on an interrupt to take the incoming character and then display it on the LCD display.
3. Send characters to the FPGA board from Noddy at 9600 Baud again using the Null Modem cable.
4. Connect the Mac to the FPGA board using the USB RS232 cable and the Null Modem.
5. Expand the amount of Memory available to the Trinity Dual Complex.
6. Write some kind of Monitor to allow downloading of programs and uploading of results.I'll see what I can do when I get some time.
There have been a few things happening since my last entry.
1. Movement of Call Return Address Register to R0.
2. An improvement of the Stack Push instruction by one clock cycle.
3. Finally hard overflowing of the area on my FPGA so that to keep two cores I had to lose the DMA block.
4. Getting 6.0 + 7.0 = 13.0 .
I've now been able to write some routines in Assembler to do Floating Point operations. Now I am sure a Professional Software Engineer might raise their eyebrows and I need to go through the routines possibly to clean up, or at least comment them, but I now have a small group of routines to do floating point support operations.
Here is a screen shot. Single Precision IEEE 754 in software adding 6.0 to 7.0 getting 13.0.
I've now implemented the two instruction Insert and Extract.
The reason for this is that I am starting to look at some Floating Point code with the intention of making it easier to work on IEEE 754 Standard.
Infact I have plans to extend those instructions but let's leave that for now.
The intention is to get some form of Single Precision Math up and running. If I can get that up and running then I will be on the path where I can start to do some cunchy compute on the Array.
Slow, incredibly SLOW but got to aim high :).
The smaller 2 D Trinity Net by the looks of it is up and running. A couple of squeaks but wasn't too bad.
Did notice a couple of possible improvements along the way so will target those next.
Getting this up and running means that I will be able to get an array into the larger FPGA board.
I would prefer to get this all tested out and secure before concentrating on the RS232 again. With that good to go I can start to write some code and get it to deploy to the array.
I've also bought a four line LCD display which will be used by the master to display information. This implies four machines but might be able to use two to a line.
Slow but sure.
I have now implemented the two dimensional version of Trinity Net, Trinity Small Net, into an MPPA type environment. It's using the Dual Core Small Net Complex so for every node there are two Trinity cores. I am still yet to check through to ensure there are no bugs introduced by the change. Note I still have Trinity Net available.
I decided to try out running a program on a 49 node array and took a timing of how long it took to simulate 400 us. This came out as about 2 hours, which means I am still compute bound for a large array and resource bound as to the possibility of getting into hardware.
I am going to use a four by four array for testing in sim and then cut it back for FPGA testing for the large board.
As I think I mentioned I've taken the 'register file' payload memories and converted them into SRAMs. This meant I could have a crack at seeing how large a single Trinity Net came out as. The result was not great, ~51k Luts for a 3D node, which is about 43% of the present FPGA. Time to rationalise a few things.
I am not going to be able to get an FPGA large enough to go 3D but will be able to go 2D. With that aspect in mind I was able to get down to 8 connections and the area came down to 9250 Luts, this is a bit more feasable. Could get this happily into a single core system and have an array. Note that the internal interconnect has come down from ~17.7 k Luts, this coming down to 1.7 k Luts. Also note this is still more connections than the original Transputer as this had four, we have eight.
Now that they MPPA RTL with Block RAM has now been updated time it's time to start to see if it can be implemented in an FPGA.
This will be a stepped process.
First a single core with only one instantiation. This will be targetted at the first FPGA board.
Next to repeat the same targeting the larger FPGA.
After this two instantations in X.
The next stage will be a 2x2 array. I suspect that will be the limit.
After that I suspect having to save up or sell the family into white slavery.
It'll be quiet but I think bear up !
I wanted to update the MPPA Complex with an updated Trinity Net.
One of the problems is that the MPPA and more specifically the Trinity Dual Noc Complex is comparatively old.
This means that I have to add in all the other updates that have happened since I last looked at it.
Now have the FPGA re-imaged with the dual core image along with the LCD Character Sender. I decided to put the Secondary Trinity Core into action along with the Primary Trinity Core. So I wrote a program that works out which core it is and then updates the lines with what it is. They then send a single character to the display, pause and then send a different one to the same position. So what we have here is something that tests that the two cores can extract instructions from the same memory and then writes to two slightly different areas of data memory. Then a third agent, the LCD Character sender, reads that memory and displays it. The Primary core is sending hash and then percent, the Secondary is sending backward slash and then is supposed to send forward slash. However it's sending the Yen symbol. Possibly the Display Chip has that encoded for 0x5C. Too tired and late to work it out. Any two cores running out of the same fabric successfully being arbitrated.
Sadly not got this on you tube or vimeo and I need to go to bed !
Now Joined Vimeo so here is the short video