Now that we were able to get some basic ROM routines working, setting up the 68K in an initial state and testing RAM, verifying that all of the hardware is operational, we can move onto writing yet another protocol for handshaking between the data card and the CPU/Graphics card.
Some thought had to be paid attention here. It is not as simple as just writing a little protocol to read out the FIFO and write those values into RAM and execute. It is a bit more involved than that.
To understand why, we need to take a look at the system architecture again:
So if we take a look at this, we basically have two system busses here. the main system bus which is the VME bus, is also part of the main CPU card and contains the RAM for the whole system, and its ROM and EEPROM is part of the main address space and can be accessed by any bus mastering cards at any point.
But, the Graphics Card also has its own system bus. it has its own ROM, its own RAM, and VRAM.
However, the system bus for the GPU isnt the same system bus as the main bus. it has its own. Therefore, the Graphics Card can only access the main bus by a 128K assigned address block and a paging register.
This poses a problem because the operation is one-way. the main CPU cannot access the graphics card in any way. only the graphics card can reach out and do anything.
However, we have bi-directional interrupts between the graphics card and the main CPU. the main CPU can send an auto-vectored interrupt to the Graphics Card, but also the Graphics Card has an IVEC register and can send manually-vectored interrupts back to the CPU card. This allows the graphics card to target a specific vector over on the CPU and run that code block, and the CPU can send an interrupt back at the graphics card at any time.
Great for handshaking between the CPU and the GPU. Buuuuut there's a problem with this.
The data card.... the data card only has specific registers exposed to the bus as we talked about previously. the data card itself cannot master the bus or read/write to it on its own. it can only load up registers, and its up to another master on the system bus to read/write the card. it is peripheral only whereas the GPU can be a bus-master.
However... the issue is the data card interrupt only goes back to the CPU. the graphics card is unaware what is going on with the data card at any point, no way for the graphics card to know what the data card is doing unless it polls it.
So circling back around, this is the issue we need to keep in mind when designing a bootstrapping protocol for the ROM on the CPU and graphics cards.
Now, there are a couple different approaches we can take here. One, we could download the graphics card binary to the CPU's shared RAM space, and then tell the graphics card via interrupt to download its binary from main system RAM. In fact, this is how they did it originally.
Or, we can simply have the Graphics Card reach out directly on the system bus to the data card and it can handle the data card on its own which removes the burden of data handling from the CPU to get to the graphics card.
Problem with the second approach is once again the graphics card cannot easily see whats going on with the data card unless its constantly polling. Polling is a bad idea because it has to take over the bus to do it. this will impede main CPU operations and slow things down.
So we work around it by routing interrupt traffic. So we design a specific command in the CPU ROM protocol that's specifically designed to bootstrap the graphics card.
Instead, we load the data card FIFO with a chunk of binary data meant for the graphics card, then shoot an interrupt to the main CPU. the main CPU then sees the vector and the CPU simply passes the interrupt down to the graphics card. so the graphics card's bootloader knows the next chunk of data is ready. Clever.... So this is the method we will go with.
using this approach, we need to download the binary into the graphics card first before downloading the CPU binary. Reason being, once the main program on the CPU executes and takes over the ROM bootloader, it will sever the path between getting signals between the data card and graphics card unless i add code overhead. Lets avoid that.
Bootstrap the graphics card and check a flag to know if its running. Once it is, we move on to the CPU and download its program into RAM and execute.
So now, we have a plan.... Lets execute that plan.
this system thus far has been architecting one protocol after another. and this is no exception. I made up a preliminary document of how i think the protocol should work:
Now, the data card does have a manually vectored interrupt register, IVEC. So i could have setup specific vectors for each of this. But instead of using up all of my vectors and making this more complicated than it needed to be, I kept data card operations on one specific vector and thats it.
However, what i did do was use the FIFO for all data blocks. and then I used the CPU register for commands. I would then fire over an interrupt to the CPU, and the CPU would read this register to know what to do, and then read the FIFO if necessary. That was the plan.
A million hours of thinking, debugging, and time later we end up with something like this:
And then as I discussed before, my clever solution of just handing off a command/interrupt over to the GPU for processing:
Then, the Vector 64 interrupt handler for the data card:
So really, its that simple. it reads the 8-bit latch from the data card when it receives an interrupt to vector 64 from the data card, sets a flag for mainloop to catch and then handle what it just received.
The way the bootloader works is very simple. I made a CMD01 which sets up the transfer. that routine determines which card is being targeted. if it is targeted for the graphics card, it simply just hands off the interrupt generated by CMD01 over to the graphics card, and the ROM bootloader for that card takes over!
But, once we determine we are targeted for the CPU program, I then simply send a pointer to RAM to begin the transfer:
The pointer signifies the start of the binary block that the transfer will commence at.
PAY ATTENTION! the 68K is a Big Endian system, and everything else including the PC and raspberry pi are little endian. so you cant just generate an int to bytes and send that over unless you want things to explode, so you need an endian-aware conversion.
Once we are initialized with a CMD01, we start sending subsequent CMD02s. the CMD02 is a block transfer command. it has a block number, and number of bytes to read as well as the bytes. Each CMD02 will advance the pointer internally in the bootloader by X bytes. If you need to start over, you must send a new CMD01 and then start sending CMD02s again. the block number insures they are sent in order, and if there is an out-of-order condition it throws it out. or if the same block was received twice, it'll ignore it.
And like before, if it's not targeted for the CPU, it just hands off the interrupt.
Once all of the CMD02 blocks are sent and the entire binary has been uploaded into RAM at the pointer you specified, we need to execute this code.
So we created a CMD03. the CMD03 will verify the integrity of the program and jump to the address pointer specified.
Pretty straight forward, takes the entry point (reset vector) address you want to divert execution to. as well as the expected binary length, and expected checksum of the binary. It will verify the calculated checksum against the expected checksum and execute the program!
the bootloader will send a success ACK back to the data card just before it jumps to the reset vector.
There you have it. we now have a method of bootstrapping both 68Ks from the data card!
Now, we need to test this.... If you can tell by the additional commands in the mainloop, it didn't go well at first with lots of trial and error.