Managing complexity

With a system that handles large volumes of data and tries to keep that data into distinct channels, it becomes very easy for the complexity to get out of hand. I hard started my design with the idea that every channel has certain kinds of data and each of those needs a separate register ... very quickly I ended up with a monstrosity that needed so many registers the board would have been huge.

Taking a step back, I've decided to simplify things a little, by making the design a bit more general, and using that generality to implement as much as possible. So, the current thinking is:

Each processor node has 16 channels. The channel corresponds to a set of registers that are used to store permanent data needed by the channel.
The registers are general purpose, and can be used to store different data for different kinds of channel: it could be a DMA address, or pointers to start and end of a buffer in scratch memory, or simply scratch registers used for a data generation process (e.g. to produce a stream of psuedo-random numbers). The processor doesn't need to know. The number of registers is restricted in order to minimize size. This means that a channel won't be able to both perform DMA and store the results in a buffer -- but it can perform DMA and pass results to another channel, so you can achieve this result if you commit two channels to it. I think this is a reasonable compromise.
Each channel has multiple service routines associated that can be used in different circumstances: a source routine (that provides the data in the channel), a storage routine (that can store the data into memory) and a sink routine (that stores the channel data in its destination location).
Some macro-level routines are encoded as single instructions, e.g. DMA fetch & increment address, store in buffer, read buffer to output, DMA store & increment address, etc. This lets them be microcoded to execute as quickly as possible, and hopefully in a single cycle.
The processor will have a FIFO into which requests to activate service routines are placed, along with data for them (e.g. when a channel receives data from an external source, this is pushed into the FIFO).
Whenever no service routine is executing, an entry is pulled out of the FIFO and used to determine what to do next.

My aim is to be able to pull a byte from DMA, extra 2 4-bit fields from it, and push the two results to output ports, all in 4 cycles. That'll require some efficient implementation, but I hope it will be possible.

Inspiration from the design of the CDC6600 Peripheral Processor

Discussions

Become a Hackaday.io Member