Tagged Control Stack

A project log for F-CPU

The Freedom CPU project has a log here too now :-)

yann-guidon-ygdesYann Guidon / YGDES 08/08/2023 at 03:250 Comments

Tagged registers (see 14. Tagged registers) might not be the best idea ever. OK. But my latest musings about the stacks (#A stack.) and the support from various languages made me realise that a different view is necessary.

The usual C stack is a merged Data + Return + frame stack with many problems. FORTH (among a few others) uses a split system with Data separated from Return. And some languages implement their own data stacks for large variables that must be accessed by the caller.

There are also other things to consider : scopes.

Scopes can be a (set of local) variable name(s) or an error handler (try/catch) : they have their place on the stack because they need to be rewound in inverse order. I doubt it's a good idea to create a separate stack for error handlers, they belong to the "return stack" IMHO, otherwise how do you know where to reset the stack pointer ?

So I have just had this really weird idea : tag items on the return stack with a type.

For example we would have the usual call/return type for function calls, as well as the handler type for nested functions. A "canary" type would also help detect an error, for example. But imagine that the MSBs of the pointer (for example, could also be LSB so normal return goes to aligned words...) that is pushed on the stack flags an error handler.

There are languages that would be happy with such a HW-based support, and I have no clue how to hack C to perform this trick. But this is a reasonable, useful, safe set of features that can be implemented in hardware, even at the risk of having the return instruction potentially taking more than one cycle to complete.


So the CPU must support two sets of instructions to share the return stack (which is separate and HW-handled) :

"error" acts like a "return" as seen before but the "handler" instruction is not a call, instead it gives an address to be pushed on the return stack, so the instruction format remains the same, and it does not affect the pipeline's flow.

Of course the compiler must also ensure that context for the error can be reached, the data stack is not affected by these instructions so extra care is required for the frame. But that's not something that hardware must try to optimise.

What do you think ?

Another use of the scope is to automatically free resources that have been allocated. Ideally room is allocated on the stack, but open handles or complex stuff must also be popped off from some sort of something. So maybe a copy of the data stack pointer can also be stored,  just like the "push bp" idiom on x86 function prologs, but automatic here.


A "canary" type might not be necessary if there is a stack limit register.

However another required type is the IPC : Inter Process Call, which must remember the caller's address, caller's thread ID and potentially some other metadata/masks/flags...


20230908 : I'm trying to make a census of the required codes. 3 bits should be OK.

000 is invalid/reserved.

001 is for catch/error handling. It is missing a "frame pointer" to recover local data though.

010 is for normal call/return

011 is used by for/loop.

100, 101 and 110 are for IPC/IPR (110 is the return address, as saved by CALL, 101 would be the ID of the calling thread and 100 could be a bitmask of the "masked" registers to prevent data leaks, or could be credentials...)

111 could be used for "chaining" when/if blocks are moved from/to main memory, some DMA would be required to maintain a linked list or remember the position, with some granularity. That would be critical to speed up swapping a context during a switch.

Let's see if this works...


Thank you FORTHers !