Before getting back to general architecture, let's talk about Super Mode, because that's pretty fundamental to the memory protection scheme.
This is going to be quite a long brain dump of where I've got to in thinking about this. I'll try to add some structure, and hopefully it won't be too hard to follow, but there may be way too much information for comfort here, and some things might not make sense! Feel free to ask if that's the case, maybe I missed something, or could explain my thinking better when less tired!
Paging vs Protection
Most of what I've described so far is more about memory paging than protection. It's true that pages that aren't mapped can't be accessed by a process, but the page table itself, and the PID register, and the system I/O, are all vulnerabilities that user processes shouldn't have direct access to, except in special cases where access is explicitly given and the risk of privilege escalation is accepted.
As I said before, transparent address translation is pretty fundamental, partly because of the level of protection it does automatically provide, and partly because the 6502 stack, and to some extent zero page, are at fixed locations in memory. It would be possible to segment up those bottom two pages and apply protection checks based on the PID for example, but it would limit the system to a very small number of PIDs and/or not much stack/zero page space per process. So transparent paging based on remapping address lines through a page table is obviously the way to go.
We do need some way to set up the page table though, and some way to create new user-mode processes, and a way to switch which process is running. User-mode processes themselves shouldn't have the ability to do this directly, as it's open to abuse - so they need to be able to execute some trusted code that can set these things up without leaving any loopholes to be exploited. That code is the supervisor.
Hardware difference between supervisor mode and user mode
What is it about the supervisor that means it does have access to set these things up, while user processes don't? It's an interesting question. I've considered three main answers for it.
- The supervisor code is running from ROM, i.e. code running at certain addresses is trusted
- There's a special hardware state that's activated on entering supervisor mode, and deactivated on leaving it (a "super flag" or "user flag", I've used both terms I think)
- The supervisor simply has mappings in its page table that other processes don't (but could) have
1. Supervisor code is in ROM; code at these addresses is trusted
Regarding the first one, it is extremely limiting from a security perspective. The big problem with it is that a user program could call into the supervisor code in ROM at any address, and it could be possible to find a particular address that could be called which, though it's not an intended entry point, would lead to some breach of security that the user process could exploit.
This can be mitigated by vetting the code to ensure no such vulnerabilities exist; but that's impractical for a large amount of code, and easy to get wrong.
Another issue is that it precludes choosing to delegate the authority. I don't consider that a showstopper, but it's a shame.
2. Hardware state to indicate Super vs User mode
The next possibility is some kind of hardware flag that indicates whether the system is currently executing in Super mode or User mode. There are various forms this could take, but for now imagine a simple 1-bit D flip flop (74HCT74) that's set or clear to indicate the mode status.
As it's a persistent flag, the key point to consider is when the transition takes place. In particular, transitioning into super mode must be carefully controlled - it must not be possible to transition into super mode and immediately execute arbitrary code.
I'll come back to this in a bit, but want to discuss option 3 first.
3. Supervisor has privileges because of page table content
The whole idea of the page table is to provide different processes with different views of the system, and especially to limit processes to accessing resources that they are entitled to access. So it's natural to wonder whether we could use it to also limit access to fundamental system configuration aspects like the content of the page table itself.
The basic idea here would be that, even in supervisor code, all bus accesses get mapped through the page table. Instead of plugging a decoder into the high bits of the CPU address bus, it would operate based on the output from the page table. Perhaps the high three bits encode 000 for a RAM access, 001 for a ROM access, 010 for accessing a VIA, 011 for reading/writing the page table, etc. It's the same as what we've always done on the CPU address bus, but applied after address translation.
It is appealing and I might pursue it one day, but there are two big problems to overcome if I do - firstly, something needs to set the page table up initially, otherwise it won't have the entries it needs to permit the supervisor to initialise it; and secondly, writing to the page table would require two accesses to the page table RAM within the same clock cycle - first, to read the translated address; and second, to write new data into the page table. The PT's input bits would need to be set differently on the second access, and that's tricky to arrange. Not impossible though.
How to enter Super Mode
I want to go back to option 2 now, as it seems to be the most likely choice for a first implementation - it's simple enough to be practical, but powerful enough to achieve most of the goals. So given that there is a persistent Super mode flag, let's think about when and how it would get set.
Using an interrupt
One common way to manage this in many architectures is to make it so that the only way to transition into super mode is through an interrupt. Interrupts are special because the CPU drops whatever it's doing and runs code at a specific address, and we can carefully control what that code is to make sure there's no way the user can set some bizarre state just before the interrupt that exploits a vulnerability. There's now only one place in the code that we have to fortify.
Interrupts usually come from hardware, and in these cases it is also natural that the supervisor gets to handle the interrupt. However, they can also come from software. ARM has the SWI (software interrupt) instruction; x86 has the INT instruction; and the 6502 has the BRK instruction.
BRK causes a regular interrupt sequence, but also stores a flag that you can use to detect that it was a BRK rather than a hardware interrupt; and the byte following the BRK instruction is skipped in execution, meaning you can store an operation code there if you want to let the user process choose between different functions the supervisor can provide. For example, function 0 might be "terminate process"; function 1 might be "give me more memory"; function 2 could be "sleep for a while". This methodology is very common in newer architectures and operating systems. Normal registers can also be used for additional inputs, and outputs.
One issue with doing this on the 6502 is a CPU bug which manifests if a hardware interrupt occurs at the same time as a BRK instruction is about to be executed. In this case the hardware interrupt wins, and the BRK is not properly executed - or viewed another way, the BRK is executed, but without setting the bit that tells the ISR that it was a BRK. It's possible to work around this, but kind of annoying. I think it was fixed on the WDC chips at least, but that means you need to not work around it on those... ick.
Another disadvantage to the BRK flow is that it's relatively slow, taking quite a few extra cycles.
But how can we detect interrupts and set the flag? On a WDC CPU that's actually pretty simple - it has a specific pin output called VPB that gets set whenever it's reading one of the vectors, which happens very late in the interrupt process, just before it starts executing the ISR code. For legacy NMOS CPUs it would be much harder to detect.
Based on execution address
An alternative to using BRK is to enable super mode when an opcode is about to be fetched from a specific ROM address (or multiple addresses). If user code calls these addresses, then the system transitions to super mode and can continue to run the ROM code; if the user code calls an unapproved address in ROM then super mode would not be activated, and the user process should be blocked from reading from ROM, and get terminated instead.
In terms of implementation, it's fairly simple to check in hardware for certain addresses or bit patterns on the address bus, simultaneous with "sync" being high which indicates that the CPU is starting an instruction, and use that to set the super flag. You mustn't do it when "sync" is low, as user code could simply try to read from these addresses to enter super mode, and it shouldn't be able to do that.
Assuming the VPB test above isn't also being used, it would be important to ensure that IRQ and NMI handlers go through an approved address, so that they also have the flag set - otherwise they'll just keep running in user mode. The vector addresses would also need to be readable from user processes, as when the interrupt is first picked up, the system will not yet have switched to super mode.
This method can let user code choose between different entry points for different supervisor functions, which could have its benefits, being more efficient than parameter-passing by other means.
Based on a specific opcode
Finally, I thought it may be possible to enter super mode when a certain opcode is detected on the data bus, with "sync" high. The instruction would need to be something benign, or we'd need some juggling on the bus to send a different opcode through to the CPU. Ultimately I don't really think this is worth considering deeply - it would add a lot of hacky complexity, and doesn't seem to have significant benefits.
How to leave Super mode
On the other hand, how could the supervisor return control to a user program? This is less of a minefield security-wise because the supervisor is in control from the start, but of course it needs to happen smoothly, with the user program - which may have been involuntarily pre-empted - able to continue from where it left off. It's also important that it does happen properly and cleanly, i.e. that there aren't cases where execution could return to user code without dropping the super flag first, or cases where the flag getting cleared at a bad time causes memory mapping issues.
Note that the supervisor won't necessarily want to pass control back to the same user process that was running already. Especially when pre-emptively task switching, it needs to hand control to a different process.
Returning to an interrupted process
Let's start with the case of returning to a user process which lost focus due to an interrupt - either BRK or a hardware interrupt - because whether or not interrupts are the sole way to enter Super mode (as discussed above), they will certainly need to be at least one valid way to enter Super mode.
When an interrupt occurs while a user process is running, the CPU will wait until the end of an instruction, and then push the two-byte address of the next instruction onto the stack, followed by a copy of the status register (possibly with B cleared, if it was a hardware interrupt), and it will then fetch the ISR address from one of the vectors at the top of the address space. It's important to note that the ISR address is fetched last - and it is the earliest time we'd possibly know that we're entering Super mode (i.e. if we're using VPB to detect this). This means that all the stack activity so far must happen on the process's own stack.
In order to return to this interrupted process, we can either pick these values off the stack ourselves, or use an RTI instruction to do it. Either way, when we're done, the CPU will start executing from where it left off, and we need to be back in User mode by that point.
One way to trigger the transition to User mode could be to just always do it whenever the CPU fetches an opcode from RAM rather than ROM - or specifically from outside of a defined range of addresses that contain only the supervisor's code, and that can only execute in Super mode. However, this only works if the ROM remains visible during User mode, to guarantee that user code is never executing from that address range.
I don't really want to keep the supervisor code present in the user process's address map though, so need another way to detect leaving Super mode. Another option, then, is to have reading an instruction from a specific address be the cue to leave Super mode. That instruction could be the RTI itself. So by placing the RTI at a known address, with hardware to detect "sync" being high while that address is on the bus, we could derive a signal to turn off the Super mode flip flop.
This is pretty much the mirror image of one of the techniques for detecting entry into Super mode above. The RTI instruction could be in ROM as usual, or it could be placed at a specific address in the user process's RAM space, e.g. $00 or $0100.
Finally, the brute force option is for the supervisor code to have a way (e.g. memory-mapped I/O) to disable the Super flag explicitly. This is pretty clumsy, and one potential issue with it is if disabling the Super flag affects the memory map, for example if it's not acceptable to execute code from ROM in user mode, then as soon as the flag's turned off, the rest of the "return" code can't run - unless it's copied into RAM or something like that. This code could be simply a single RTI instruction, which is just one byte to copy.
Returning to a non-interrupted process
The other case to consider is where the user process was not interrupted by an interrupt - it made a system call itself, under its own power. If the mechanism for this in BRK then in fact it still behaves like an interrupt and the above all applies. But if the mechanism is a simple JSR, then we can't follow up with an RTI for two reasons - firstly, RTI will pop the flags off the stack but they were never pushed; and secondly, RTI pops the actual return address from the stack while JSR pushes one less than the return address. Both of these can be fixed up of course as part of the entry code, so the RTI is still a good option so long as we take care to do that.
Otherwise we could consider an alternate return path that uses RTS instead of RTI. I think the general theory is exactly the same, so it's not a big deal to support this as well. Especially if we use the method where the RTI instruction is stored in RAM at $00 or $0100, it's really easy for us to put an RTS instruction there instead when we need to.
General considerations when returning control to user processes
Generally we want to restore all register values before returning. This is critical for interrupts, but there are exceptions for syscalls because they may actually need to return data in registers.
After control is passed to the supervisor and the super flag is set, we need to be careful with the memory map. If entering super mode fundamentally changes the memory map, then the user's stack will no longer be visible; yet the stack pointer will still try to point into it. Any attempts to save the A, X, Y, and SP registers will likely write to supervisor memory, not the process's own memory. It's not a problem per se but needs to be borne in mind, especially when returning again.
Let's explore what things look like if the memory map has indeed changed by this point.
The supervisor itself won't have any useful information in its own stack, so there's one less thing to worry about there. It's fine to reset the stack pointer to maximum and carry on from there. Or to not bother, and just keep pushing data wherever it's ended up temporarily, if that turns out to be useful.
It will be necessary to save the user process's A, X, Y, and SP registers somewhere safe, potentially for a while. Due to scheduling, we might return control to a different process, and have to come back later to resume this one. It makes sense to allocate an area of memory for storing each process's saved register states. In order to do this, we need to know which process is currently running, so we can index with it. In order to index, we also need the X or Y register free. So there will be some work to do before we can actually save the values.
To determine the PID, we could perhaps read it from an I/O address mapped to the PID register. But I don't really want to have to make that register bidirectional, due to the pinout of the '273. So let's say we just store a shadow copy of the active PID in super RAM somewhere, e.g. $40, so that it's there waiting for us when the process is suspended. We can store the Y register somewhere temporarily, e.g. $41, and then load the PID into Y from $40 to help us store the rest of the registers:
zp_pid = $40 zp_temp_y = $41 ... sty zp_temp_y ; save Y for now ldy zp_pid ; load PID into Y sta $200,y ; save A value stx $280,y ; save X value lda zp_temp_y ; restore saved Y value into A sta $300,y ; save original Y value tsx ; get the stack pointer into X stx $380,y ; save the old stack pointer
This is using super RAM from $200 to $400 to store saved register values, for up to 128 user processes.
Again, this is assuming that the memory map has already changed by this point, e.g. assuming that setting the Super flag forces the supervisor's stack and other low memory to be paged in. If that's not the case, then at this stage the user process's memory will still be visible. The options are actually simpler in this case - there's no need to explicitly store the registers in super RAM, we can store them in user RAM instead:
zp_save_a = $60 zp_save_x = $61 zp_save_y = $62 zp_save_sp = $63 ... sta zp_save_a stx zp_save_x sty zp_save_y tsx stx zp_save_sp
This is obviously much simpler, both for saving and restoring the registers. There's no need to use the stack here - it doesn't need to be re-entrant. The user process won't run again until we've already restored these.
So both of these are viable, which applies will depend upon exactly how the address translation responds to being in super mode.
Going back to syscalls that need to return data, though - it's fairly easy to see that we can still return values in A, X, and Y by simply updating the stored values - either in user RAM or super RAM - before returning.
This was a long and dense one, but maybe it's been interesting. It's certainly useful for me to brain dump this all out to a log. I can't vouch for the accuracy of sense of these plans yet, and I'm keeping quite a few options open at the moment, until there are clear reasons to go one way or another. Feel free to ask what I mean though if something doesn't make sense!
One conclusion that is standing out at the moment is that if I drop the requirement to also work on NMOS 6502s, and instead focus on WDC 65C02s, this would all be easier to do in a neat and tidy way - the IRQ/BRK bug is fixed in those, and they have a VPB signal. The reason I care about NMOS 6502s though, is that I have one and might want to use that one for this project. As such, and because I kind of swap back and forth sometimes, I prefer to design things that will work on both.
Tomorrow I'll post a simplified sketch for a first prototype, less ambitious than the more full-fat design I posted today, and I'll discuss a bit how the Super Flag might work in that design, as well as how the supervisor would manipulate the page table and access user process memory.