As there are a lot of potential moving parts, and whole subsystems that aren't directly related to what I'm trying to achieve, I wanted to be clear about how important various features are to me. This then guides my thinking when deciding what to design, what to prototype, and where to compromise.
This takes the form of a MoSCoW breakdown - please expand below to see more...
JPEG Image - 635.61 kB - 10/01/2020 at 21:54
prototype0b-hardware.mp4This is the new hardware I've patched into my existing computer. On the left is the page table RAM and a NAND IC; on the right is a modified write-only debug port being used as a PID register.
MPEG-4 Video - 39.58 MB - 10/01/2020 at 01:20
prototype0b-output.mp4Video output from Prototype 0b. Two processes are running alongside the supervisor, with preemptive multitasking driven by wiring the 50Hz vsync signal into the CPU's NMI pin. RAM is split into eight pages; each process gets one for its stack and other storage, plus one of the four pages that cover video memory. The supervisor also had a page for its general storage and a page of video memory. The processes are each printing their register contents. They increment X continually while running. The supervisor is displaying the total number of times it has run the scheduler.
MPEG-4 Video - 30.02 MB - 10/01/2020 at 01:19
blockdiagram-prototype0-2.pngBlock diagram of Prototype 0 - extending the original architecture with paged memory and multiprocessing, but not protection yet
Portable Network Graphics (PNG) - 79.89 kB - 09/27/2020 at 15:26
blockdiagram-originalcomputer2.pngBlock diagram of the original computer that I'm using as a base to extend and add protected memory
Portable Network Graphics (PNG) - 47.96 kB - 09/27/2020 at 15:23
Prototype 0b - close, but no cigarGeorge Foot • 10/01/2020 at 22:17 • 0 comments
Shortly after my last log, I went ahead and wired up the 6502's NMI pin to a vsync-style pin in my video output circuit (actually marking the end of the visible reason rather than the actual sync pulse). I added some basic NMI handler code - the same as the IRQ/BRK handler, but simpler because there's no need to detect BRK vs IRQ, and no need to read the BRK instruction's argument byte - and tried it out.
It seemed to work first time, which I was satisfied by, but also not surprised because I kind of anticipated it - because the basic task switching was all done through software interrupts, it was pretty natural that it would also work from hardware interrupts.
Here's a video of this in action, and a screenshot for quick reference:
What it's doing
The 32K of RAM is divided into eight pages. Four of the pages correspond to the area of RAM that the video circuit displays on the monitor. The RAM is not cleared on startup, hence the weird random background pattern you see.
The supervisor takes page 7 (the bottom quarter of the screen) for its own video output, and displays a counter showing the number of context switches that have occurred. It also takes page 1 to use for its zero page, stack, and general memory.
Then it spawns two instances of a test program. Each gets a page of non-video memory (page 2 and page 3) for stack and general use, plus a page of video memory (pages 4 and 5). These processes just show their process ID, then sit in a tight loop printing their register contents onto the screen. The NMIs interrupt them and the supervisor schedules a new process to run, using a least-recently-used queue.
So what's the problem?
This works fine in most cases, but due to the fact that NMIs are non-maskable, I really need to ensure that all my code - including all the supervisor code - is OK to be interrupted. Easier said than done! Let's go through some of the issues and resolutions:
1. My NMI and IRQ handlers were not reentrant
As I was initially just writing a BRK handler, I didn't make it support being re-entered - the supervisor will never call BRK itself anyway. It's much simpler and more efficient to save the registers to fixed memory locations than to push them onto the stack. Still, with hardware interrupts a possibility, this code needed to use the stack instead.
As a technical detail, it's still the case that neither the NMI handler nor the BRK handler will actually get re-entered while they're already running. The supervisor indeed never issues a BRK, and the NMI is triggered at a low enough frequency by hardware that it will never trigger quickly enough to catch the supervisor before it's handed control back to a user process. However, the critical case that can occur is when a BRK is executing, and the NMI fires. That's the case that needs to be guarded against. So it would be sufficient to just make one or the other be reentrant.
2. NMI can occur while the supervisor itself is active
As it's possible for an NMI to occur while a BRK is being handled, the NMI handler needs to cope with interrupting the supervisor. The general behaviour needs to be different, because the supervisor is not currently activated like a normal user process. In fact, given a way to detect that the supervisor is running, the easiest thing to do is just exit the NMI as quickly as possible without doing any damage. The only purpose of the NMI is to forcefully interrupt user processes.
The difficulty is reliably detecting that it was the supervisor rather than a user process that was interrupted. At the hardware level, I didn't build in a way to read back from the PID register. I tried shadowing it in software, but this was futile because it's not possible to update the shadow and the actual PID simultaneously. There will always be a gap where they don't match, and an NMI in that gap is impossible to handle cleanly.
3. NMI pushes...Read more »
Prototyping RoadmapGeorge Foot • 09/27/2020 at 18:17 • 0 comments
Last weekend I thought quite a way ahead, in terms of where the architecture needs to end up, and wrote up the basic requirements I have for this project, which you can find in the project's Details section.
There are still a lot of open questions about the designs I sketched out last time - in some respects they don't fully achieve what I wanted, but they're already getting complex and I don't want to overthink them at this stage. Instead I want to plan some steps along the way to getting to a point where I can make better decisions about those designs from experience based on earlier prototypes.
So here's the initial roadmap. I'm only planning the first few stages - later stages are subject to change as I go through:
- Base computer - start from a working base and extend from there
- Prototype 0a - adding paging and cooperative multitasking, but without expanding to more RAM, and without memory protection yet
- Prototype 0b - adding pre-emptive multitasking (unless I decide to defer that)
- Prototype 1 - adding simple memory protection so processes can't interfere with the page table
- Take stock and make a better plan for the next steps
When to enter/leave Super Mode?George Foot • 09/22/2020 at 00:56 • 0 comments
Before getting back to general architecture, let's talk about Super Mode, because that's pretty fundamental to the memory protection scheme.
This is going to be quite a long brain dump of where I've got to in thinking about this. I'll try to add some structure, and hopefully it won't be too hard to follow, but there may be way too much information for comfort here, and some things might not make sense! Feel free to ask if that's the case, maybe I missed something, or could explain my thinking better when less tired!
Paging vs Protection
Most of what I've described so far is more about memory paging than protection. It's true that pages that aren't mapped can't be accessed by a process, but the page table itself, and the PID register, and the system I/O, are all vulnerabilities that user processes shouldn't have direct access to, except in special cases where access is explicitly given and the risk of privilege escalation is accepted.
As I said before, transparent address translation is pretty fundamental, partly because of the level of protection it does automatically provide, and partly because the 6502 stack, and to some extent zero page, are at fixed locations in memory. It would be possible to segment up those bottom two pages and apply protection checks based on the PID for example, but it would limit the system to a very small number of PIDs and/or not much stack/zero page space per process. So transparent paging based on remapping address lines through a page table is obviously the way to go.
We do need some way to set up the page table though, and some way to create new user-mode processes, and a way to switch which process is running. User-mode processes themselves shouldn't have the ability to do this directly, as it's open to abuse - so they need to be able to execute some trusted code that can set these things up without leaving any loopholes to be exploited. That code is the supervisor.Read more »
Address decoding logicGeorge Foot • 09/21/2020 at 21:03 • 0 comments
Regarding the decoder, it felt like a really good fit for the PLDs that Dawid Buchwald had been recommending recently, so I thought a bit about how that logic works. What does the supervisor need to be able to do? How can it set up the various control lines to achieve it? How many outputs does the decoder need? Here the resulting logic table:Read more »
Requirements and architecture brainstormingGeorge Foot • 09/21/2020 at 20:32 • 0 comments
Over the weekend I thought a bit about what I want to achieve, what matters most, what I'm not so interested in doing, and drew up a list of requirements to help me balance various design choices. It's in the description above. It will still evolve over time but it's still helpful to have a reference written down.
I also thought a bit about architecture. In order to support multiple processes on a 6502 you pretty much need address translation, otherwise they all have to share the same stack space at 0100-01FF. It's possible to do that (GeckOS does when running on the C64) and I think it's also possible to do it in a safe "protected" way, but it's not very elegant. My goal is to explore the benefits of hardware support anyway. Using memory paging and address translation is more in line with how larger real-world architectures work.
So I thought a bit about how to organise that, what addressing scheme to use, and drew some diagrams on paper. These are horrible to read, but they make sense to me and I wanted to share them anyway. When things are more concrete I'll make some clearer one electronically.
The first is just brainstorming the addressing scheme:Read more »
Inspiration and existing projectsGeorge Foot • 09/20/2020 at 18:36 • 0 comments
I wanted to call out where some of the inspiration to do this came from, and also point out another project here on Hackaday that's touched on the topic.
Inspiration - GeckOS by André Fachat
The main inspiration for me was about a year ago when I saw the YouTube video that I've linked at the end of this log entry. It's a presentation by Glenn Holmer from a retro computing convention, in which he demonstrates a multitasking OS running on a 6502-based Commodore 64.
The OS in question is called GeckOS, and it's by André Fachat. Any time you search for things like this on a 6502 you're likely to end up on André's web site! He was trailblazing this kind of thing back in the 90s, on both the software and the hardware side.
My first thought when I saw this presentation was that there are some really simple things you could do in the hardware to make this work a lot better. And of course that's exactly what André did - it's not shown in this presentation, as the presenter is only using a C64, but André was building his own monster of a homebrew computer, with 6502s and other processors all slotting together, and at least at some stages he was using quite a versatile MMU.
André's system did have some elements of memory protection, including quite advanced features such as virtual memory (using page faults to page from storage), and read-only and no-execute pages. I'm not sure whether he implemented a privilege system though - from his designs it looks like untrusted code could break out of its sandbox.
I also wanted to point out this other project here on Hackaday that has some similar needs. I'm not sure if it's still in development, it was a couple of years ago: https://hackaday.io/project/98837-8-bit-portable-internet-enabled-computer
They are thinking along similar lines, in any case, and of course, referring back to André's excellent work as well.
If you're aware of any other interesting attempts at this, please do let me know. I'll probably still forge my own path, as this is all about the journey for me, but I'd still love to check them out!
Why make a protected memory enviroment on a 6502 breadboard computer?George Foot • 09/20/2020 at 17:33 • 0 comments
Over the years I've found my homebrew 6502 computer to be an incredibly valuable way to experiment with and understand computer hardware concepts. I've learned about clock signals, memory, buses, address decoding, and memory banking; I've built my own video and audio output circuits, and keyboards that connect straight to the bus; I've experimented with the 6522 VIA's timers and shift register, and wired peripherals up to that. I've also learned a lot about digital electronics, how and why things work, and what not to do!
A lot of what I've done has been driven by understanding how the BBC Micro that I used in the 80s worked - but I'm also interested in other concepts that were less relevant on early 80s microcomputers.
Protected memory systems had existed long before the 6502 was produced, but were only relevant on much larger systems, which had multiple concurrent users and enforcing privacy was important. In later decades, running partially-trusted software became so important that computer security became relevant for home users as well - but for the general applications of a 6502, especially with the memory constraints of the age, it was never very relevant.
Nevertheless, even if these concepts aren't useful on small 8-bit computers, they're still interesting and educational to experiment with. The simple architecture and lack of advanced features built in to the 6502 itself mean that you have to create the more complex systems yourself - and this means you have the freedom to try doing it in ways that make sense to you, and discover the pros and cons yourself, without being forced to do it in the way a more advanced CPU requires you to.
I don't particularly expect to make a useful protected memory environment on my 6502; but I do hope to make something simple, elegant, and understandable, and learn a lot in the process!
Enjoy this project?Share
I suggest two techniques to detect an interrupt sequence on NMOS 6502 which lacks the vector pull pin: 1. Three consecutive writes can only happen during an interrupt sequence (BRK, IRQ, or NMI). 2. A sync cycle followed by a read from the same address. It only happens during interrupt sequences. All other instructions cause a sync cycle followed by a read cycle from the incremented address. More discussion: http://forum.6502.org/viewtopic.php?f=4&t=3452
Very interesting, thanks for the reference! All three writes are stack writes too I guess. I had also considered detecting sync on the specific ROM address of the ISR. The user could JMP there but I couldn't think of a way to abuse that.
Yes, three consecutive stack writes. I thought about using sync on a specific address too but it would preclude user processes from using the full 64K address space. Not that it would be a serious limitation...
Wonderful. Wondering if 6502-age me would have been capable of getting my head around something like this then...
I think I was too young myself - I've come to regret not being born just slightly earlier. But then I was lucky to be born when I was and not later!
I also regret not being more interested in hardware in those days - I dabbled but was nowhere near as interested as I am now.
is there a finisched schemaktic and example code ?