Yet another homebrew computer based on 65C02 CPU
To make the experience fit your profile, pick a username and tell us what interests you.
Last time I wrote about my experiments with common operational amplifier, but obviously, there was certain context to that, and I found the topic worthy of another post. Again, the inspiration came from this amazing video by George Foot, and this time I would like to tell you a story of building simple adjustable active load circuit, and how it allowed me to break Ohm's Law. Twice!
Let's start at the beginning: adjustable active load is a circuit/device that allows you to simulate certain (and configurable) load on your system. It is critical for testing any kind of power circuits, but the greatest value of it is in the learning opportunity. Also, please note: if you are to test any kind of commercial product design, you should probably buy professional grade device for several hundred dollars.
However, should you decide to build your own, you get to use different components and while troubleshooting any issues you come across, you get to understand them much better. It was fun, it was full of surprises and discoveries. Strongly recommended!
On basic level such a load circuit will consume certain amount of current and convert it into energy, probably heat. If the idea were to use constant, defined voltage and constant, defined current, you could just use power resistor - and that would be it. Want 500mA at 5V? Using Ohm's law you calculate the resistance as 10 Ohm. Power dissipation will be 2,5W so make sure your resistor can handle that.
Things get more complicated when you want your load adjustable (in terms of current passing through) and working with different voltages. This is where single resistor will not suffice. I used the this blog entry as inspiration, and the schematic was as follows:
Let's explain how this circuit is supposed to work:
Now, what happens if the non-inverting input is higher than the 150mV measured on OpAmp inverting input? It will increase output voltage delivered to Q1 MOSFET gate and as a result, the MOSFET will pass more current through. This will keep happening until voltage drop on R4 is equal to the non-inverting input of OpAmp. Beautiful usage of the feedback loop!
What is also great - none of the input parameters to OpAmp depends on the input voltage. R4 voltage drop is calculated against ground, and RV1 output is always measured against the 2,5V reference voltage provided by U1 chip. Lovely, isn't it?
Theory and practice - in practice
The beautiful simplicity of this circuit could be matched only by its utter and complete failure to work. I built it on breadboard, provided 5,35V power from standard 2A charger and started testing. Yeah, it would work pretty well almost halfway through, but at around 600mA it wouldn't get higher anymore. I replaced the MOSFET, tried different variants of R2 biasing resistor. 660mA was the limit and that was that. Sure, I could live with the 600mA limitation, but I wanted to understand where it came from, especially that on paper it looked as if it should be able to pull full...Read more »
I believe I have said it before, but here goes again: one of the things I don't like about all these EE tutorials out there is that most of them are written by people who are actually pretty experienced. They don't remember what was hard to grasp at the beginning and keep using terms that are not that clear for people like myself that are new to the field. The same goes for most books, every single datasheet (for a good reason, really), and majority of videos.
Then there is this split between analog and digital electronics. I've been working way more with the latter, and it all seemed so easy. Sure, there were terms like "input capacitance" or "output impedance" that didn't mean anything to me, but hey, as long as you connect these chips like LEGO pieces, it doesn't seem to matter.
Time went by, and this ignorance was like an itch - something you can forget if you try hard, or have too much of a good time, but it comes back whenever things get rough. As it turns out, there are other people out there having the same problem (pretty decent understanding of digital, but much less of analog electronics), and sometimes they provide excellent inspiration. This great video by George Foot reminded me how badly I need to work on my understanding of the simplest circuits. If you haven't seen it yet, please do, it is really amazing: simple, clear explanation of complex concepts, made by someone who still remembers the difficult beginnings.
I decided to build the circuit myself, trying to understand each part of it the best I can. Since OpAmp is critical part of the circuit, I started there and it was amazing journey so far. So, however it's not related to my DB6502 project, I decided to write about it, because it's definitely something interesting to share.
Fun with OpAmps
OpAmps are virtually everywhere. People that know diode polarity without checking the datasheet probably know everything about them and think that simple two rules explain it all. For everyone.
I tried watching several videos and reading multiple articles, but they all seemed rather convoluted. One way or another, I decided to give it a try and build some basic circuits myself. I will document all the mistakes I made here, because I want to illustrate the learning process and show how insignificant exercise as this one can help you build very strong understanding and intuition about basic rules about electric circuits. Let's get making then!
Simple comparator circuit
To follow along with my exercises, you will need the following:
It all starts simple - two voltage dividers, one of them being adjustable with the potentiometer:
With 5,35V power supply I'm using V2 is around 2,65V and V1 can alternate between 2,40V and 2,92V. I chose the values of the R1, R3 and RV1 resistors to make sure that V1 range is pretty small, around 500mV. After all, we are going to amplify that signal, right?
So, let's go ahead and test the output when turning the potentiometer. I use scope in slow "roll" mode to ensure that slow changes introduced that way are clearly visible on screen. Channel one (yellow) is connected to TP1 above and channel two (pink) to TP2.
As you can see, channel 1 oscillates just a little bit below and above the channel 2 - just as I wanted it to.
To use OpAmp as comparator we need to create something called open loop. In general, the way to use OpAmp is to feed back some of the output signal back into one of its inputs; this is called closed loop configuration, and it allows us to control the gain, or signal amplification factor. Sounds complicated? It did to me, so let's start without feedback loop, with something much simpler.
In the open loop configuration gain is virtually infinite, causing...Read more »
Unfortunately, recently I haven't been able to work on my project as much as I would like to, and the progress is much slower than I was used to. That being said, taking some time off can give you new perspective and lets you reconsider your assumptions, goals and plans. So, not all is lost...
At the moment I decided it's time for another PCB exercise - struggling with 14MHz experiments I kept asking myself whether the problems might be caused by poor connections on breadboard. I know, it seems far fetched and probably is not true, but still - the PCB version I'm using right now was supposed to be temporary and replaced down the line by next iteration while I sort out some of the design questions. I have, actually, so I should probably stick to the original plan.
Sure, making PCBs is not cheap and there is certain delay between order being placed and the board arriving, but given how slow my progress has been recently this is something I can live with. On the upside, I want to use this opportunity to test some new ideas, including some fixes to original design. Stay tuned, I should write about it soon.
For now - there was one issue I didn't want to keep open, and since I was about to make a PCB I needed to decide how to solve it. The issue was nothing new, it's something I mentioned previously: RDY pin on WDC65C02 is a bidirectional pin, so it requires careful handling to avoid damage to CPU.
As I wrote in the "Wait states explained" blog entry, the main issue with RDY pin on 65C02 is that it can work in both input and output modes. Most of the time you will be using only the input mode, supplying information to CPU about wait cycles (if that's not clear, please read the previous entry on the subject), and it's tempting to connect your wait states computation logic circuit output directly to the RDY pin. There is serious risk associated with this approach - if, for one reason or another, CPU executes WAI instruction, RDY pin will change mode to output and line will be pulled low (shorted to GND). At the same time your wait state circuit might be outputting high signal on the same line (shorting to VCC) and you will cause short between VCC and GND, resulting in high current being passed via the CPU. If you're lucky it will cause only high energy consumption, but if not, you might burn your CPU.
Sure, there are some standard approaches to the problem, and I will investigate them below. The thing is that the above section is not all. You also need to remember another thing: if you intend to use wait cycles it probably means you are planning to make your CPU run at higher frequency, giving you less time to spare for any of the solutions to work.
This is why I wanted to compare each of the approaches and discuss pros and cons of each. I hope it will help you choose the approach that is suitable to your build.
So, based on the problem statement above, the question I want to answer is: how do these approaches perform in real scenario given the following below constraints:
Now, the most proper way to do it would be to test it against the actual 65C02 CPU, and I might actually do it in future, but at the moment I needed much simpler setup. I just wanted to test what is the fastest, energy efficient way of delivering RDY signal to receiver and compare some of the ideas I saw on 6502.org forums.
As described in the paragraph above, this is what I needed: oscillating high/low CMOS signal exiting output of one gate being fed into input of another gate. This would resemble closely target situation where the producer of the signal is your wait state circuitry and the consumer...Read more »
This is the final part of the 14MHz series, but I'm sure it's not last entry about it. Sorry if it had been a bit stretched, and maybe too beginner-friendly, but I guess for all the experts out there it's all common knowledge. It's the beginners like myself that struggle with these things, so I'd rather write a bit more and make it more useful.
As I wrote in my first post on the subject, all the other issues are secondary, but the timing is the key in running 65C02 at full advertised speed. Bus translation is not very difficult, and documentation quality can be worked around with enough research (remember what that word meant before Google?), but both of these challenges are all the harder with tight timing of 14MHz.
Before we get to the point where I can talk about specifics, I would like to cover one more thing on the subject: what is the timing violation, and how can that affect your build. Again, sorry for going into such basic details, but it might not be obvious for everyone; it certainly wasn't obvious for me.
What happens if you violate chip timing?
We have all done that at some point, and what we know for sure is that it didn't cause the universe to implode. That's already good news, but in fact: where do all these timing restrictions come from and why? Well, our digital logic integrated circuits are not as digital as we would like them to be, nor are they logical. That part I'm sure of - integration and circuitry are still up for debate :)
What happens in a chip like a simple NAND gate is that whenever voltages change on input pins (which, by the way, is also not that very instant!), there is very long and complicated process where different components of the circuit start responding to changing input, and they all do it in very analogue and illogical way. Usually the dance of currents and voltages takes from several to several dozens of nanoseconds. Anything that happens in between is pretty much random, and as with anything random, you can never assume that your result is the proper, final one. It might, just as well, be just random value that resembles the final value closely enough.
What's even worse, this dance is not deterministic. It's not like the access will always take the same amount of time, because both internal and external conditions might change the duration of the process. This is why in datasheets you have pessimistic values for each operation, and while these are not very important at slow CPU speeds, the faster you go, the more it matters. Let's look at the NAND gate used in Ben's project:
Now, it's tempting to assume that the worst case possible scenario at room temperature should be around 15-18ns (taken from rows 4 and 7), but this assumption is valid only if you can guarantee that your operating voltage will not drop below 4.5V. Can you? Sure, we have decoupling caps for that purpose exactly, but still, keep that in mind, it might matter! If the voltage drops below 4.5V threshold, propagation delay will be longer and valid response will appear on output later. Will you notice? Not necessarily. You might be lucky to get response faster thanks to the random operation in the IC.
Still, these are pretty simple cases. When you consider more complex chips it gets even worse. More moving parts means much more unexpected behaviour. It's especially interesting in case of reading ROM memory, which usually will be the slowest part of your build (unless you connect LCD directly to the bus, that is). Let's consider simple example (assuming ROM starts at 0x8000):
LDA $2000 CMP $9000 BNE not_equal
As you can see, I'm reading RAM at address 0x2000 and comparing it against ROM value at 0x9000, jumping to
not_equal label when the values differ . How much can you violate the ROM timing for the code to work? Basically, how much can you push that read beyond ROM limits before it fails?
There are two things to consider here:
One interesting thing that Ben doesn't seem to elaborate on in his videos, is the interesting issue of CPU families and resulting chip (in)compatibility. I came across this issue when started using SC26C92 Dual UART chip, but only much later, when tried pushing 6502 to 14MHz limit I noticed some resulting issues.
Let's start with the beginning, though. If you followed Ben's project closely, you might have noticed important difference between IC interfaces. If not, you will notice shortly...
Conveniently, it's very easy to hook up 6522 chip to 6502 CPU bus. No wonder - these belong to the same "family" of CPU and peripherals, and they use the following signals to synchronise operation:
If you check the ACIA chip (6551), you will notice it has the same set of control signals (with fewer registers, but the idea is the same):
Now, if you look at the ROM/RAM chips, these are a bit different:
As you can see, some details are similar (like the low active Chip Select signal), but part of the interface is a bit different. Instead of single Read/Write signal, there are two separate lines: low active Output Enable and low active Write Enable. There is no PHI2 signal, and as a result, to prevent accidental writes, in Ben's video about RAM timing there is necessity to ensure that write operation is performed only during high clock phase.
If you haven't played with any other CPU of the era (I haven't at the time), you might just accept the solution and just move on without thinking too much about it. This is exactly what I did, and only after playing with higher frequencies (and, specifically, wait states) I had to revisit my understanding of the subject. But I'm getting ahead of myself...
Side note: all the issues I ran into when trying to connect to this chip are the reason I started this blog in the first place - I wanted this documented somewhere. Probably will need to write more details about the initialisation and such details. Some day, I guess...
When you read this specific chip documentation you will find it uses interface similar to the one used in ROM/RAM:
As you can see, there are standard A0..A3 register select lines, D0..D7 data bus, low active IRQ output line. The first important difference is the RESET signal, which is high active, but this translation is easy - single inverter or NAND gate will do. Chip Enable (other name for Chip Select) is predictably low active, and there are two signals to control read/write operation: low active RD (identical to low active OE) and low active WR.
Now, it might seem that connecting to this chip is pretty simple, and you should do it in a similar way Ben connected RAM:
This way we ensure that RESET signal is RESB inverted and RDN is low only when R/W is high (indicating read operation), while WRN is low only when clock is high.
Unfortunately, there is an issue here: early during clock cycle, while address lines are still being stabilised, you might get random access to the UART chip (your address decoder might react on the unstable address and accidentally pull UART CEN line low for just a couple nanoseconds). At the same time RDN might be low, resulting in read operation being executed.
Sure, the operation would not be valid - it would be at most 10ns long, which is way below the minimum pulse length, but this is actually not a good thing. It might cause issues with chip operation stability or worse.
How can anything be worse than the chip instability? Actually, as I have learned, certain operations can be executed,...Read more »
Last time I wrote about three problems that one has to solve when trying to run 65C02 CPU at 14MHz. Obviously, how severe each of these is depends on the details of your build, so each case will be different. Let's start with comparison of two main methods and see how they can help you out with faster clock speeds.
One of the first issues you can think of (when looking at faster clock speeds) is the access time requirement for each of the components. When I started this investigation, I noticed that DIP package ROM chips were rated for 150ns access time - way too slow for anything above 4MHz.
Now you might wonder - I wrote in the past about my first build running at 8MHz - and you would be right, this was a mystery to me as well, and I will explain it later on. For now let's assume that access time is half of your clock cycle, and at 4MHz that would translate to 125ns - close enough for 150ns EEPROM to work properly at room temperature and stable 5V power supply.
So yeah, what are the options here?
The fastest 32KB one I could find was SOIC/TSOP/PLCC package 70ns access time AT28HC256 chip. You can go a little faster with AT28HC64 - it's 55ns, but only 8KB. If you want to compare these, you have to consider additional address decoding logic that will add to that, resulting in similar access time.
Still, even at 55ns you can't go much faster than 10MHz (half of clock cycle is 50ns), so what can you do about that?
Clock stretching vs. wait states
There are two ways to address this issue: one is not running your clock faster than the peripheral can handle, and this is called clock stretching; the other one assumes clock rate stays the same, but each CPU operation takes more than one clock cycle to execute. Hopefully the following diagrams should clarify the difference.
Here is an example of clock stretching method:
We have two clocks: CLK1 running at 10MHz and CLK2 running at 4MHz. CLK is the actual clock fed to the CPU, and the switch always happens at low clock. Switch is triggered by address decoding logic circuit, specifically nRAM and nROM signals. Let's look at the sequence:
As you can see, clock switching is not a trivial task, and if you are looking for a great document on how to implement it correctly, you will find it here.
This is identical scenario, but implemented using wait states:
As you can see, there is just one clock here, CLK, running at 10MHz. I assumed that access time must be comparable to the clock stretching scenario, so the access time for ROM must be at least 250ns (full 4MHz cycle length), but please note: this assumption is actually silly, when you dig deeper into details. For now I just wanted to illustrate the mechanism.
So, the following things happen:
This update is long overdue - and apologies for that - but I have been really busy recently. Between family issues, failed parallel cooperative project, end-of-year workload increase and quite complex project challenge I found myself stuck, overwhelmed and demotivated to write.
Luckily things are looking much better now, and hopefully I can write more regularly now, because I do have quite a lot to share. It all started with simple challenge: run 65C02 (and the whole DB6502v2 build) at 14MHz. Simple idea, isn't it?
After all, this is the maximum CPU speed for which WDC 65C02 is rated for, according to official datasheet. Since I have already had it running at 8MHz in my first revision, it didn't seem like something very difficult to implement. Certainly not something impossible, but still, at the time of writing this words, I can't say that I have reached the goal fully. Sure, I did capture this nice screenshot (proving at least that it's partially possible):
As you can see, measured and reported CPU speed is 14MHz and indeed it was running with crystal oscillator at 14MHz at the time. Was it stable? Well, one would say that any Syystem reporting its status like that is not very stable... Obviously, it's not a typo in the source code, it's a serial interface glitch resulting in double write.
So yeah, I have tried, but haven't succeeded yet. There are some other issues to handle, but I will write more about them as I describe the journey, as there is plenty to talk about. It will probably take a couple of project logs to go through it all. And to be fair, I might not even be able to get it to run reliably at that speed...
You might be wondering where's the problem - you just plug in faster oscillator and that's that, right? Well, not exactly.
Famous last words.
Unfortunately, as we all learn when pushing the limits of our unconscious incompetence, it usually is harder than it seems. Sometimes it might seem like making one more step should not be harder than those previously made, but life can surprise us in all possible yet unexpected ways.
As I wrote in previous logs couple of times, this adventure of electronic discovery has been full of surprises and weird glitches that could have been perfectly well explained if only investigated closely enough. Sometimes these glitches are infuriating, making even the best theoretical circuits fail in unpredictable ways, and sometimes they seem like miracles that I took for granted.
Let's look at the three main problems that I have encountered during my journey.
When you look at datasheets of various components, you will notice that they have pretty strict limitations in their timing. 28C256 EEPROM in DIP package (used in Ben's build as well as mine) is rated for 150ns access time. I wrote about it some time ago, when I was surprised I got it to work at 8MHz - and right now I understand much better what happened back then, but the general idea is that when your clock speed increases more and more challenges emerge.
Let's consider address decoding for instance: when you are operating at 1MHz your full clock cycle takes 1000ns, and
It might seem like a really short time (30ns max vs. 1000ns cycle time), but when you consider that at 14MHz each clock cycle takes only 71ns, it suddenly becomes major concern. Each single nanosecond counts and matters.
So yeah, not only RAM access time is an issue here, everything can cause problem.
Funny thing is that even the "fast" SRAM used in Ben's build (62256) is not fast enough for anything above 9MHz with its access time of 55ns - and it took me a while to figure that one out as well...
To summarise: first main problem is how fast everything happens at 14MHz, and...Read more »
So, last time I wrote about things that scare me the most: some seemingly random glitches that obscure larger design problems. This is why whenever I see something off I get really anxious - I'm afraid this time it will be too hard to fix, and I got pretty terrified recently!
As usual, I want to share the story, partially because it makes for a nice cautionary tale, and partially because it was pretty interesting investigation that followed, with some magical twist to it.
First things first, to set the stage. Recently I made an amazing discovery, but I will cover this in a separate entry. Suffice to say I managed to solve one of the major pains with my first version of the board without any significant modifications to version 2.
As a result I could finally move forward with clock switching design I wrote about previously. After having included all the comments from Reddit, I moved on to hardware implementation: one 74AC74, one 74HC157 and a full-can crystal oscillator.
I was surprised to see how easy that was. With all the schematics prepared in advance and prototype build for my test fixture it took less than 15 minutes.
Booted up OS/1 and all seems fine, the whole machine started at 8MHz, ran just fine until I decided to enter debug mode where it seamlessly switched to 300KHz mode with bus logging and when needed I could single step down to half-clock-cycle precision. Lovely.
Another feature I included was real-time CPU clock frequency measurement, so the below output was captured in a single session, without restarting or powering down the computer:
+---------------------------+ | | | #### #### # # | | ## ## ## # ## | | # # ### # # # | | ## ## ## # # | | #### #### # ### | | | +---------------------------+ OS/1 version 0.3.5C (Alpha+C) Welcome to OS/1 shell for DB6502 computer Enter HELP to get list of possible commands OS/1>info OS/1 System Information System clock running at 0MHz ROM at address: 0x8000, used: 13420 out of 32768 bytes. System RAM at address: 0x0300, used: 1517 out of 3328 bytes. User RAM at address: 0x1000, used: 0 out of 28672 bytes. ROM code uses 9056 bytes. ROM data uses 4194 bytes. SYSCALLS table uses 164 bytes. VIA1 address: 0x0220 VIA2 address: 0x0240 Serial address: 0x0260 Serial driver: SC26C92 OS/1>info OS/1 System Information System clock running at 8MHz ROM at address: 0x8000, used: 13420 out of 32768 bytes. System RAM at address: 0x0300, used: 1517 out of 3328 bytes. User RAM at address: 0x1000, used: 0 out of 28672 bytes. ROM code uses 9056 bytes. ROM data uses 4194 bytes. SYSCALLS table uses 164 bytes. VIA1 address: 0x0220 VIA2 address: 0x0240 Serial address: 0x0260 Serial driver: SC26C92 OS/1>
First time the INFO command was invoked, the computer was running at 300KHz, hence the 0MHz reading. Before second invocation I switched clock in the supervisor session to 8MHz and it was detected properly as you can see above.
Lovely, isn't it?
It seems like more and more features from my dream DB6502 build are getting implemented nicely, I'm proud to report :)
So, obviously, I needed to test some more complex programs to see if the system is stable. I mean it's all very nice, but bare operating system doesn't make for a good testing software.
I loaded some simple programs, and they all worked just fine. Tried MicroChess, which is using CPU and memory extensively and this one also worked correctly, no glitches there.
Time for the most difficult one: Microsoft BASIC interpreter. It loaded just fine (well, almost, but that is different story I will cover another time), and I ran it in anticipation. It starts by asking user if this is Cold or Warm boot, and depending on the answer it starts memory size detection routine.
The "memory detection" is really simple mechanism: it starts from defined address and moves on, byte by byte, writing and reading 0x55/0xAA to each address....Read more »
So, I don't know about you guys, but for me the most scary part about designing any circuit whatsoever is that it might not work, but not all the time, just every now and then. Failure rare enough to be near impossible to capture, yet severe enough to make the device unusable.
Sure, you can test your design all you want, but honestly, how many reliable tests can you execute? What if the problem is related to one of the most heavily used parts of your circuit? That will be near impossible to troubleshoot.
So, I came with an idea for DB6502 v2 that would enable two modes of use: full-speed mode, without AVR supervisor attached (say 8MHz, will go faster next time), and slow-speed mode, with AVR analysing system busses. Obviously, AVR would be controlling the selector, and user can choose the mode on the fly, via the supervisor shell.
Implementation of such contraption is actually pretty simple - all you need is single 2:1 multiplexer:
So, depending on the signal fed to S pin on the mux, 6502 would be fed with 8MHz clock or the slow AVR variant. There is, however, a serious problem with this approach:
There are certain requirements in the 6502 CPU as to the length of the clock cycle. Both high and low phases of the clock need to be of certain duration. If the toggle happens in the middle of low or high phase (called PHI1 and PHI2 respectively), CPU state might get corrupted. Nothing tragic, but whatever software ran on the computer would no longer work as expected.
Probably most of the times you wouldn't even notice, because CPU would somehow recover or the data that was corrupted (like accumulator state) was not important (as it was going to be overwritten anyway in next cycle).
However, every now and then, the results would be catastrophic - execution would fail due to hard to pinpoint glitch.
The problem is that you need to find a way to ensure these things don't happen. Even if you know what to do (and probably some of you already know the solution to the issue at hand), the important questions is: how do you know the solution will work?
Well, I'm new to electronic engineering, but I'm no stranger to software development. What developers do to ensure their mission critical code runs correctly? They apply one of many proven techniques to ensure code correctness, and one of them is Test Driven Development where you start with writing tests that your software absolutely must pass. Your tests are not based on observations of the encountered or expected failures, your tests document the critical requirements. If your software must ensure safe plane landing you don't test altimeter reading, you test for collision, and first flight ends in flames :)
Basically, to consider TDD execution to be proper, you have to ensure to see the test fail the first time. If you wrote your code first and the test later - you are doing it wrong. If you wrote your test first, and then your code that works - you are doing it wrong. You have to see your test fail to know that the test itself works correctly. Only then, when the test finally passes you can consider the code correct.
So, how do we go about this approach with the problem at hand?
There is just one requirement here: clock cycle can't be shorter that half the maximum CPU frequency. So, if the maximum for modern 6502 is 14MHz, then neither of the clock phases can be shorter than 35ns (half of 71ns which is 1sec/14.000.000).
So, we need to generate special test fixture that will toggle clock selector in a way that ensures shorter than 35ns cycle. Then, we need to come up with a test that will catch these occurrences. And only then, when we prove we can see the test fail, we can go about finding a fix for the problem.
Let's start with the basics: we will need a clock, say 8MHz, that will generate the basic signal:
Build it on breadboard:
Measure to be sure:
Sorry,...Read more »
So, I've been playing with my DB6502 Proto Board for some time now, polishing the supervisor software recently, and it's pretty neat as it is now. You can flash ROM, you can obviously read its contents as well. You can run the 6502 using onboard AVR as clock source with speed ranging between 300KHz (system bus captured, no breakpoints yet) and 800KHz (system bus capture disabled). You can single step over single cycle or single instruction, so basic disassembler is already in place. Some screenshots:
The one above shows how onboard AVR is used to flash OS/1 system image to EEPROM.
This one - dump EEPROM operation and entering monitor shell.
You can single-step the cycles...
And whole instructions.
Finally, you can run fast to get enough performance to run OS/1 on the board:
I have also implemented major redesign to OS/1 serial interface architecture, it's using replaceable (at compile time) serial driver modules, and I created one for my next-gen DUART controller, so it works with three different chips now.
So yeah, I've been busy recently, and it all worked pretty fine, with one simple exception.
So, there are two reset circuits on the board, and the same design will be used in the final version. There is primary master reset circuit connected to DS1813 chip that resets everything on the board (with the exception of the UART->USB interfaces, see below). However, I wanted to have another, secondary circuit, used to reset only 6502 and its peripherals. The reason to do so is that you might want to use your AVR supervisor session over several 6502 resets. You want to keep your breakpoints for instance.
The solution is pretty simple: both reset signals are active-low, so the master reset is connected directly to DS1813 chip (that generates the signal on power-up and when reset button is pressed) and AVR and its peripherals. 6502, however, is connected to secondary circuit that is generated as an output of AND gate. The inputs are: master reset and signal originating from AVR.
This way we have two ways of resetting the 6502: by the master switch/power-on, or by command from AVR shell.
Now, this is pretty simple, right. Could anything possibly go wrong?
Well, I wouldn't be writing about it, if there wasn't.
So, most of the time it worked, I could reset the 6502 from AVR shell and it would just work. Sometimes, without any apparent reason I had to invoke the reset operation more than once for it to kick in. That was weird, especially that my code for sending the reset signal was following WDC datasheet that requires at least two full clock cycles. I had three.
Still, sometimes what happened was this:
As you can see, reset sequence was performed, but the CPU continued as if nothing happened. In those cases I just had to repeat it couple of times to kick in:
I was ignoring the issue for a while, because it was just a small annoyance, but at certain point I decided to look closer at it. And what I found was eye-opening.
What do you do in cases like this? Get your logic analyser and see what it records. Here is what was captured using my cheap Saleae Logic 8 clone:
A-ha! Three cycles, bus taken over (it's not really necessary though for reset operation), but RES line was not pulled low. I checked the terminal, and the puzzle got all the weirder:
RES line was not pulled low, but the reset operation worked? WHAT THE HELL?
Probably the cheap clone is crap. Weird, but whatever. Let's get the serious stuff: 16 channels, 200MHz. Proper gear.
What ensued was so strange I actually forgot to take screenshot of it, so what you will see below is my own recreation of the observed result. This is what I saw at much higher frequency logic analyser:
What got my attention here (but I wasn't able to replicate this afterwards),...Read more »
Become a member to follow this project and never miss any updates