A computer engineering journey through the Motorola 68k CPU family
To make the experience fit your profile, pick a username and tell us what interests you.
I’ve compiled Linux hundreds of times, but it’s almost always automated behind a build system of some sort. I don’t spend that much time configuring it once it’s working as expected and I’ve spent even less time parsing through the source code. The last few days have been eye-opening in more ways than one. Since the last log, I’ve managed to shave the Linux image file down to 1.5MB. It will now fit comfortably in the 2MB of RAM currently installed on Mackerel, but will it run?
The first thing to do was to hook up the CH376S USB module again. I haven’t used this since the first prototype build, but it’s essentially a USB-to-parallel adapter with support for FAT filesystems. The setup involves hooking it up to the CPU’s 8-bit data bus, connecting some control lines, and memory-mapping it like any other peripheral. There’s a command/response API for querying the file system and transferring data. My current implementation takes about 3 minutes to transfer the entire 1.5MB Linux image into RAM, which is terrible, but it’s a lot less terrible than sending it over a 9600 baud serial port. I added an option in the bootloader to automatically copy the image file into RAM and jump to the entry point.
At this point, Linux should be running, but you can’t tell. There’s no console activity, but the 68901’s LEDs do eventually show the pattern for “unhandled exception”, so something happened. Let’s figure out how far it went.
The Linux startup process varies depending on the CPU architecture. Fortunately, the m68k version is pretty straightforward. In the kernel source code, there’s a file called linux/arch/m68k/68000/head.S. This is the starting point, i.e. the assembly function called _start is defined here. This function will be put directly at the start of the compiled image. In this case it ends up at memory address 0x8000 and jumping to that address will kick off the boot process.
Since I have no idea if this code is actually running properly, I hacked in a few assembly statements to update the LEDs on the 68901 as the code progressed through _start. This at least proves that object code in the image file is executing and it gives me a place to start debugging. I haven’t actually implemented a serial driver or any platform-specific initialization code yet, but I’m just hoping to see something work. After the slow process of recompiling the kernel and booting from the USB drive, the LED lights show the expected pattern and then quickly change to the “unhandled exception” pattern again. Something happened!
Tracing through the code further, _start eventually calls jsr start_kernel. This is where the official jump to the Linux kernel happens. The start_kernel() function is defined in linux/init/main.c. Taking a similar approach to the assembly code, I added a few LED pattern commands in here, only this time written in C. Rinse and repeat, success! I know the CPU is at least getting to the start of the Linux kernel initialization code.
By now, the LED patterns are becoming cumbersome, but I’m still no closer to a working serial driver. Time to cheat! My serial output code for the 68901 is really simple, so I just copy-pasted the bare minimum code right into the main.c kernel source. This does not give Linux a way to communicate, but it acts like a more useful LED display, the old “printf debugging” routine. Sprinkling mfp_puts() messages all over the place and rebuilding gave me a better idea what was happening:
Booting from USB... ........................................................................................................................................................................................................................................................................................................................................................................... Image loaded at 0x8000 Booting image... lockdep_init() ...Read more »
If you Google around for "68000 linux", it won't be too long before you find some mention of uCLinux. uCLinux is a project that supports a handful of CPUs without a memory management unit which makes it perfect for the 68008 (and most of the early 68k family). Several other homebrew 68k projects have used this “distribution” with some success, but a lot of them are now approaching a decade old and uCLinux itself seems to have been abandoned around 2016. The releases are still hosted on SourceForge, with the newest one building on v4 of the Linux kernel.
So what does it actually take to compile uCLinux into something Mackerel can execute? Where does it even start? I’ve already been using a gcc-based cross-compiler for the m68k platform and that won’t change. However, I’m going to start over and rebuild the toolchain from an older version with the hope that tools from around 2016 will have better compatibility with code from the same era. I did this whole process in a Debian 9 virtual machine.
The first step is to build binutils and gcc as a cross-compiler (https://wiki.osdev.org/GCC_Cross-Compiler is a good reference for this). I settled on binutils v2.27 and gcc v5.5.0, both from around 2016 . Now I’m basically back to where I started but with older versions of my cross-compiler and tools.
In theory, there’s nothing stopping me from cloning the Linux kernel source, setting up a config manually, and building any version I like that targets the 68k. The problem is that the kernel is only one part of a full Linux system. I also need a standard library, applications, and a file system set up, not to mention drivers and startup code that will be specific to the Mackerel hardware. This is where uCLinux comes in. There are a few dozen preset configurations for known hardware. For example, Atari and Amiga have prebuilt configs that should build a full bootable system image. Let’s start with one of these to make sure the toolchain is working and to iron out any build issues before attempting to customize things further.
The uCLinux build process is pretty straight forward: run make menuconfig and select the vendor and board. Then run make. If all goes well, there’s an image file waiting at the other end. I chose the Arcturus uCsimm board for no particular reason other than the config seemed to be pretty barebones. The first dozen or so build attempts ended in failure for various reasons.
At first, uCLinux was attempting to use the wrong compiler. I updated the vendors/config/m68knommu/config.arch file to point to the correct prefix, i.e. m68k-elf-, the prefix for the toolchain I built previously.
Then the build failed due to some compiler errors in the startup code. I had to comment out most of the interrupt handling code in ints.c and remove the call to the config_BSP() function from the setup_no.c file. Obviously, this will need to be fixed at some point, but I want to see a successful build before I write a bunch of driver code.
After a lot of trial and error with different combinations of toolchains, uCLinux releases, and configs, I have a Linux image:
-rw-r--r-- 1 colin colin 2.4M Mar 3 12:49 image.bin -rwxr-xr-x 1 colin colin 204K Mar 3 12:49 linux.data -rwxr-xr-x 1 colin colin 1.2M Mar 3 12:49 linux.text -rw-r--r-- 1 colin colin 997K Mar 3 12:49 romfs.img
What does this image.bin file actually contain? According to the config, it should be a Linux kernel and a filesystem image containing busybox and some other basic Linux applications. Let’s confirm.
Looking at the Makefile for the Arcturus uCsimm configuration, there’s a make image step that runs after all the compiling is done. This step is stripping various sections from the linux ELF file and combining it together with the ROM filesystem into a single image.bin file.
The linux ELF is really just object code with some metadata wrapped around it. Using objdump...Read more »
After spending all afternoon soldering, Mackerel now has a full 2 megabytes of super fast SRAM. I decided to go with the ISSI IS61C5128AL RAM chips for this project. They're pretty similar in function to the more common AS6C4008 series from Alliance, but they are rated for a blistering 10ns access time, more than 5x faster than the Alliance RAM. They're also slightly cheaper. The downside is that they're only available in surface mount packages. By the time the adapter boards are factored in, the cost difference is negligible, but the speed is still really nice. No wait states here!
The SOJ-36 packages are not really that bad to solder by hand. With lots of liquid flux, I was able to get (in my humble opinion) really nice joints connecting them to the DIP breakout boards. These breakout boards won't really be necessary when I start designing PCBs, but they do make the ICs easy to reuse during prototyping.
I've also thought about designing a SOP-36 to DIP-32 adapter which would convert the footprint of these ISSI chips to match the more standard DIP-32 SRAM pinout, allowing either style of SRAM to be used in the same socket.
Here's the full RAM board with four chips installed for a total of 2048KB:
The back side is a little less attractive:
I'm not sure how well this wiring arrangement will hold up at higher clock speeds, but I was able to verify reads and writes to the entire 2MB range using a simple test program.
One thing I don't really like with this design is that all of the chip-select logic is off-board. Every SRAM chip has its chip-select line connected to a separate pin on the backplane. I've got plenty of pins to spare at the moment, but if this is made into a PCB, I would like to move the address decoding for each individual module on board, freeing up a bunch of backplane connections.
With the MFP serial port working reliably, the next requirement for a barebones Linux system is a constant timer to allow the kernel to do context switching and multitasking. It's fortunate, then, that the MFP also provides four independent timers which can trigger interrupts on the 68008 CPU. It can also be configured to trigger interrupts for other reasons (serial port data available, GPIO input levels changing, etc.), but for now, a constant timer will be a big step forward toward the main goal of booting Linux.
The MC68000 series of CPUs supports vectored interrupts. It’s worth taking a minute to understand what that means. It took me a few days to really internalize this idea. In most 8-bit CPUs from a generation before the 68k, interrupts are not vectored, but polled. For example, when a peripheral wants to interrupt the 6502 CPU, it asserts the IRQ pin. The CPU stops what it was doing, stores some internal state to the stack, and then calls the interrupt handling code. Since there’s only one interrupt pin, all of the peripherals have to share. So when the CPU is ready to handle the interrupt, it has to ask each peripheral if it was the cause of the interrupt and then choose the appropriate action to take when the source is found. This is simple to design on the hardware side since every peripheral can connect in parallel to the IRQ pin, but it means the interrupt handling code is more complicated and potentially slower.
Vectored interrupts, on the other hand, require more hardware, but they provide a way for the CPU to know exactly where each interrupt comes from without having to ask each potential source. On the 68k, this is done with an asynchronous interrupt system that prioritizes interrupts according to level.
Instead of one IRQ pin, there are three IPL inputs representing a binary number from 0 to 7. These are the interrupt levels where 0 means no interrupt and 7 is a non-maskable interrupt. The minimum interrupt level can be set in software and when the CPU detects an interrupt at or above its minimum level, it will start the interrupt handling process. As with polled interrupts, it will finish executing the current instruction, store some state to the stack, and then find the appropriate handler function to call. The main advantage, though, over polled interrupts is that instead of asking every peripheral, the CPU expects the location of the interrupt handler to be exposed on the CPU bus by the peripheral. In other words, when the CPU acknowledges that it’s ready to handle the interrupt, the peripheral that made the request tells the CPU which handler to call. This has to be set up ahead of time in code of course, but this one-time setup simplifies and speeds up interrupt handling for the lifetime of the program. It also means interrupt handlers can be changed programmatically, which gives a lot of flexibility to the code.
There are some details I left out (that will come up in the implementation section), but this is the description of vectored interrupts I wish I had when I started working on this. I hope it’s clear enough to be helpful to someone in the same position.
The first step to getting interrupts going is to bring in some additional control lines into play. The MFP has an IRQ output that must be mapped to one of the 7 interrupt levels on the CPU. Since it is currently the only source of interrupts, the IRQ pin is connected directly to IPL2. All three IPL pins have a pullup resistor, so this means the MFP will cause a level 4 interrupt. If more than three interrupt levels are ever needed, something like a 3-to-8 decoder would be necessary to use the whole range.
There are three pins on the 68008, FC0 through FC2, that, when all high, indicate the CPU is acknowledging an interrupt. This will be called the /IACK signal. It is used internally on the CPLD, but also connected to the MFP’s IACK pin. When the CPU is ready to handle...Read more »
In the last log I complained about the 68901 MFP chip and the issues I was having with its UART in particular (spoiler alert: it was my fault). The issue of general unreliability and random crashing is definitely one of the hardest types to debug, but it also indicates to me that it's probably not actually the fault of the MFP since it's not failing in a repeatable way.
There were two problems I was seeing:
I figured the best place to start was to make sure all of my CPLD logic was correct. Aside from the address decoding which is trivial to test, the CPLD is also generating a /BOOT signal to temporarily map the ROM to address 0x0000 for the first eight memory reads and the RAM to 0x0000 thereafter. It seemed unlikely that this was affecting the MFP, but let's confirm it's working as expected.
This logic is implemented in Verilog. I'm almost certain there's an easier way to do this, since a lot of designs get by with nothing but a shift register. Still wrapping my head around Verilog and programmable logic in general...
// Generate the BOOT signal for the first 8 memory accesses after reset reg BOOT = 1'b0; reg [3:0] bus_cycles = 0; reg got_cycle = 1'b0; always @(posedge CLK) begin if (~RST) begin bus_cycles = 0; BOOT <= 1'b0; end else begin if (~BOOT) begin if (~AS) begin if(~got_cycle) begin bus_cycles <= bus_cycles + 4'b1; got_cycle <= 1'b1; end end else begin got_cycle <= 1'b0; if (bus_cycles > 4'd8) BOOT <= 1'b1; end end end end
Overly complicated or not, the output seems to be in order. The boot signal (purple) goes high after the first 8 memory accesses following a reset. I'm not sure about the bumps in the rising edge of /AS (blue), but the signal is obviously good enough for the CPLD, so let's move on.
The MFP is currently the only peripheral in the system that produces its own DTACK response. The RAM and ROM are both fast enough that, until now, DTACK was tied to ground and the whole bus cycle was effectively synchronous. At low clock speeds, the MFP does actually seem to work with DTACK grounded, but since I'm having issues with it, this obviously needs to be implemented correctly.
The logic is simple: if the MFP is selected, the CPU's DTACK line should be controlled by the MFP DTACK response signal. If anything else is selected, i.e. RAM or ROM, just hold DTACK low.
In Verilog, this simplifies down to:
// Generate DTACK signal assign DTACK = ~MFPEN & DTACK_MFP;
This logic will get more complex when other devices are added or if wait states become necessary as clock speeds increase. For now, though, it's easy to validate.
This is also looking alright. Trace 3 is the MFP chip-select line, active low. When the MFP is enabled, DTACK (trace 1) is held high until the MFP pulls it down (trace 2). When the MFP is no longer selected, DTACK goes back to ground.
In addition to properly implementing DTACK, I "reinforced" the boards with more decoupling capacitors and I replaced the clock generation on the CPLD with a dedicated 2 MHz oscillator as close to the CPU clock pin as possible. I thought it would be nice to have a high frequency oscillator feed the CPLD and then generate a system clock from that, but the generated clock signal being output by the CPLD was only hitting around 3v peak-to-peak instead of the 5v I would expect. According to the datasheet, the 68008 will recognize anything over 2.4v as high, but I didn't want to take any chances and decided to feed the CPU clock line directly with a nice clean clock signal from its own oscillator. This also means the...Read more »
In the long run, I don't know if Mackerel will be a backplane system or a single-board computer, but while the project is still in the prototyping phase, having interchangeable component cards on a backplane does make development easier. The first build (shown in the previous log) repurposed a 40 pin backplane from an earlier project of mine, but I quickly ran out of pins and space on the PCBs for all of the ICs and interconnects I need. To give Mackerel some more room to grow, I designed a really simple 80-pin backplane and matching prototype PCBs.
Another bottleneck in the previous build was using the DIP-48 version of the 68008 CPU. Using the PLCC package gets you two more address pins, an upgrade from 1MB to 4MB of address space. I'm not sure of the minimum RAM requirements to run something like uCLinux, I'm sure it depends on the version and features. I've seen projects boot ancient versions of Linux with as little as 512KB, but I'd prefer to target something more recent. Linux has not gotten any smaller...
Anyway, PLCC packages are not the hardest surface mount chips to work with, but they are more difficult to use in prototypes, so I designed some PLCC-to-DIP adapters for the various sizes.
The first two (PLCC-44 an PLCC-52) fit nicely into a normal breadboard or perfboard. Technically so does the PLCC-68 version, but it takes up two lanes on a breadboard which makes it a little space inefficient. The MC68EC000 uses this package. That CPU has the option for an 8-bit bus and supports higher clock speeds and more address space. It might be a nice upgrade to the 68008 at some point.
I also ordered a PLCC-84 breakout, but this is proving to be less useful. The only ICs I currently have in that package are some non-functional EPM7128 CPLDs.
Speaking of CPLDs, I've swapped out the multiple GALs for a single EPM7064SLC44 CPLD. These run at 5V and can be programmed using Quartus 13.1 and a cheap USB blaster. This version of Quartus is still available for download on the Intel website and runs just fine on Windows 10. The USB blaster is a $5 clone from eBay. It also works fine.
I put together a breakout board for the JTAG header, a clock, and some I/O to make chip testing and programming faster. The PLCC sockets seem pretty fragile and they're expensive enough that I don't want to squander them, so moving around the whole DIP adapter seemed like an acceptable compromise.
As far as the actual programmable logic, I've used this CPLD to implement:
Basically, I've rebuilt the first prototype of Mackerel on the new 80-pin backplane and moved all the control logic onto the CPLD.
Here it is assembled on the backplane.
Here's a mostly accurate schematic of the current system. This ignores the ROM and RAM, but those are mapped as expected, address and data bus to the CPU with chip-select signals coming from the CPLD. Nothing about this design is finalized. A lot of the routing is just there for convenience. The system bus on the backplane headers is not well thought out yet. I'm still just adding signals as needed to get them connected between the component boards. There is plenty of room for consolidation once I've proven out the design.
The only I/O in the system is the MC68901 MFP board. In...Read more »
Mackerel is still a fairly new project, but I'm happy to say I'm making quick progress on getting a usable computer system running. Spending a few months building a 6502 system was a great foundation for moving up to a more advanced CPU. I've managed to avoid a lot of the pitfalls that derailed me at the beginning of that project.
I'm starting with the MC68008 CPU in DIP-48, so I'm currently limited to the 20-bit address space and 8-bit data bus. This is plenty to get up and running. I've assembled some boards by hand for the core components and made use of an early prototype 40-pin backplane from a previous project. Forty pins is really not going to be sufficient for this project for very long, but in the spirit of making the simplest thing first, it does just fine.
The CPU is running at its full rating of 10 MHz with (so far) good stability. I've paired it with a 512KB static RAM board and a 512KB ROM board. There's a MC68901 multi-function device to provide GPIO and timers. The MFP also gives me a serial port, but I've hard terrible luck getting it working reliably, so for the time being, I've bodged in a 65C51 ACIA which runs beautifully at 115200 baud.
On the software side, I'm using the GNU tools for C development and the Newlib C standard library. I started out using the m68k cross compiler that is available in the Debian repository, but I ended up building my own vesion of gcc from source. The nice thing about using standard tools is that software development looks pretty much the same as it would writing C code for any other system. objdump has, in particular, been extremely helpful for debugging.
I modified some of the bootloader code I wrote for Herring to work on Mackerel as well, so I've got a way to load new applications over the serial port, avoiding the need to constantly flash the ROM chip.
I'm also in the process of building a USB file storage interface using a CH376S breakout board. It is connected directly to the data bus of the system and provides an abstraction over the FAT file system on the USB drive. My immediate plan is to use this USB drive as a stage 2 boot device, i.e. boot from ROM, ROM code finds a kernel on the USB drive, copies it to RAM, and jumps to it. It might also prove useful as a filesystem for Linux in the future.
I've ignored a bunch of important things in this design: interrupts, /DTACK generation, proper address decoding, etc. It does run reliably though and I can write C code for it, so that's a great start.
I have some PLCC-52 adapter boards on order that will allow me to prototype with the PLCC version of the 68008. That will give me 2 more address lines for a total of 4MB of available address space and an additional interrupt line which should bring interrupt handling hardware more in line with the other 68k CPUs. I'd like to replace the current DIP 68008 with this PLCC version and add a bunch more static RAM to the system. I'll also need to build some interrupt handling and set up a system timer. With all of that in place, I can start to explore running something like uCLinux.
Become a member to follow this project and never miss any updates