Close

Moving Past Memory Issues

A project log for Mackerel 68k Computer

A computer engineering journey through the Motorola 68k CPU family

Colin MaykishColin Maykish 08/07/2022 at 19:418 Comments

After spending many hours building various combinations of kernel versions and configurations, I've finally got Mackerel progressing past the memory initialization errors. I tried 6 six different kernels, multiple compiler versions, and every config setting I could find. Almost all of those build combinations produced the same result: either crashes in the allocator code or tons of page state errors on the console.

Eventually I got the idea that maybe the kernel and the config were fine and something else was wrong. I still wasn't convinced I could rule out hardware issues, but I added some basic memory tests to my bootloader code and couldn't find any issues. Additionally, Stephen Moody was able to boot the same kernel code on his 68k board and got stuck in roughly the same place.

The only thing left to blame was the binary image itself. When the Linux kernel compiles, it produces a vmlinux file as the main output. Then, based on the platform, you need to turn this into a binary that the CPU can load into RAM (or ROM) and execute. This involves using objcopy to convert the file to an executable.

uCLinux provides a bunch of examples of this process as the final step in the Makefile. Ultimately, this was my problem. The board I selected as a template expected the kernel to run from ROM. Even though I disabled the ROM and adjusted the memory mapping in the kernel config to run from RAM, this final image generation step still produced a binary tuned for the Arcturus uCsimm, not the Mackerel.

The solution, then, was to simplify the image generation command:

m68k-elf-objcopy -O binary vmlinux images/image.bin

That's it... That was a few weeks worth of debugging kernel code. I took for granted that because the image file was booting and doing something, it must have been fine, but the memory map I gave to the kernel and the placement of data in the executable were not in agreement.

A frustrating experience, but a valuable one. Even though my kernel debugging wasn't directly productive, I learned a ton about Linux internals and that should help with the next steps: hardware timers, interrupts, serial drivers, and filesystems. That's pretty much the list of remaining tasks.

Anyway, I'm happy to say I've got my board booting to the infamous "Calibrating delay loop..." message which means I'm ready to implement a timer and interrupt logic.

5Linux version 3.10.108 (mackerel@4b9e0bcb9c18) (gcc version 4.9.2 (GCC) ) #28 Sun Aug 7 18:57:42 UTC 2022
6
Mackerel 68k support by Colin Maykish <crmaykish@gmail.com>
6

uClinux/MC68000
6Flat model support (C) 1998,1999 Kenneth Albanowski, D. Jeff Dionne
7On node 0 totalpages: 496
7free_area_init_node: node 0, pgdat 0011b84c, node_mem_map 00150100
7  DMA zone: 4 pages used for memmap
7  DMA zone: 0 pages reserved
7  DMA zone: 496 pages, LIFO batch:0
7pcpu-alloc: s0 r0 d32768 u32768 alloc=1*32768
7pcpu-alloc: [0] 0 
Built 1 zonelists in Zone order, mobility grouping off.  Total pages: 492
5Kernel command line: 
6PID hash table entries: 16 (order: -6, 64 bytes)
6Dentry cache hash table entries: 1024 (order: 0, 4096 bytes)
6Inode-cache hash table entries: 1024 (order: 0, 4096 bytes)
5Sorting __ex_table...
trap_init()
6Memory: 528k/528k available (868k kernel code, 544k data, 44k init)
5Virtual kernel memory layout:
    vector  : 0x00000000 - 0x00000400   (   1 KiB)
    kmap    : 0x00000000 - 0xffffffff   (4095 MiB)
    vmalloc : 0x00000000 - 0xffffffff   (4095 MiB)
    lowmem  : 0x00008000 - 0x001f8000   (   1 MiB)
      .init : 0x0011d000 - 0x00128000   (  44 KiB)
      .text : 0x00008000 - 0x000e0810   ( 867 KiB)
      .data : 0x000e0810 - 0x0011c180   ( 239 KiB)
      .bss  : 0x00128000 - 0x0014d2a8   ( 149 KiB)
6SLUB: HWalign=16, Order=0-3, MinObjects=0, CPUs=1, Nodes=8
6NR_IRQS:32
init_IRQ()
hw_timer_init()
6Calibrating delay loop...

For reference, I'm using Linux v3.10 (without any uCLinux libraries or code) compiled with gcc v4.9.2 and binutils v2.25. All built on Debian Jessie. There's a Dockerfile with the full environment in the main Github repo. I suspect that a newer kernel would also work now that I've solved the image issues, but I'm happy with this one for the moment.

Project repo: https://github.com/crmaykish/mackerel-68k

Linux kernel: https://github.com/crmaykish/mackerel-linux-3.10

Also, thanks to Stephen Moody for his debugging support. Check out his awesome 68000 project: https://hackaday.io/project/181472-y-ddraig-my-68000-computer.

Discussions

Stephen Moody wrote 08/28/2022 at 12:41 point

That's good that you have made progress. I did have a play around with the 4.x kernel a little without any luck, but haven't had much free time over the last few weeks to do much with it.


I'll have a look at the changes you've done and see if I can get the same code running on my computer as well. It will be interesting to see how far it gets once you have the timer interrupt working.

  Are you sure? yes | no

Colin Maykish wrote 08/29/2022 at 15:05 point

Since I figured out my problem was unrelated to the kernel, I've moved my focus back to 4.4. I haven't written up a log yet, but I did implement a timer some other code to get passed the delay loop. I got as far as trying to mount the proc filesystem and then hanging. Pretty sure this is due to having no filesystem image at all in the binary.

Eventually I ran into some hardware stability problems I've been trying to solve. Hoping to get back to Linux debugging this week.

For reference: here is my kernel code with the timers and hardware init changes: https://github.com/crmaykish/mackerel-linux/compare/setup-herring

  Are you sure? yes | no

Stephen Moody wrote 08/31/2022 at 15:39 point

I'll have a look at the changes you've done there and see if I can get them up and running on my computer.

I don't know how Linux works with some of these generated file paths, I wonder if it does need a working file system to generate sym-links or something similar to that.

What problems are you having with the hardware?

  Are you sure? yes | no

Colin Maykish wrote 08/31/2022 at 16:14 point

If I'm reading the docs correctly, it looks like setting up the proc fs does require a real filesystem to be setup, at least before it can be mounted. (https://cateee.net/lkddb/web-lkddb/PROC_FS.html)

I'm not sure if I'm actually getting to the point where it's trying to mount it or if something else is wrong.

My hardware problems are mostly self-inflicted. My hand-wired CPU/CPLD board is getting out of control - too much rework. I started having random memory errors that I wasn't having before.

I'm going to work on simplifying the design and move some of the known working parts to a PCB in an effort to improve stability and isolate any issues to individual components.

  Are you sure? yes | no

Stephen Moody wrote 09/02/2022 at 15:05 point

I think the availability of cheaper PCBs from China is making me lazy these days. I don't bother hand wiring or prototyping anything like I used to. Most work related projects are SMD anyway so harder to prototype anyway. I have a couple of designs in the works that are straight to PCB, just hope they work with minimal modifications required.

  Are you sure? yes | no

Gravis wrote 08/27/2022 at 17:38 point

Congratulations on finding the issue.  I've gone down a rabbit hole many times just to find out that some peripheral command, flag, or option was ruining my day.

  Are you sure? yes | no

Colin Maykish wrote 08/29/2022 at 14:59 point

Thanks, the worst kind of problem is one where it works well enough to hide in plain sight.

  Are you sure? yes | no

Gravis wrote 08/29/2022 at 22:12 point

I hear that. Perhaps the thing that can make it worse is when it works... but only some of the time for reasons unknown.

  Are you sure? yes | no