Targeting SDCC to the 8080

Writing a code generator for the 8080 microprocessor for Small Device C Compiler (SDCC)

Similar projects worth following
I am attempting to write a code generator backend for SDCC for the 8080 microprocessor whose instruction set is a subset of the Z80.

As I was building an 8085 SBC with a relatively large memory space compared to the microcrontrollers I had previously used, I desired a C compiler for it to develop substantial standalone programs. The 8085, as explained in that project, is basically a 8080 with easier electrical interfacing, and a couple of extra instructions, RIM and SIM. There are also some undocumented instructions, and they may be added to the assembler table but use by the compiler should be optional, for when you do have a 8085.

Casting around for existing free compilers led to a few candiates. For CP/M or DOS there are Dunfield C, and also Hitech C. But these require old development environments (though one could use a virtual machine) and also only support old C standards. There may be other deficiencies due to the limitations of the hosting environment. As they are proprietary, they cannot be developed further.

The Amsterdam Compiler Kit (ACK) is currently being maintained by David Given and does work. The software is rather awkward to install as it has many components. The build process is terrible though getting better through developer efforts. Also it accepts an older standard for C, though there are efforts to bring that up to date. It is used by the 8080/8085 ports of FUZIX by Alan Cox. One advantage of ACK is fhat it also supports a couple of other languages like Pascal, if that is desired.

SDCC is the most modern of the candidates. It supports recent standards for C and is actively maintained. In fact it was a post by Alan Cox (I'm guessing also the lead of FUZIX) in the SDCC mailing list that got me thinking about hacking SDCC to generate 8080 code. If I could pull this off I could develop on Linux.

I tossed all of this in my mind in 2018. All these factors came to my mind:

  • Was I sane to do this? The Z80 code generator module gen.c is about 13,000 lines of code. Although I wouldn't have to rewrite all of it from scratch. In fact since the 8080 has some similarities to the Gameboy Z80, I could get a free ride on some of that code.
  • Will it be of any use? Very few people have only the 8080 or 8085 these days. If they develop for retro CPUs it's more likely to be the Z80 or its faster and more capable descendants for which SDCC works fine. Few people care about the 8080 anymore. In the worst case, just me.
  • What are the steps I should take?

As it turned out, I did some steps fairly quickly and had long hiatuses for others. It's not usable yet and there is a possibility it might never be. I agonised if I should post this as a project. In the end I've decided to present it as is. It:

  • Will probably remain an ongoing project forever even if I get it to work acceptably as some bugs may take a long time to surface or nobody will use it enough to tickle the bug
  • Will not be easy to install as you have to build from a git clone
  • May never be accepted in the mainstream SDCC as I may have done too much violence to the Z80 code generator module (but it might be acceptable if separated into another module, if anybody cares)
  • May not be of use to anybody, including myself
  • May be instructive for anybody wanting to augment the work or do something similar with another Instruction Set Architecture

Now that I have accepted the software condition (an analogue of the human condition) I shall present a series of logs. This is not happening in real-time. Some steps were completed months ago, and some are still in progress. I will also update the status in this description according to progress.

  • Validation

    Ken Yap10/14/2019 at 00:31 2 comments

    This log may change as I understand better how it works.

    Now we come to the key step which the backend must pass otherwise it's useless — checking that the code generated is correct. SDCC comes with a formidable suite of regression tests. The procedure is simple, compile a test program, run it in a simulator (sz80), and output messages for failing tests. The test programs look like this made up example:

    i = 2;
    if ((i + i) != 4)
      error("Failed add");

    Of course a whole range of code from the simple up to the complex like pointer access is tested. A test harness runs all the tests and summarises the results. It should be automated so that continuous integration can check that a code change hasn't made things worse.

    But how is error() implemented? We could make it equivalent to a printf(), but what if we aren't sure that we have a working printf()? The solution is to generate a trap instruction for the error() which is handled by the simulator and ensures that we always see output. Unless the program crashes, in which case the output won't match either.

    As an aside, there are compiler options to output the original source lines and iCode as comments in the assembler code so when a test fails one can look at the generated code and work out what were the incorrect assumptions.

    So TODO number 3 is to get the 8080 backend to pass the regression tests, and with various options. That sentence means a lot of iterated work!

    That's the end of the logs for the moment. I just have to collect some round tuits, roll up my sleeves and work on the TODOs.

  • Runtime

    Ken Yap10/09/2019 at 23:41 0 comments

    A binary compiled from C needs some additional code to work. These are the runtime library routines. Typically they consist of utility routines such as character and string functions that many C programmers expect as part of the environment. Even in embedded environments a rudimentary stdio library may be provided that at the bottom calls a couple of routines, getchar() and putchar(), which must be provided by the developer for a particular embedded platform say by talking to a serial interface.

    The library may also contain assist functions to do operations not supported by the processor, such as multiplication and division. In our case right shift routines fall into this category. Incidentally have you ever wondered how the user is prevented from writing C functions and global variables that may override and interfere with these assist functions and variables? Originally in Unix this was done simply by prepending an underscore (_) to every external symbol. In other words, the C programmer can only create symbols in the object files starting with underscore. Assist functions and variables do not start with an underscore so cannot be overriden. If the user writes in assembler to be linked with the objects from C, then there is no barrier. In that case the user is assumed to know what she is doing.

    Other things in the runtime library are routines written in assembler for speed, for example block copies and compares, accessed for example via bcopy() and bcmp().

    But the one thing the runtime library must contain is the startup module. Traditionally this was named crt0.o in Unix (C runtime zero). This module accepts control from the OS, or from the boot vector and sets up the registers as necessary for the payload to run. The program counter is of course taken care of when the startup jumps to the payload. Other registers that must be set up include the stack pointer, any segment pointers, and interrupt vectors, this last for embedded environments. The C environment also stipulates that unintialised variables must contain binary zero. These variables are stored in the BSS area, typically above the code and initialised variables (read-only in recent C standards), but below the heap. The stack typically extends downward from the top of RAM. So one of the jobs the startup must do is zero the BSS.

    Since embedded environments vary, there is no one size fits all crt0. Typically the compiler package provides a standard crt0, but the developer is expected to take the source and customise for the hardware configuration that will be used. In SDCC, like in traditional C compilers, it is possible to tell the linker to omit the provided crt0.o and then the developer ensures that the first file handed to the linker is her customised crt0.o.

    The runtime library contains a mix of C and assembler. Assembler is used where C cannot or efficiency is crucial.

    So TODO number 2 is to take the assembler routines that are written in Z80 code and produce equivalents in 8080 code for the 8080 runtime library. As an example, the Z80 startup will use the Z80 block assign opcodes to zero the BSS. The 8080 code should do this the longer way.

  • Right, shift!

    Ken Yap10/03/2019 at 20:51 0 comments

    C supports bit operations on integers of various widths, including left and right shift. Unfortunately on the 8080, shifts are only supported on the accumulator and comes in two varieties, shift using and not using the carry bit. The former is needed for two byte (int) and four byte (long) shifts to carry the leaving bit from one byte to enter the next.

    Left shift is the less troublesome operation and is already supported in the code generator for the general case which also works for the 8080. Some shifts can be turned into doubling operations as a single left shift is a doubling. Some shifts which are not powers of two can be more efficiently composed from two or more shifts instead of using a loop. Some shifts by multiples of 8 on multibyte data are just byte shifts.

    Note that the above applies for shifts by constants. For a shift by a variable a loop needs to be generated. The case of shifting a constant by a constant should not happen as the compiler would have eliminated this by constant folding. This happens often due to macro expansion, less from the programmer writing it.

    Right shift is harder because while you can add something to itself to double it, there is no corresponding opcode to halve something.

    I was stuck here for months agonising over how to generate the (possibly unrolled) loops to right shift. Eventually taking the lead from the Amsterdam Compiler Kit, I decided to fob right shifts off to library routines. Perhaps later I might inline some simple cases like right shift by one. Get something working first.

    A note here on efficiency: There are two aspects: code size and machine cycles. For code size, except in the simplest cases such as a shift by one on a byte, less instructions will be generated because a bunch of inline instructions is replaced by subroutine calls and returns. For machine cycles, there is the overhead of the call and return, but again, except for the simplest cases, the proportion of time spent in the call and return is not significant compared to the time doing the shifting as the shift gets larger. There is no stack variable access overhead as the library routines expect arguments in registers. (Reentrancy is not an issue as these routines are leaf routines.) Anyway as I wrote, get something general working first, then optimise as necessary.

    Six routines are needed, for signed and unsigned shifts for char, int, and long. In signed shifts the sign bit is preserved. In unsigned shifts there is no sign bit. Operands are expected in registers. These routines were tested separately, but the backend routines to plug them into the code stream have not been completed. TODO number 1  Unfortunately there is scant documentation on the backend routines I need to call to go from the iCode (the internal representation of the C code) to assembler code. I just have to work it out by trial and error. ☹️

  • Modifying the backend

    Ken Yap09/23/2019 at 22:25 0 comments

    By this time you're getting impatient and wondering when I'm going to hack the backend, in particular the monster file gen.c.

    Did I study the file carefully and make surgical changes? Hahaha, I wish. What I actually did was run the compiler across the selection of test programs in the regression directory, note any illegal code produced and change the generator to fix, and repeat. (It turns out that the regression directory isn't the real test suite but nonetheless as good a place to get my hands dirty as any.)

    The changes fell into several categories:

    Relative jumps JR had to be changed to absolute jumps JP. And DJNZ had to be done the long way. Straightforward.

    Bit operations have to be done the long way by using the accumulator as an intermediate register. Fortunately the accumulator is regarded as volatile. Some bit tests, e.g on the top bit can be done with testing for negative instead of using AND.

    Moves and compares that used the block instructions have to be done the long way.

    Right shifts, as Alan Cox astutely noted years ago, have to be done the long way as the 8080 is terribly incapable in this area. All the remaining unhandled cases fell into this category. This deserves an entire log to itself, coming up.

  • Function parameter passing convention and reentrancy

    Ken Yap09/06/2019 at 14:52 0 comments

    We normally don't think hard about how parameters are passed to functions (a procedure is simply a function that doesn't return a result, or returns void in C parlance), we just trust the magic. However this has a critical effect on how functions work, as well as the efficiency of the code.

    The canonical way of passing parameters is on the stack. Each invocation of a function receives its own copy of the parameters. Here's how it usually works in C:

    1. The function caller saves any registers that must remain unchanged if caller saves convention is in effect.
    2. The parameters are pushed last first onto the stack.
    3. The function is called, pushing the return address and any other information like flags on the stack.
    4. A frame pointer register is made to point to the first argument. This is optional, compilers often allow this to be omitted by a compilation directive. More further down.
    5. The function allocates space on the stack for locals.
    6. The function saves any registers that must remain unchanged if the callee saves convention is in effect.
    7. The body of the function runs.
    8. At the return statement the function restores any registers that were saved by the callee, adjusts the stack to remove the locals, then returns to the caller.
    9. The caller adjusts the stack pointer to get rid of the passed arguments.
    10. The caller restores any registers that were saved by the caller.

    Firstly note that only one of caller saves or callee saves needs to be implemented. There are pros and cons.

    In caller saves:

    • The caller knows which registers must be preserved.
    • There needs to be as many save and restore sequences as there are calls.

    In callee saves:

    • The callee has to figure out which registers get used in the function and must be preserved, even if the caller doesn't care about some of them.
    • There needs to be as many save and restore sequences as there are functions.

    SDCC implements caller saves by default for the Z80 backend.

    Incidentally why last first? C has variadic functions, of which printf may be the best known. By pushing the first parameter last, this puts it at a fixed position from the stack pointer. Some compilers have done it the other way, and then they have to do some contortions to handle printf.

    Secondly, the frame pointer is actually redundant because the compiler knows at each point in the function the offset of the required parameter or local from the current value of the stack pointer. However the FP has value when using a debugger since the offset from the FP is always the same for a given parameter or local.

    In CPUs that are lacking in registers, the FP is usually omitted at the cost of more work in the code generator to get the right offset.

    Finally there are some hybrid schemes that pass say the first few (say 1 or 2) arguments in registers and the rest on the stack.

    Whatever scheme is chosen for paramter passing above, all runtime libraries used for a given port must be compiled with the same scheme. This in addition to any opcode differences. So no mixing of Z80 and I80 libraries.

    Note that we assume we can access a parameter or local at an offset from the stack pointer. Even this is tortuous for benighted CPUs like the 8080, as there is no indexed addressing mode. We have to emit instructions to calculate the effective address of the parameter, involving getting the SP into a register, then adding to it. The Z80 has the IX and IY registers for indexing, a great advance, but this is limited to offsets that fit in a signed byte. This suffices for the vast majority of parameter and local lists but greater offsets (say if you pass...

    Read more »

  • Are you game, boy?

    Ken Yap09/01/2019 at 12:50 0 comments

    The one factor that makes the 8080 target even remotely tractable for me is the included port to the Gameboy Z80 by Philipp Klaus Krause. The GBZ80 is a cut down version of the Z80 that has more in common with the 8080 than the Z80. A good treatise can be found here. A quick summary: All the DD, ED, FD prefixed instructions are missing so no extended instructions, or IX/IY instructions. However the CB prefixed bit instructions are still there and this will cause most of the headaches later on. The I/O instructions are also absent but this doesn't affect the code generator, only library routines in assembler that use them.

    In the code there are various macros of the form IS_<port>, e.g. IS_GB. These expand to a test on the machine subtype. I added another constant SUB_I80, the corresponding macro IS_I80, and a new macro IS_GB_I80 which is IS_GB || IS_I80.

    So can we just s/IS_GB/IS_GB_I80/ throughout? Not so fast. As mentioned before the bit instructions are still present, and there are some features the GBZ80 has that the Z80 and 8080 don't. So these cannot use the test IS_GB_I80. In addition the GB uses DE then HL for function results instead of HL then DE, presumably as this is the convention for other GB software. The upshot is that every occurence of IS_GB has to be inspected to see if it's relevant to the I80 also.

    Furthermore the Z80 and GBZ80 have some instructions the 8080 doesn't like DJNZ so code to do this the long way has to be added. This cannot be found by searching for GB.

    There are other major differences which will be described in other logs.

  • Preparing the driver

    Ken Yap08/30/2019 at 12:14 0 comments

    As aficionados of Unix know, the classic C compiler is actually a chain of programs that take C source and generate various files, in the longest case, to an executable, classically a.out format, but for a long time now, usually ELF format. Here is the chain in block diagram form, taken from Wikibooks:

    By Agpires - Own work, CC BY-SA 3.0, Link

    SDCC is no different. However I only need to concern myself with the box labelled Compiler in the above diagram. For the 8080, as for the Z80, aslink generates Intel Hex format.

    SDCC calls each distinct processor target a model, which is passed to the compiler driver sdcc as an argument, for example:

    sdcc -mz80 hello.c -o hello.ihx

    All related models share a backend, for example all the 8051 targets use the same backend.  Since the 8080 is closely related to the Z80, this is the backend I'm modifying.

    Each variant processor is included in SDCC by including a pointer to a PORT structure. It's a large structure which describes many aspects of that port, including, but not limited to, the characteristics of the C implementation such as the sizes of various data types, the names of various sections in the generated assembler code, and the programs and libraries that are used in the chain.  Here is an excerpt from it:

    PORT i80_port =
      "Intel 8080",           /* Target name */
        NULL,                       /* model == target */
      {                             /* Assembler */
        "-plosgffwy",               /* Options with debug */
        "-plosgffw",                /* Options without debug */
        NULL                        /* no do_assemble function */
      {                             /* Linker */
        _z80LinkCmd,                //NULL,
        NULL,                       //LINKCMD,
        _crt,                       /* crt */
        _libs_i80,                  /* libs */
      {                             /* Peephole optimizer */
      /* Sizes: char, short, int, long, long long, near ptr, far ptr, gptr, func ptr, banked func ptr, bit, float */
      { 1, 2, 2, 4, 8, 2, 2, 2, 2, 2, 1, 4 },

    Fortunately I could copy a lot of the fields from the Z80 port. But some parameters, such as for optimisation, may need tweaking later.

    The structure refers to a mapping.i file which is used as a sort of macro expansion mechanism to generate many lines of code from a single line of pseudo-code. For example this entry:

    { "enterx", "add sp, #-%d" },

    generates an instruction to adjust the stack from a single mnemonic. 

    Also referred to are the peephole rules for optimising short sequences of assembler lines. For example:

    replace restart {
            ld      %1, %1
    } by {
            ; peephole 1 removed redundant load.
    } if notVolatile(%1)

    removes an unneeded assembler instruction. I took the Z80 rules and removed those which did not apply to the 8080. 

    Finally we need to spit out the .8080 directive to the assembler output at the beginning. This is done in the file SDCCglue.c:

    else if (TARGET_IS_I80)
        fprintf (asmFile, "\t.8080\n");

    All in all about half a dozen files were modified to prepare the framework to handle the 8080 backend. The real work is yet to begin.

  • Choosing the instruction mnemonics

    Ken Yap08/29/2019 at 02:00 0 comments

    As people know the Z80 chose to use a different set of mnemonics from the 8080. In my opinion the Z80 mnemonics are more logical. The 8080 mnemonics encoded the addressing mode in the instruction, so for example load immediate to accumulator was MVI A,nn but load immediate to HL was LXI HL,nnnn. For the Z80 mnemonics these are LD A,#nn and LD HL,#nnnn. There are other irregularities.

    It wasn't hard to make a choice. The SDCC code generator for the Z80 generates the Z80 mnemonics, naturally. Even though the user normally doesn't look the assembler output, I had no desire to throw 8080 mnemonics into the mix. So Z80 mnemonics it was.

    However I had to make assembler detect whenever an instruction not in the 8080 subset was used to catch code generation errors. So I hacked asz80 from asxxxx to accept an additional directive: .8080 which would make the assembler signal an error on such lines. I wasn't consistent in the method of detection. Some are caught at the syntax stage, by forbidding IX and IY, for example. The others are detected at the mnemonic decoding step, by forbidding EXX, for example. It isn't important, as long as an error is returned as this is something that "should not happen" and should be reported to the developer.

    This was done in short order and you can find the hacked asz80, or rather sdasz80, since the SDCC developers have made several changes to the stock asxxxx distribution, in the GitHub repo.

View all 8 project logs

  • 1
    Building from Github source

    The current code is published on Github and you can clone and build from there. You should have a Linux environment and all the necessary prerequisites, i.e. gcc and all sorts of development header files and libraries, and be competent with GNU autoconfigure tools.

View all instructions

Enjoy this project?



EtchedPixels wrote 01/31/2020 at 11:53 point

One comment: 8085 code generation is very different to 8080. The 8085 added all the needed instructions for sane high level languages although they were then undocumented for some weird Intel reason, but are documented for many of the clones/copies and are used by everyone in the 8085 world.

In particular it adds

- 16bit right and left shift

- 16bit subtract

- DE = HL + imm8

- DE = SP + imm8

- (DE) = HL (16bit)

- HL = (DE)

as well as JK and JNK which can be used with 16bit inc/dec to avoid the complex test of 16bit = 0

8085 code thus tends to look like

LDSI variable (relative to SP)

LHLX   load 16bit variable into HL

Do stuff

(LDSI variable again if DE was eaten)

SHLX put it back

and pointer chasing becomes

LHLX - HL = (DE)

LDHI  offsetinstruct


.... (with XCHG insterad on a 0 offset)

ACK unfortunately can't use this because the ACK compiler has a built in requirement for a frame pointer, which is totally incompatible with the design of the 8085 stack instructions.

  Are you sure? yes | no

Ken Yap wrote 01/31/2020 at 11:59 point

Thanks that's useful. I must get a round tuit some day.

  Are you sure? yes | no

agp.cooper wrote 10/19/2019 at 01:57 point

Now your timing for this is magic. I have been looking for a C (cross) compiler for the 8080 but found that SDCC did not support the 8080. I have found a number of other C compilers (DOS/CPM) for the 8080 so it was not a huge problem but the SDCC would be better as it is a more modern and perhaps a better compiler.

Regards AlanX

  Are you sure? yes | no

Jonathan wrote 10/14/2019 at 06:37 point

Reading this write up was already extremely informative (always wondered about the added underscore prepending function listings). Keep it up I'd say and please continue writing.

  Are you sure? yes | no

Similar Projects

Does this project spark your interest?

Become a member to follow this project and never miss any updates