Function parameter passing convention and reentrancy

A project log for Targeting SDCC to the 8080

Writing a code generator for the 8080 microprocessor for Small Device C Compiler (SDCC)

Ken YapKen Yap 09/06/2019 at 14:520 Comments

We normally don't think hard about how parameters are passed to functions (a procedure is simply a function that doesn't return a result, or returns void in C parlance), we just trust the magic. However this has a critical effect on how functions work, as well as the efficiency of the code.

The canonical way of passing parameters is on the stack. Each invocation of a function receives its own copy of the parameters. Here's how it usually works in C:

  1. The function caller saves any registers that must remain unchanged if caller saves convention is in effect.
  2. The parameters are pushed last first onto the stack.
  3. The function is called, pushing the return address and any other information like flags on the stack.
  4. A frame pointer register is made to point to the first argument. This is optional, compilers often allow this to be omitted by a compilation directive. More further down.
  5. The function allocates space on the stack for locals.
  6. The function saves any registers that must remain unchanged if the callee saves convention is in effect.
  7. The body of the function runs.
  8. At the return statement the function restores any registers that were saved by the callee, adjusts the stack to remove the locals, then returns to the caller.
  9. The caller adjusts the stack pointer to get rid of the passed arguments.
  10. The caller restores any registers that were saved by the caller.

Firstly note that only one of caller saves or callee saves needs to be implemented. There are pros and cons.

In caller saves:

In callee saves:

SDCC implements caller saves by default for the Z80 backend.

Incidentally why last first? C has variadic functions, of which printf may be the best known. By pushing the first parameter last, this puts it at a fixed position from the stack pointer. Some compilers have done it the other way, and then they have to do some contortions to handle printf.

Secondly, the frame pointer is actually redundant because the compiler knows at each point in the function the offset of the required parameter or local from the current value of the stack pointer. However the FP has value when using a debugger since the offset from the FP is always the same for a given parameter or local.

In CPUs that are lacking in registers, the FP is usually omitted at the cost of more work in the code generator to get the right offset.

Finally there are some hybrid schemes that pass say the first few (say 1 or 2) arguments in registers and the rest on the stack.

Whatever scheme is chosen for paramter passing above, all runtime libraries used for a given port must be compiled with the same scheme. This in addition to any opcode differences. So no mixing of Z80 and I80 libraries.

Note that we assume we can access a parameter or local at an offset from the stack pointer. Even this is tortuous for benighted CPUs like the 8080, as there is no indexed addressing mode. We have to emit instructions to calculate the effective address of the parameter, involving getting the SP into a register, then adding to it. The Z80 has the IX and IY registers for indexing, a great advance, but this is limited to offsets that fit in a signed byte. This suffices for the vast majority of parameter and local lists but greater offsets (say if you pass a large structure by value, or have large locals) have to be handled too.

Here is a C file:

#include <stdio.h>

extern int rand(int);

int foo(int i, int j)
        int     k;

        k = (j == rand(i));
        return k - 7;

void main()
        int     i, j;

        i = rand(2);
        j = foo(24, 42);
        printf("%d %d\n", i, j);

 and the generated Z80 assembler, annotated:

;foo.c:5: int foo(int i, int j)
;       ---------------------------------
; Function foo
; ---------------------------------
;foo.c:9: k = (j == rand(i));
        pop     bc  ;bc = parameter j
        pop     hl  ;hl = parameter i
        push    hl  ;5. local k
        push    bc
        push    hl  ;2
        call    _rand  ;3.
        pop     af  ;9.
        ld      c, l
        ld      b, h
        ld      iy, #4  ;access local k
        add     iy, sp
        ld      a, 0 (iy)
        sub     a, c
        jr      NZ,00103$
        ld      a, 1 (iy)
        sub     a, b
        jr      NZ, 00103$
        ld      a, #0x01
        .db     #0x20
        xor     a, a
        ld      c, a
        ld      b, #0x00
;foo.c:10: return k - 7;
        ld      a, c
        add     a, #0xf9
        ld      c, a
        ld      a, b
        adc     a, #0xff
        ld      b, a
        ld      l, c
        ld      h, b
;foo.c:11: }
;foo.c:13: void main()
;       ---------------------------------
; Function main
; ---------------------------------
;foo.c:17: i = rand(2);
        ld      hl, #0x0002
        push    hl  ;2.
        call    _rand  ;3.
;foo.c:18: j = foo(24, 42);
        ex      (sp),hl  ;9.
        ld      hl, #0x002a
        push    hl  ;2.
        ld      l, #0x18
        push    hl  ;2.
        call    _foo  ;3.
        pop     af  ;9.
        pop     af  ;9.
        pop     de
;foo.c:19: printf("%d %d\n", i, j);
        ld      bc, #___str_0+0
        push    hl  ;2.
        push    de  ;2.
        push    bc  ;2,
        call    _printf  ;3.
        ld      hl, #6  ;9.
        add     hl, sp
        ld      sp, hl
;foo.c:20: }
        .ascii "%d %d"
        .db 0x0a
        .db 0x00

Now you understand why compiler writers want to hug the CPU architect when they see a well-designed indexed addressing mode. Ask anyone who has enjoyed the 6809 instruction set after enduring the 6800 one.

However for some CPUs like the 8051 this stack scheme is not available or has to be used sparingly because something is lacking, for example sufficient RAM space for the stack. In this case SDCC offers another way of passing parameters and allocating locals. The parameters and locals are actually static locations in RAM, for each function. The caller fills the locations then jumps to the function. Given that the 8080 has poor stack variable handling, can we use this?

"Yes but what will you do about recursive calls?"

"Ok ok I promise not to write any eight queens or tower of Hanoi programs."

"What will you do about bsort and qsort library routines?"

"Maybe we can make an exception for those and do it the hard way for the CPUs that can grudgingly?" (This is actually what is done, with the reentrant special keyword.) "Anything else?"

"Yes, you realise that if an interrupt routine calls a function that is also used in the main line it can trample over parameters and locals?"

"Ah, I guess we'll have to avoid that then."

Nonetheless for the beginning at least I have set --stack-auto to false in the 8080 profile and will turn it on later to see whether the code generator will cope.