Close

Original article OCR

A project log for NS 32000 cross-assembler in C

By Richard Rodman, and published in Dr. Dobbs' Journal.

keithKeith 02/11/2024 at 16:140 Comments

Dr. Dobb’s Journal, December 1986, page 48.

A table-driven assembler that can be modified for other processors.

Series 32000 Cross-Assembler

by Richard Rodman, 1923 Anderson Rd., Falls Church, VA 22043

The 32000 processor features generalized addressing modes available in almost all instructions.

The National Semiconductor series 32000 microprocessor line includes the 32-bit 32032 and the 16-bit 32016 (formerly called 16032) microprocessors. As part of a project to build a board using a 32032, I wrote an assembler in Software Toolworks’ C/80; adaptation to any other variant of C should be easy.

Although most people lump the 68000 and the 32016 together, these processors are radically different. The differences have been summed up as "the 68000 is PDP-11-like, whereas the 32000 is VAX-like." The 32000 includes bit-field, translate, procedure enter/return, and other high-level instructions in its instruction set.

Basic Program Design

This program works in a brute-force fashion, but it is easy to understand, modify, and debug. Each instruction’s binary equivalent is stored in a string, with xs where operands need to be inserted. A string matcher, match(), matches the opcodes against lines in the source file, keeping matches to wildcards in the buffer ambig_buffer. Each opcode has an option character, opopt, associated with it that controls special-case logic for some instructions. The data is output in Intel absolute hex format. Table 1, page 49, shows the definitions for the opopt characters and the instruction table format. Table 2, page 49, shows some examples of instruction formats defined using the structure in Table 1.

The 32000 processor, although allowing absolute addressing, features generalized addressing modes available in almost all instructions. Two’s-complement offsets can be used in three different sizes — 7, 14, or 30 bits long — as needed. Because these offsets could refer to areas not yet defined, and the length of the code varies with the offset, three passes are necessary. The first pass gets a coarse value of all symbols, the second pass then makes the variable offsets the right length and corrects the symbol values, and the third pass actually generates the code. After the first pass, the symbol table is sorted; then in the second and third passes, a binary search is used to find entries more quickly.

Assembler Syntax

Table 3, below, shows the error messages produced by the assembler.

Future Enhancements

Unless I get some 32000 hardware to play with, it’s unlikely I'll work on this program further. If you'd like to work on it, however, some items on your list should be:

  1. Multivalue db/dw/dd and character-string constants.
  2. Global/external object format and linker.

    The 32000 instructions are already relocatable; any absolute values that would be present would presumably be entry points or I/O addresses. In fact, even the global/external isn’t really necessary because of the cxp/rxp instructions.

  3. Cseg/dseg pseudo-ops.

If you send your changes to me, I'll be happy to make them available to others. Anyone wanting a copy of the source code may send me $8 for materials and effort Please specify 8-inch CP/M, 5¼-inch PC, or other (inquire) or 3½-inch Atari ST.

For those lucky people who are in a position to make use of this program, why not let readers know what you're doing? Is the 32000 really the programmer's dream some say it is? And for those who are in a position to do so, how about some inexpensive 32000 hardware — a singleboard computer perhaps — so people can get a hands-on feel for what the processor can do?

Even if you don't have a 32000 processor to play with, you may be able to make use of routines from this program. The style exemplifies my belief that C should be written to be readable both by computers and by humans. Cryptic C is bad C.

DDJ

Vote for your favorite feature/article. Circle Reader Service No. 5.

The listing for this article is presented in a machine-readable form — Soft-strips produced by Cauzin Systems. The strips begin on page 83. The text of the listing is available for downloading in the DDJ Electronic Edition on CompuServe. A disk with this listing and others is also available — see the ad on page 115. The text of the listing will be published next month.


#define MAXOP 149

/* The binary value should be a string of bits e.g. 0111xxxxx00b
The opcode opopt character is used to specify special operands, etc. */

/*
opopts used here for the 32000 are:

blank  nothing special 
a      gen 
b      gen short 
c      gen gen 
d      00000 short 
e      gen gen reg 
f      reglist save/enter 
h      reglist restore/exit 
h      00000 gen (sfsr) 
i      inss/exts 
j      movs/skps/cmps 
k      setcfg 
l      procreg, gen for lpr/spr 
m      index (operand order) 
n      ret/rett — postbyte 
o      movm 
p      exp (disp after instruction) */ 

struct {
    char    *onam;       /* opcode name */ _ 
    int     oent;        /* operand count, negative if PC-relative */
    char    obin;        /* opcode binary value */ 
    char    oopt;        /* opcode opopt char */
}

Table 1: Definitions of opopt characters


"bsr",      -1,   "02h", 
"save",     1,    "62h", 
"svc",      0     "0e2h", 
"bne",      -1,   "1ah",
"addq?",    2,    "xxxxxxxxx00011iib", 
"sgt?",     1,    "xxxxx011001111iib", 
"jump"      1,    "xxxxx01001111111b", 
"jsr",      1,    "xxxxx11001111111b",
"addl",     2,    "xxxxxxxxxxxxx00000010111110b",
"mulf",     2,    "xxxxxxxxxx11000110111110b", 
"and?",     2,    "xxxxxxxxxx1010iib", 
"not?",     2,    "xxxxxxxxxx10011101001110b",

Table 2: Selected instruction formats from the opcode table

?    unknown item — syntax error 
x    unimplemented instruction (bad instruction database) 
l    no length modifier (bad instruction database) or expression too complex 
e    address extensions missing  
p    illegal register/pr/spr
[    brackets required 
v    syntax error in value
o    unknown arithmetic operator 
u    undefined symbol

Table 3: Error messages

Discussions