Close

BASIC Assembler

A project log for Model 100 Assembler

Development of an assembler for the Tandy Model 100 portable computer.

clintrclintr 08/15/2014 at 03:280 Comments

Can't find a way to add files to the project, so I'm just going to paste below - sorry.  First is the BASIC program, then a text file I wrote to document the code.  My apologies to anyone who tries to read this; I have no experience with BASIC.

Here is the BASIC program:

100 GOTO1000
200 PRINT"err near line",CL!
210 CLOSE1
220 END
1000 CLEAR2048:PS%=1:LM!=MAXRAM:HM!=HIMEM-1
1005 GOSUB2000
1006 INPUT"FILENAME";FN$
1007 PRINT"pass 1"
1010 OPENFN$FORINPUTAS1
1020 GOSUB10000
1030 CLOSE1
1040 IFLM!>HM!THENGOTO1120
1050 IFHM!>=MAXRAMTHENGOTO1120
1060 IFLM!>=HIMEMTHENGOTO1140
1120 PRINT"bounds error":PRINT"LO =",LM!:PRINT"HI =",HM!
1130 END
1140 PS%=2:CL!=1:BF%=0
1145 PRINT"pass 2"
1150 OPENFN$FORINPUTAS1
1160 GOSUB10000
1170 CLOSE1
1180 PRINT"success":PRINT"LO =",LM!:PRINT"HI =",HM!
1190 END
2000 A1$=".cma.47.cmc.63.daa.39.di.243.ei.251.hlt.118.nop.0.pchl.233.ral.23.rar.31.ret.201.rim.32.rlc.7.rrc.15.sim.48.sphl.249.stc.55.xchg.235.xthl.227"
2010 A2$=".adc.136.add.128.ana.160.cmp.184.ora.176.sbb.152.sub.144.xra.168"
2020 A3$=".dcr.5.inr.4"
2030 A4$=".dad.9.dcx.11.inx.3.ldax.10.pop.193.push.197.stax.2"
2040 A5$=".aci.206.adi.198.ani.230.cpi.254.in.219.ori.246.out.211.sbi.222.sui.214.xri.238"
2050 A6$=".call.205.jmp.195.lda.58.lhld.42.shld.34.sta.50"
2060 B2$=".a.7.b.0.c.1.d.2.e.3.h.4.l.5.m.6"
2070 B3$=".a.56.b.0.c.8.d.16.e.24.h.32.l.40.m.48"
2080 B4$=".b.0.d.16.h.32.sp.48.psw.48"
2090 B5$=".nz.0.z.8.nc.16.c.24.po.32.pe.40.p.48.m.56"
2100 LB$=""
2199 RETURN
2300 GOSUB2700
2320 IFNM!>255THENGOTO200
2330 BB!=NM!:GOTO2800
2400 GOSUB9000
2406 IFNL%<>0THENGOTO200
2410 CC$=LEFT$(TK$,1):GOSUB8100
2420 IFSC%=0THENGOTO2470
2430 NB$=TK$:GOSUB5000
2440 IFSC%=0THENGOTO200
2450 RETURN
2470 SS$=TK$:MP$=LB$:GOTO3000
2500 GOSUB2700
2570 GOSUB2600
2580 BB!=MD!:GOSUB2800
2590 BB!=DV!:GOTO2800
2600 IFNM!<=32767THENGOTO2620
2610 N!=NM!-32768:DV!=128:GOTO2630
2620 DV!=0:N!=NM!
2630 DV!=DV!+(N!\256):MD!=N!MOD256
2640 RETURN
2700 GOSUB2400
2720 IFSC%<>0THENRETURN
2730 IFPS%=2THENGOTO200
2740 TK$="0":NM!=0:RETURN
2800 IFPS%=1THENGOTO2850
2810 POKEPC!,BB!
2830 PC!=PC!+1:RETURN
2850 IFPC!<=HM!THENGOTO2870
2860 HM!=PC!
2870 IFPC!>=LM!THENGOTO2890
2880 LM!=PC!
2890 PC!=PC!+1:RETURN
3000 P%=INSTR(MP$,"."+SS$+".")
3010 IFP%<>0GOTO3040
3020 SC%=0:RETURN
3040 T$=MID$(MP$,P%+2+LEN(SS$))
3041 P%=INSTR(T$,"."):IF P%=0 GOTO 3050
3048 T$=MID$(T$,1,P%-1)
3050 NM!=VAL(T$):SC%=1:RETURN
3200 GOSUB9000
3210 CC$=LEFT$(TK$,1):GOSUB8000
3220 IFSC%=0THENGOTO200
3230 SS$=TK$:GOSUB3000
3240 IFSC%=0THENGOTO200
3250 OC!=OC!+NM!:RETURN
5000 N$=NB$:SC%=0:NM!=0
5010 K%=LEN(N$)-1
5020 IFK%<1THENRETURN
5030 A$=RIGHT$(N$,1)
5040 IFA$<>"h"THENRETURN
5050 N$=LEFT$(N$,K%)
5060 FORI%=1TOK%
5080 A$=MID$(N$,I%,1)
5090 X%=ASC(A$)
5100 IFX%>=48ANDX%<=57GOTO5130
5110 IFX%>=97ANDX%<=102GOTO5140
5120 RETURN
5130 X%=X%-48:GOTO5150
5140 X%=X%-87
5150 NM!=NM!*16
5160 NM!=NM!+X%
5170 NEXT
5180 SC%=1:RETURN
8000 A%=ASC(CC$):SC%=1
8040 IFA%>=97ANDA%<=122THENRETURN
8050 SC%=0:RETURN
8100 A%=ASC(CC$):SC%=1
8120 IFA%>=48ANDA%<=57THENRETURN
8130 SC%=0:RETURN
8200 GOSUB8000
8210 IFSC%=1THENRETURN
8220 GOTO8100
8300 SC%=1
8305 IFBF%<>0THENGOTO8345
8306 NL%=0
8310 IFEOF(1)THENGOTO200
8320 CB$=INPUT$(1,1)
8322 C%=ASC(CB$)
8324 IFC%<>10THENRETURN
8326 NL%=1:CL!=CL!+1:RETURN
8345 BF%=0:RETURN
8400 SC%=1:A%=ASC(CC$)
8420 IFA%<=32THENRETURN
8430 SC%=0:RETURN
9000 TK$=""
9010 GOSUB8300
9015 IFNL%<>0THENGOTO9430
9030 CC$=CB$
9040 GOSUB8400
9060 IFSC%<>0THENGOTO9010
9070 IFCB$<>";"THENGOTO9200
9100 GOSUB8300
9105 IFNL%<>0THENGOTO9430
9150 GOTO9100
9200 IFCB$<>","THENGOTO9220
9210 TK$=CB$:RETURN
9220 CC$=CB$
9222 GOSUB8200
9230 IFSC%=0THENGOTO200
9231 BF%=1
9240 GOSUB 8300
9245 IFNL%<>0THENGOTO9440
9250 CC$=CB$
9260 GOSUB8200
9270 IFSC%=0THENGOTO9300
9280 TK$=TK$+CB$
9290 GOTO9240
9300 IFCB$<>":"THENGOTO9340
9320 TK$=TK$+CB$
9330 RETURN
9340 IFCB$<>","ANDCB$<>";"THENRETURN
9350 BF%=1
9360 RETURN
9380 GOSUB8400
9390 IFSC%=0THENGOTO200
9430 TK$=CB$:RETURN
9440 BF%=1:RETURN
9700 GOSUB9000
9710 IFNL%<>0ORTK$<>","THENGOTO200
9720 RETURN
10000 CL!=1:BF%=0
10010 GOTO10050
10030 GOSUB9000
10040 IFNL%=0THENGOTO200
10050 GOSUB9000
10055 IFNL%<>0THENGOTO10050
10060 TT$=TK$:SS$=TT$:MP$=A1$:GOSUB3000
10070 IFSC%=0THENGOTO10100
10080 BB!=NM!:GOSUB2800
10090 GOTO10030
10100 SS$=TT$:MP$=A2$:GOSUB3000
10110 IFSC%=0THENGOTO10150
10120 OC!=NM!:MP$=B2$:GOSUB3200
10130 BB!=OC!:GOSUB2800
10140 GOTO10030
10150 SS$=TT$:MP$=A3$:GOSUB3000
10160 IFSC%=0THENGOTO10200
10170 OC!=NM!:MP$=B3$:GOSUB3200
10180 GOTO10130
10200 SS$=TT$:MP$=A4$:GOSUB3000
10210 IFSC%=0THENGOTO10250
10220 OC!=NM!:MP$=B4$:GOSUB3200
10230 GOTO10130
10250 SS$=TT$:MP$=A5$:GOSUB3000
10260 IFSC%=0THENGOTO10300
10270 BB!=NM!:GOSUB2800
10280 GOSUB2300
10290 GOTO10030
10300 SS$=TT$:MP$=A6$:GOSUB3000
10310 IFSC%=0THENGOTO10350
10320 BB!=NM!:GOSUB2800
10330 GOSUB2500
10340 GOTO10030
10350 IFTT$<>"lxi"THENGOTO10400
10360 OC!=1:MP$=B4$:GOSUB3200
10370 BB!=OC!:GOSUB2800
10375 GOSUB9700
10380 GOSUB2500
10390 GOTO10030
10400 IFTT$<>"mov"THENGOTO10450
10410 OC!=64:MP$=B3$:GOSUB3200
10415 GOSUB9700
10420 MP$=B2$:GOSUB3200
10430 GOTO10130
10450 IFTT$<>"mvi"THENGOTO10500
10460 OC!=6:MP$=B3$:GOSUB3200
10470 BB!=OC!:GOSUB2800
10475 GOSUB9700
10480 GOSUB2300
10490 GOTO10030
10500 IFTT$<>"rst"THENGOTO10550
10510 OC!=199:GOSUB2700
10520 OC!=OC!+8*NM!
10530 GOTO10130
10550 IFTT$<>"org"THENGOTO10595
10560 GOSUB2700
10570 PC!=NM!
10590 GOTO10030
10595 A$=RIGHT$(TT$,1):IFA$=":"THENGOTO11110
10600 A$=LEFT$(TT$,1):SS$=MID$(TT$,2):MP$=B5$
10610 IFA$<>"r"THENGOTO10650
10620 OC!=192:TK$=SS$:GOSUB3230
10630 GOTO10130
10650 IFA$<>"c"THENGOTO10700
10660 OC!=196:TK$=SS$:GOSUB3230
10670 BB!=OC!:GOSUB2800
10680 GOSUB2500
10690 GOTO10030
10700 IFA$<>"j"THENGOTO10750
10710 OC!=194:TK$=SS$:GOSUB3230
10720 GOTO10670
10750 IFTT$="end"THENRETURN
10760 IFA$<>"d"THENGOTO200
10770 B$=MID$(TT$,2)
10800 IFB$<>"s"THENGOTO10890
10810 GOSUB9000
10820 IFNL%<>0THENGOTO200
10830 CC$=LEFT$(TK$,1):GOSUB8100
10840 IFSC%=0THENGOTO200
10850 NB$=TK$:GOSUB5000
10860 IFSC%=0THENGOTO200
10870 PC!=PC!+NM!
10880 GOTO10030
10890 IFB$<>"b"THENGOTO10950
10900 GOSUB2300
10910 GOSUB9000
10920 IFNL%<>0THENGOTO10050
10930 IFTK$<>","THENGOTO200
10940 GOTO10900
10950 IFB$<>"w"THENGOTO200
10960 GOSUB2500
10970 GOSUB9000
10980 IFNL%<>0THENGOTO10050
10990 IFTK$<>","THENGOTO200
11000 GOTO10960
11110 IF PS%=2 THEN GOTO10050
11120 B%=LEN(TT$)-1:A$=LEFT$(TT$,B%):LB$="."+A$+"."+STR$(PC!)+LB$
11130 GOTO10050

Here is the documentation:

asm.ba contains BASIC code for an assembler for the model 100.  This is a
very limited assembler which is meant only to be used to create a better
assembler written in assembly.
Assembly reference: 8080/8085 ASSEMBLY LANGUAGE PROGRAMMING MANUAL
                    1977,1978 Intel Corporation
This assembler only understands lowercase.
It understands only the following:
 - usual 8085 opcodes
 - numbers in hexadecimal only
    - these must begin with a decimal digit and terminate with 'h', e.g.:
      0ab43h
 - the usual operands a b c d e h l m sp psw
 - labels
 - immediate operands may be labels or hex numbers only
 - assembler directives:
    - org
    - end -- everything after "end" in the file is ignored
          -- this is required
    - db  -- byte data as hex numbers only
    - dw  -- word data as hex numbers only
    - ds  -- note the # bytes may only be given as a number - not a label
 - comments -- from ';' to the end of the line
The assembly code is converted to machine code directly in the model 100 RAM.
You could then use the BASIC keyword SAVEM to put the machine code in a file.
There are errors that this assembler won't catch; try not to make any.
*** Program Documentation ***
Before running the assembler:
- Your input file must start with an org directive.
- The assembler will only write to RAM from HIMEM to MAXRAM-1, so you need to
  make sure your program will fit in that space.  See documentation on the BASIC
  CLEAR keyword for help with this.
** 1000 main
The assembler makes two passes of the file.  The first pass makes sure that all
writes to RAM will be within the range from HIMEM to MAXRAM-1, and determines
the values of all the labels.  The second pass writes the program to RAM.
The main program starts at line 1000, and calls the subroutine at 10000 once
for each pass.
** Variables:
    FN$ input filename
    PC! next location in RAM to write ('program counter')
    PS% pass number
    LM! lowest RAM address changed
    HM! highest RAM address changed
    CL! current line in input file
    SC% used for boolean return values from subroutines
    NL% boolean indicating whether a newline was read from input file
    TK$, TT$ tokens read from input file
    OC! instruction opcode
** 10000 single pass
The subroutine at line 10000 handles one pass.  It is a big loop which reads
and assembles one instruction or directive at a time.  The following cases
are treated within the loop:
    line number     case
    10060           all instructions listed in A1$
    10100           all instructions listed in A2$
    10150           all instructions listed in A3$
    10200           all instructions listed in A4$
    10250           all instructions listed in A5$
    10300           all instructions listed in A6$
    10350           lxi instruction
    10400           mov instruction
    10450           mvi instruction
    10500           rst instruction
    10550           org assembler directive
    10595           labels
    10610           all conditional return instructions
    10650           all conditional call instructions
    10700           all conditional jump instructions
    10750           end assembler directive
    10800           ds  assembler directive
    10890           db  assembler directive
    10950           dw  assembler directive
** 2000 init
The subroutine at line 2000 sets up some string variables before the first pass.
** 9000 get token
The subroutine at line 9000 gets the next token from the input file.  It skips
whitespace and comments.
Recognized tokens are:
    - a newline
    - a comma
    - a string of alphanumeric characters followed immediately by a colon
    - a string of alphanumeric characters
If the end of the file is reached before any token, TK$="" upon return;
else if the token was a newline, NL%=1 upon return;
else TK$ contains the token found.
** 5000 hex string to number
The subroutine at line 5000 attempts to read a hexadecimal number in NB$.  The
number must be in the range 0h to 0ffffh.  If successful, upon return SC%=1
and NM! contains the number read.  Otherwise, SC%=0.
** 8000 isalpha
The subroutine at line 8000 expects a single-character string in CC$.  It checks
whether the character is a lowercase letter: if so it returns with SC%=1, else
it returns with SC%=0.
** 8100 isnum
The subroutine at line 8100 expects a single-character string in CC$.  It checks
whether the character is a decimal digit: if so it returns with SC%=1, else
it returns with SC%=0.
** 8200 isalphanum
The subroutine at line 8200 expects a single-character string in CC$.  It checks
whether the character is a lowercase letter or a decimal digit: if so it
returns with SC%=1, else it returns with SC%=0.
** 8300 get next char
BF%=0:
The subroutine at line 8300 reads the next character from the input file, which
must not yet be at eof.  The character is put in CB$ and, if the character is
newline, NL% is set to 1, else NL% is set to 0.  Also, whenever newline is
read, the variable CL! is incremented.
BF%=1:
The subroutine sets BF% back to 0, and returns whatever it returned the last
time it was called.  Setting BF% to 1 before calling basically "unreads" the
last character read from the input file.
** 8400 iswhitespace
The subroutine at line 8400 expects a single-character string in CC$.  It checks
whether the character is whitespace: if so it returns with SC%=1, if
not it returns with SC%=0.  Here we consider any character with ASCII code
<= 20 to be whitespace.
** 2300 get imm1
The subroutine at line 2300 reads the next token from the input file, expecting
it to be a valid one-byte immediate operand (either a hex number or label, with
value in range 0h to 0ffh).  The value of this byte is written to RAM at the
current PC! location, and the PC! is incremented.
** 2500 get imm2
The subroutine at line 2500 reads the next token from the input file, expecting
it to be a valid two-byte immediate operand (either a hex number or label, with
value in range 0h to 0ffffh).  The value of this word is written to RAM at the
current PC! location, and the PC! is increased by 2.
** 2700 get number
The subroutine at line 2700 reads the next token from the input file, expecting
it to be a valid immediate operand (either a hex number or label, with value in
range 0h to 0ffffh).  The token read is returned in TK$ and the value of the
operand is returned in NM!  Note that if an undefined label is read, this is
allowed during the first pass but not the second pass.
** 2800 poke
The subroutine at line 2800 expects BB! to contain a number between 0h and 0ffh,
and PC! to contain the address in RAM where BB! should be written.  During the
first pass, this subroutine just keeps track of the range of RAM to be written
(in LM! and HM!).  During the second pass, this subroutine actually writes to
RAM.
    I needed a way to associate string "keys" with values.  I did this
    with 'MAP' strings in the form ".key1.val1.key2.val2... .keyn.valn".
    Every key must start with a lowercase letter, and the values must be
    decimal numbers.
** 3000 map lookup
The subroutine at line 3000 looks up the value associated with key SS$ in MAP
MP$.  If the key is found, the value is put in NM! and the subroutine returns
with SC%=1.  Otherwise returns with SC%=0.
** 3200 add field into opcode
The subroutine at line 3200 gets the next token from the input file and uses
it as a key to look up in MAP MP$.  The value found is added into OC!
** MAPS
A1$ contains opcodes for instructions which take no operands
A2$ contains opcodes which take one operand which is the name of a register,
    which is encoded into the opcode by adding the value in B2$ associated
    with the register into the opcode
A3$ contains opcodes which take one operand which is the name of a register,
    which is encoded into the opcode by adding the value in B3$ associated
    with the register into the opcode
A4$ contains opcodes which take one operand which is the name of a register pair,
    which is encoded into the opcode by adding the value in B4$ associated
    with the register pair into the opcode
A5$ contains opcodes which take a one-byte immediate operand
A6$ contains opcodes which take a two-byte immediate operand
B2$ see A2$
B3$ see A3$
B4$ see A4$
B5$ for rXX, jXX, cXX instructions, XX being the condition code, the condition
    code is encoded into the opcode as bits 5, 4, 3 of the value associated
    with the condition code in B5$
LB$ holds the values of the labels

Discussions