First Attempt at raw 8086/88 assembling...

A project log for Improbable AVR -> 8088 substitution for PC/XT

Probability this can work: 98%, working well: 50% A LOT of work, and utterly ridiculous.

esot.ericesot.eric 01/12/2017 at 10:197 Comments

Here we go! First attempt at assembling raw 8088/86 code...

The point, here, is to create raw binary machine-code that can be directly loaded into the BIOS/ROM chip... NOT to create an executable that would run under an operating system like linux, or DOS.

(Though, as I understand, technically this output is identical to a .COM file, in DOS, or could also plausibly be written to a floppy disk via 'dd').

This needs: 'bin86' and maybe 'bcc' both of which are available packages in debian jessie, so probably most linux distros...

Note that you can't use the normal 'as' (I coulda sworn it was called 'gas') assembler that goes along with gcc (if you're using an x86)... because... it will compile 32-bit code, rather than 16-bit code. (See note at the bottom)


So, here's a minimal buildable example:

#  "Without the export line, ld86 will complain ld86: no start symbol."

export _main
   nop   ; -> 0x90 (XCHG AX?)

   xchg ax, ax ; Yep, this compiles identically with as86 (not with gnu as)
Note that, technically, the 8088/86 doesn't have a "NOP" instruction... It's implemented by the (single-byte) opcode associated with 'xchg ax, ax'.

I got a little carried away with my makefile. Hope it's not too complicated to grasp:

#This requires bcc and its dependencies, including bin86, as86, etc.

target = test

asmTarget = $(target).a.out

ldTarget = $(target).bin

   as86 -o $(asmTarget) $(target).s
   ld86 -o $(ldTarget) -d $(asmTarget)

   rm -r $(asmTarget) $(ldTarget)

   hexdump -C $(ldTarget)

#objdump86 only shows section-info, can't do disassembly :/
   objdump86 $(asmTarget)
Note that you can't *only* use as86 (without using ld86), because, like normal ol' 'as', it'll compile an executable with header-information, symbols, etc... meant to run under an operating-system.

ld86 links that... or, really, in our case... unlinks all that header/symbol information, basically extracting a raw binary.

So, now, if you look at the output of hexdump (or objdump86) you'll see the file starts with 0x90 0x90, as expected.

I'm not yet sure why, but the file is actually four bytes... 0x90 0x90 0x00 0x00.

Guess we'll come back to that.


So, my plan is to pop the original 8088 PC/XT-clone's ROM/BIOS chip, and insert a new one that contains nothing but a "jump" to one of the (many) other (empty) ROM sockets, where I'll write my own code in another chip.

Actually, what I think I'll do is copy the original ROM/BIOS and piggy-back another chip right atop the copy (keeping the original in a safe location). Then I'll put a SPDT switch between the /CS input from the socket and the two ROMs' /CS pins (and a couple pull-up resistors). That way I can easily choose whether I want to boot with the normal BIOS or whether I want to boot with my experimental code.

I guess I'll have to make sure that my secondary/experimental ROM chip does NOT start with 0xAA 0x55, as that's an indicator to a normal BIOS that the chip contains a ROM expansion (for those times when I want to boot normally). Maybe the easiest/most-reliable way would just be to start it with 0x90 (nop), then have my custom code run thereafter.


So, so far this doesn't take into account the jumping. Note that x86's boot from address-location 0xffff0, they expect a "jump" from there to [most-likely] the beginning of the ROM/BIOS chip's memory-space, where the actual code will begin.

So, next I'll have to learn how to tell 'as86' that I want [some of] my code to be located at 0xffff0... and I suppose that means I need to make sure my EPROM is the right size to do-so, based on whatever address-space the original ROM occupied... and, obviously, my EPROM won't be 1MB, so... what location would I actually have to direct it to...? Maybe 64KB - 0x0f.

Or, probably easier, would be to compile the jump, just as I did above, and tell my EPROM programmer to load the resulting binary-file at that location. Not sure all EPROM programmers have that option, but mine does.



Attempt with normal 'as', had the following:

mov ax, ax
This wouldn't compile... something about too many memory-operands (does it think 'ax' is a variable name?) So tried:
mov %ax, %ax
Or something similar (was it 'ax' or 'AX'?)

Anyways, the final result was 0x90 0x66 0x90

0x66 is an invalid opcode on 8088/86...

Thought, maybe, since I was writing in assembly, I could go ahead with the normal 'as', and just limit my assembly-instructions to those I know are on the 8088/86, but I guess not.


Yann Guidon / YGDES wrote 01/12/2017 at 18:43 point

Oh my, I wish I didn't know so many of the dirty x86 idiosynchrasies but ...

0x66 is a data size prefix, introduced in the i386 : before a "%ax", is means that the assembler expects the platform to run in 32-bits mode.

as does just that, as it is a compiler back-end for UNIX machines...

fix: use nasm.

  Are you sure? yes | no

esot.eric wrote 01/13/2017 at 03:01 point

@Shaos and @Yann Guidon / YGDES, Thanks for that info! I haven't heard of nasm before. 

Looks like it's not designed *explicitly* for 16-bit systems, so I assume there must be an argument to tell it to do-so. Have no fear, I'll 'man'-up on that soon :)

The fact it outputs *raw* binary looks like a nice feature! And disassembly! Woot!

  Are you sure? yes | no

Yann Guidon / YGDES wrote 01/13/2017 at 03:08 point

15 years ago, I wrote a huge program using nasm, hosted on a diskette, mixing real and protected mode in 16 and 32 bits modes...

My .COM and .EXE header macros seem to still be included in the package :-)

  Are you sure? yes | no

SHAOS wrote 01/13/2017 at 03:10 point

Yes, I use your macros, Yann :)

P.S. @esot.eric read this

  Are you sure? yes | no

esot.eric wrote 01/13/2017 at 07:07 point

@Yann Guidon / YGDES sounds cool... I'll keep my eyes out for 'em.

@Shaos good link, thanks! 

I hear nasm's big selling-point is its documentation, there's a lot of it!

  Are you sure? yes | no

SHAOS wrote 01/12/2017 at 18:15 point

yes, NASM is the best :)

and it's available for Linux as well:

> aptitude show nasm

Package: nasm                            
State: installed
Automatically installed: no
Version: 2.10.01-1
Priority: optional
Section: devel
Maintainer: Anibal Monsalve Salazar
Architecture: amd64
Uncompressed Size: 3,113 k
Depends: libc6 (>= 2.4), dpkg (>= 1.15.4) | install-info
Conflicts: nasm
Description: General-purpose x86 assembler
 Netwide Assembler.  NASM will currently output flat-form binary files, a.out, COFF and ELF Unix object files, and Microsoft 16-bit DOS and Win32 object files.
 Also included is NDISASM, a prototype x86 binary-file disassembler which uses the same instruction table as NASM.
 NASM is released under the GNU Lesser General Public License (LGPL).

  Are you sure? yes | no

Yann Guidon / YGDES wrote 01/12/2017 at 18:07 point

Use nasm.

I'm sure @Shaos will agree ;-)

  Are you sure? yes | no