Adding 16 bit ISA support

A project log for Beckman DU 600 Reverse Engineering

Reverse engineering process for a 68332 based system

joe.zatarskijoe.zatarski 12/22/2016 at 23:120 Comments

Today I will be adding 4 project logs, 3 of which are a bit overdue.

For the first, sometime back in June of last year, I began to look into the simple ISA implementation used to interface the Cirrus Logic CL-GD5429 SVGA chip to the 68332. My intention was to find out if the limited subset of the ISA signals was enough to add a simple PATA interface without much trouble.

What I discovered is that most of the ISA signals are created within U44, a GAL22V10. The pinout, as best as I could tell, is listed in the hardware description. Not all of the pins are directly used, but still carry signals, and are instead reused internally in the GAL. These have been marked NC, and little to no attempt was made to discover what exactly they do or whether they are useful.

I don't remember exactly now, as my memory is a bit fuzzy, but I think a single transfer takes at least 4 cycles of the CPU clock before DSACK (depending on whether or not the device requests additional wait states). This means the ISA bus implementation is effectively running at 1/4 the speed of the CPU clock, or about 4MHz (given the CPU runs at approximately 16MHz), resulting in 4MT/s absolute maximum.

However, during this endeavor, I noted that the video chip was only wired for 8 bit transfers! This not only cuts the potential performance in half, but it would make adding a PATA bus difficult since 16 bit transfers are a requirement of the PATA bus since the ATA-2 spec, if I'm not mistaken. (Of course, neither of these two things are actually issues if you're using the board as it was intended, in a Beckman DU600. Graphics speed was certainly no concern, and board layout simplicity was probably a bigger concern.)

Maybe the order of these two things are actually reversed, but the result is the same: I set out to develop circuitry that would implement 16 bit ISA transfers, not only to improve graphics performance, but also to allow the possibility of adding a PATA bus at a later date.

I began reading the ISA specs, and refreshing my memory on the way CPU32 dynamic bus sizing worked.

The 68332 works on an asynchronous bus design: That is, the CPU reads or writes a memory location, and then waits for the device to acknowledge the transfer. This is a DSACK, or Data and Size ACKnowledge. The 68332, having a 16 bit external bus, allows 16 bit or 8 bit transfers per cycle. Whether the transfer was 8 or 16 bits is determined by the device being accessed. There are two DSACK signals, DSACK0 and DSACK1. By asserting one or the other DSACK line, the device can acknowledge an 8 bit, or a 16 bit transfer. Additionally, there are two SIZ output signals from the 68332, SIZ0 and SIZ1. Because the 68K line is 32 bits internally, the 68332 considers transfers up to 32 bits as a single transfer, even though they take two physical transfers on the 16 bit bus. The SIZ outputs indicate the number of bytes left to be transferred, 1, 2, 3, or 4 (0 means 4). The only way to get 3 is after a single byte transfer on a 4 byte operand transfer.

During a 32 bit write, the 68K will always place the upper word on the bus first. During a 16 bit write, the 68K just places the entire word on the bus. Lastly, during an 8 bit write, the 68K actually duplicates the 8 bit data on both halves of the bus. This is done so that 16 bit devices can use the half of the bus they prefer (depending on ADDR0) but 8 bit devices will always have the data available on the upper half of the bus. During a 24 bit write (a special case of a partially completed 32 bit write), the 68K will duplicate the upper byte across the data bus, similar to the 8 bit write.

During a 32 bit or 16 bit read, the 68K will always latch an entire word from the data bus, but depending on DSACK, may only use the upper byte if the transfer acknowledged was only 8 bit. During an 8 bit read, or a 24 bit read (again, special case of partially completed 32 bit transfer) the 68K will only use one byte of the data from the bus. If the transfer was 16 bits, the 68K will use the upper or lower byte depending on ADDR0. If the transfer was 8 bits, the 68K will always use the data from the upper byte of the bus.

The intention of this system is that certain devices are inherently 16 or 8 bit, and will always produce the appropriate DSACK for their bit width. 16-bit devices will handle 8 bit transfers based on SIZ0 and ADDR0. However, there is nothing preventing the device from producing the conjugate of the dynamic bus sizing. That is, acting as an 8 bit device when the transfer is an 8 bit transfer, and acting as a 16 bit device when the transfer is larger. This is somewhat how the 16 bit ISA signals were implemented in my circuitry.

The ISA bus works as a synchronous bus, in contrast to the 68K bus. A synchronous bus normally completes a bus cycle in a specific number of cycles. Instead of the device acknowledging the transfer when it is done, the device will request more time when it's not done. This is accomplished by the IOCHRDY line (which is also used for memory, despite the name).

When a device supports a 16 bit transfer, it uses the /IOCS16 and/or /MEMCS16 lines to indicate this to the host as soon as it has decoded its memory or I/O address. On the other hand, when the host is capable of transferring data on the extra 16 data lines, it asserts the SBHE signal (system bus high enable, active low). Depending on the A0 line, this either indicates an 8 bit transfer capable of using the upper data bits, or a 16 bit transfer. If the device is capable of a 16 bit transfer, and the host is attempting to transfer using the upper 8 data lines, then the transfer *will* occur using the upper data lines (and a 16 bit transfer will occur, if A0 is low).

For my design, I use the SIZ0 line for SBHE. SIZ0 is high whenever there is a single byte to be transferred, so this makes all 8 bit transfers use the lower half of the data bus. Additionally, I perform an 8 bit DSACK for any time SIZ0 is high. I perform 16 bit DSACKs any time that SIZ0 is low, and the device supports a 16 bit transfer (IOCS16 or MEMCS16 is asserted as appropriate).

The schematic for the new DSACK circuitry:

This circuitry was implemented entirely using unused gates already existing on the board. Some trace cuts were necessary to use these gates since their inputs were usually tied to ground. This resulted in a mess of 30awg kynar wire-wrap wire:

Additionally, the newly created IOCS16 and MEMCS16 lines had to be connected to the VGA controller, as well as SIZ0 to SBHE. Being a 132 pin QFP with .65mm pitch pins, the 30 AWG Kynar was really useful. The other 8 data lines also had to be connected, taken from connector JA1. The data lines, since we're interfacing a big endian CPU to a little endian device, had to be byte-swapped. That is, D8-D15 mapped to D0-D7, and vice-versa.
This does create endianess problems if 16 bit numbers are worked with, but it corrects memory ordering issues. This way, memory address 0 maps to 0, and memory address 1 maps to 1, etc.