Some background...
The main purpose of my Intel HEX project was to:
- Check how difficult it is to create non-CPU controllers using microcode
- Improve my microcode compiler based on real project
- Give me a tool for priming memory of FPGA-based computers when CPU is not working / written yet - during runtime, avoiding slow recompiles
Turns out, with some breadboards, wiring, and bringing the Intel HEX component buses outside from FPGA into the physical world, the same components can be reused for a custom and handy testing tool.
This testing allows to check:
- Address, data, some control (R/W) buses
- (EP)ROM/RAM if populated (for expected content or read / write / re-read)
- Address decode logic (e.g. mapping of ROM/RAM, repeats due to partial decode etc.)
Note that boards / computers tested this way do not need to be populated fully with ICs or other components, so it is possible to check "dead" or "not yet alive" boards too.
There are two possibilities:
- CPU is not on the board, "insert" the wires into CPU socket (poor man's in-circuit emulator)
- CPU is on the board, but there is bus access that allows DMA takeover (e.g. BUSREQ/BUSACK in Z80-type buses)
What cannot be readily tested using this approach:
- Non-memory oriented bus signals (e.g. clocks, interrupts etc.)
- I/O (memory mapped) - there can be indication that it "kinda works" as those spaces will be read as whatever default the I/O device returns, and in case of write in some simple case (e.g. write latch connected to some LEDs or relays) it may work.
- I/O (separate) - with extra signal I/O space can be "forced" to appear as memory space and then it can be figured out if the data returned makes sense or not (example given below)
Hardware
For this project I used a cool little 8085-based single board computer (8085 Minimax) described and graciously provided to me by Ken Yap (thanks again!). I was actually in the process of soldering together the board, and decided to use a verification step before plugging in my vintage Soviet CPU to see if there will even be a chance for it working or not...
The hardware setup is simple but a bit messy affair:
Few notes:
- 8085 IO/M signal (purple wire) is simply switchable to IO (high) to M(emory) (low) using a micro-DIP on the FPGA board. The HEX I/O has no idea about it, it just sees IO space as another 64k address map space (of course with 256 repetitions because typically in 8080-family systems upper 8 address lines are not decoded (Z80 introduced 16-bit IO to some degree, and then HD64180 has a full 64k IO address map)
- 8085 has multiplexed lower address A7..A0 with data bus D7..D0, giving AD7..AD0 bus. Typically a 8-bit latch (like 74x573) enabled by ALE captures the low address early in memory access cycle. To simplify, I left this IC unpopulated on the board and connected the low memory address wires (gray) directly to its output socket pins.
- Minimax 8085 of course takes +5V DC - I am sourcing its modest consumption (esp. because the power hungry NMOS CPU is not there!) with a 3.3V to 5V step-up regulator.
- Connecting nominal 3.3V (FPGA) to nominal 5.0V (SBC) is a "circuit crime". But I could get away with it in this case as the modern RAM / ROM used has max "0" voltage and min "1" voltage signals that are within margins.
- 4 additional PMOD signals are used for UART - this is how the Intel HEX files are uploaded/download during runtime.
This is how these connections look in VHDL (top file of the project):
--PMOD interface
JA1: inout std_logic; -- Connected to USB2UART
JA2: inout std_logic; -- Connected to USB2UART
JA3: inout std_logic; -- Connected to USB2UART
JA4: inout std_logic; -- Connected to USB2UART
JB1: out std_logic; -- GRAY 74F573.19 A0
JB2: out std_logic; -- GRAY 74F573.18 A1
JB3: out std_logic; -- GRAY 74F573.17 A2
JB4: out std_logic; -- GRAY 74F573.16 A3
JB7: out std_logic; -- GRAY 74F573.15 A4
JB8: out std_logic; -- GRAY 74F573.14 A5
JB9: out std_logic; -- GRAY 74F573.13 A6
JB10: out std_logic; -- GRAY 74F573.12 A7
JC1: out std_logic; -- WHITE 8085.21 A8
JC2: out std_logic; -- WHITE 8085.22 A9
JC3: out std_logic; -- WHITE 8085.23 A10
JC4: out std_logic; -- WHITE 8085.24 A11
JC7: out std_logic; -- WHITE 8085.25 A12
JC8: out std_logic; -- WHITE 8085.26 A13
JC9: out std_logic; -- WHITE 8085.27 A14
JC10: out std_logic; -- WHITE 8085.28 A15
JD1: out std_logic; -- PURPLE 8085.30 IO/M (low for memory access)
-- breadboard signal connections
BB1: inout std_logic; -- BLUE 8085.12 AD0
BB2: inout std_logic; -- BLUE 8085.13 AD1
BB3: inout std_logic; -- BLUE 8085.14 AD2
BB4: inout std_logic; -- BLUE 8085.15 AD3
BB5: inout std_logic; -- BLUE 8085.16 AD4
BB6: inout std_logic; -- BLUE 8085.17 AD5
BB7: inout std_logic; -- BLUE 8085.18 AD6
BB8: inout std_logic; -- BLUE 8085.19 AD7
BB9: out std_logic; -- ORANGE 8085.31 nWR
BB10: out std_logic; -- YELLOW 8085.32 nRD
Software
There are 3 different toolchains and languages coming together in this project:
- Intel HEX file - there are innumerable systems and tools producing and consuming this old but still useful format, most notably all the assembler, cross-assemblers etc. in 8080/Z80 ecosystem on one side, and bunch of programmers, emulators etc. on others. Any can be used in combination with this project - as long as it is "old enough" (producing just 64k address space and 00 and 01 type records only)
- VHDL (using Xilinx ISE 14.7) - to compile it all together and generate a .bit file, standard FPGA toolchain flow. Few notes:
- No Xilinx specific "IP" components are used, in other words, "vanilla" VHDL could be recompiled for other FPGA vendors
- Lot of VHDL code is auto-generated by the microcode-compiler as a boiler plate to include and/or copy from to create the microcoded controllers
- Microcode - this is custom "language" I "invented" to simplify creating templatized controllers. There are 3 such microcoded controllers in this project:
- HEX2MEM - accepts stream of ASCII characters, interprets them as valid Intel HEX file stream, and generates memory write signals
- MEM2HEX - generates memory read signals, bytes read are assembled into Intel HEX file output ASCII stream
- TTY_Screen - accepts ASCII character stream (including some special characters such as CR, LF, CLS, HOME) and writes into a memory organized as MAXCOL columns and MAXROWS rows (in this project 80*60 for a text based VGA, but could for example be a 16*4 LCD etc.)
The software components are best explained by going through the 4 supported modes of operation:
---------------------------------------------------------------------------------------------
-- SW7 SW6 Mode TTY (VGA) UART TX 7seg LED
---------------------------------------------------------------------------------------------
-- 0 0 sel_hexout - Generated HEX mem2hex debug port (or bus if nWait = 0)
-- 0 1 sel_hexin Microcode trace Echo UART RX hex2mem debug port (or bus if nWait = 0)
-- 1 0 sel_loopback0 Echo UART RX Echo UART RX Baudrate (decimal)
-- 1 1 sel_loopback1 Echo UART RX Echo UART RX UART mode
---------------------------------------------------------------------------------------------
Mode 3 - display UART mode and loopback test
To have confidence in a test circuit, it is useful for the test circuit to test itself :-) This mode:
- Displays the UART mode in somewhat cryptic way in 6-digit 7-seg LED
UART is two separate circuits (SER2PAR and PAR2SER) that I reuse in many projects. They support a variety of 8-bit per character transmit and receive frames. The terminal program on the host should be set to same setting (8-N-1 in this case)
Anvyl board switches 2..0 select the mode as visible in the image below.
(note 700ms delay per line - this is to allow time for the trace of HEX2MEM microcode to display before processing next incoming character, more about this below)
- Echos the characters coming from UART input (RX) back to output and also to TTY2VGA. This allows testing those components too.
To simplify top level object, the TTY, video RAM, chargen RAM and VGA controller are wrapped up in one component called TTY2VGA:
- VGA mode: 640*480, 25MHz pixel clock
- Text mode: 80*60 characters, 8*8 pixels
- Video RAM: 4k, dual port. The port connected to TTY controller is read/write (read is used when scroll-up is needed, this is of course a bad design as a top-row register pointer could allow using write on that side) and the port connected to VGA controller is read only. Both ports take x, y (row/col) and internally a hardware "multiplication" is done to find out the character code address (A = Y*MAXCOL+X)
- Character generator ROM: 128 characters (0-127 ASCII, 128-255 are simply inverted in the circuit), total of 1k
- Hardware cursor in two modes (underscore and block)
- Color is hard-coded but could easily be extended by addition of 4k color RAM, similar to video RAM
Mode 2 - display baudrate and loopback test
Another test mode which:
- Displays real baudrate (BCD format) on the LEDs
- Also provides character loop-back just like mode 3
Anvyl switches SW5..3 select the baudrate from 600 (000) to 57600 (111). Note that the number displayed is not exactly the typical standard rate. The reason is that the frequency is actually measured on the board. First, the FPGA 50MHz board frequency is divided by two prescale factors, one leads to freq_4096 that can be divided by powers of 2 down to 1 Hz, and the other based on the selected divide value to get baudrate_x8 frequency:
prescale: process(CLK, baudrate_x8, freq4096, switch_uart_rate)
begin
if (rising_edge(CLK)) then
if (prescale_baud = 0) then
baudrate_x8 <= not baudrate_x8;
prescale_baud <= prescale_value(to_integer(unsigned(switch_uart_rate)));
else
prescale_baud <= prescale_baud - 1;
end if;
if (prescale_power = 0) then
freq4096 <= not freq4096;
prescale_power <= (clk_board / (2 * 4096));
else
prescale_power <= prescale_power - 1;
end if;
end if;
end process;
Eventually these two are used to feed into a counter that counts in BCD (more precisely, it has a 32-bit adder inside that can add in BCD or binary):
counter: freqcounter Port map (
reset => RESET,
clk => freq_2048(11),
freq => baudrate_x1,
bcd => '1',
add => X"00000001",
cin => '1',
cout => open,
value => baudrate_debug
);
The counter assumes that the "clk" signal is 50% duty cycle, as it has 2 counters which work on opposite sides of the clk level. Counts accumulated on "high" side are displayed on "low" side and vice versa, with the net result that each 1s the count ("freq" signal) is refreshed. Because 50MHz cannot be divided by some integer to create exact baudrate, they are off by less than <1% which is of course well within timing tolerances.
This way the crucial UART frequency generation, LED debug display etc. are tested.
Mode 1 - Accept ASCII stream in Intel HEX format, and write to memory
The key component here is Hex2Mem which I intend to document better on its own page. But few explanations here until I get around to do it. Refer to slightly modified VHDL and microcode.
Basic operation is as follows:
- Wait for ASCII character
- If there is one, branch to location that processes it (the ASCII code can be thought of as an "instruction")
- If invalid, output error, go to step 1
- If valid, process it based on which it is and what is expected or not (for example ":" can come only once at the beginning of line, spaces or tabs anywhere but will be ignored, unless they are between hex digits that should not be split (e.g. data bytes)
- Each two digits are written into one internal byte memory location (there is a small 64 bytes buffer)
- As a byte is written into internal RAM, the checksum is updated
- The number of bytes received is checked with expected record length, for error check
- Final byte received is the checksum. Added to accumulated checksum it should result in 0x00 in the LSB of the checksum register
- If checksum is correct, the data bytes are written in a burst to external RAM bus. This means RAM will not be thrashed by bad checksum record
- Either CR and/or LF indicates end of record, this increments the line counter, clears the character counter (these are only used to show error message) and processing of new record can start.
This is how it is hooked up into the design:
hexin: hex2mem Port map (
clk => hex_clk,
reset_in => reset,
reset_out => open,
reset_page => page_sel, -- not really used but i8080-like system would reset at lowest 8k updated
--
debug => hexin_debug(15 downto 0),
--
nWR => nWrite,
nBUSREQ => hexin_busreq,
nBUSACK => hexin_busack,
nWAIT => nWait,
ABUS => ABUS,
DBUS => DOUT,
BUSY => hexin_busy, -- yellow LED when busy
--
HEXIN_READY => hexin_ready,
HEXIN_CHAR => hexin_char,
HEXIN_ZERO => open,
--
TRACE_ERROR => dip_traceerror,
TRACE_WRITE => dip_tracewrite,
TRACE_CHAR => dip_tracechar,
ERROR => LDT2R, -- red LED when error detected
TXDREADY => tty_sent,
TXDSEND => hexin_debug_send,
TXDCHAR => hexin_debug_char
);
Signals:
- clk (IN) - common 12.5MHz, can be virtually any speed, but fast enough to be able to keep up with incoming baudrate
- reset_in (IN) - classic reset
- reset_out (OUT) - will generate a pulse if write is detected to any of the "reset_page" 8k blocks. Typically this would be 0x0000 - 0x3FFF for "PC starts at 0" CPUs (808X, CDP1802), and 0xC000 - 0xFFFF for "reset vector" CPUs (65XX, 68XX, 99XX)
- reset_page (IN) - 8 bits, each indicates 8k block
- debug (OUT) - signals from microcode controller unit, useful for single stepping through microcode
- nWR (OUT) - Z80 style memory write signal
- nBUSREQ (OUT), nBUSACK (IN) - DMA signals. The first time nBUSREQ will be generated when a valid HEX record has been received. At that point, memory write can be allowed if nBUSACK goes low. Which will only happen if operation mode is this one.
- nWAIT (IN) - see description below
- ABUS (OUT), DBUS (OUT) - connections to system bus (in this case, wires to 8085 SBC board)
- BUSY (OUT) - blinkenlight :-)
- HEXIN_READY (IN) - connected to UART, when a valid serial character is received, UART generates which pulse which captures the received character to process.
- HEXIN_CHAR (IN) - 8 bit ASCII character from input stream
- TRACE_ERROR, TRACE_WRITE, TRACE_CHAR (IN) - 3 independent switches that enable tracing when error, when writing to memory or when character is received. These are simply conditions for microcode, if true then execution branches to tracing (output of a text string). This is a fundamental advantage of microcoded designs as the debug facility can be written along (or best - before) the rest of the design / code!
- ERROR (OUT) - red blinkenlight!
- TXDREADY (IN), TXDSEND (OUT), TXDCHAR (OUT) - these are connected to TTY which allows tracing to be shown on VGA. Note that writing to VGA still takes some time so the input stream should be delayed by character or line when extensive tracing is turned on.
The video is a shaky recording of a session to input from a test HEX file into the memory. It wasn't successful because I forgot to clear the wait mode, so the component was stuck waiting to write a byte (false condition, so repeat kept executing):
// ask CPU for memory, then write 1 byte with any number of optional wait cycles
writemem: ram_addr = bytecnt, nBUSREQ = 0;
ram_addr = bytecnt, nBUSREQ = 0, if nBUSACK then repeat else next;
ram_addr = bytecnt, nBUSREQ = 0, nWR = 0;
ram_addr = bytecnt, nBUSREQ = 0, nWR = 0, if nWAIT then next else repeat;
Finally, I typed a few random characters to show how it detected bad input and emitted error message about it:
// error codes are 1 to 6, 0 means no error
errcode: .regfield 3 values
ok,
err_badchar, // ERR1
err_unexpected, // ERR2
err_badchecksum, // ERR3
err_badrecordtype, // ERR4
err_badrecordlength, // ERR5
same
default same;
While I was fiddling with WAIT, the host was sending data, and because there is no handshake, many bytes got lost. Eventually it sync'd up with ":" record start character and after that it wrote to RAM and output the trace:
if TRACE_WRITE then next else nextaddr;
emit(char_A); // A[address]=data
emit(char_open);
printaddr();
emit(char_close);
printram();
The wait circuit is implemented in top level component, because it is reused by HEX2MEM and MEM2HEX. It is triggered by either component activating nRD or nWR signal (nAccess signal). That means memory operation is requested. If the WAIT is enabled (a S/R flip/flop controls that) then nWAIT is locked low until a button is pressed. This way each memory access can be inspected (the A and DBUS values appear on the 7seg LED which is conveniently 6 digits on Anvyl so 4 hex A and 2 hex DBUS can be displayed).
The FF below has a little trick - the clock itself is multiplexed depending on its state. When not in WAIT mode (nWait = '1') it will be triggered on nRD or nWR going low, but once waiting, then press on the button(3) flips in around. Therefore:
- start wait mode: button(1)
- advance: button(3)
- stop wait: button(2), reset, or changing mode (reset_sw signal)
-- Wait signal
wait_ena <= not (reset or reset_sw or button(2) or wait_dis);
wait_dis <= not (button(1) or wait_ena);
wait_clk <= (not nAccess) when (nWait = '1') else button(3);
on_wait_clk: process(reset, wait_clk)
begin
if (wait_dis = '1') then
nWait <= '1';
else
if (rising_edge(wait_clk)) then
nWait <= not nWait;
end if;
end if;
end process;
Mode 0 - Read memory contents and convert to Intel HEX format ASCII stream
The key component in this mode is predictably the Mem2Hex described here. This is how the component is hooked-up:
hexout: mem2hex port map (
clk => hex_clk,
reset => reset,
--
debug => hexout_debug(15 downto 0),
--
nRD => nRead,
nBUSREQ => hexout_busreq,
nBUSACK => hexout_busack,
nWAIT => nWait,
ABUS => ABUS,
DBUS => DIN,
START => button(0),
BUSY => LDT1Y, -- yellow LED when busy
PAGE => page_sel, -- select any 8k block using micro DIP switches
COUNTSEL => '0', -- 16 bytes per record
TXDREADY => tx_ready,
TXDSEND => hexout_send,
CHAR => hexout_char
);
Few notes:
- I copied the VHDL and microcode (split) because I expected more changes to do to the design, but in the end they were mostly cosmetic.
- clk (IN) - 12.5MHz, but this is not critical, it can go from 0 to 50MHz.
- reset (IN) - Anvyl has no "hardware reset" button, so pressing all BTN together is a "reset"
- debug (OUT) - state from microcode controller driving the design is output and can be shown on 7seg LED (useful to single step through microcode)
- nBUSREQ (IN), nBUSACK (OUT) - Z80 syle DMA signals.
- nWait (IN) - there is a common WAIT state generation circuit shared by Mem2Hex and Hex2Mem (described above)
- ABUS (OUT), DBUS (IN) - connected to outside world, along with nRD
- START (IN) - triggered manually (see my finger on the button in video below :-) )
- PAGE (IN) - original component supports selecting any combination of 8k pages. I ran out of switches so I combined 2 bits per DIP to configure the memory to be output, so it is 16k blocks ("page_sel")
- COUNTSEL (IN) - allows 16 (0) or 32 (1) bytes per record.
- TXDREADY (IN) - handshake signal for character output. Microcode waits for this signal to go high before next character is emitted.
- TXDSEND (OUT) - if TXDREADY is high, then a character is put into outside buffer and this signal driven high. The UART par2ser implements the reverse side of this protocol. A FIFO could be injected between them.
- CHAR (OUT) - ASCII code of the HEX stream generated appears here.
Here is how the send character handshake appears in the microcode:
// "UART" is supposed to signal TDXREADY = 1 when presented 0x00 or when serial trasmit is done
emit: if TXDREADY then next else repeat; // sync with baudrate clock that drives UART
if TXDREADY then next else repeat;
if TXDREADY then next else repeat;
if TXDSEND then return else return;
TDXREADY is checked 3 times in a row to prevent any clock domain glitches. Finally, the TXDSEND is checked, but this condition is hardcoded to "1", means it will always return to the caller at this point, but a simple comparator is hooked up to look for check of this condition to generate the send pulse:
-- hack that saves 1 microcode bit width
TXDSEND <= '1' when (unsigned(m2h_seq_cond) = seq_cond_TXDSEND) else '0';
Sanity check for I/O:
Reading the I/O space can give some indication if it "sniffs right", like in this case. The only IC hooked up to I/O space is 8251 UART, which is enabled when address is XXXXXXXX0001XXXX - when dumping out addresses that match it is visible that "something" appears in those locations, while everywhere else the DBUS returns the default float high.
Discussions
Become a Hackaday.io Member
Create an account to leave a comment. Already have an account? Log In.