Intel HEX file input/output for FPGAs

Uploading / downloading Intel HEX file stream, implemented without embedded processor!

Public Chat
Similar projects worth following
About the format and use:

Intel HEX files have been used for many microcomputer systems for 40+ years. They have been supported by many platforms and development tools, and are still popular in retro-computing community.

For FPGAs, they can be used in two main ways:
1. Build-time, memory is initialized during build of the binary file using text file in .hex format
2. Run-time, memory is initialized by uploading a .hex file to running system

It is obvious that for (2) some sort of running processor (in addition to I/O device) is needed. This is usually provided by some embedded processor now common with FPGAs, often a "soft-core". What if the having such an embedded processor is an expensive overhead, or is not desired from design purity perspective? That's where this project comes in handy!

More info to come, but for now a demo video in action:

The best way to test and illustrate new components is to put them into a "end to end" project. It is even better to be able to visualize what is happening, and for that purpose a video output component was reused from another project

Here is the somewhat simplified schema of the top level file of the project, which is at the same time the test circuit:

The "new and noteworthy" components are explained in the project logs. The dual port RAM is initialized with an image during build time (as is often the case with FPGA projects, for example, "firmware" for the system being implemented). During runtime, it can either be loaded with new HEX file stream, contents read as HEX stream, or both (not at the same time, switch(0) on Mercury board selects the mode). 

Some other components visible on the schema:

  • 50MHz internal clock - it is divided by two and fed as pixel clock to VGA, closely matching the need for 640*480, 50Hz refresh video mode
  • 96MHz external clock - it is fed by "half can" on Mercury baseboard. All clocks in the project are derived from this one. Most importantly:
    • Divided by 2, it becomes hex_clk that is fed to hexin and hexout components (max. 12MHz)
    • Divided by constant, becomes the standard baudrate from 600 to 57600 Hz.
  • vga_controller - it is able to display image in 2 different formats, selectable by switch(1). Given that both formats occupy less than the 640*480 resolution, a "hardware window" is displayed on static background, and can be moved in 4 directions using the buttons on the baseboard
  • uart_modesel - simple 3 bit counter that allows selecting the UART mode (default at reset: 000 == 8 bit, no parity, 1 stop)
  • uart_baudsel - simple 3 bit counter that allows selecting the UART speed (default at reset: 111 == 57600)
  • vram_addrb - address MUX on the read side of the memory. Both vga_controller (to generated the image in the display window) and hexout (to assemble HEX character stream to output) need access to memory. Precedence is given to VGA, but if hexout manages to get access and flip the MUX to its side, it will keep it until the end of read cycle. This causes some "snow" on the image. It can be eliminated with a small speed tradeoff. Note that both components support full 16-bit address space, but the memory is 32k so A15 is dropped.
  • vram - this is 32k*8 dual port RAM, intrinsic to Xilinx Spartan, but most other FPGAs will have it as standard component. It contains the image which can be changed via HEX file upload.

Top level entity description defines the use of hardware resources on the board. A/D, PS/2 and audio in/out are not used. 

entity hex_io_mercury is
    Port ( 
                -- 50MHz on the Mercury board
                CLK: in std_logic;
                -- 96MHz external clock
                EXT_CLK: in std_logic;
                -- Master reset button on Mercury board
                USR_BTN: in std_logic; 

                -- Switches on baseboard
                -- SW(0) -- OFF: accept HEX input, ON: generate HEX output
                -- SW(1) -- OFF: TIM-011 video (512*256, 4 colors), ON: V99X8 video (256*192, 16 colors)
                -- SW(2) -- HEX_CLK speed sel 0 (000 = trace mode, tracer is active)
                -- SW(3) -- HEX_CLK speed sel 1
                -- SW(4) -- HEX_CLK speed sel 2 (111 = 12MHz)
                -- SW(5) -- ON: Enable character echo trace for HEXOUT
                -- SW(6) -- ON: Enable write to memory trace for HEXOUT
                -- SW(7) -- ON: Enable error trace for HEXOUT

                SW: in std_logic_vector(7 downto 0); 

                -- Push buttons on baseboard
                -- BTN0 - HEX input mode: move window right    ; HEX output mode: start output
                -- BTN1 - HEX input mode: move window left    ; HEX output mode: increment mode register
                -- BTN2 - HEX input mode: move window down    ; HEX output mode: select uart_mode
                -- BTN3 - HEX input mode: move window up        ; HEX output mode: select uart_baudrate
                BTN: in std_logic_vector(3 downto 0); 

                -- Stereo audio output on baseboard
                --AUDIO_OUT_L, AUDIO_OUT_R: out std_logic;

                -- 7seg LED on baseboard 
                A_TO_G: out std_logic_vector(6 downto 0); 
                AN: out std_logic_vector(3 downto 0); 
                DOT: out std_logic; 
                -- 4 LEDs on Mercury board...
Read more »


Project binary file, use with Micro-nova mercury programmer tool.

bit - 146.13 kB - 09/19/2021 at 21:52


  • Tracing and debugging for microcoded controllers

    zpekic5 days ago 0 comments

    More details coming soon.

  • ser2par - a novel UART receiver

    zpekic5 days ago 0 comments

    Refer to the UART basics, and the component source

    UART "receivers" that convert serial bit stream into parallel word and "done" signal are usually implemented as state machines. The trick is to observe the space ('0') state of the RXD input to decide if it is long enough to qualify for start bit, and determine the mid-point of the start bit to sample data bits at 1 / baudrate time intervals after that. Once the whole frame is counted, the state machine needs to be reset to initial state and watch for start bit again. There is some complexity to such state machine, which has to run over twice the frequency of the incoming data stream (sampling theorem), but in reality much faster, usually 4 or 8 times faster. 

    It can be simplified, and no state machine is needed, with a simple observation:

    • if  we have n / 2 + 1 mark bits ('1') in row (e.g. 3 for baudrate * 4), then is must be either a data 1, or a stop bit
    • if at the same time, there are n / 2 + 1 space bits frame time in the past, then this must be a stop bit, and everything between is a data frame

    With this, one has to simply have a 44 bit shift register (max 11 bits per frame supported * clocked at baudrate * 4), which receives RXD on the right (shift up), and simultaneously acts as a delay line. The stop bit is detected at the right side ("now"), and start bit at the left side ("past"). 

    (more details soon) 

  • par2ser - a novel UART trasmitter

    zpekic5 days ago 0 comments

    Refer to the UART basics, and the component source

    When it comes to converting parallel data to serial format, an idea of shift register comes to mind, and this is how often such circuits are implemented. However, with start / stop / parity bits, the shift register must be longer than the data, and with parallel data already buffered, the number of register bits doubles.

    This component uses a simple MUX instead, and a 4-bit counter (bitSel). Operation is as follows:

    1. Reset clears bitSel
    2. if bitSel is 0000, the clock input is MUXed to "send" input signal
    3. external circuit presents data at the input and on rising edge of "send":
      1. bitSel is incremented to 0001
      2. char is loaded from data (input data is free to change after this)
    4. now that bitSel is != 0000, the clock is MUXed to baudrate
    5. as bitSel is incremented with baudrate frequency, the 16-to-1 MUX presents the right output to TXD (1, 1, 1, 0, char(0)... char(7)...)
    6. after char(7), the next bit depends on parity mode if selected
    7. finally a stop bit is transferred to TXD (this is simply MUX input driven to '1')
    8. when bitCnt reaches 1110, it is reset to 0000 and the circuit is ready from step 2 above

    When bitCnt = 0000, it can also be used as a ready signal for the higher level circuit, meaning par2ser is idle and waiting to be loaded with data to transmit.

    Main clock is baudrate * 1, which is the speed at which TXD MUX needs to change inputs. The operation mode is given by 3 mode bits:

    modedata lengthparityframe length
    1008space (0)11
    1018mark (1)11

  • mem2hex component - read from memory and generate .hex character stream

    zpekic5 days ago 0 comments

    Refer to microcode and source code, details coming soon.

  • hex2mem component - accept .hex character stream and write to memory

    zpekic5 days ago 0 comments

    Refer to microcode and source code. Details coming soon.

View all 5 project logs

Enjoy this project?



Similar Projects

Does this project spark your interest?

Become a member to follow this project and never miss any updates