Uploading llvm-mc elf to FPGA and Simulation

After a lot of pain and suffering(tm), I managed to get Vivado to update the bitstream to contain the contents of an Ember elf binary!

I honestly had no idea how convoluted integrating the assembler would be, but it appears to be working now, and each time I implement the design, the tool automatically adds the binary data to the BRAM initialization block so the program is on the FPGA when it starts. To get this working, I created a tcl script to update a mem file containing the binary data for simulation and implementation, and another to just patch the bit file if I'm just updating the assembly program (so I can just upload the patched bit file, not wait minutes for a new one to be implemented...)

Previously, I just built the instructions using {, , , } syntax in the memory definition, which is not a solution long term. It required a full re-synth/impl pass just the change a bit in the test program, and also, writing instructions required hand-constructing the opcodes. Now I can just write normal assembly in an editor, assemble it with llvm-mc, link it with llvm/lld, then run the scripts to patch the resulting elf file.

Previous method (in .sv file for BRAM module):

30'h00000000: data.mov <= '{OpCode::op_mov, WidthCode::w, MovReg::active, RegSet::gp, Reg::zero, MovReg::active, RegSet::gp, Reg::r2, 11'h000 };
30'h00000001: data.mov <= '{OpCode::op_mov, WidthCode::w, MovReg::active, RegSet::gp, Reg::r2, MovReg::user, RegSet::gp, Reg::r3, 11'h000 };
30'h00000002: data.mov <= '{OpCode::op_mov, WidthCode::w, MovReg::active, RegSet::system, SysReg::pc, MovReg::user, RegSet::gp, Reg::r4, 11'h000 };
30'h00000003: data.sys <= '{OpCode::op_halt, 26'h0 };
                    
30'h00000004: data.mov <= '{OpCode::op_mov, WidthCode::h,  MovReg::active, RegSet::gp, Reg::r2, MovReg::active, RegSet::gp, Reg::r1, 11'h000 };
30'h00000005: data.mov <= '{OpCode::op_mov, WidthCode::sh, MovReg::active, RegSet::gp, Reg::r2, MovReg::active, RegSet::gp, Reg::r1, 11'h000 };
30'h00000006: data.mov <= '{OpCode::op_mov, WidthCode::b,  MovReg::active, RegSet::gp, Reg::r2, MovReg::active, RegSet::gp, Reg::r1, 11'h000 };
30'h00000007: data.mov <= '{OpCode::op_mov, WidthCode::sb, MovReg::active, RegSet::gp, Reg::r2, MovReg::active, RegSet::gp, Reg::r1, 11'h000 };

Assembly file version:

.org 0
_start:

    // MOV Test
    mov      zero, r2 ; write nothing
    mov      r2, ur3  ; user r3 to r2
    mov      pc, ur4  ; skip the following halt
    halt

    mov.h    r2, r1   ; 16-bit (zero extended) r1 to r2
    mov.sh   r2, r1   ; 16-bit (sign extended) r1 to r2
    mov.b    r2, r1   ; 8-bit (zero extended) r1 to r2
    mov.sb   r2, r1   ; 8-bit (sign extended) r1 to r2

    ...

Making this work requires the use of the Vivado updatemem.exe tool (which replaces the data2mem.exe tool since about 2015 or so). It is not as simple to use, but once it works, it is quite useful.

Unfortunately, there is not a lot online about the newer tool (outside of the intended use with their MicroBlaze IP), but I managed to piece it together, mostly by looking at .mmi and .smi files people have made and posted online for their CPUs, which mostly use AXI busses also. If you are not using a Xilinx IP, or a core with AXI, you can't use their integrated ELF support, unfortunately. This site was particularly helpful if you are interested in the details of creating a .mmi file.

MMI and SMI Files

To use updatemem.exe, I first had to create a .mmi file, which describes where the BRAM is located on the FPGA, and how it is configured in the bitstream. Here is what I have for my very simple 4k test RAM (basically 1 32(+4p)-bit BRAM block on the Spartan7):

<MemInfo Version="1" Minor="5">
  <Processor Endianness="Little" InstPath="my_bram">
    <AddressSpace Name="my_local_bram" Begin="0" End="4095">
      <BusBlock>
        <BitLane MemType="RAMB36" Placement="X0Y3">
          <DataWidth MSB="31" LSB="0"/>
          <AddressRange Begin="0" End="1023"/>
          <Parity ON="false" NumBits="0"/>
        </BitLane>
      </BusBlock>
    </AddressSpace>
  </Processor>
  <Config>
    <Option Name="Part" Val="xc7s15ftgb196-1"/>
  </Config>
  <DRC>
    <Rule Name="RDADDRCHANGE" Val="false"/>
  </DRC>
</MemInfo>

Here is the slight variation for the .smi file...basically the same thing as the .mmi file, but instead creates a .mem file for simulation or synthesis, instead of patching the .bit file for the hardware:

<MemInfoSimulation Version="1" Minor="5">
  <Processor Endianness="Little" InstPath="my_bram">
    <AddressSpace Name="my_local_bram" ECC="NONE" Begin="0" End="4095">
      <BusBlock>
        <BitLane MemType="RAMB36" MemType_DataWidth="32" MemType_AddressDepth="4095">
          <DataWidth MSB="31" LSB="0"/>
          <AddressRange Begin="0" End="1023"/>
          <Parity ON="false" NumBits="0"/>
          <MemFile Name="mem/block_ram.mem"/>
        </BitLane>
      </BusBlock>
    </AddressSpace>
  </Processor>
  <Config>
    <Option Name="Part" Val="xc7s15ftgb196-1"/>
  </Config>
  <DRC>
    <Rule Name="RDADDRCHANGE" Val="false"/>
  </DRC>
</MemInfoSimulation>

This method is pretty fragile since if the FPGA implementation changes and uses a different BRAM block, this will fail and I'll have to update the files...right now it uses the single block at X0Y3 on the chip.

UpdateMem

The tcl script to call updatemem is super simple. You could parameterize things, but here I just hard-coded them for my testing. Once you have the .smi file above, just point it at the assembled .elf file and .bit file, then give it a name for the patched .bit file. For the -proc parameter, instead of some IP CPU, just set it to the same string as the InstPath in the .mmi file.

Now, in order to generate a new .mem file from my assembled .elf, I can run this script from the tcl Console in Vivado (first cd to the root directory of the Vivado project):

proc update_mem {src_name} {
    set filename "${src_name}"
    exec updatemem -meminfo mem/block_ram.smi -data ../../rtl/asm/bin/$filename -proc my_bram -force
}

The .mem file will look something like the following:

// 
// Mem file initialization records. 
// 
// Data File: C:/Development/Research/FPGA/ember/rtl/asm/bin/TestProgram.elf
// Data File: The input elf file is: 32 address width.
// 
// 
// Copyright 1986-2021 Xilinx, Inc. All Rights Reserved.
// SW Build 3367213 on Tue Oct 19 02:48:09 MDT 2021
// updatemem v2021.2 (64-bit)
// 
// Address Space Name: 'my_local_bram'
//           Data Bus: [31:0].
//      Address Range: [4095:0] [0X0000000000000FFF:0X0000000000000000]
//      Address Depth: [1023:0].
// 
// Bus width = 32 bits, number of bus blocks = 1.

@00000000
   28001000 28051800 28232000 04000000 28840800 29040800 29840800 2A040800
   28801000 29441800 28600800 28621000 28851000 29051800 29858000 2A058800
   2C101234 2C10FFFF 2C901234 2C90FFFF 2D9000FF 2D9000FF 2D104567 2D108EEE
   2E100067 2E1000FA 2C10FFFF 2C105678 2D10FF67 2D9000FF 2E1000F9 2E100039
   40100000 41914001 45914001 45914001 42114002 46114001 46114001 421140FF
   40900000 40914001 44914001 44914001 41114002 45114001 45114001 41117FFF
   00000000 04000000

The .mem file needs to be added to BOTH the Design Sources (for Implementation) and Simulation Sources (for Vivado Simulation). You can add the same file to both. This will initialize both the FPGA and Simulation with the contents of the .mem file.

Then add some code in the HDL source like the following Verilog for inferred BRAM:

    initial begin        
        $readmemh("block_ram.mem", memory);        
    end

No path is needed if you add it to sources. Then, when you run the simulation, Vivado will load this file and put the values in memory.

UpdateMem Patching

Once you generate a bitstream, you can patch the .bit file directly to just update the memory contents for a new .elf file. This is useful when you are not changing the CPU hardware implementation, but just want to update the program running on the CPU. This patch happens in just a few seconds, rather than minutes, or longer, for a full build.

This time use the .mmi file, and specify the original .bit file and a new filename for the patched .bit file:

proc update_elf {src_name} {
    set filename "${src_name}"
    exec updatemem -meminfo mem/block_ram.mmi -data ../../rtl/asm/bin/$filename -bit Ember.runs/impl_1/cpu.bit -proc my_bram -out Ember.runs/impl_1/cpu_programmed.bit -force
}

Be sure to upload the *_programmed.bit file, not the original file!

Discussions

Tom wrote 02/20/2022 at 22:40

Cool, thanks. I was just about to write a tool to save out a mem file from an elf when I found that info about the original Data2Mem tool (which is no longer supported) and the newer version. My emulator already loads elf files, so it would be much work. This is much better though, since not only can I create mem files, but it patches bit files too. so far it works pretty well.

Are you sure? yes | no

zpekic wrote 02/20/2022 at 18:31

Great job reverse engineering and hijacking the .bit file to speed up development. I was facing a similar problem but took a completely different route solving, by creating a "processor" that accepts native Intel .hex file format. If I find time I will add Motorola S-files. The target system can distribute the incoming byte writes to any type of memory or word width format as required. https://hackaday.io/project/181664-intel-hex-files-for-fpgas-no-embedded-cpus

Are you sure? yes | no

MMI and SMI Files

UpdateMem

UpdateMem Patching

ALU, LDI, NOP, HALT at 100MHz - Part 2

Project Update

Discussions

Become a Hackaday.io Member