Note: This project is in progress. Code and app described below may differ from what you may see. At this point I am more interested in learning how to best communicate the intent of desired code changes to the AI tool, than in the actual fully-fledged disassembler functionality.
TL;DR
I estimate that writing the same tool (simple web-based disassembler) without AI would have taken me at least 10X time. Most of that time would have been spent writing the boilerplate, and looking for and integrating right UX components, both of which are mostly overhead and for hobby programmers often pressed for time distract from the "fun parts" of the project. Quality of the code generated is good but depends on the clarity and sequence of prompts given to the tool. In other words - good general software design skills of the "AI developer" and knowledge of the limitations and capabilities of the tool used are crucial to get best results.
Background
To catch up with the times, I started tinkering with some AI-based code generation tools, and was curious how well they would do in a retro-computing setting. Combining old and new usually leads to fun, unexpected learn experiences. Writing a disassembler came up as an idea, due to:
- moderate complexity, even a simple implementation can be useful (esp. for more exotic and/or home-brew processors which have less tools available)
- quality of the generated output can be easily evaluated if using a binary which we know the source code (I used Tiny Basic for Intel 8080 described here)
- canonical implementation of disassembler is through lookup tables and/or switch cases with complexity added due to instruction sizes, mnemonic variations, reference (e.g. relative and absolute jump target) resolution etc. Therefore, viewing the "design" choices AI tool makes (or not) to adhere to these patterns gives interesting insights into code generation process and choices. Fancier disassemblers also recognize ASCII sequences, various programming tricks (e.g. BIT instruction trick for 6502), special locations / vectors (like RST for 8080 and derivatives) etc.
There are many AI tools currently available for assisting software development, and the eco-system is rapidly evolving. Use of one over the other may become a question of preference, price, and specific usage niche. I used Loveable and found it a very good tool due to:
- "sandbox" integration - app is instantly viewable and code too (this facilitates fast round-trip)
- good github integration
- interaction which provides "echo" explaining what the tool is actually attempting to do, with a link to code changes
- simple publishing of single-page React-based web app (with possibility of publishing to own domain if available)
- specific "knowledge" can be provided for the project (although I could not evaluate to which extent it was used)
On the downside, the daily limitation of free use was enforced a bit too harshly - to the point that code generated was simply cut leaving it in a broken state until upgraded to paid offering, or new daily "quota" was issued. Eventually this trick worked as I was sufficiently impressed with the tool to purchase a subscription.
The disassembler
- Link to source code (I deliberately refrained from any "human" touch-ups, all changes are made by the lovable-dev[bot])
- Link to working app
Features:
- Required input:
- binary file (.bin extension) which is a sequence of bytes one would typically use to burn into an EPROM, or dump code memory etc. For test purposes, I used the "Tiny Basic" binary (same code for which I have written a CPU and "single-board computer" in VHDL)
- Optional inputs:
- offset address added to the binary byte counter / pointer (e.g. if you have a binary which is supposed to work from address != 0000 - remember code for Intel 8080 and its derivatives is not relocatable)
- Targeted CPU instruction set (Z80, Intel 8080, 8085)
- Output format (.lst shows addresses and instruction bytes in hex format along with disassembled code, .asm only the latter, so it is more ready to be re-assembled again)
- Output:
- Hex view (along with ASCII characters on the side when in range 0x20 .. 0x7E), including option to download in Intel HEX file format.
- Disassembly view. This allows viewing either as .lst or .asm with possibility to copy and/or download it as file. Flavors:
- Z80 - uses Z80 type mnemonics, "escape" op-codes (0xCB - bit, 0xDD - IX, 0xFD - IY) are not (yet) implemented (not to mention the undocumented ones)
- Intel 8080 and 8085 - attempted to use Intel mnemonics, but a bit of hit or miss so far (needs to be refined). Same for RIM and SIM 8085-only instructions

What is not (yet / quite) working:
- assembly output format does not add any .org or .db .dw or similar pragmas - this would require some detection of byte run patterns (e.g. fills with NOPs or ASCII character sequences)
- Intel 808x format is not quite as it should be, for example still shows "MOV A, nn" instead of "MVI A, nn" etc.
- When in 808x mode, unsupported (but well-known) instructions are interpreted as valid Z80 instructions, instead of flagging them in disassembled code or disassembling them to how they actually execute (for example 0x38 is NOP for 8080, not JR C, offset8)
Problem (1) above is solvable by adding more smarts to the decoding logic which would have to observe "run lengths" beyond single instruction boundaries. Fixing (2) and (3) seems easier, but is actually quite difficult at this point - the reason being that I made a software design error: Z80 and 8085 both have supersets of 8080 instructions, therefore I should have started with 8080 first, and then on top of that set implemented the derived ones. Instead, I started with Z80 and tried to explain to the tool the subset concept with is led to buggier and messier implementation choices. I estimate that it would be easier and faster to start a new project with right way to build it up than to "explain" to bot all the refactoring steps needed to bring it to correct design shape. This is probably the single biggest learning from this project - AI or not, the old advice of design well, think it through and only then code - which was always in effect - still applies.
Development process
Loveable has a simple but very effective web-based IDE (integrated development environment). The process is interactive dialog with the system, and after each prompt result can be observed either in the code, or sandboxed app. Input can be made richer by attaching file with relevant data, and response contains link to code change and "echo" of the command as it was interpreted by the system. It is smart enough to figure out if the change is a "new feature", "refactor", or "fix", and code change produced will reflect that too. Each prompt that is results in code change is effectively a github commit, and this granularity allows tracking and reverting every change.

It took about 20 or so prompts to have an app that was able to disassemble the Tiny Basic, and this could be achieved in few hours (it took me few days as I kept running out of daily quota). However, exact tweaking required lots of "chatting", to the point one is tempted to just dig in and "correct" the code manually (which is also possible and good github integration makes it "round-trippable").
Closely examining code and app behavior after each command is interesting exercise, and it gives insight into how the system "thinks". Observing the patters it is taking allows crafting more efficient commands in the chat pane, and therefore results that align with intent of the developer. Below are some example.
Command | Link to commit | Observation |
(Project kick-off) Create a command line tool for windows, which reads .bin files and dissembles them into Z80 assembly source code | Add command line tool · zpekic/bin-to-z80@60d9bde | Generated a very compelling boiler-plate React app right off the bat, with binary file upload control and text box to enter the offset for the bytes in binary file. In addition, added a workable initial structure to describe instructions and implemented a few of them: // Define instruction types
type Z80Instruction = {
mnemonic: string;
operands: string;
bytes: number[];
size: number;
comment?: string;
};
My conclusion is that It must have had knowledge of similar online tools / apps, and from that even anticipated adding a download control for the .asm file.On a funny note - Lovable wanted to keep me happy and responded that it will generate windows command line utility, but then is still proceed to do what it knows how to do - generate a React / typescript web app :-) |
(feature) if the target of CALL or JP instructions falls into the range of disassembled code, replace the destination address in hex format with text label. | Refactor: Add labels for CALL/JP targets · zpekic/bin-to-z80@c3d04d6 | I had the feeling it was just waiting to implement this one. Obviously picked up from some other project as the response was spot on: The z80-disassembler.ts file has been updated to support labels for jump and call targets. Now, when CALL or JP instructions point to an address within the disassembled code range, the destination address will be replaced with a text label in the format "L_XXXX" (where XXXX is the hexadecimal address). The labels are also displayed in the assembly listing at their corresponding addresses. This makes the disassembled code more readable and closer to what you would see in hand-written assembly code. Unfortunately, it broke the code, missing the implementation of a critical function (see below) |
(feature/refactor/fix) implement isAddressInRange function to return true if input value is range of 0 to number of bytes in input binary file, otherwise false. | Add isAddressInRange function · zpekic/bin-to-z80@bef29f7 | Result as expected, even added a [rather redundant] comment
// Check if an address is within a given range
const isAddressInRange = (address: number, start: number, end: number): boolean => {
return address >= start && address <= end;
};
|
(refactor) Note that the z80-disassembler.ts file is getting quite large (over 580 lines). After this fix, you might want to consider refactoring it into smaller modules for better maintainability. | Fix: Resolve TypeScript errors in disassembler · zpekic/bin-to-z80@617432c | The bot offered to do this refactoring, based on some internal evaluation of complexity / file size, which was pretty smart. Just like when human developers do refactoring - it broke the code :-) However, it detected that and subsequently fixed it. |
(feature) if the single byte value in LD or CP instructions falls in range 0x20 to 0x7F, show the value as ASCII character between single quotes | Fix: Format byte values with ASCII chars · zpekic/bin-to-z80@89685fe | At first, this caused another build break, but after automated fix, eventually it got it mostly right. After this change, some code (such as "return position of first non-blank character in DE") started to look more understandable:
L_0028:
1A LD A, (DE)
FE 20 CP ' ' ; 20h
C0 RET NZ
13 INC DE
C3 28 00 JP L_0028
|
(feature) In disassembly settings, add a dropdown named "Target instruction set" with 3 options: Z80, Intel 8080, Intel 8085. Default is Z80. | feat: Add instruction set selection · zpekic/bin-to-z80@426349f | Really done perfectly. Added the control, ensured that the value is unique (and not same like the display value, ensuring easier future localization, set the right default. |
(feature) If the target instruction set dropdown is Intel 8080 or Intel 8085 use Intel assembly mnemonics as explained in https://pastraiser.com/cpu/i8080/i8080_opcodes.html, not Zilog ones. | Refactor: Support Intel 8080/8085 mnemonics · zpekic/bin-to-z80@b04234d | This one was very fascinating to see. The AI system did this exactly the way perhaps a junior developer would do it - clunky but somewhat workable. Instead of going back to the design of instruction lookup table (and extending it with data specific to Intel processors), it introduced a "string translation" lookup from Z80 to Intel mnemonics. Along the way, it "hallucinated" a few instructions for 8080 which do not really exist, for example it "invented" the DCR B / JNZ sequence. Another problem with this approach is that it actually mixes up the mnemonics: EX HL, (SP) becomes XTHL HL, (SP) instead of just XTHL. // Map Z80 mnemonics to Intel 8080/8085 mnemonics
export const translateToIntel = (z80Mnemonic: string): string => {
// Mnemonic translation table
const mnemonicMap: Record<string, string> = {
// Direct translations
'LD': 'MOV', // Most LD instructions map to MOV in 8080
'CALL': 'CALL',
'RET': 'RET',
'JP': 'JMP', // Jump is JMP in Intel syntax
'JR': 'JMP', // Relative jumps don't exist in 8080, so map to JMP
'DJNZ': 'DCR B / JNZ', // No direct equivalent
'INC': 'INR', // Increment register
'DEC': 'DCR', // Decrement register
'ADD': 'ADD',
'ADC': 'ADC',
'SUB': 'SUB',
'SBC': 'SBB', // Subtract with borrow in Intel
'AND': 'ANA', // Logical AND in Intel
'OR': 'ORA', // Logical OR in Intel
'XOR': 'XRA', // Logical XOR in Intel
'CP': 'CMP', // Compare
'PUSH': 'PUSH',
'POP': 'POP',
'EX': 'XTHL', // Exchange HL with top of stack (specific case)
'IN': 'IN',
'OUT': 'OUT',
'RLCA': 'RLC', // Rotate A left
'RRCA': 'RRC', // Rotate A right
'RLA': 'RAL', // Rotate A left through carry
'RRA': 'RAR', // Rotate A right through carry
'DAA': 'DAA', // Decimal adjust A
'CPL': 'CMA', // Complement A
'SCF': 'STC', // Set carry flag
'CCF': 'CMC', // Complement carry flag
'HALT': 'HLT', // Halt
'DI': 'DI', // Disable interrupts
'EI': 'EI', // Enable interrupts
'NOP': 'NOP' // No operation
};
|
(feature) Add a tab next to "Disassembled Code" with title of "Hex view" which will display content of the binary file in rows of 16 bytes at a time, in hex format | Add hex view tab · zpekic/bin-to-z80@0b21faf | Nailed it flawlessly. Imported the right "stock" UX component, created a tab for it, and hooked it up. Huge time saving. |
(feature) Add a background image to the input form of the application. This image should show Z80 microprocessor on a motherboard | Add background image to input form · zpekic/bin-to-z80@7ed2b28 | Lovable has lots of brains, but sadly no eyes yet - the code change was spot on, but the opacity and positioning got missed so the image was not visible at all. In addition, the image didn't depict the classic 40-pin Z80 we all love, instead...
In the end, after some attempts to display it properly, I asked it to remove the image, but I am sure it would be eventually able to display it properly. |
(refactoring) Trying to implement the 8080, 8085 as deriving from same basic instruction set common to both | Multiple (still needs some refinement) | It took quite a bit of prompting. I had to be very specific to first implement 8080 and 8085 separately (different files) and then extract the commonality into a "base" data structure. Interestingly (maybe for security purposes?) lovable only takes images as attachments to prompts. So I had to screenshot the instruction tables from sources below, and feed those images in. Both OCR and semantic recognition of the content was perfect. https://pastraiser.com/cpu/i8080/i8080_opcodes.html converted to code: https://github.com/zpekic/bin-to-z80/blob/main/src/lib/cpu/opcodes/intel8080-opcodes.ts https://pastraiser.com/cpu/i8085/i8085_opcodes.html converted to code: https://github.com/zpekic/bin-to-z80/blob/main/src/lib/cpu/opcodes/intel8085-opcodes.ts |
(documentation) create a summary.md file which summarizes the commits made so far to this project | Add summary file · zpekic/bin-to-z80@6adb658 | Although not generating code, this was almost the most impressive response from Lovable. Not only it provided a perfect concise summary of the project but also anticipated what is missing. bin-to-z80/summary.md at main · zpekic/bin-to-z80 |
Discussions
Become a Hackaday.io Member
Create an account to leave a comment. Already have an account? Log In.