After remembering where I was in the project I started to look for bugs in my CPU. I know it does not fully work, for example since running in TI BASIC I get:
PRINT 1*-1 1
So something is not working in the CPU. In order to work on this, I took advantage of my previous design, which combines a real TMS99105 CPU with my FPGA implementation of the TI-99/4A. Running the above on it yields the correct result, -1.
So I wrote a piece of test code, ran it both on the FPGA CPU and the TMS99105 while capturing the results (by dumping a section of memory on both systems to a file), below is the comparison:
Left hand side is TMS99105, right hand side is my soft CPU, i.e. the FPGA TMS9900 core. Each instruction is tested 8 times with different data. The source code of this test is below, after the explanation below.
Each instruction test output takes 8 bytes or 4 words. The last word of the output are the flags (only top 6 bits preserved). The result words are R1,R2,R3 and flags. The instruction is always executed like SUB R1,R2. Thus R2 is the result and R1 shows the source operand.
With that, we can see that there is a difference with the second instruction under test. It is the subtract instruction.
The first subtraction works fine (SUB 1,2, i.e. 2-1) but the second has a difference in the flags (SUB >7FFF,1) where the soft CPU has >2800 while the real CPU has >2000.
The flag that is different is ST4 i.e. the overflow flag. Also in the the other SUB instructions the overflow flag is sometimes bogus, here is a table of the eight cases:
SUB >1,>2 OK (note that with the TMS9900 this actually is the 2-1 operation)
SUB >7FFF,1 Bug: soft-cpu asserts overflow
SUB >8000,>7FFF Bug: soft-cpu does not assert overflow
SUB >7FFF,>8000 Bug: soft-cpu does not assert overflow
SUB >FFFF,>8000 OK
SUB >8000,>FFFF Bug: soft-cpu asserts overflow
SUB >8000,>8000 Bug: soft-cpu asserts overflow
SUB >0,>8000 OK
Looking at the data sheet carefully, there is a difference how ST4 (overflow) is asserted for adds and subs, the first condition is inverted. I bet I don’t do this.
Adds: If MSB(SA) == MSB(DA) and MSB of result != MSB(DA)
Subs: If MSB(SA) != MSB(DA) and MSB of result != MSB(DA)
The only other difference with this data is at >0120, and here the result is wrong (but flags ok). Since R3 has changed, it must be a DIV or MPY instruction.
First instruction test output at 0..>3F, 2nd at 40..7F, 3rd at 80..BF, 4th at C0..FF, 5th at 100..13F. So the fifth instruction.
And indeed it is the DIV instruction - like PNR reported. One of my test cases at least now catches the problem. It is fifth test case, from above it is
DIV >FFFF,>8000 i.e. >8000 divided by >FFFF. The result should be >8000 as quotient and >8000 as remainder.
But my code gives >FFFE as quotient and >FFFE as remainder too.
Here is the TMS9900 assembler test code (story continues after the listing):
; EP 2018-09-23 - run through a sequence of instructions with data and write ; results to RAM. This is to enable comparing the FPGA CPU and TMS9900. LI R5,>2000 ; point to result table LI R7,TEST_ROUTINES ; point to test routines RUN_TEST MOV *R7+,R8 ; address of routine to test LI R6,TEST_DATA_SEQ ! MOV *R6+,R1 ; fetch test parameters MOV *R6+,R2 CLR R3 ; perform operation under test BL *R8 ; save results MOV R1,*R5+ MOV R2,*R5+ MOV R3,*R5+ STST R3 ANDI R3,>FC00 ; only keep meaningful flags MOV R3,*R5+ CI R6,TEST_DEND JNE -! CI R7,TEST_ROUT_END JNE RUN_TEST ; write end marker to memory LI R3,>1234 MOV R3,*R5+ MOV R3,*R5+ MOV R3,*R5+ MOV R3,*R5+ And here is the data: TEST_DATA_SEQ ; Parameters to pass two various instructions DATA 1,2 ; First data set DATA >7FFF,1 ; 2nd DATA >8000,>7FFF DATA >7FFF,>8000 DATA >FFFF,>8000 ; 5th DATA >8000,>FFFF DATA >8000,>8000 ; 7th DATA 0,>8000 ; 8th TEST_DEND TEST_ROUTINES DATA DO_ADD, DO_SUB DATA DO_SOC, DO_SZC DATA DO_DIV, DO_MPY DATA DO_COMP TEST_ROUT_END DO_ADD A R1,R2 RT DO_SUB S R1,R2 RT DO_SOC SOC R1,R2 RT DO_SZC SZC R1,R2 RT DO_DIV DIV R1,R2 RT DO_MPY MPY R1,R2 RT DO_COMP C R1,R2 RT
Once I had analysed the problems, I fixed both the divider and the overflow flag. The fixes for both problems are in the tms9900.vhd source code committed to the soft-cpu-tms9902 branch at GitHub.
The fix for the overflow flag was easy, just taking into account the difference in flag generation. I took still a few iterations from me to get it working properly. The resulting VHDL code for overflow flag generation is:
-- ST4 overflow alu_flag_overflow <= '1' when (ope = alu_compare or ope = alu_sub) and arg1(15) /= arg2(15) and alu_result(15) /= arg1(15) else '1' when (ope /= alu_sla and not (ope = alu_compare or ope = alu_sub)) and arg1(15) = arg2(15) and alu_result(15) /= arg1(15) else '1' when ope = alu_sla and alu_result(15) /= arg2(15) else -- sla condition: if MSB changes during shift '0';
Fixing the division problem was a little more involved, once I understood that I actually need a 17-bit subtraction the divider started to work properly. I had not captured the problem in my earlier testing because I only manually tested a few cases and did not try dividers above 32767, i.e. where the 16-bit divisor has bit 15 set. Implementing this fix in a simple way required me to add a dedicated 17-bit subtract unit to the divider code, rather than relying on the ALU I already had in the design, here is the key snippet of code, the story continues a bit more after the code:
when do_div1 => dividend(15 downto 0) <= rd_dat; -- store the low word shift_count <= "10000"; -- 16 cpu_state <= do_div2; when do_div2 => dividend(31 downto 0) <= dividend(30 downto 0) & '0'; -- shift left -- perform 17-bit substraction, picking up the bit to shifted out too divider_sub <= std_logic_vector(unsigned(dividend(31 downto 15)) - unsi gned('0' & reg_t)); dec_shift_count := True; -- decrement count cpu_state <= do_div3; when do_div3 => if divider_sub(16)='0' then -- successful subtract dividend(31 downto 16) <= divider_sub(15 downto 0); dividend(0) <= '1'; end if; if shift_count /= "00000" then cpu_state <= do_div2; -- loop back else cpu_state <= do_div4; end if; when do_div4 => -- done with the division.
The good news is that after these fixes the instructions that I now test and compare with a real TMS99105 all work identically. The bad news is that the BASIC statement at beginning of this blog entry still gives me the same result - so on with the bug hunt for further bugs!