Close

CPU internals

A project log for Bit-serial CPU based on crossbar switch

256 switches and few shift registers to implement a working 16 or 32-bit integer arithmetic calculator (+, -, *, /, isqrt, BCD / bin conv..)

zpekiczpekic 04/25/2022 at 00:480 Comments

Core of the calculator is a micro-coded CPU. For micro-code refresher, see:

Lots of its code has been autogenerated by running the mcc compiler on the microcode. This simplified schema shows the main elements:

A successful microcode compiler run will generate 4 VHDL files useful to be included in the project:

.code 8, 52, hexcalc_code.mif, hexcalc_code.cgf, hexcalc_code.coe, hxc:hexcalc_code.vhd, hexcalc_code.hex, hexcalc_code.bin, 8;
.mapper 8, 8, hexcalc_map.mif, hexcalc_map.cgf, hexcalc_map.coe, hxc:hexcalc_map.vhd, hexcalc_map.hex, hexcalc_map.bin, 1;
.controller hexcalc_control_unit.vhd, 8;
.symbol 8, 256, hexcalc_sym.mif, hexcalc_sym.cgf, hexcalc_sym.coe, hxc:hexcalc_sym.vhd, hexcalc_sym.hex, hexcalc_sym.bin, 32;

cu_hxc (hexcalc_control_unit)

This is autogenerated microcode controller unit. It consumes 20 bits of the microcode width for its operation - 4 to select one out of 16 conditions, and 8 each for "then" and "else" branch targets. This means that at each microinstruction a conditional program flow instruction can be executed, or an unconditional subroutine call. 

hxc_microcode

The resulting microcode address (8 bits) selects the 52-bit wide microinstruction. Therefore 32-bits are available to drive the control signals of the remaining components in the CPU, and some (such as STATUS field) drive directly signals going out from the CPU. 

hxc_uinstruction <= hxc_microcode(to_integer(unsigned(ui_address))); -- copy to file containing the control unit. TODO is typically replace with 'ui_address' control unit output

hxc_mapper

256*8 lookup memory - the address is the value of the instruction register, and the output the 1st word of the microcode routine implementing it. This is easiest to see in the .mif file which is also generated.

Symbol file hexcalc_sym.vhd is consumed one structural level up by the tracer unit, so it in not included inside the CPU.


Data registers

CPU has 8 "program accessible" registers (R0...R7). They support:

In addition there are two internal registers of the same size, which:

Registers 0 and 1 (TOS and NOS) can be driven independently, while R2+ all together. To save microcode bit width, the combinations needed for the algorithms are collapsed into 8 cases:

// shift register operations in format: TOS_NOS_Other 				
opr 		.valfield 3 values
			np_np_np,	// no shifts
			np_np_ld,	// only effects 0xC[onstant] and 0xD[ata]
			m2_d2_d2,	// TOS shift up, NOS and other regs shift down
			np_m2_m2,
			d2_d2_d2,
			d2_d2_np,	// used for multiplication
			np_d2_d2,
			m2_m2_np default np_np_np;

bitcnt

This is a 5-bit counter primarily used to count bits in simple operations where single traversal of bits is needed. Microcode does not know the value of bitcount, just that it can be loaded with maximum value (which is 15 or 31), that it can be decremented, or examined if zero (bitcnt_is_zero condition). This means microcode is agnostic to the length of registers, if bitcnt and loopcnt would hold 255 max, and registers of that length, 256-bit numbers could be handled, and 64-digit BCDs converted to binary!


loopcnt 

Similar to bitcnt, but outer loop when iterations are needed (conversions bin <-> BCD, mul, div). It is also loaded with 15 or 31 (note how mode32 input signal selector is used), but can be also incremented in addition to decrementing. 

Non-standard register lengths could be implemented (e.g. 24 or 48 bits etc.) under the condition that a proper wrap-around for these registers occurs ( 0 - 1 = 47, 23 + 1 = 0 etc.)

Much of the code for these have been also auto-generated by mcc:

---- Start boilerplate code (use with utmost caution!)
 update_loopcnt: process(clk, hxc_loopcnt)
 begin
	if (rising_edge(clk)) then
		case hxc_loopcnt is
--			when loopcnt_same =>
--				loopcnt <= loopcnt;
			when loopcnt_max =>
				loopcnt <= mode32 & X"F";	-- 31 or 15
			when loopcnt_inc =>
				loopcnt <= std_logic_vector(unsigned(loopcnt) + 1);
			when loopcnt_dec =>
				loopcnt <= std_logic_vector(unsigned(loopcnt) - 1);
			when others =>
				null;
		end case;
 end if;
 end process;
---- End boilerplate code

TXDCHAR

CPU can generate an output ASCII stream, one character at a time. To save on microcode width, only few characters are supported, such as CR, LF, space, E, R, R. These are generated through a MUX which feeds these direct ASCII values. 8 of the 16 MUX sources come through a hex-to-ASCII lookup table, that way 8 hex values from inside the CPU can be selected and streamed out. Essentially, CPU is doing its own tracing under microcode control.

errcode

CPU recognizes 2 error conditions:

To have the proper register semantics, two more states are needed:

This gives 4 states, so register is 2 bits long. Only Z, N and reset signal clear the error. 


ALU

This is as simple as it gets, few logic gates:

-- ALU!
row_delay <= d_flag; 
row_not <= not col_not; 
row_and <= col_and1 and col_and2;
row_sum <= c_flag xor (col_adc1 xor col_adc2); -- 1 bit full adder sum
-- 12 is constant register
-- 13 is data register
mt_x(14) <= '0';
mt_x(15) <= '0';

Value of some switch matrix column wires are combined using not / and gates and fed back to switch matrix rows.

The adder has additional third signal which the value of previous carry, and the row_delay is hooked up to a flag value which can capture the state of single bit in some previous bit cycle time. Typical use is preserving the value of bit shifted out from a register. 

 FLAGS

Some classic CPU flags are present:

carry, delay and zero flags for TOS, NOS registers (and "AND row") can be inspected (their value is available to the microcode) which allows arithmetic algorithms to work.

seq_cond:	.if 4 values 
			true, 			// hard-code to 1
			input_is_zero,	// do not process 0x00 input char
			TRACE_INPUT,
			TRACE_RESULT,
			TXDREADY,
			TXDSEND,
			bitcnt_is_zero,
			loopcnt_is_zero,
			d_flag_is_set,
			c_flag_is_set,
			z_flagand_is_set,
			loopcnt_is_max,
			z_flagtos_is_set,
			z_flagnos_is_set,
			//daa_flag_is_set,
			loopcnt_nibble,
			false			// hard-code to 0
			default true;

Discussions