Close

Everybody loves benchmarks! The numbers are in!

A project log for Bit-serial CPU based on crossbar switch

256 switches and few shift registers to implement a working 16 or 32-bit integer arithmetic calculator (+, -, *, /, isqrt, BCD / bin conv..)

zpekiczpekic 04/25/2022 at 03:430 Comments

When calculations are done 1 bit at a time, one should probably not expect blazing speed. 

To measure the clock cycles per operations, I simply baked a 8-digit BCD counter into the system and increment it at each cycle, and reset to 0 at each new instruction. This is in top-level file of the design:

-- count clock cycles per instruction    
cclkcnt: bcdcounter Port map ( 
        reset => (RESET or i_new),
                clk => hc_clk,        
        enable => (hc_status(1) and hc_txdready),    -- only count when status is "busy" and not waiting for UART
                value => c_cnt
    );

i_new <= hc_status(1) when (hc_status_old = status_ready) else '0';
on_hc_clk: process(hc_clk)
begin
    if (rising_edge(hc_clk)) then
        hc_status_old <= hc_status;
    end if;
end process;

 i_new is the pulse (on hc_clk clock cycle wide) that gets generated when a new instruction execution starts. This is detected because CPU status goes from STATUS_READY to any of the STATUS_BUSY (bit(1) of status field is 1). The status is generated at each and every microinstruction (and unless stated otherwise, will be "busy"):

// Component interface signals
STATUS        .valfield 2 values
        ready,    // waiting for input character
        done,    // input processed, will go ready on next clock cycle
        busy,    // processing
        busy_using_mt default busy;    // processing and needing the MT_8816

 hx_txdready is the signal that comes back from UART sender. Because at 38400Hz (typically) this is slower than CPU speed, this wait time is excluded from the count (during this time counting is disabled).

Finally, the value of the counter is fed into the "hardware window" to display to VGA:

(measurement converting 9999 to binary in 16-bit mode, with tracing input and tracing output disabled, clock 195kHz)

Operation16-bit32-bitNotes
0..F (enter digit into TOS)3652
ENTER (push registers down, clear TOS)4779
Z (clear TOS)4678
N (clear all regs)3971
R (rotate registers)4779
U (duplicate TOS)4880
S (swap TOS and NOS)4880
< (shift TOS up)5082
> (shift TOS down)5082
+ (add TOS + NOS)4880
- (sub TOS - NOS)4981
* (multiply TOS * NOS)51 78683
2578
Best: TOS = 0
Typical: all other
/ (TOS / NOS)95
1384

127
4830
Best: TOS = 0 (divide by 0 detected)
Typical: all other
# (convert to BCD)15295493for arg 9999/99999999
$ (convert to binary)13164656for result from above
Q (integer square root of TOS)16305  94195  Worst: max argument

General observations:

How to improve performance:

Discussions