Close

Test code - Fibonacci sequence generator

A project log for Iskra EMZ1001A - a virtual resurrection

4-bits wonder to print "Hello World!" and to calculate Fibonacci numbers to 13 decimal digits!

zpekiczpekic 12/12/2022 at 05:460 Comments

Why Fibonacci sequence? The algorithm is simple enough to fit into code and data memory, and allows to test lots of the processor functionality:

Description below refers to the code here, note that line numbers and labels may differ as I still tweak the program a bit.

The most common application of EMZ1001A were in small appliances or devices with simple capacitive keyboard and few LED 7-segment digits (for example, a microwave oven, digital scale etc.). For these to be useful, LED display must at all times display some correct data continuously. Interrupts do not exist, instead the display refresh is the main loop of the program, and everything else that needs to happen must be done within that dead loop (old calculators worked the same way, that's why during calculation their display was momentarily blank or scrambled). 

LED display can be only driven from RAM location, so at the time of LED display update RAM must contain valid data, in this case a valid Fibonacci number in the sequence. It is held in bank 2 (used for display), and previous 2 numbers are in banks 0 and 1. The debug display on VGA illustrates the placement of data (8+13 = 21):

So the algorithm is:

  1. Initialize the processor
  2. clear bank 0 (column 0 in the pic, now contains 0000000000000000)
  3. clear bank 1
  4. flip LSBit of LSDigit in bank 1 to 1 (so it now contains 0000000000000001)
  5. Add banks 0 and 1 and store to bank 2 (BCD add)
  6. Display bank 2
  7. Go back to step 6 and refresh LEDs until 1s tick is detected
  8. copy bank 1 to 0 and 2 to 1 (so we have always the previous 2 numbers in banks 0 and 1)
  9. go to step 5
  10. BUG: I wanted to detect carry from MSDigit to restart from 0 but somehow it is missed so it continues indefinitely generating last 13 valid Fibonacci decimal digits)

While the processor is relatively simple, it does have some clever capabilities that can be used to accomplish some tasks efficiently. Few highlighted below.

Looping over RAM locations

64 nibbles can be thought as a 2D 4*16 array, addressed by BU (2 bit) and BL (4 bit) registers. Loops can be started at any BU and BL either 0 or 15 and then traversed towards the opposite end. End condition is baked in into the instruction based on the direction of the BU increment/decrement direction. Example:

DEADLOOP:   LBZ 0;        // select RAM column 0, row 0
            JMS  CLEAR;    // clear RAM column 0
            
            LBZ 1;        // select RAM column 1, row 0
            JMS  CLEAR;     // clear RAM column 1

(omitted)
//    ---------------------------------------------------------------------------
            .org 0b1111000000;
//    Page 15 in the bank is the default place for subroutines
//    ---------------------------------------------------------------------------
CLEAR:  LAI 0;            // A = 0, BU set by caller
        XCI 0;            // Exchange with M[BU, BL], BU = BU, BL++
        JMP CLEAR;        // repeat until all covered (BU = 0)
        RT;                // back

Efficient and fast subroutine calls

Program memory can be up to 8k (13-bit address), but JMS instruction has only 6 - where does the rest come from? Best if from nowhere (it is implied) - unless PP was executed before (to change the page), JMS will jump to [curent bank][page 15][6-bit destination]. Storing entry points of subroutines on page 15, and only if that page does not have enough space (1 page = 64 locations), branching with PP + JMP to the final destination. Example:

(on page 0)

ADDLOOP: JMS  BCDADD;    // RAM[2,*] = RAM[0,*] + RAM[1,*]

(on page 15)

BCDADD: LBZ 0;           // BL = 0, BU = 0
        RSC;             // clear carry
ALOOP:  LAM 0b01;        // A = M[0, BL], BU = 1
        ADCS;            // C,A = A + M[1, BU] + C
        JMS V16_TO_18;   // carry set
        JMS V0_TO_15;    // carry not set
        XAE;             // save A in E
        LAI 2;           // A = 2
        XABU;            // BU = 2
        XAE;             // restore A
        XCI 0b10;        // M[2, BL] = A, BU = 0
        JMP ALOOP;       // repeat until all 16 digits
        RT;              // done

Similar to Intel 8008, stack is built into the processor (3 levels, but this implementation has 4 as that is easier to implement). Actually, there is no stack, but a memory block of 4 13-bit locations, and the 2-bit stack pointer selects one of them which is used as program counter; JMS and RT increment/decrement this pointer.  

Jumping over RAM banks 

In a single instruction of type XC* not only BL can be incremented / decremented, but next bank can be different or same using a simple 2-bit XOR trick: new BU = old BU xor instruction[1:0]. 

Example SHIFT0: banks will be:

After LBZ .... 1

After LAM .... 0

After XCI ... 1

(loop back to LAM so again 0/1, 0/1)

Example SHIFT1: banks will be 2/1, 2/1 ...)

SHIFT0:        LBZ 0b01;        // row (BL) = 0, col (BU) = 1
SHIFT0LOOP:    LAM 0b01;        // A = M[1, BL], BU = 0 (LSB flipped)
               XCI 0b01;        // exchange with M[0, BL], BU = 1 (LSB flipped back), BL++
               JMP SHIFT0LOOP;    // repeat until all moved from column 1 to 0
               RT;                
            
SHIFT1:        LBZ 0b10;        // start with column 2
SHIFT1LOOP:    LAM 0b11;        // switch to column 1 (both bits flipped!)
               XCI 0b11;        // switch back to column 2        
               JMP SHIFT1LOOP;
               RT;


7-seg LED display using DISN instruction

EMZ1001A has everything needed to drive 7-seg displays (actually 8, as DP (decimal point) dot can be driven too) of up to 13 digits. DISN instruction will "lookup" the pattern for hexadecimal digits 0-F in the internal ROM which will be output to D6..D0 and carry flag will directly drive D7 (decimal point). In the CPU logic:

...
-- seven segment pattern for DISN instruction
constant sevseg: mem16x8 := (
	"01111110",	-- 0
	"00110000",	-- 1
	"01101101",	-- 2
	"01111001",	-- 3
	"00110011",	-- 4
	"01011011",	-- 5
	"01011111",	-- 6
	"01110000",	-- 7
	"01111111",	-- 8
	"01111011",	-- 9
	"01110111",	-- A
	"00011111",	-- B
	"01001110",	-- C
	"00111101",	-- d
	"01001111",	-- E
	"01000111"	-- F
);
...
disn_out <= eur_invert xor (mr_cy & sevseg(to_integer(unsigned(mr_a)))(6 downto 0));	-- combine carry with lower 7 bits of 7seg lookup
...

A(address) and D(ata) lines driving 7-seg are also used for accessing external ROM, so there is lots of multiplexing:

CodeT1, T3 (SYNC low)T5, T7 (SYNC high)
common anodePSH to set pin A[BU] high, others low
EUR bit 0 set to 1 to not invert D (default)
drive LEDs only when not in test mode, using internal ROM and not in multiplexed mode (PSH with BL 14)drive LEDs in all modes except test and if not in D float mode (PSL with BL 14)
common cathodePSL to set pin A[BU] low, others high
EUR bit 0 set to 0 to invert D
drive LEDs only when not in test mode, using internal ROM and not in multiplexed mode (PSH with BL 14)drive LEDs in all modes except test if not in D float mode (PSL with BL 14)

Fibonacci code is in external ROM and the board uses "inverted common anode" so the bolded combination is set up by the program. Simple logic detects external ROM use during SYNC low and blanks the display. 

-- when using external ROM, prevent A and D messing up the LEDs when SYNC is low
blank <= not (SYNC or sw_internalrom);
	
DOT <= D(7);
A_TO_G <= D(6 downto 0);
AN <= "1111" when (blank = '1') else A(3 downto 0);

Now for some pains:

This pain can be seen in the BCD addition loop. Digits from banks 0 and 1 are added and stored to bank 2, but each stored value has to be corrected depending if it was greater than 15 (carry in this case is equivalent to 8080 H-flag, consumed by DAA instruction), or if between 10 and 15 - both of these require correction of adding 6 and setting of carry flag (which is now decimal, not binary carry) 

BCDADD: LBZ 0;           // BL = 0, BU = 0
        RSC;             // clear carry
ALOOP:  LAM 0b01;        // A = M[0, BL], BU = 1
        ADCS;            // C,A = A + M[1, BU] + C
        JMS V16_TO_18;   // carry set
        JMS V0_TO_15;    // carry not set
        XAE;             // save A in E
        LAI 2;           // A = 2
        XABU;            // BU = 2
        XAE;             // restore A
        XCI 0b10;        // M[2, BL] = A, BU = 0
        JMP ALOOP;       // repeat until all 16 digits
        RT;              // done

V0_TO_15:  XAE;
           LAE;            // E = A
           ADIS 6;         // Classic DAA adjust
           JMP BCDCARRY;   // carry set, adjustment was needed, A is correct BCD
           XAE;            // carry not set, adjustment was not needed, restore A
           RT;
BCDCARRY:  STC;
           RT;
            
V16_TO_18:  ADIS 6;     // 0x12 + 6 + 1 becomes 0x18 etc., carry remains set
            NOP;
            STC;        // decimal carry forward
            RTS;        // skip next

Discussions