Close

The new breakpoints system

A project log for Discrete YASEP

a 16-bits YASEP computer (mostly) made of DIP/SOIC chips like in the 70s and 80s... with 2010's twists!

yann-guidon-ygdesYann Guidon / YGDES 11/09/2015 at 16:2117 Comments

The Discrete YASEP enables a lot of explorations. The YASEP architecture is rather well defined and understood but there are other aspects that need innovation. In particular the breakpoints and trace systems, which have been essentially software-driven in the last decades. It's exciting to play and find new ways to do things, that our predecessors could only dream of !

The design of the breakpoints started with the log Dear SN74HC688 where a classic system was described. Then @esot.eric gave me an even better idea (in the comments of Programming systems) that totally fits in the spirit of the Discrete YASEP project : use cheap, widely avalable parts to do all the work, in ways that break with the tradition of scarcity of the previous generations. The Discrete YASEP is the celebration of dirt cheap parts ;-)

Instead of using a 574 and a 688 for each individual breakpoint, a single 64K SRAM chip will use less room but provide 8 breakpoints, not just on one data point but on a whole range or even totally unrelated points. Alternatively, the breakpoints can be also used for code/data coverage and sent to a host for analysis.

The more I think about, the more possibilities appear so I'll use this log to gather current and future ideas. Comments about it are piling up on an unrelated log so I'll move things here :-)


Update: In the comments, @K.C. Lee suggested to feed the BP signals to a set of counters for performance monitoring. This project is getting better and better thanks to your input, guys :-D

So now we have 4 counters made of 74HC4040, 36 bits long. An overflow bit is needed too. The 4040 are easy to clear and increment but it gets tough to manage this quantity of data: each stage of 4×4040 requires 6×HC253, 24 of them overall just to select one 36-bits counter out of 4. Then individual bytes must be selected for output/display...


The trace & breakpoint boards are amazing features, and like some other boards, they make this project a comfortable development environment. But it appears to get out of hand and the complex RAM-based breakpoints may drag the whole project's development. They are now "optional" and I'm considering the return of a basic 688-based breakpoint for the "base configuration". I want things to get done ASAP, they can be extended later :-)

Discussions

Eric Hertz wrote 11/11/2015 at 11:50 point

Alright, forgive my ignorance... but my understanding is that a breakpoint is usually an instruction address, that when that instruction is accessed, the processor stops execution (where you can continue by single-stepping, or check register contents, etc...)... So, from this (limited/mis-) understanding, couldn't you have as many breakpoints as there are data locations in the "breakpoint [S]RAM?" (Granted, most SRAM chips probably have quite a few fewer address-bits than your address-bus, but so does the '688...).

I mean, basically what I visualized was that you were using the '688s to detect a "match" between your breakpoint address and the address on the program-counter. To match an address with SRAMs, just wipe the SRAM to 0, and put a '1' at the desired address(es)... That'd allow for, literally, thousands of breakpoints... How'd you come up with 8? (Oh, and each of those breakpoints could have many different categories, for e.g. masking)

Sorry, it's not really important that *I* understand, but I don't understand :)

  Are you sure? yes | no

Yann Guidon / YGDES wrote 11/11/2015 at 14:26 point

Hi,

I think I should make drawings/schematics ASAP :-D

The magic 8 number comes from the width of the SRAM. I simply connect the whole address bus (16 bits) to the bus to spy, and the SRAM's data bus is 8 bits wide, providing 8 independent sets of arbitrary condition (a bp is either disabled, fed to a counter, stops or gets combined with another).

For the prototype, with 5V DIP parts, I have a few 61512 parts 64K×8) at about 15ns.

For the 3.3V SMD version, I can use 32K×16 parts. and the upper/lower byte is selected by the missing address bit (with an inverter for the /UB or /UL signal). If it's too awkward or slow, I'll use the 64K×8 and just waste one of the data bytes. I don't think that having more than 8 bp would be wise, it would cost too much with only marginal usefulness. Prove me wrong and I'll put 16 BP in the 32-bits version of the YASEP ;-)

(btw, I can't remove your duplicate comment, can you do it ?)

  Are you sure? yes | no

K.C. Lee wrote 11/11/2015 at 15:06 point

128Kx8 (1 Mbits) are available at $2.40 and only cost 2X the old (256k bit) 32Kx8.  I am sure you can find something to do with the extra address bit.  :)  I got piles of PCB with 256Kx16 SRAM.
64Kx8, 32Kx16 (512kbits density) are the odd balls.  I wouldn't bother with those.

  Are you sure? yes | no

Yann Guidon / YGDES wrote 11/11/2015 at 15:12 point

I have "a certain quantity" of SRAM chips, 32K×16 up to 256K×16 in TSOP44, bought for cheap on the bay, so this project is the best way to use them at last ;-)

  Are you sure? yes | no

K.C. Lee wrote 11/11/2015 at 04:40 point

A breakpoint system would only need a single bit of matching.  You can recover the info of which breakpoint it is after the fact at the break point anyway.

Think beyond breakpoints and more towards a tagging system.  With a bank of counters enabled by the tag, you could use it as a logging system for profiling your code and how many cycles it spend on each of the regions or even drive a logic analyzer/scope for looking at things real time.

  Are you sure? yes | no

Yann Guidon / YGDES wrote 11/11/2015 at 05:01 point

It's slowly but certainly getting out of hand ;-)

There is already a bank of up-down counters: that's the register set. They already require 11×4=44 DIL16 chips :-D

Now I could implement a few counters, just like the Pentium's performance counters. Even a "TSC" would be good. Events would be pipelined to keep the pipeline short. I suppose that 16 bits counters are not enough. Instead of "reloadable" 4 bits counters (I use a bunch of '193 already for the registers) it would be more practical to use a pair of CD4040. That's 12 bits×2, are 24 bits enough for you ? :-)

(oh, the CD4040 is a "slow" chip and might not work at the expected processor speed, "1.5MHZ min @5V" so I need another reference that could work @3.3V...

Let's say there are 4 counters and they can be fed from certain breakpoints (otherwise the MUX tree would be too large and slow). Count up to 32 bits max. They have to share a Hex display, and be read back from the host...

It's a beginning.

  Are you sure? yes | no

K.C. Lee wrote 11/11/2015 at 05:12 point

There are faster 74HC4xxx for some of the 4000/4500 CMOS chips. Ripple counters are on that list.
I like complexity - I went the opposite direction - my FPGA computer project.

  Are you sure? yes | no

Yann Guidon / YGDES wrote 11/11/2015 at 05:28 point

Hah, I just checked and I see that I was smart enough to take the 74HC version in my local store :-) the sn74hc is rated at 28MHz @ 4.5V/25°, should be enough.


As you can see, I'm on a sabbatical from FPGA world ;-) But I will return with even more experience.

  Are you sure? yes | no

danjovic wrote 11/10/2015 at 10:09 point

What about expanding the 8 breakpoints with some 74LS138s ? After all you have 8 bits to play with. Or maybe you can use the 8 bits to load a partial address area where your debug lies within, thus allowing to have different trap addresses.
Another thing, the most brilliant part of the idea of using a SRAM for trapping debug addresses is that you don't need only to trap 'addresses'. You can also trap 'ranges of addresses', got it?

  Are you sure? yes | no

Yann Guidon / YGDES wrote 11/10/2015 at 23:04 point

Hi,

Yes, I "got" the "ranges" thing before you mentioned it, and not just ranges, but any set (even not contiguous). This is why it's also interesting for code coverage : if the output is left in "set the bit" mode (which is usually toggled manually) then the SRAM gets filled with 1s for all the values encountered. That's called "coverage" : the whole algorithm can be covered, not just the instruction space (the classic "code coverage" thing that some SW do) but also opcode coverage (already done by a JavaScript assembler but heh.) and even data coverage, to see if result values exceed a given range, for example...

Your '138 idea : I'm not sure to understand how you want to conect them.

  Are you sure? yes | no

danjovic wrote 11/11/2015 at 00:02 point

Hi,

I have understood that each bit from the data lines from the  ram is meant to one breakpoint, and that's why the number of breakpoints would be 8 . But if you consider 8 bits you can have up to 255 break points (2^8 less one for 'idle'). But thinking better, forget the 138's. You can use the data bus from 'trap ram' as a partial address to a jump table and then have up to 255 trap address entries. Of course you can also have a master 'break point' execution address that reads the 8 bits from the 'trap ram' and then decide what to do.

  Are you sure? yes | no

Yann Guidon / YGDES wrote 11/11/2015 at 00:38 point

Your approach makes all breakpoints related in a way or another. When 2 bp share common conditions, you have to allocate 2²=4 codes for all the possiilities (one, the other, none, both). Which is equivalent to 2 independent bits... If you want to be "smart", you end up needing smart software that use set theory to "compile" compact codes for all the combinations of requested conditions. That's... unnecessary. This will confuse the novice users. 8 breakpoints is already more than I've ever used (at most 2 or 3 in practice). Remember, there are already 8 bp for each of the 5 spied buses !

The other constraint is to keep the logic delay as short as possible because the bp becomes part of the processor's critical datapath. I know it's going to be slow but if I can make it a bit faster, like 3 MIPS (160ns minor cycle), my inner speed demon will grin :-) at 4 MIPS (125ns) I'll be the king of the world :-P

  Are you sure? yes | no

danjovic wrote 11/11/2015 at 12:58 point

Ok, You got a point. 8 BPs are good enough :) 

  Are you sure? yes | no

Eric Hertz wrote 11/10/2015 at 07:59 point

Hah, this breakpoint thing is kinda blowing my mind, but I'm glad you understand it :)

  Are you sure? yes | no

Yann Guidon / YGDES wrote 11/10/2015 at 08:44 point

The breakpoint system is as crzay as the trace system ;-) I have not yet counted how many SRAM chips it's going to use, but way more capacity thank the 3 user memory banks...

  Are you sure? yes | no

Eric Hertz wrote 11/10/2015 at 09:19 point

I think you need to add SDRAM in there, somewhere... But of course, I'm kinda partial to it...

  Are you sure? yes | no

Yann Guidon / YGDES wrote 11/10/2015 at 22:07 point

Moooaaaaahr Breakpoints ! :-P
Shouldn't 8 bp be enough for anybody ?...

  Are you sure? yes | no