Close

Regression Testing

A project log for Bexkat1 CPU

A custom 32-bit CPU core with GCC toolchain

matt-stockMatt Stock 12/12/2015 at 15:011 Comment

So far in these projects, I've been able to build iteratively and not run into too many nasty bugs. There are many layers of abstraction though (libraries, compiler, assembler, machine, CPU), and so when a bug does crop up, it can be really challenging to find.

Most recently, I found that I had misunderstood some subtleties of transferring data between registers. The fix was simple - an opcode that zero fills the upper bits when you make a copy of an object smaller than the register size. But how this manifested itself was that sometimes printf() printed out the wrong character when printing a number. Eventually, I was able to isolate this to 33 % 10 resulting in 9 (not 3), which meant I didn't have to debug libc. After further narrowing the issue down to making a very small test case, I was able to see why the CPU was generating the incorrect value. That probably took me 4 days to debug.

As I plan on making some radical changes that could break things, I need to consider how best to avoid introducing more of these kinds of issues, and if it happens, how to quickly determine the issue.

The best idea I've got right now is to leverage the space I have within the FGPA and build more stuff. So since I plan to start trying to reduce the size of the combinatorial paths within the CPU which could effect timing, I'll create a second CPU. The new CPU will be the one I modify, and the first one will be my canary. I can feed them all the same data in parallel. The output from the canary will not be connected to the rest of the system, but will instead feed into a testing module. They module will also get taps from the second processor, and if the outputs diverge it can throw a signal that I can catch with the debug tools.

The nice thing about this is that it's fairly lightweight, and it will allow me to immediately see if the timing has changed. It doesn't rely on any other device in the system, and so I don't need to worry about special test programs or anything like that, however if I did have a program that did some additional self-tests it would be beneficial.

I'm curious if anyone has other ideas on how to build the equivalent of unit tests for systems of this complexity? I never got into the simulation aspects of Verilog - is that something that is worth the time to retrofit, or is the benefit of simulation more pre-synthesis?

Discussions

Yann Guidon / YGDES wrote 12/12/2015 at 23:18 point

You're entering in the heart of the art of architecture design :-)

Differential testing is one method, used when no compiler or simulator was available.

I use multilanguage description to catch semantic problems, JavaScript as a "golden" model and VHDL is derived from it.

Simulation can save all the time of compilation, you (should/)can observe any signal and slowly trace the source of the error. In fact, you should start the design with simulation, synthesise only when the code is clean and working.

Modularity helps too :-)

  Are you sure? yes | no