Real-time scanline benchmarks

A project log for PICO9918

A replacement for the classic TMS9918A/TMS9929A VDP, powered by a Raspberry Pi Pico (RP2040)

troy-schrapelTroy Schrapel 06/13/2024 at 07:570 Comments

Thought I'd do some initial real-time benchmarks - just to see how much wiggle room I have to implement more advanced features. It's looking quite good. 

These images will look bad, but bear with me. :D

What's happening here is I'm switching the VGA signal to output BLUE when it's done generating the next scanline. So, if it was at 100% generating scanlines, these images would look perfectly normal since it wouldn't have time to switch to BLUE. If the image was mostly blue, that means it has loads of time left to do more cool stuff.  Keep in mind, the VGA is at 640x480, so I'm doubling the scanline vertically for a virtual resolution of 640x240 (used to be 320x240, but switched to 640x240 to support 80 column mode). I generate a scanline once and use it twice. With that in mind, if I was using 50% of my single CPU core capacity to generate scanlines, you would see alternating lines: correct image, BLUE, correct image, BLUE, etc. 

Ok. that's the gist of it. Now here's the results: 

Graphics I - No sprites 


Using around 33% of the available scanline generation time. ie. It finishes around 2/3 of the way through the first pass of a doubled scanline. 



About the same: 33%. 

Mixed modes


 Here, you can see MC mode is the fastest. Graphics II lagging a bit at roughly 40% of the available time used. 




Adding in sprites gets us a pinch over 50% of a scanline time used (hence the black appearing on the left where sprites appear). 

Anyway, all-in-all, I'm happy with the results. It means I have some room to move. For the most part, I could render the entire display twice (or more). I also haven't done much work to target my TMS9918A library to the Pico (was originally written for desktop use), so I'm sure there would be performance gains to be had by keeping the RP2040 in mind. Also, the scanline rendering is all taking place on a single CPU core. The second core is mostly twiddling its thumbs waiting for interrupts from the PIOs - I could allocate some work (such as the sprite layer) to that CPU core.