Close

Instead of CGIA, go with a super-VDP instead?

A project log for Kestrel Computer Project

The Kestrel project is all about freedom of computing and the freedom of learning using a completely open hardware and software design.

samuel-a-falvo-iiSamuel A. Falvo II 05/11/2016 at 23:523 Comments

After stumbling upon a fairly sizable collection of design notes and interviews from the inventor of the Texas Instruments TMS9918A VDP (video display processor), I began to entertain the possibility that maybe the Kestrel-3 should use a VDP-like video core to replace the MGIA instead of my original CGIA (Configurable Graphics Interface Adapter) idea.

Recap: CGIA

I've not disclosed details about the CGIA much anywhere because I never had a need to, but this is what I've been thinking. Take the MGIA logic as-is, and expose some of its guts to the programmer. In particular, expose the DMA engine to the programmer and make it programmable. Video data would be streamed from a single buffer, using a single DMA fetch pointer. Every HSYNC, the CGIA would fetch the next N bytes from the video buffer into a waiting scanline buffer. N would be configurable via a control register. The fetch logic would not make any attempt to interpret the meaning of the bits it read at all. Then, on the next HSYNC, the pending scanline buffer becomes the currently displayed buffer (leaving the formerly displayed buffer as the new pending buffer for DMA purposes). Here, the contents of the buffer would be clocked out at a configurable rate, with the bits routed to a palette register bank in a configurable manner. In this way, the programmer would have complete control over video bandwidth/CPU bandwidth tradeoffs.

Recap: VDP and its Progeny

The TMS9918A exposes only a minimal amount of configuration to its user; coming out of reset, it is basically configured in a 32x24 character display (what they call "pattern graphics"). The configuration it does offer is, for example, whether or not you're using 4K or 16K dynamic RAMs, whether or not background video is genlocked, and where the various display tables will appear in video RAM. The number of colors supported and the monitor synchronization timing are all hard-wired to support NTSC television, not unlike the Kestrel-3's MGIA being hardwired to support IBM VGA. A completely separate chip had to be made a few years later to address the PAL market.

Later generations of the VDP, such as the V9938, V9958, and V9990, all added higher bandwidth paths to memory when they became available, support for higher color depths, planar as well as patterned graphics, etc.; but, they otherwise retained their hardwired video timing parameters. An open source clone of the TMS9918A that borrows some V9938 features is also available online.

Finally, there is the Gameduino 1.0 device, which is perhaps the first VDP-like video interface to actually drive SVGA monitors (800x600 72Hz, to be precise). It's feature list is most impressive; however, it is limited in its resolution (400x300 addressible pixels) and available on-screen colors (you can show lots of colors, but your palette selections are limited and highly optimized for tile-based graphics, such as you'd find on NES-like gaming consoles). It fills its niche quite well, but I don't think it's appropriate for the Kestrel-3.

My Thoughts

I think, if I were to go the VDP clone route, and it is indeed appealing for a number of reasons, I do not currently see a reason why I couldn't add the programmable video timing parameters I'd like to see. Moreover, I can have it bootstrap into a planar graphics video mode that is backward compatible with the MGIA, thus allowing existing system software to work with it. No need to recompile any existing graphics drivers except where color, sprites, or different resolutions are needed.

Benefits of my original CGIA concept:

Detriments of my original CGIA concept:

Benefits of adopting a VDP-like architecture:

Detriments of adopting a VDP-like architecture:

I can't think of any further disadvantages. I should point out that the horizontal smooth scrolling problem is caused by the same problems in both VDP and CGIA designs. They would be solved in the same way for both as well. While I list lack of horizontal smooth scrolling as a detriment for both, any solution I come up with for solving that issue would equally apply to both VDP and CGIA.

So I'm curious about your thoughts; should I try to go with a VDP architecture? Would this make the Kestrel-3 more appealing for others to use or program for? Should I stick with a simpler graphics architecture and rely on improving CPU performance to make up for sluggish frame rates in the future?

Discussions

Sean R. Lynch wrote 12/23/2016 at 05:03 point

It seems like whatever you do with the VDP approach will end up being overtaken by improving CPU performance. I did like being able to write characters on the screen and move sprites around just by poking a couple bytes on my Commodore 64 and I'm enamored with hardware sprites and hardware text, but practically speaking SDL works just fine, and both it and the API it requires from the graphics car are quite simple. If you do go with any kind of acceleration, I'd focus on low-level stuff that makes it possible to accelerate libraries like Cairo.

  Are you sure? yes | no

Samuel A. Falvo II wrote 12/23/2016 at 07:03 point

Interesting; I find SDL 2.0 and Cairo utterly inscrutable.  To get my Kestrel-3 emulator working, I spent the better part of a month figuring out how to properly paste a frame buffer on the screen.  :(

Most of its performance, though, comes from hardware acceleration on the video card itself.  SDL 2 would be a performance dog without the help of the GPU.  Granted, it's just using the GPU as a glorified blitter; but, still...

I've decided to stick with the CGIA approach to generating video, as you might have guessed by reading subsequent blog posts.  VDP-style video controllers just don't interface that well to SDRAM.  However, the CPU performance really isn't expected to get higher than 24 MIPS in the coming years (assuming I pull a miracle out of .... the closet, and somehow manage to implement a real, 5-stage pipelined CPU microarchitecture coupled to a large enough cache memory), so I may end up having to introduce some sprite and/or blitter mechanisms to overcome the CPU's lack of throughput anyway.

But, for now, the CPU will happily bit-bang the frame buffer.  Optimizations can come later, especially once I can get some baseline measurements on performance.  :)

  Are you sure? yes | no

Sean R. Lynch wrote 01/03/2017 at 05:03 point

I guess I underestimated the bandwidth difference. But a blitter and some accelerated drawing primitives with dedicated framebuffer RAM seem like a more general solution than sprites, if they're sufficient for the performance you want.

  Are you sure? yes | no