Close

VGA Timing Improvements

A project log for TRS-80 Model 1 on a PIC32

TRS-80 Model I emulated on a PIC32MX processor; VGA, PS/2, and SD for tape and disk images. Oh, and glourious cassette sound.

ziggurat29ziggurat29 09/16/2016 at 17:550 Comments

Having grown weary of swapping my main monitor back-and-forth between the dev computer and the project board, I decided to invest in an inexpensive ($35) 7" vga monitor off ebay from china. After about a month, it arrived, and I eagerly plugged it in. And... nothing... (sigh). I was able to get it to display a little bit by fiddling with the controls but it was still way off screen to the left and unuseable. Hmm.

Up to this point I have been using a video driver which I had borrowed hacked upon to make signals that would also be compatible with VGA multisync, but also some integral multiple of the archaic 384x192 resolution of the TRS-80 model I (which is too low for modern things). Truthfully, and I am chagrined to admit that I did less calculating and measuring than fiddling with constants until it worked. Since this project started as a lark, and since it had worked on the few monitors I had tested, I didn't really think that much about it. But, now, OK, maybe we need to check our assumptions with the oscilloscope and some standards.

OK, so the current video is not exactly any standard mode, but it is most close to SVGA 800x600/60 Hz. I generate it by tripling the vertical lines and doubling the horizontal pixels to 768x576. So, I decided to close the gap to 800x600 with some blanked lines (handled in the frame gen state machine) and the horizontal doesn't strictly matter: there's some blank to the right that you don't really notice unless you want to, and all the more so because monitors these days are usually widescreen anyway, so there's already plenty of blankness. (Strictly, there is an old VESA mode of 768x576 which would be perfect, but I have a doubt that would be supported on cheap monitors like the one I got. Anyway, the dot clock in that mode is 34.96 MHz, as opposed to 40 MHz for 800x600 which is an integral divisor of the PIC32 system clock (80MHz)).

So, having meansured, the vertical look good at about 59 Hz, and the horizontal looks good at about 37 KHz. The vertical sync was wildly out-of compliance at 54.8 us (instead of 105.6). That's easy, there's a constant which defines the number of horizontal lines which was 2 and now is 4 so it's good now at about 108.8 usec.

The horizontal sync looked good at 3.8 usec (spec calls for 3.2), but the horizontal back porch is wildly out of spec at 1.6 us (spec says 2.2 usec). This almost certainly explains why the video is hopelessly off the screen to the left; that 1 us is 40 pixels! How to fix?

OK, the video generation is done mostly using hardware peripherals: 3 SPI for pixel data, 3 DMA to feed the SPIs with no CPU overhead, a Timer set to count up and recycle once per raster line, and an Output Compare unit to generate horizontal sync pulses based on the counter value in the Timer. Also, the rising edge of the Output Compare (end of horz sync) is what triggers the SPI to start, because the 3 SPI are configured into 'frame slave mode' with the horz sync fed back into the frame start inputs. The SPI is in 32-bit mode, and is preloaded with a value of '0', and is clocked at sysclock/4 = 20MHz (since I'm doubling the physical horizontal resolution from 384 to 768, I halve the dot clock rate to make each pix twice as long). The DMA starts clocking out the actual frame buffer once that initial '0' has been sent.

There is a short ISR entered at the rollover of the timer (i.e. at the beginning of the horizontal sync pulse) which drives a state machine that bit-bangs the vertical sync, and sets up the DMA to point to the current fame buffer line. All this preparatory happens while the sync is low, with negligible CPU overhead. The time-critical magic happens under hardware control for all three color planes, in sync, when the Output Compare goes high (i.e. the end of the horizontal sync pulse). Meanwhile the CPU is off tending to other stuff, not to be bothered again for about 27 usec.

All this is great, except that the horizontal back porch (i.e., the quite time after hsync ends, and before pixel data begins) is really being generated by that initial 32-bit '0' being loaded into the SPI. 32 bits @ 20 MHz is the 1.6 us I am seeing for the current back porch. I need another 600 ns, which is 30 pixels. The problem is, this is not a constant I can change easily with the current code -- you can't preload two words into the SPI. I need to do some surgery.

The first attempt was simply to pad the frame buffer with an extra word (32 bits is close enough to the 30 bits I technically need), and always keep that word 0, and adjust all the frme buffer routines to consider this value. This took a fair amount of hacquery, but I got it in, and made the number of words a #define sao I can tweak it with ease. It works fine now, and my cheap monitor locks onto the signal and I can now see the full TRS-80 screen. But I really dislike this solution because I am wasting precious RAM. At one word per raster line per colour plane, I am wasting 2.3 KiB! Yikes! But it is a baseline.

For the second attempt, I am using an additional Output Compare (I had two spares) configured just like the horz sync one, but with a larger compare value (so it triggers at the end of the back porch). This works great and I get my RAM back! However, there is a problem: the frame sync must come in from a physical pin (not internally routed in silicon), and those are already wired to the Output Compare for horz sync, and not to the new one (D4). Sigh. I don't mind cutting traces on my board, but I doubt anyone else will want this to be a requirement to play with the software with an off-the-self board.

So, alas, for now I am stuck wasting the 2.KiB. I guess it's not the end of the world, since I have it there to waste, but it doesn't make me particularly happy about it. But, if I ever decide to make my own board for this project, I will then have the opportunity to take this better approach now that I've figured it out.

Discussions