Just for spending some time hacking on something I've never did before I started to hack around a bit do test how stable a PAL tv-out would be on a PIC without a crystal.
Decided to make a low-res 64x32 pixel display since the PIC doesn't have a lot of RAM. If doing a 1 bit-per-pixel this would only require 256 bytes.
The internal RAM in PIC devices are usually banked in a number of separate banks of 80 bytes each. But in some of the newer devices all the separated banks can be accessed thru the FSR registers as an index into a special linear memory region. The PIC16F1xxx series is one of the newer devices that have this nice feature.
With its very small instruction set the PIC is relatively easy to learn, but have too many quirks and strange limitations that it is much harder to code in assembly for it compared to its main competitor - the AVR's from Atmel.
I spent some hours coding up the Chip-8 interpreter and got most of it implemented very easily.
But I still have to take care of the problem with the Chip-8 applications running from flash, thus preventing the applications to use unused parts of the 4K area as extra variable storage. Basically I just have to implement a lookup table pointing to the PIC's SRAM area and use that whenever a read/write to the Chip-8 area is detected. This is probably a few hours of coding and debugging - but that's things one have to accept when selecting the "wrong" hardware for a project. :-)
The PIC16 I'm using here only have 1k ram whereof 256 is used for the video buffer and the chip-8 specs calls for 4k memory. So my chip8 games have to live in flash instead, but nany of the old games use some of the ram/memory to store data in which wouldn't work too well when flash memory is used. ;-)
So I think I'll take this golang project and hack it to show which areas of the memory the games are using for read/write data storage so I can "overlay" my remaining 0.5k ram over the flash in a proper location in the memory space in order to at least get some of the old games working....
Ok... Spent an hour or two rewriting most of the code today to use interrupts at 64us intervals instead of having precise timed loops.
The code ended up about the same size but is now easier to maintain and it makes it possible to run an application software in the foreground without disturbing the video that is in the background handled by the interrupts.
About 2/3'rds of the CPU time is spent inside the interrupt handler doing video & sync outputs. The remaining 1/3'rd is available for the foreground application/game. That should be plenty for a Chip-8 interpreter to run at full original speed.
The output is still a little bit glitchy since I haven't yet been able to de-jitter the interrupt handler, but for the time being it just adds to the 1970's retro feeling :-)
In the first log entry I showed a checkerboard image on the screen where the last line looked like it was out of sync. I just realized that the syncs are just fine (even if the screen is not rock stable - but what can be expected with an internal mcu oscillator PLL'ed up by 4?).
It's actually the top line (no 1 of 7) of the first pixel row that is shown as the last line. The lines are all offsetted by one.
I realized that I use the idle time in the sync pulse interval in
the subroutine that outputs the lines with with video data to increment
the 1-of-7 counter and update the pixel counter. This is done *before*
the video is output, meaning that the I've skipped the very first row
and it will turn up as the last line after a premature wrapping of the
Easy fix. Initially at boot initialize the 1-of-7 counter with 0xFF instead of 0x00 - then it will all line up nicely :-)
Actually I'm using a down-counter decrementing from 7 to 0 with a "decfsz" instruction, so the initial initialization value should be incremented by one to get it all adjusted correctly.
The PIC is using an internal 32MHz clock (actually generated by an internal 8MHz clock multiplied by 4 by an internal PLL) giving a base 8MHz / 125ns instruction rate.
In non-interlaced mode the PAL tv standard have 312 lines at a length of 64us each. This adds up to a frame rate of 19.968ms or 50.08Hz.
So each line is 64us long and at a 125ns (0.125us) instruction rate I can (or actually rather "have to") use 512 instructions for each line. That is not entirely true, since some instructions (most notable goto's, calls's and conditional skip/branches) takes two instruction times.
I'm outputting a composite video signal with both of the two sync signals (vertical and horizontal) and the video/pixel data is mixed together into one signal.
The horizontal/line sync is normally a 4us long low pulse that is sent at the very beginning of each of the 312 lines. But at the beginning of each frame there is a several hundred microseconds long vertical sync pulse as well. During this time we still need to regularly send the horizontal/line sync pulses but the output is already held low by the vertical sync, so during that period they have to be sent as short high "interruptions" instead.
Apparently there are some trickery to coax the tv into behaving nicely in non-interlaced mode. The TV is most happy with the interlaced mode where the odd lines are sent in the first frame and the even lines is send in the next, but this is a pain in the neck to do from a lowly microcontroller as it requires the last line to be split into two halves and some other complications that I rather live without. But by sending some extra short and extra super long sync pulses during the frame sync (beginning of each full frame) the TV seems happy enough.
My code is currently timed by keeping track of the number of instructions and their execution time (1/2 cycles), I'm not using interrupts to control the timing, this is the next step now when I got the basic functionality tested and working. I've made a number of subroutines (aka "functions" for you belonging to the high-level language crowd) that takes care of a specific type of line. Each function takes exactly 507 cycles to allow for some extra code like the loops to be added between the calls. If I don't need to do anything between two calls I simply add a 5 cycle delay there to waste up the time to a perfect 512.
A few handling the special vertical syncs. One doing the regular video lines, but not outputting any data to just get some margins at the top and bottom of the screen. And one that is actually outputting video data by shifting out 8 bytes one by one on a gpio pin resulting in 64 pixels in X-resolution.
The same 8 bytes of data is used on 7 consecutive lines to build up a a roughly square pixel. Then the next set of 8 bytes are used for the following 7 lines.
I soldered up a 4x4 matrix keyboard together with the PIC16F1705 and two resistors for mixing the monochrome video and the syncs.
Then spending 4-5 hours coding up the basic code for outputting stable syncs and a bitbanged video with the pixels coming from RAM (not hardcoded) I was amazed that my TV actually showed a fairly stable image on the first try.
I must admit that I used the simulator in MPLAB X to check the timings - but still... First try!
This is how the 64x32 pixels with a 0x55/0xAA grid looks like.
There seems to be some sync issue in the last row....