As the video signal and the corresponding sync signals are generated by software, the console contains a minimum of hardware. There is also an audio signal output with five binary tone channels, mixed by a passive resistor network. Two of those channels are used for sound effects, similar to ones used in video games of that time (early eighties) and three for background music. This output is capable of driving line output for PC speakers or headphones.
It should be noted that there is no video processing unit, PGA or any special purpose chips, and that PIC microcontrollers are not designed for video signal generation. Everything is achieved by a series of different design tricks and some compromises. As te game does not run on PC but on stand-alone unit, screenshots are taken by camera from VGA monitor or directly from Photoshop, which was used in bitmaps creation process.
Video and audio generators, which are the vital parts of the firmware, are the parts of the operating system, which will soon be documented, and can be used for any other game or application. As the timings are critical, those parts are written in assembly language, but all the other parts of the program (scenario for some other games or any other application) may also be written in some other programming language, preferably Microchip's C. In this case all parts are written in Assembler, but only as a result of author's preference.
At the moment, only the game Jumping Jack is written for the platform, well known to those who played with the Spectrum personal computer back in the day. However, once a new game is created, it is easy to download it from the computer, via the serial port (which is not visible at the photo). The console has USB connector, but it is used only for 5V power supply. Unfortunately, microcontrollers which are packed in DIP packages (with thru-hole soldering, convenient for DIY projects and workshops) do not have USB interface but only serial ports, so you have to use RS 232 if you want to download some new game.
This project is based on PIC24EP512GP202 microcontroller, which is 16-bit Microchip's MCU with 512K of internal flash program memory, 48K of internal data memory, packed in standard 28-pin Shrink-DIP case. This is the schematcs diagram:
The whole unit uses +5V power supply (2.1 mm coaxial CON1A or mini USB CON1B connector, but take care not to use both!). Measured current consumption is 77 mA. LDO regulator MCP1702-33 is used for +3.3V supply for MCU.
Instead of quarz, you can use ceramic resonator for the oscillator, but some of resonators will produce significant frequency jitter, which is visible as horizontal pixel instability on the screen. It is also possible to use internal FRC oscillator with PLL, but, of course, the jitter will be even worse.
R, G and B video hardware drivers are realized by simple passive attenuators (R1R2, R4R5 and R7R8) and NPN transistors (as emitter followers) T1-T3. There are no special requirements for those transistors, but the low current types (=< 100 mA) are reccomended, as the high current ones typically have lower bandwidth. Intensity control (for "dark" colours and gray) uses standard silicon diodes D1-D4. Do not use Schottky diodes, as their forward voltage drop is too low for proper dark colour representation in analog video signal.
5-channel audio mixer uses simple passive resistor network. Capacitor C9 is used for passive, first order low-pass RC filter. Pull-up resistor R14 is used in two-step volume level control circuit. As all outputs are pure binary (on-off), low volume level is managed by software, when the output port pins are held in open-drain state (selectable in ODCA and ODCB registers), as in that case resistors R14-R20 create additional voltage attenuator, for each output individually.
Video processors are not typically embedded in microcontrollers, so using the external video display unit is considered in gaming consoles. As this is the minimalistic project, VGA video signal is generated by software, based on interrupt driven kernel.
The routine which generates VGA signal is the part of T2 (Timer 2 module) interrupt service routine. This routine also services vertical sync pulse, markers for monitor auto adjustment and the bottom line text routine. At this version, no other interrupts are active, but the user can add his own interrupt sources, as long as they have lower priority level.
VGA timings for resolution 800x600 in 56Hz refresh rate are represented on the drawing. Here are detailed timings data:
Pixel clock: 36 MHz (13.89 ns)
Horizontal frequency/period: 35.16 KHz (28.44 us)
Visible area: 800 pixels (22.22 us)
Front porch: 24 pixels (0.67 us)
Sync pulse: 72 pixels (2 us)
Back porch: 128 pixels (3.56 us)
Vertical frequency/period: 56 Hz (17.86 ms)
Visible area: 600 lines (17.067 ms)
Front porch: 1 line (28.44 us)
Sync pulse: 2 lines (56.89 us)
Back porch: 22 lines (625.78 us)
Whole frame: 625 lines (17.78 ms)
Dot clock for 800x600 resolution @ 56 Hz vertical frequency is exactly 36 MHz, and the maximum execution speed for PIC24E family is 70 MIPS. So the MCU has to be slightly overclocked to 72 MHz to get the desired instruction/pixel clock rate. This overclocking is only 2.8%, which is negligible and will not noticeable affect operational safety or thermal dissipation.
As it was noted, each pixel takes the place of 2x2 pixels screen area, so the actual dot clock is not 36 but 18 MHz. That gives enough time to the processor to execute four instructions in one pixel timing. In addition, every scan line is displayed four twice, so there is even more time for buffer setup during horizontal sync and porches.
Video memory is located in internal 48 KB RAM, where it occupies 45600 bytes. All video signal timings match VGA standard in 800x600 mode, but, due to RAM limitations, the actuual displayed resolution is only 380x240, and it is displayed on 760x480 pixels original screen area.
To use the whole 800x600 display area in 8-bit pixel mode, we need 800x600=480,000 bytes of memory, but in the best case, all that PIC microcontrollers offer at this time is only 48K (49,152 bytes), which is too far from what we need. There are some 16-bit PICs with 96K RAM, but they are too slow for video signal generation, and some 52K PICS, but they are in SMD 64-pin packages with 0.5 mm pitch, which is quite unconvenient for DIY projects. Although it is possible to add external RAM, it is of no practical use, as the access to the external RAM would be too slow for VGA signal generation. So we have to do it with 48K RAM MCUs somehow.
To do that, we have to make some copromises:
1. The colour of each pixel is defined by four bits only, so it works in 16-colour mode. To make things worse, only 15 colours are used, as one of them (binary represened as 0000) does not mean "black" but "transparent", which shalll be used in sprite handling (black is 1000). More about that later.
2. Each pixel from the video memory is displayed on 4-pixel (2x2) area of the VGA screen.
3. Actual displayed resolution is 380x240, which occupies 760x480 screen pixels. The 20 pixel wide margin on the top, left and right side of the monitor are not used and are left black. At the bottom there is one line (39 characters) of text. It needs no frame buffer, as the routine interprets text directly from the text buffer in RAM.
This organization gives 380x240=91,200 graphical pixels, but as each pixel is covered by 4 bits, the video memory needs only 91,200/2=45,600 bytes of memory. Bottom line text needs no video buffer and it occupies only 78 bytes (39 for text and 39 for colour attributes). So there are 49,152-45,678=3,474 free bytes, which is quite enough for housekeeping (internal buffers and general purpose registers).
With the processing power of 72 MIPS, it would be easy to generate the video signal by software, if the only requirement is to show the contents of video memory. As there is no video processing unit here and the MCU has to handle one pixel at a time, such concept would be useful for static images or very small movable blocks of pixels, but not for the game, which requires real time processing of large memory blocks. Even worse, as more than 2/3 of the time CPU is busy generating video signal, so it leaves only 1/3 for housekeeping and active memory handling.
The solution to this problem is to use sprites, which are 2D images located outside the video RAM, and somehow superimposed in the main scene. Video units in some of the first personal computers could handle sprites in hardware, but in this project it is realized in software. The sprites are in internal program memory of the MCU and they are combined with video RAM contents to generate the full video signal. That means that there is no way to manipulate the sprite contents, it can only be displayed at the desired location of the screen. As the most of characters in this game are animated, there is a large number of pixels, and each of them represents one frame of that pixel in the animation. Luckily, there is enough space in program memory, both for program, background image and sprite frames. In this case, more than 95% of 512 Kbyte program memory space is used.
Here is the example of Jack's jump. Note that X and Y absolute position on the screen is permanently adjusted during the jump, as well as the order of slides in the jump sequence (which is listed in the script table in the firmware), so it gives much more freedom in creating the Mise en scène for the game - this jump is, in reality, much higher and lasts longer than it may look while just watching those slides. Some of those slides are repeated, some used randomly, and most of them are dynamically relocated on the screen. So there is no need to draw the equal slides again, as each of them can be called repeatedly in the script table. In this example, the last five slides are repeated only because of the hair "splash", otherwise slides 11, 12, 13, 14 and 15 could be ommited and listed as 9, 8, 6, 4 and 3 in the script. The same slides are used for jump up and for jump down to the lower floor, but with different script tables.
All that software has to do while servicing the video scenario, is to preset the special sprite registers, determining X and Y positions (relative to the left and upper border of the active portion of the screen), width and height of one slide image, and address of the current slide in program memory. Video firmware, located in the interrupt routine, will superimpose that sprite in the content of the background video memory during RGB signal generation.
One more thing to note is that the orange colour in sprites means "transparent" (there is no orange colour in the game pallete, only in the pallete of the PC drawing program during sprite design process). Each orange pixel on the sprite will be displayed from the video memory, which will typically hold the background image. Yet, there is one drawback of this princip. If two or more sprites are overlapping, then transparent (orange) pixels on the first of them (which holds the highest priority, that means which is located higher on the special sprite table in RAM) will partly covered the lower sprite, displaying the background instead of lower sprite's active pixels. The first (simulated) screenshot shows that example.
There is the way to solve this problem, but, due to execution time limitation, only for the limited number of sprites. Some sprites can be treated as "special", and they do not have that drawback (see the second screenshot). The only problem with those sprites is that they require 18 times more time for the video routine to execute, so programmer has to take care not to use this option if it is not necessarry, as it could result in losing scan lines on the screen.
How to tell to the video routine which sprite is special, and which is not? The sprite list (located in RAM and named SPRITELIST) holds pointers for active sprites. The video routine can place (or remove) any sprite at that list at any time, and at any table position which is not currently occupied. This table can hold the maximum of 20 sprites at the same time. Only four sprites (number 17, 18, 19 and 20) are treated as "special" ones - they are executed much slower, but they do not generate the described problem in overlapping conflict, or at least it is minimized so that it is not noticeable. In this game, only one sprite (Jack itself) has that privilege, as the game scenario is such that all other sprites will never be overlapped.
Theory of operation
The most significant part of the video routine uses SPRITEBUFFER, which is 190 bytes long (equal to one scan line in video memory), and in which the video routine prepares the sprite contents for the current line, before it merges it with background image and outputs that scan line. So the video memory has two layers: the lower layer is the large video memory itself, which mainly contains background picture, and the upper layer, which is only one scan line large and which contains pixels for that line. Those pixels are copied from the sprite tables located in program memory, before the video routine starts outputing data. So, this layer is dynamicaly changed for each scan line (more specifically, each two equal scan lines) during the horizontal sync and back and front horizontal proch.
Here is how the video routine outputs RGB video signal to the port pins B8, B10, B12 and B14 (Red, Green, Blue and Intensity, respectively). Four instructions (total of 55.55 ns) are used for single pixel, and this part of program (repeated 190 times) outputs 380 pixels. Odd pixels (1, 3, 5...) are generated when bits #0, #2, #4 and #6 from the corresponding byte of SPRITEBUFFER are copied to port pins B8, B10, B12 and B14 (red listing), and even pixels (2, 4, 6...) are generated the same way, except they are rotated, so that bits #1, #3, #5 and #7 are copied to the same pins (blue listing). W13 register already points to the high byte of LATB register (not shown on the listing), w3 register points to the start of SPRITEBUFFER minus 1, and w12 register contains offset from SPRITEBUFFER to the main background (video memory) buffer (it should be correctly calculated before each scan line execution). W7 and w14 are simple masks used for odd/even pixels separation.
If you have to redesign the hardware of this project, you must know that the remaining bits of high byte LATB portion (#9, #11, #13 and #15) can not be used for simple output function, as they will be corrupted in this routine (this does not apply to remapable pin functions, as they are not altered by witing to LATB). As you can se, each 4-instruction part (both blue and red) first fetch the single byte from SPRITEBUFFER and tests it for zero at the same time. If it is zero (if the pixel in the sprite contains "transparent colour"), it fetches the pixel content from the video memory. At last, the pixel (whether it is from sprite or background) is outputed to the port. Here follows the vital part of video routine:
Of course, SPRITEBUFFER must be properly loaded with sprite pixels before current scan line starts displaying in T2 interrupt. This can be done only during horizontal sync and back and front horizontal porches, and it leaves 6.23 us (about 448 instructions) which can be used for SPRITEBUFFER preparation. In reality, some of those instructions will be spent on register presets and w12 (offset) calculation, horizontal sync synchronization and SPRITEBUFFER clearing at the beginning, so in the best case we can count on about 300 instructions. This is surely not enough time to test 20 possible sprites, to check if they exist in the current scan line, calculate position inside sprite lookup table and to move their contents from program memory to the SPRITEBUFFER. Most of the time will be spent on the last item, reading program memory and moving its contents to the SPRITEBUFFER. To make it worse, reading from program memory takes 5 instruction cycles for each word, but, luckily, if you use PSV (Program Space Visibility) mode, only the first word transfer will take 5 instruction cycles, and the others only one. This is, of course, used in this project, otherwise it would not be possible.
Unfortunately, this is valid only if you move 16-bit words in PSV mode (e.g.mov [w3++],[w4++]), but if you use the same technique in byte mode (e.g. mov.b [w3++],[w4++]) you still need 5 instruction cycles for every byte (this is not documented in Microchip's manuals, so I had to learn it the harder way). The consequence for this PIC24E drawback is that it is not possible to move the single byte (2 pixels) of video content, but only word by word, which is 4 pixels. So the X pointer for each sprite should point to 0, 4, 8, 12, 16, 20... and not to the locations which could not be divided by 4. This makes more headache to the programmer, even during slide design in sprite animation.
What is so special about the last four sprites in the table, so that they can correctly cover another lower priority sprite? They do not use fast (and "blind") PSV mode, but slow byte-by-byte comparision and transfer. This takes 18 times more time to handle one sprite, so it should be used with special attention, and for sprites which are not too wide (height does not matter). There is still one possible pixel of "error" in overlapping sprites, when the area between overlapped sprites could contain some single transparent pixel, but this is unnoticeable on the screen.
As it was noted, there is not enough time to handle all sprites before each scan line. Luckily, there are two equal scan lines for every video line, so if we use both of them, we shall have twice more time. The only problem is that there is no way to start preparing the SPRITEBUFFER before it is completely displayed in the second scan line. That is why, instead of SPRITEBUFFER, there are two independent sprite buffers - SPRITEBUF1 and SPRITEBUF2. While the video routine displayes the contents of the first one, the second one will be prepared, and vice versa. That small pipeline is not so confusing like it seems, and it was the last trick which enabled the project realization.
So there are four basic steps, each of them executed before the scan line is outputed to port:
1. Test for every sprite in SPRITELIST and calculate pointers for the sprites which are present in scan line N+2 (and N+3), then load COPYLIST table with those pointers... then generate scan line N, using SPRITEBUF1
2. Use the COPYLIST to transfer pixel data from program memory to SPRITEBUF2... then generate the equal scan line N+1, using SPRITEBUF1
3. Test the sprites in SPRITELIST and calculate pointers if sprites are present in scan line N+4 (and N+5), then load COPYLIST table with those pointers... then generate the new scan line N+2, using SPRITEBUF2
4. Use the COPYLIST to transfer pixel data from program memory to SPRITEBUF1... then generate the equal scan line N+3, using SPRITEBUF2
By the way, SPRITEBUF1 and SPRITEBUF2 are spaced and surrounded by three areas named DUMMUSPACE1, DUMMUSPACE2 and DUMMUSPACE3, each of them 86 bytes wide. They are are used for nothing, except to store dummy pixels for some sprites which are close to the borders of the screen or even outside the screen. So X pointers can point up to -172 to the left or (380+172-sprite width) to the right, and the sprites will be correctly hidden if they are outside the screen. Y pointers can be streched unlimitedly, with no special care.
How to draw your own sprites and convert them to data tables
Both in video memory and in sprite tables, pixels are organized in the same way: bits #0,#2,#4,#6 are for the first pixel, bits #1,#3,#5,#7 of the same byte for the next one, and so on. That is how they have to be arranged when the sprite is created and the pixel data table is created. It can be.byte or .word data list, so the video routine can access it. Bits 16...23 of program memory are not used by video routine. Sprite tables can be located at any page of program memory.
There is a lot of ways to create image or sprite data tables. One possible way is to use some drawing program (e. g. Photoshop) to create the 16-colour pallete, with colours are arranged in this way:
|0 Orange||4 Dark blue||8 Black||12 Light blue|
|1 Dark red||5 Dark violet||9 Light red||13 Light violet|
|2 Dark green||6 Dark cyan||10 Light green||14 Light cyan|
|3 Dark yellow||7 Gray||11 Light yellow||15 White|
Now draw the sprite or slides for the animation in Indexed Color mode (with all transparent areas painted orange), and save it in .RAW format. If you look at the .RAW file in some hex editor, you shall see that the colour for every pixel is represented in a single byte. Now you have to create the simple program which converts the file to ASCII data table, respecting bit orders represented on the drawing.
That program should create ASCII directive .WORD or .BYTE, numeric constant prefixes 0x (if bytes are converted to hex), commas as table separators and line feeds, so the output should possibly look like this:
Or like this, depended on mode used:
Yoy can copy your table as the text to your source file in your application.
Audio signal is generated by software and internal MCU peripherals. There is no dedictated sound controler in PIC peripherals, but some other resources are used instead. As the consequence to this compromise, audio output signal is pure binary (square wave), just like in the old computers and game consoles. To make the sound more pleasant, there is the passive, first order low-pass RC filter (R15-R19 and C9). There are five audio channels, each of them controlled by software. On-off, tone frequency and two-step volume control is supported. Three channels are typically used for music, one is for Jack's sound effects and one for his enemies sound effects.
The most convenient peripheral for audio signal generating is OC (output Compare), which is preset to Center-Aligned PWM mode. Unfortunately, there are only four OC channels in PIC24EP512GP202 (or any other 28-pin PIC MCU), and one of them is already used in horizontal sync generator, so we have only three for audio application. They are used for music.
The remaining two audio channels (sound effects) are created using TX units from UART1 and UART2 peripherals. TX buffers (U1TXREG and U2TXREG) are permanently loaded with 10101010s, so the whole transmited sequence, including Start and Stop bits, contains 0101010101s, repeated endlessly. BRG (Baud Rate Generator) determines the frequency, and On-Off is obtained by remapping Peripheral Pin Select to TX out or Port Latch.
In the Menu (or Pause) page, each channel is adjusted to Off, Low and High volume. The only difference which determines volume level, is OD (Open Drain) bit in ODCx register.
Music (or effect) script contains of 4-byte groups. First byte is for tone channel #1, then goes tone channel #2, tone channel #3 and duration (in 17.86 ms steps, which is equivqlent to 56 Hz, same as VGA vertical frequency). The table on the bottom is for tone pitch bytes (red numbers, 1-75), showing corresponding tones (blue, A2-B8) and frequencies (black, 110-7920 Hz). If the tone byte contains 0, it is pause (no tone on that channel), and if it contains 99, it is "no change" (previous tone continues). All table members must contain four bytes. Table terminator contains 0,0,0,255 and then follows one-word address (low endian) with new table which shall be executed. So "0,0,0,255" is opcode "jump" and the following word is jump address.
For sound effects, script contains 2-byte members. The first one is the tone pitch, and the second one is pitch in 17.86 ms steps (56 Hz). Terminator contains two zero bytes. There is no loop address like in the music script, as sound effects should be executed only once for each game event. Pause is represented as 0 (followed by pause duration byte which is >0), and there is no "equal" byte, as this is monotone effects table.
There are two global variables, named music.shift and effects.shift, they are used as global pitch shifters, for final frequency adjustment in one-tone steps. By default, they are 8 and 6. respectively.
The first example is for the part of Jumping Jack main theme (pp is pause and it equals to 00, ee means "equal" and it equals to 99 in the table):
and the second one is sound effect for Jack's jump:
Note: Special credits go to Marko Antonić, who made the arrangements for music and adapted them for this media. Thanks to him, music sounds suprisingly good, even with this modest hardware.
Horizontal sync is generated by OC1 (Output Compare 1) register, which is syncronized to T1 (Timer 1). For the full horizontal clock syncronization, special routine is executed after each horizontal sync signal, otherwise permanent horizontal jitter would be generated. This routine is very important in software driven video signal, and it computes (blue) and executes (red) the series of NOPs to get proper horizontal timing.
btss _ IFS0,#T1IF __ ; test horizontal sync interrupt flag
bra __ 1b ________ ; wait for horizontal sync
mov _ #14,w0 ____ ; minuend (may be changed to adjust horizontal picture position)
subr _ TMR1,WREG ; 14 - TMR1 ---> w0
Vertical sync is generated by software. Timer 2 generates interrupts in variable timing periods, and it services four different events:
1. Vertical sync (one dummy line plus two lines)
2. Markers for automatic monitor adjustment (one dummy line plus one line)
3. Main graphical image (one dummy line plus 240*2 lines)
4. Bottom line text image (one dummy line plus 10*2 lines)
Each of those events is initiated by T2 (Timer 2) interrupt. Special counter (VGATASK), which counts interrupts in the range 0...3, determines which routine will be executed. At the end of that routine, the number of lines (timing until next routine) is written to PR2, so that T2 triggers new interrupt at the right moment. These timing may be used to adjust vertical position of each picture portion.
Red coordinates on the previous picture signify the periods in which the non-interrupt program code (game scenario and logic, keyboard scan, pointer and sprite servicing, etc) is executed. Black ondes belong to the video sync (black line at the top of the drawing) routine, monitor auto adjusting markers (two blue lines at the top), main image and bottom line text. Blue coordinates (1 lines each) are for timing syncronization.
The polarity of both sync signals is positive. Although 24E-series PIC MCU use only 3.3 V power supply, some port pins are 5V tolerant, not only as input, but also as output, when they are defined as Open Drain output pin. In that case, pull-up resistors to +5 V are used to define high output voltage level.
This game originates from the early days of personal computing. The scenario is very simple: there is ground level and six floors, with several holes which permanently travel left and right. Jack can jump up thru those holes, but also fall down, in which case he will be unconscious for some time. If he falls down to the ground level, he loses one life. He also has to avoid several enemies, coming from left, right and falling above, as each contact with them will cost him one life. One of those figures (boxed rotating heart at the left of the screenshot) will not take but give him one life, and the shield (the big dotted rotating circle, top right on the picture) will n ot only add one more life, but also protect him from all enemies for some time. There is a total of five lives at the beginning.
There are two elevators which will not take him upstairs, but only downstairs. Elevators will automatically come when he approaches it. When he enters the elevator on the upper floor, the light in that elevator will automatically turn on. His goal is to switch the light on in both elevators, and the next level will be started. There is a total of 13 levels, each of them harder that the previous one, as more and more enemies appear. After level 7, some obstacles will also be positioned on some floors.
Cursor keys are for Jack's navigation. "Up" key enables Jack to stick to the ceiling, to avoid the hole under him, and "Down" key makes him jump down one floor without consequences. If you press "Jump" key at the right moment, Jack will jump one floor up.
"Pause" key temporarily stops the game and open the navigation through game levels, music select and two-step volume control for music and sound effects. If you prefer the right-hand arrow keys, you can switch "Rotate Joystick" to "Yes" and, when you exit Pause/Menu screen, all controls will be rotated by 180°.
You can download software and PCB file here:
Note that there is still no bootloader in software, so there is no RS232 input support. It will be soon.