Video conversion using dual port RAM in FPGA

A project log for MSX2(+) video to VGA conversion (proof of concept)

V9938 and V9958 Video display processors were successors to TMS99X8 - this is an attempt to convert their video signal to VGA using FPGA

zpekiczpekic 03/29/2021 at 04:060 Comments

The basic approach is essentially the same as described here:

The key differences are:

Resolution512*256256*192 (typically)
Colors4 (2 bit "intensity")8 (1 bit per R, G, B)
Pixels per byte4
b7:b0 = VvVvVvVv
b7:b0 = -RGB-RGB
Pixel clock12MHz5.3693175
Data sampler clock
Horizontal syncpositive HSYNC, video signal has no porchespositive HSYNC, video signal has front and back porch
Vertical syncpositive VSYNC, video signal has no porchesregenerated from CSYNC, video signal has top and bottom porch
Window on VGA512*256512*384
Memory used32k24k

Refer to following files for key components:


This is the main top-level component. The video signals come in through 8-pin PMOD port:

alias VIDEO_HSYNC: std_logic is PMOD(7); -- BB6 on Anvyl (white)
alias VIDEO_CSYNC: std_logic is PMOD(6); -- BB5 on Anvyl (blue)
alias VDP_B_DIG: std_logic is PMOD(3);     -- "digitized" blue signal (using LM339 1-bit ADC)
alias VDP_G_DIG: std_logic is PMOD(2);     -- "digitized" green signal (using LM339 1-bit ADC)
alias VDP_R_DIG: std_logic is PMOD(1);     -- "digitized" red signal (using LM339 1-bit ADC)
alias VDP_CPUCLK: std_logic is PMOD(0);     -- v9958 pin 8 (XTAL/6 == 3.579545MHz)

(simplified here, the actual code contains overlapped signals for TIM-011 mode)

Out of these signals only VIDEO_HSYNC is directly used, as is a positive pulse that resets the horizontal scan counter and drives the vertical scan.


Contains the VSYNC but also the HSYNC signals. To extract the VSNYC only a simple delay line is used that filters out a signal which is less than the length of HSYNC (24 pixels = 96 XTALs)

--generate VSYNC by filtering out HSYNC from CSYNC using a delay line
on_vdp_cpuclk: process(reset, VDP_CPUCLK, VIDEO_CSYNC, VIDEO_HSYNC)
    if (rising_edge(VDP_CPUCLK)) then
        csync_line <= csync_line(30 downto 0) & VIDEO_CSYNC; 
    end if;
end process;

vdp_vsync <= not (VIDEO_CSYNC or csync_line(17)); -- 24 pixels long ~ 17 CPUCLK


This the master used for sync of pixel clock. The frequency is XTAL/6. So to get XTAL, we multiply by 12 (using a built-in DCM "digital clock manager" circuit baked into the Xilinx FPGA. Almost all FPGAs support similar (or PLL) circuits to generate clocks of almost any frequency). However multiplying with 12 is not perfect, it is noticeable in vertical bars that appear when digitizing the R, G, B signals. 

The clock produced (42.95454 MHz) is then divided by 2 but also used to drive delay lines for digitized R, G, B:

on_vdp_xtal_int2: process(VIDEO_HSYNC, vdp_xtal_int2, VDP_R_DIG, VDP_G_DIG, VDP_B_DIG, r_line, g_line, b_line)
--	if (VIDEO_HSYNC = '1') then
--		vdp_xtal_int <= '0';
--	else
		if (rising_edge(vdp_xtal_int2)) then
			vdp_xtal_int <= not vdp_xtal_int;
			r_line <= r_line(6 downto 0) & VDP_R_DIG;
			g_line <= g_line(6 downto 0) & VDP_G_DIG;
			b_line <= b_line(6 downto 0) & VDP_B_DIG;
		end if;
--	end if;
end process;


These are the "raw" 1-bit color signals from LM339. But they are not directly fed to the sampler, a bit of timing tweak is possible by tapping into the delay line. This allows removing some noise to sample the video signals at a precise moment. 

r_delayed <= r_line(to_integer(unsigned(switch(7 downto 6) & '1')));
g_delayed <= g_line(to_integer(unsigned(switch(5 downto 4) & '1')));
b_delayed <= b_line(to_integer(unsigned(switch(3 downto 2) & '1')));

Six switches on the Mercury baseboard select the moment to sample the color signal. 

With these signals ready, they are fed into the "sampler" component:

offset_vdp <= button(3 downto 0) when (switch_tms = '1') else "0000";
vdp: vdp_sampler2 port map (
		reset => RESET,
		clk => vdp_xtal_int, -- 
		hsync => VIDEO_HSYNC,
		vsync => vdp_vsync,
		pixclk => vdp_pixclk,
		offsetclk => freq4, 
		offsetcmd => offset_vdp, -- in TMS mode move the 0, 0 dot within the window
		r => r_delayed, --VDP_R_DIG,
		g => g_delayed, --VDP_G_DIG,
		b => b_delayed, --VDP_B_DIG,
		a => vdp_sampler_a,
		d => vdp_vram_dina,
		limit => "001110", --switch_limit, 
		we_in => we_in,
		we_out => vdp_sampler_wr_nrd

The sampler takes following inputs:



The "sampler" circuit is relatively simple. The key to remember is:

4 XTAL = 1 pixel ("sample_pulse")

2 pixel = 1 byte ("write_pulse")

8 XTAL = 1 byte

So in 8 input clock cycles, the R, G, B signals have to be sampled twice and byte containing the xRGBxRGB written once:

-- 8 xtal cycles == 2 pixel clock == 1 byte
on_clk: process(clk, hsync, cnt, r, g, b)
	if (hsync = '1') then
		cnt <= "000";
		if (falling_edge(clk)) then
			cnt <= std_logic_vector(unsigned(cnt) + 1);
		end if;
	end if;
end process;

pixclk <= cnt(1);
write_pulse <= (limit(5) xor clk) when (cnt = limit(2 downto 0)) else '0';
sample_pulse <= (limit(5) xor clk) when (cnt(1 downto 0) = limit(4 downto 3)) else '0';

The exact timing when this happens in 8 cycle sequence is determined by parameter "limit" set as constant from outside (it is somewhat tweakable). 

The "sample" pulse drives a shift register that moves by 4 bits (note that MSB is set as '0'), and lower 3 bits capture the RGB color:

on_sample_pulse: process(sample_pulse, r, g, b, sample)
	if (rising_edge(sample_pulse)) then
		sample <= sample(3 downto 0) & '0' & r & g & b;
	end if;
end process;

How is the sampled color byte (containing 2 pixels) stored in the memory?

The scan line is typically 256 pixels, which means 128 bytes, 7 bits. And then there are 192 rows which fits in 8 bits. So the 14-bit address is:


-- output signals
d <= sample;

a <= v_off(7 downto 0) & h_off(7 downto 1);
we_out <= write_pulse and (not h_off(8)) and (not v_off(8));

-- offset to ignore "left" before real pixel data comes in
h_off <= std_logic_vector(unsigned(h) + unsigned(h_offset(8 downto 0)));--unsigned(limit(2 downto 0) & "00"));
-- offset to ignore "top" before real pixel data comes in
v_off <= std_logic_vector(unsigned(v) + unsigned(v_offset(8 downto 0)));--unsigned(limit(5 downto 3) & "00"));
v_ok <= '0' when (unsigned(v_off) > 191) else '1';

However, the V and H are not direct horizonatal or vertical counters. The pixels do not start right after VSYNC and HSYNC signals, there are "porches" that delay the start. So both directions have offsets that can be tweaked using 2 up/down counter registers:

h_reg: offsetreg Port map ( 
				reset => reset,
				initval => "1111100110", -- -26 (0x3E6)
				mode => offsetcmd(1 downto 0),
				clk => offsetclk,
				sel => '0',
				outval => h_offset

v_reg: offsetreg Port map ( 
				reset => reset,
				initval => "1111100101", -- -27 (0x3E5)
				mode => offsetcmd(3 downto 2),
				clk => offsetclk,
				sel => '0',
				outval => v_offset