Close

Remex I/O Channels

A project log for Kestrel Computer Project

The Kestrel project is all about freedom of computing and the freedom of learning using a completely open hardware and software design.

samuel-a-falvo-iiSamuel A. Falvo II 02/05/2017 at 08:029 Comments

IBM mainframes have some pretty nice names for their channel architectures. The original, of course, simply is known as "channels." But, when they needed higher performance, IBM released something called ESCON. Later, when that wasn't enough, they released a fiber-optic and substantially faster version called FICON.

As you might guess, I'm not particularly interested in being sued by IBM for infringing on their trademarks, so KESCON or some similar portmanteau or initialism is simply out of the question. Thankfully, it's not a big problem to come up with a decent name of my own: Remex channels.

I selected the name remex because it is the flight feathers of a bird; in a way, it's one of the "primary interfaces" between a bird and its environment.

Kestrel-3's I/O channels are based on 1x6 Pmod connectors, 3.3V logic, IEEE-1355 DS-SE-02 signalling, and using a modified Spacewire-like protocol for communications between the computer and peripherals. The result is not compatible with Spacewire or even stock IEEE-1355, due to my insistence for supporting bit-banged peripherals on Arduino-class microcontrollers, which depending upon how they're programmed, can operate at best in the kilobits per second range. However, if the device relies on an FPGA or a GA144-type microcontroller, performance can easily reach many tens of megabits per second.

As I type this, I have completed a preliminary data-strobe decoder and character decoder for the receive-pipeline, which is arguably the most performance critical part of a Remex link. (See Github repo.) Right now, icetime reports that the top clock rate for the receiver is 157 MHz, which means you could theoretically feed it a 51 Mbps input data rate. (Unlike IEEE-1355 links made professionally, I'm not using self-clocked receiver logic due to the innate difficulty with getting such a thing working on a single development tool-chain, much less across a plurality of different FPGA development systems!) The icoBoard Gamma has a 100MHz oscillator standard, so I expect to drive it at 100MHz to achieve a top throughput of 33 Mbps. That's not a fantastically high data rate (a smidge over 2.5 MB/s peak data rate; real-world performance remains to be measured); but, for an amateur production like mine, it should be plenty powerful enough for a long time to come.

Besides, if we really need 200Mbps throughput, someone can release an FPGA-/toolchain-optimized revision to the core which enables the receiver to be truly self-clocked. One thing is for sure: 2.5 MB/s isn't fast enough to support even monochrome 640x480 bitmapped displays at 60fps. However, it is capable of 30fps (needs only 1.6 MB/s), so basic animations should still be doable.

I'm still playing around with the circuit details as I develop it, since this is the very first time I've ever made any IEEE-1355-compliant link. It's also why I'm not writing any unit tests at this time; things are prone to change quite drastically as I learn more about the requirements of the circuit. For now, all the test-benches just generate waveforms for viewing in gtk-wave or similar tool.

Discussions

K.C. Lee wrote 02/07/2017 at 05:59 point

>One thing is for sure: 2.5 MB/s isn't fast enough to support even monochrome 640x480 bitmapped displays at 60fps.

FYI:  The dot clock for VGA is 25.175MHz.  The $0.44 ARM in my VGA terminal project pushing pixels at 25Mbps from a SPI to drive monochrome 640x480 at 60fps VGA output.

  Are you sure? yes | no

Samuel A. Falvo II wrote 02/08/2017 at 18:23 point

That's 25Mbps if, and only if, the CPU can consistently pump bits into the shift register.  For a general purpose computer, running a general purpose operating system, running applications designed to run in a general purpose environment, that's not going to happen.  Ever.

It's far better to off-load that work onto a dedicated terminal/video device, and just send video diffs when/where they're needed.  Problem is, it's not hard real-time, and is not properly framed for beam-racing.

Finally, to display a 640x480 monochrome display, you need 800*525*60/8=3.15 MB/s throughput.  2.5MB/s is not fast enough.

Whether said external/dedicated display device is your $0.44 ARM or not, that's up to you or whoever else is interested in making that port happen.  Personally, I'm going to use my laptop PC and a port of the Kestrel-2 as a terminal at some point in the future.  The raw computer itself is not in any position to display raw video on its own.  Not in a small FPGA like an iCE40.

  Are you sure? yes | no

K.C. Lee wrote 02/08/2017 at 18:53 point

Pixel clock *already* include the non-displayable area.  Your assumption is wrong.  Kind of silly to argue with the math of someone that implemented a working design.

I am only stating that your statement "One thing is for sure: 2.5 MB/s isn't fast enough to support even monochrome 640x480 bitmapped displays at 60fps. H" is wrong.  I didn't care what you are trying to do.

"640 x 350 (EGA on VGA)"    "640 x 400 VGA text"        "VGA industry standard"
Clock frequency 25.175 MHz  Clock frequency 25.175 MHz  Clock frequency 25.175 MHz
Line  frequency 31469 Hz    Line  frequency 31469 Hz    Line  frequency 31469 Hz
Field frequency 70.086 Hz   Field frequency 70.086 Hz   Field frequency 59.94 Hz
One line:                   One line:                   One line:
  8 pixels front porch        8 pixels front porch        8 pixels front porch
 96 pixels horizontal sync   96 pixels horizontal sync   96 pixels horizontal sync
 40 pixels back porch        40 pixels back porch        40 pixels back porch
  8 pixels left border        8 pixels left border        8 pixels left border
640 pixels video            640 pixels video            640 pixels video
  8 pixels right border       8 pixels right border       8 pixels right border
---                         ---                         ---
800 pixels total per line   800 pixels total per line   800 pixels total per line                             
350 lines video             400 lines video             480 lines video
  6 lines bottom border       7 lines bottom border       8 lines bottom border
---                         ---                         ---
449 lines total per field   449 lines total per field   525 lines total

  Are you sure? yes | no

Samuel A. Falvo II wrote 02/08/2017 at 19:32 point

I think you're misleading yourself.  You imply that I've not done this before, but in fact I have: please refer to my MGIA core, used as the video core for my previous computer design, Kestrel-2.  I also have plans on the backburner for a CGIA core to serve as a massive upgrade from the MGIA, covering both higher resolutions as well as color and maybe even 2D bit-blit acceleration longer term down the road.

Your resolution figures are a distraction; if you divide the 25.175MHz dot clock by 8 pixels per byte, you get (roughly) 3.15 megabytes per second.  You need AT LEAST that data rate in order to RACE THE BEAM, which is exactly what your Chibi terminal is doing by using DMA-driven SPI.  My math is dead on.  My claim is similarly dead-on -- you *cannot* direct-drive a video display at 640x480 resolution with the Remex ports; they're just not fast enough as currently conceived.

The best you could hope for is to send a bitmap at 2.3MB/s (80 bytes*480 lines*60 fps), but this AGAIN assumes that the main CPU, the Remex interfaces on both the host and the slave video display device, AND the slave hardware all cooperate to stream this data at 100% efficiency at all times. These are unrealistically tall orders to fulfill concurrently.

  Are you sure? yes | no

Samuel A. Falvo II wrote 02/08/2017 at 19:32 point

I think you're misleading yourself.  You imply that I've not done this before, but in fact I have: please refer to my MGIA core, used as the video core for my previous computer design, Kestrel-2.  I also have plans on the backburner for a CGIA core to serve as a massive upgrade from the MGIA, covering both higher resolutions as well as color and maybe even 2D bit-blit acceleration longer term down the road.

Your resolution figures are a distraction; if you divide the 25.175MHz dot clock by 8 pixels per byte, you get (roughly) 3.15 megabytes per second.  You need AT LEAST that data rate in order to RACE THE BEAM, which is exactly what your Chibi terminal is doing by using DMA-driven SPI.  My math is dead on.  My claim is similarly dead-on -- you *cannot* direct-drive a video display at 640x480 resolution with the Remex ports; they're just not fast enough as currently conceived.

The best you could hope for is to send a bitmap at 2.3MB/s (80 bytes*480 lines*60 fps), but this AGAIN assumes that the main CPU, the Remex interfaces on both the host and the slave video display device, AND the slave hardware all cooperate to stream this data at 100% efficiency at all times. These are unrealistically tall orders to fulfill concurrently.

  Are you sure? yes | no

Eric Hertz wrote 02/09/2017 at 03:13 point

Most distractive "distraction" I've seen all day. I needed that. Thanks guys!

  Are you sure? yes | no

Vince Hodges wrote 02/06/2017 at 20:53 point

Re: Graphics, don't give up yet!  By making the graphics 'card' high level (ie by sending drawing commands) you can do much with limited bandwidth.  Gameduino works this way as well as https://hackaday.io/project/5651-homer (alas seems stalled) - be sure to watch the videos in their project updates to see what can be done animation wise!

When my Hifive board arrives one of the projects I want to do is a high level graphics subsystem like this (but since I have zero hardware experience it will be to my Linux box using something like a bus pirate).

  Are you sure? yes | no

Samuel A. Falvo II wrote 02/07/2017 at 00:31 point

That's the plan.  I want a generic command set though, so the exact graphics terminal is at least minimally abstracted from the computer.  Needing to write device drivers for every kind of attached terminal would truly hurt.

Ultimately, I'm thinking of having terminals expose a 9P filesystem that is compatible with the Rio window manager in Plan 9.

  Are you sure? yes | no

Vince Hodges wrote 02/07/2017 at 03:33 point

Yep, sounds awesome!

  Are you sure? yes | no