Kestrel Computer Project

The Kestrel project is all about freedom of computing and the freedom of learning using a completely open hardware and software design.

Similar projects worth following
With each passing day, technically capable consumers of computing technology increasingly lose their rights with computer hardware. While some look to prominent Linux suppliers as an escape from the Intel/Microsoft/Hollywood oligarchy, I have taken a different route -- I decided to build my own computer completely from scratch. My computer architecture is fully open; anyone can review the source, learn from, and hack it to suit their needs.

From the main project website:

  • No back doors. No hardware locks or encryption. Open hardware means you can completely understand the hardware.
  • No memberships in expensive special interest groups or trade organizations required to contribute peripherals.
  • No fear of bricking your computer trying to install the OS of your choice. Bootstrap process is fully disclosed.
  • Designed to empower and encourage the owner to learn about and even tweak the software and the hardware for their own benefit.
  • Built on 64-bit RISC-V-compatible processor technology.

More precisely, the Kestrel-3, my third generation design, aims to be a computer just about on par with an Atari ST or Amiga 1200 computer in terms of overall performance and capability, but comparable to a Commodore 64 in terms of getting things to work.


This block diagram illustrates my vision of a Furcula-to-Wishbone bus bridge. The KCP53000 CPU exposes a Furcula bus for both its instruction and data ports. Once these buses are arbitrated to a single interconnect, the KCP53001 is used to talk to Wishbone peripherals and memory.

JPEG Image - 205.76 kB - 11/13/2016 at 15:59



This block diagram illustrates how the pieces of the CGIA fit together to serialize graphics data to the VGA port.

JPEG Image - 1.10 MB - 06/16/2016 at 18:57



Here, I draw a GEOS-inspired dialog box-like thing, interactively as you can see.

Portable Network Graphics (PNG) - 22.93 kB - 04/11/2016 at 20:23



Here, I'm writing software to draw simple boxes to the screen using the XOR operator directly on the framebuffer bitmap.

Portable Network Graphics (PNG) - 54.16 kB - 04/11/2016 at 20:22



Finally got block storage working inside the emulator, and along with it, a visual block editor. It's based on my own Vi-Inspired Block Editor (VIBE).

Portable Network Graphics (PNG) - 52.55 kB - 04/11/2016 at 20:21


View all 8 files

  • DX-Forth for Kestrel-2DX Now Prints Numbers

    Samuel A. Falvo II01/08/2018 at 17:55 0 comments

    DX-Forth now prints numbers.  Oh, you don't know about DX-Forth?  You should probably read up on the progress of the Kestrel-2DX then.

  • Kestrel-2DX Update: TIM/V

    Samuel A. Falvo II09/24/2017 at 18:19 0 comments

    In case you missed the good news, you might be interested in reading my recent announcement of TIM/V for the Kestrel-2DX.

  • Kestrel-2DX to be submitted for 2017 Hackaday Prize.

    Samuel A. Falvo II09/05/2017 at 03:00 0 comments

    For grins, I've decided to place the Kestrel-2DX up for the 2017 Hackaday Prize, 5th category.  To make this easier for Hackaday folks, I've created a new project dedicated to the Kestrel-2DX.  I will continue logging my progress with it on that project page.  So folks who are interested in the Kestrel, please do follow the Kestrel-2DX project page too!  Thanks!

  • Kestrel-2DX Runs its First C Program

    Samuel A. Falvo II09/02/2017 at 03:56 4 comments

    So, I decided to try getting my RISC-V GCC compiler working with my Kestrel-2DX again, and this time, for whatever reason, my mental model just "clicked" and things worked.  It was an iterative process to get this far; however, I'll describe things in the rough order I've accomplished things.

    Here's the C program I wanted to run:

    /* Invert the contents of the video frame buffer on cold boot. */
    unsigned long long *scrp, *endp;
    _start(void) {
        scrp = (unsigned long long *)0x10000;
        endp = scrp + 2000;
        while(scrp < endp) {
            *scrp ^= 0xFFFFFFFFFFFFFFFF;
        while(1) ;

    I was able to compile this into statically-linked ELF binary file with the following command:

    ./riscv64-unknown-elf-gcc -O1 -c floop.c -o floop.o -march=RV64I
    ./riscv64-unknown-elf-ld floop.ld floop.o -o floop.exe -nostdlib

    You'll notice that I have a custom loader script, which looks like this:

            ROM (rx) : ORIGIN = 0x00000, LENGTH = 0x8000
            RAM (rwx) : ORIGIN = 0x14000, LENGTH = 0x8000
            .text :
                    . = ALIGN(4);
                    . = ALIGN(4);
            } >ROM
            .rodata :
                    . = ALIGN(8);
                    . = ALIGN(8);
            } >ROM
            .data :
                    . = ALIGN(8);
                    . = ALIGN(8);
            } >RAM
            .bss :
                    . = ALIGN(8);
                    . = ALIGN(8);
            } >RAM

    RAM technically starts at address 0x10000; however, the MGIA fetches its video frame from that location, so we configure the linker to place global data variables 16KB away.

    Then, to pull out just the code and constant data, I used the following:

    ./riscv64-unknown-elf-objcopy -j .text -j .rodata floop.exe -O binary floop.bin

    At this point, I have a raw binary image.  A problem remains, however.  The constant data precedes the code; thus, I cannot just have the processor reset directly into _start.

    $ xxd floop.bin   # note data at offset $00, not $38 or something.
    0000000: dec0 ad0b ea1d ad0b 1000 0000 0000 0000  ................
    0000010: b707 0100 3747 0100 1307 07e8 83b6 0700  ....7G..........
    0000020: 93c6 f6ff 23b0 d700 9387 8700 e398 e7fe  ....#...........
    0000030: b747 0100 9387 07e8 3707 0100 2330 f700  .G......7...#0..
    0000040: 6f00 0000                                o...

    Thus, I need a bootstrap of some kind, and I need to place the C code somewhere away from address 0.

    So, I write a simple bootstrap routine in raw assembly to scan ROM for a special "resident" structure (an idea I learned from coding directly on and for AmigaOS); nothing fancy, just something that would let me find the address of _start.

            include "regs.i"
            addi    a0, x0, $200    ; Start at address 512
    L0:     auipc   a1, 0
            ld      a1, chkdword-L0(a1)
            lui     a2, $8000
    L1:     ld      a3, 0(a0)       ; Did we find the checkword?
            beq     a3, a1, yup
            addi    a0, a0, 8
            blt     a0, a2, L1
            addi    a0, x0, -1      ; Deadlock with all LEDs lit if not found.
            lui     a1, $20000
            sh      a0, 2(a1)
            jal     x0, *
    yup:    ld      a3, 8(a0)       ; Get startup procedure's *offset*
            add     a3, a3, a0      ; Get startup procedure's *address*
            lui     sp, $14000      ; Set up C stack pointer.
            jalr    x0, 0(a3)       ; Let there be C.
            align   8
    chkdword:       dword   $0BAD1DEA0BADC0DE
            adv     $8000, $CC 

    I then altered the C code to include the following at the very start:

    struct Resident {
            unsigned long long r_matchWord;
            void (*r_fn)();
    void _start(void);
    const struct Resident R = { 0x0BAD1DEA0BADC0DE, &_start };
    static unsigned long long *scrp, *endp;
    // ...etc...

    After recompiling as above, I now needed to embed the C code into the binary file my personal assembler produced.

    dd if=floop.bin of=rom.bin bs=512 seek=1

    Whoops, this has the effect of truncating the file; I have to re-pad it to 32KB before making the Verilog module with the ROM's contents.

    dd if=/dev/zero of=rom.bin bs=1 count=1 seek=32767

    There, I now have a completed 32KB image.  I rebuild the Verilog ROM module:

    make rtl/rom.v

     edit the resulting Verilog file because Xilinx's flavor of Verilog is retarded and won't accept the sane syntax that Yosys, Icarus, *AND* Verilator accepts.  One of these days, I'll fix my tooling to automate this.


    Read more »

  • Kestrel-2DX

    Samuel A. Falvo II08/31/2017 at 16:13 0 comments

    For those wondering what I've been up to in my ever-so-copious amounts of spare time, I've been hacking anew on the Kestrel-2 using my Nexys-2 FPGA board.  I didn't want to announce it until I was confident that I could "complete" (for some definition of complete) the build.  I'm now at that stage where I'm confident I can complete it in a reasonable time period.


    • KCP53000 CPU at 25MHz.  This provides an RV64I instruction set architecture for you to play with.  (WORKS)
    • MGIA provides 640x200 bitmapped, monochrome graphics.  (WORKS)
    • 32KiB of actual ROM.  This means more space available for OS and programs loaded from SD card.  (WORKS)
    • 48KiB of block RAM.  Two bits of the GPIA's output register now lets you select which 16KiB page of memory to use as the MGIA frame buffer.  (NOTE: This, unfortunately, needs to drop to 24KB if I were to port the design to the Terasic DE-1 board that I have.  It's FPGA just doesn't have the block RAM resources that the Nexys-2 does.  Sorry.)  (WORKS)
    • GPIA for, among other things, talking to the SD card, detecting VSYNC from the MGIA, and controlling the Nexys-2's 4-digit, 7-segment LEDs (though, if I can't afford it any more, this last feature is the first to go away).  (WORKS)
    • Two PMODs reserved for SD card interfaces (versus Kestrel-2's one).  (Planned)
    • KIA cores for using a PS/2 keyboard, and perhaps in a later release, the mouse.  Strongly considering getting another PS/2 port for the board, so I can have both keyboard and PS/2-compatible mouse.  (Planned)
    • 16MB of "expansion RAM" allocated for experimentation with accessing external static, pseudo-synchronous, or synchronous RAM resources.  (Honestly, it's really however much memory space you need; the "CPU" module only exposes a 25-bit address by default, but with some editing of the Verilog files, it can be widened as far as you need.)


    Frankly, to explore how to talk to external RAM chips reliably, thus opening up the opportunity to realize my ideal Kestrel-3 concept again.  Nothing more than that; inasmuch, I figure 32KB of ROM and 48KB of RAM ought to be way, way more than enough for my needs.  Hence the DX part of the 2DX badge: it's a Developer Architecture.

    Indeed, the Kestrel-2DX is the "test mule" I've always wanted, but was never able to get off the ground.  Except for some reason, I'm now able to get it off the ground, and I'm frankly quite ecstatic about it.  Maybe this is a sign of things to come wrt to the Kestrel-3.

    Where is the Repository?

    I'm maintaining development in a Fossil repository independent of my mainline Github account, on account of its still extremely experimental nature.  I didn't want to get any hopes up by announcing, "Oh, hey, look what I'm doing!", only to get distracted with life, and have it fall into disarray, disuse, and eventually be removed in one of my famous fits of fury.  Also, I find working with Fossil far easier and more productive than with Git + Github combination.

    When I'm happy with the result, I'll merge everything back into the official Github repository as an official version of the Kestrel-2 lineage.

    Why Fossil?

    Considering that Fossil is a single binary, compiled using little more than a plain-vanilla dialect of C and some POSIX libraries, I'm seriously thinking of moving all of my Kestrel-related material into Fossil instead of Git.  My reasoning is as follows:

    1. To get Git working, you need to port a C compiler, then Perl, Python, Bash, plus a litany of dependencies: for Perl, for Python, for Bash, and then you need the Git-specific dependencies.  It's probably easier to port Linux and its userspace as an entity than it would be to port Git to a foreign, brand-new platform like the Kestrel.  Meanwhile, the footprint to get Fossil ported to the Kestrel-3 remains daunting; but, it's substantially smaller than Git's: a C compiler, and a POSIX environment. ...
    Read more »

  • Reminder: My Current Plan of Attack

    Samuel A. Falvo II08/20/2017 at 04:44 0 comments

    This log is more for me than it is for you; yet, you might find this somewhat informative.  ;)  I need to remember this for posterity, especially considering I'm taking forever making real progress from an external point of view.

    To reaffirm, my immediate goal is to make a circuit inside a Lattice iCE40HX8K-compatible FPGA that can:

    • Accept addresses and data from my host PC, and populate static RAM and/or I/O registers accordingly.
    • Accept addresses from my host PC and report back the contents of memory and/or I/O.

    That literally is it.  No Turing completeness.  No asynchronous behavior of any kind.  However, I do want it to do these things using RISC-V instructions.  Here's how I intend on making this work.

    The IPA

    The Initial Program Adapter (IPA) is a core I've developed which causes the reading bus master to block until the next 16-bit halfword arrives over a serial interconnect.  This could be a Z-80, a 6502, a 68060, or your choice of RISC-V hardware.  It exposes no addressable registers; rather, it's intended to sit throughout the address space you'd normally place ROM.

    This core is already done.

    The "Processor"

    The KCP53010 is not a self-standing processor at the moment.  In fact, it's only now starting to resemble a proper five-stage pipeline.  The pipeline is naked at the moment; to feed it instructions, you must do so using the pipeline's own control signals.  The pipeline currently implements STORE instructions (8, 16, 32, and 64-bit), as well as all OP-IMM instructions.  This is sufficient for populating RAM, but not for inspecting its contents.  For example, to load a code or data image into static RAM, you might want to send the following instruction stream to the IPA:

    ADDI X1, X0, 0      ; x1 points into RAM, where code image is to go.
    ADDI X2, X0, aaa    ; x2 is first byte
    SB   X2, 0(X1)
    ADDI X2, X0, bbb    ; x2 is second byte
    SB   X2, 1(X1)
    ADDI X2, X0, ccc    ; If x2 happens to get valid halfword,
    SH   X2, 2(X1)      ; then we can optimize and store it.
    ... etc ...

    Before implementing an "instruction fetch" stage to the pipeline, though, I need to complete the LOAD class of instructions.  This will enable me to support inspection of memory as well, so I can implement, for example, a memory test routine on my host PC.

    After LOAD and the instruction fetch stage is complete, I intend to implement JAL and JALR instructions.  This should make the KCP53010 Turing complete, although still not a complete RISC-V implementation.  This should be good enough to support basic programs like "Hello world, what is your name?" "Oh, hello Foo!" type programs.

    The SIA

    The Serial Interface Adapter (SIA) core implements the serial interface used to communicate with the host PC using a basic 4-wire synchronous serial interface.  This core is already done.

    The Host PC Interface

    I dread this step the most.

    This is actually an ESP8266 device which I had received courtesy of Dr. Ting from an SVFIG meeting.  This will adapt the USB interface from the host PC to the 4-wire synchronous serial that the IPA and SIA implements.  This will make programming the device challenging, since I will need to bit-bang the interface to the FPGA.  Absolute yuck.  It's not even very clear to me how to go about testing this module to know it's working properly.


    Things were much, much easier with the Kestrel-2, where I could rely on block RAM, a significantly simpler processor model, and a working video interface from day one.  Bootstrapping the Kestrel-3 is significantly more complex, and I'm not too happy with it.  However, as my log history shows, every attempt to go a simpler route has so far failed.  I truly wish I had a better way of bringing a system up quickly.

  • KCP53010 Pipeline Register Bypass Works

    Samuel A. Falvo II08/14/2017 at 05:51 0 comments

    I just finished register bypass logic for the KCP53010 core.  This allows execute and memory pipeline stages to feed their destination register contents back to the decode stage before register write-back actually completes, preventing a pipeline stall when, for example, the destination of one instruction is used as the source for the next instruction or two.

    I'm sure things are still buggy; however, my tests so far seems to indicate everything is working.

    Prior to this logic being added, you had to manually 'pad' instructions in the pipeline.  E.g., to add a constant to a register, you'd need to feed the pipeline with five instructions, like so:

    ADDI X1, X0, 256
    NOP  ; or, ADDI X0, X0, 0
    ; At this point, the value will be stored in X1.
    ; We can now use it.
    ADDI X1, X1, 256

    So, as you can imagine, if you wanted to execute something like:

    ADDI X1, X0, 256   ; X1 := 256
    ADDI X1, X1, 256   ; X1 := X1 + 256 = 512
    SD   X1, 1(X2)

    you would consume something on the order of 15 clock cycles.  With the feedback logic, we need not have to pad instructions out like this, since we can execute read-after-write instructions immediately:

    ADDI X1, X0, 256   ; [1]
    ADDI X1, X1, 256
    SD   X1, 1(X2)
    NOP  ; decode SB
    NOP  ; effective address calculation [2]
    NOP  ; store bits 63..48
    NOP  ; store bits 47..32
    NOP  ; store bits 31..16
    NOP  ; store bits 15..0

    The value for X1 in instruction [1] above actually gets written into the register file at point [2] in the instruction stream.  But, thanks to forwarding/feedback logic, we can use that value (and its subsequent replacement!) in intervening cycles.

    This reduces the instruction stream's latency to just 9 clock cycles.  The bulk of the time is consumed by the lengthy store operation.

    There are opportunities for speeding this up further; but, I'm going to leave it as-is for now.  I still need to implement pipeline stall logic, so that the pipeline stalls while a memory fetch or store operation is in-progress.

    An obvious opportunity for performance enhancement is to perform memory writes in the background (ZipCPU does this, for example); however, this optimization may not always work for memory reads (you'd have to be careful about choosing your registers wisely to avoid blocking).  The logic to detect when to stall in this case is pretty tricky, so for now, I'd like to keep things simple.  9 cycles for a 64-bit, 3-instruction write sequence is not horrible.

  • Quick Update Before Work...

    Samuel A. Falvo II07/28/2017 at 15:12 2 comments

    Just a quick update before I rush into the office.

    I've been working on the KCP53010 CPU's pipeline stages on and off over the last couple of weekends.  Overall, I'm happy with the results so far.  You might say that the pipeline fully recognizes STORE and OP-IMM instructions, although supporting more than these (especially LOAD, OP, OP-IMM-32, and OP-32) is quite easily implemented with only a handful of Verilog lines of code.

    Last night, I wrote the very first lines of code that integrates these different stages together into a real pipeline.  It does not work yet, but its current behavior is very promising indeed.  I haven't had the time to implement a real integration test for it yet, so I just relied on the RESET behavior and how it pipes a NOP (ADDI X0, X0, 0) instruction through the queue.  After looking at the waveforms manually, I'm pleased at the results so far.

    Some things which need to be done include (but isn't limited to):

    • Clock the register file's source addresses on the falling clock edge, instead of on the rising edge.  ALTERNATIVELY, take the source addresses from the instruction register's inputs instead of from the instruction register itself.  Either one of these approaches allows me to deliver the contents of the register file concurrently with the instruction decoder outputs, thus letting me keep a 5-stage pipeline.  Otherwise, I'd need to introduce a separate "register fetch" pipeline stage.
    • Move SEL_O signal generation into the memory stage (load/store unit).  Right now, it's an explicit input; however, since its value depends on the output of the ALU in the execute stage, there's no way to precompute it at any earlier stage.
    • Make use of BUSY_O and related pipeline stall signals to control instruction flow through the pipeline.  Right now, instructions just flow synchronously with the clock.
    • Implement register bypass/feedback logic.  This would prevent pipeline stalls or erroneous computations when the source of an instruction I comes from the destination of the previous instruction I-1.

    There's a lot of work that needs to happen yet; but, I think I can swing it.  I just need to take this slowly, one step at a time.

  • Commencing Third Pipeline Stage

    Samuel A. Falvo II07/02/2017 at 23:27 0 comments

    Since my last update, I've made many small and incremental improvements to the load/store unit and the register "write-back" side of the X-Register Set modules. To a reasonably good approximation, I think this completes 90% of my work on these stages. I think there are some small artifacts that need to be added still, but these will depend upon the cooperation of other units not yet written, so will have to wait.

    With that said, I think it's time to start on the Integer Execute stage of the pipeline. This is the stage that basically encapsulates the ALU I've already written for the KCP53000.

    LSU Features

    The KCP53010's front-side bus will conform to Wishbone B.4 Pipeline Mode specifications. This new direction satisfies several problems I was having before with the KCP53000, allowing me to collapse several support modules into the core of the CPU effortlessly.

    The B.3/B.4 Standard Mode/Furcula bus ties the master and slave side of the bus inextricably together, which required more sophisticated state machines when adapting to other buses. The 64-bit to 16-bit bridge (KCP53003) added a significant amount of overhead to the circuit, as did all the other bridges that were required to interface the KCP53000 to the Kestrel-2 hardware. It worked; but, it was very slow, and only just barely met timing requirements for a working computer.

    The B.4 Pipelined operation greatly reduces the complexity involved with bridging different bus widths. Supporting 64-bit, 32-bit, 16-bit, and 8-bit transfers over a 16-bit external bus came surprisingly easy once I realized that the command and response (or master and slave, as referenced in the Verilog sources) sides of the bus can be cleanly divorced from each other. I'm banking on this simplification to reduce both layout pressure as well as bump the CPU's operating frequency to a more comfortable rate.


    Because I now natively support Wishbone, the CPU is now directly responsible for handling address misalignment and data path routing. Right now, the LSU doesn't take misalignment into consideration. This is a known bug, but will be addressed later. However, I'm thinking the hardware to detect and respond to this (and similar) condition(s) will still result in a net reduction in complexity.

  • Mega Progress Update

    Samuel A. Falvo II06/04/2017 at 19:34 0 comments

    I could have sworn that I'd posted an update already, but looking at my logs feed, I clearly have not.

    Topics covered below include:

    • Serial Interface Adapter Core Completed
    • Initial Program Adapter Core
    • KCP53010: Successor to KCP53000 CPU

    Serial Interface Adapter Core Completed

    Not much more to say than that. It's done. It's not as small as I'd like, but on the other hand, it's also more flexible than your typical UART design. It allows you to send and receive serial data streams (LSB first only), with or without start bits, stop bits, etc. Frame checking is up to the software using it. It supports configurable FIFO depths and widths (up to 16-bits wide), allowing you to tune the core for your needs. Those who have programmed the Commodore-Amiga's internal UART will be right at home with how this adapter works. A nice, wide divisor allows for data rates as low as hundreds of bits per second, to as high as tens of megabits per second.

    Data is sent over a pair of wires, TXD and TXC, forming data and forwarded clock, respectively. Data is received on RXD and RXC, respectively. It should be noted that it can be synchronized on RXD, RXC, or both. For lower-speed applications, RXD is sufficient. For higher-speeds, you probably want to ignore RXD and focus just on RXC. The choice is yours.

    This core provides a 16-bit Wishbone B.4 Pipelined Mode slave interface; it should be easily usable with 8-bit devices as well.

    New Initial Program Adapter Core

    The Kestrel-3 code-base now includes a new core, currently with the name "IPA". This core has one mission: to facilitate loading the initial bootstrap code into RAM on a ROM-less computer design. From the processor's perspective, it looks exactly like ROM memory, and sits where ROM normally would; however, on the back-end, it parasitically feeds of the RXD and RXC pins of the SIA core. The idea is simple: when the processor reads a half-word from anywhere in ROM's address space, it blocks until the IPA receives two bytes. The bytes must be sent in PC-standard 8N1 serial format. The IPA is synchronized on the RXC input, so you'll need either a proper USART or a microcontroller to drive it. Since I have two Arduinos and an ESP8266 microcontroller at my disposal, this is not a blocking drawback.

    The idea is you spoon-feed the computer an instruction stream designed to explicitly store data into memory, like so:

    ; X1 = pointer into RAM
    ; X2 = value to store (byte)
    ADDI    X1,X0,0
    ADDI    X2,X0,$03
    SB      X2,0(X1)
    ADDI    X2,X0,$7F
    SB      X2,1(X1)
    ; ...etc...
    and so on until you have loaded 1KB to 2KB worth of code into RAM. If you need more than this, you'll need to manually reset X1 somehow, and continue loading your data. This approach is slow, of course; however, it saves me the hassle of needing to implement a DMAC just for the serial port. LUTs are precious in these smaller FPGAs, so this is a pretty big win for me. Besides, this only has to happen exactly once upon system reset, and the bootstrapper doesn't need to be terribly large (4KB seems like an awfully large bootstrapper to me).

    When the initial program is loaded, you kick it off by sending a JAL X0, 0(X0) instruction.

    The IPA exposes a Wishbone B.4 Pipelined Slave interface, and only supports 16-bit half-words. Attempting to read or write bytes from this space will fail in unpredictable ways. Don't do it. Thankfully, when the CPU fetches instructions, it fetches them 16-bits at a time.

    This is not the first ROM-less Kestrel computer I've made. Indeed, my very first, the W65C816-based proof of concept Kestrel-1, only connected to SRAM and a single VIA chip for I/O. The architecture of the Kestrel-1 and the iCE40-targetting Kestrel-3 designs share much in common.

    Kestrel 1p4Kestrel-3
    CPUW65C816P-14, 4MHzKCP530x0, 25MHz
    Performance2 MIPS max.6 MIPS max. (KCP53000),
    12 MIPS est. max. (KCP53010)
    Word Width8/1616/64
    RAM32KB max.256KB min., 512KB typ., 2^60 B max.
    I/O1 VIA with 16-bit parallel I/O 1 SIA, V.4 compatible serial, 110bps to 12.5Mbps possible.
    IPL MechanismBus mastering...
    Read more »

View all 88 project logs

View all instructions

Enjoy this project?



f4hdk wrote 09/20/2017 at 21:09 point


I'm happy to see that you still continue with this project.

Have you seen my A2Z project here?

It is quite a similar project, a full computer based on FPGA, but it is much simpler than yours. I've also coded a homemade compiler.

  Are you sure? yes | no

Samuel A. Falvo II wrote 09/22/2017 at 16:56 point

I believe I've seen it when I first joined the Hackaday community; however, I regret that I haven't been following up on my interests.  Now that I'm fully employed, I tend to focus my free time on my family engagements and, only occasionally, on Kestrel stuff.  :)  Apologies.

You are much further along in your project than I am with mine, though.  Right now, my biggest difficulty is getting reliable SD card operation.  After that, I'll need to make some kind of bootstrap mechanism.  I'm hoping progress will be more forthcoming once I achieve those milestones.

  Are you sure? yes | no

JL9791 wrote 11/27/2016 at 01:20 point

I see you are still working with Forth :)  I came upon this by accident when researching stack CPUs
I would like to learn Forth someday, I like the simplicity of stacks (which reminds me of my Magic the Gathering days).

  Are you sure? yes | no

Samuel A. Falvo II wrote 11/27/2016 at 01:32 point

Not having to name every intermediate computation is quite liberating.  But if taken to an extreme, it can also be quite confusing.  :)  The solution is to learn to hyper-factor your code.  A single function in C could well take 16 word definitions in Forth.  Naming procedures is a nice trade-off, because it almost serves to document why your code is the way it is.  Not quite, but good enough for most purposes.  :)  Plus, it really aids in testing code to make sure things work as you expect them to.

  Are you sure? yes | no

JL9791 wrote 11/09/2016 at 01:09 point

I have been following your project for a while, particularly because you selected the RISC-V ISA to build your CPU around.  I recently came across something I had forgotten about:  the now open source Hitachi CPUs (Sega Genesis, Saturn, Dreamcast) found here

Did you consider those as the brain of your Kestrel?  If not, perhaps they may be a good alternative. :)

  Are you sure? yes | no

Samuel A. Falvo II wrote 11/09/2016 at 01:16 point

Nope, and I have no intentions to either.  I've invested too much into RISC-V to change now.  Switching ISAs today would literally set me back two years of effort.  Besides, performance of RISC-V CPUs are quite good in general; that my own CPU is as slow as a 68000 should not be taken as an indication that all such CPUs are that way.

In the future, I'd like to one day hack a BOOM processor into the Kestrel, which would give it a 4-way superscalar CPU.  But, for now, I just want something simple enough that people can understand.

Another reason for adopting RISC-V is that it has learned many things from both the successes and the failures of past architectures.

Thanks for the link though.  You're not the first to suggest it.  :)

  Are you sure? yes | no

JL9791 wrote 11/09/2016 at 01:18 point

Sure thing.  Yeah, I was not suggesting you scrap all your hard work, just curious.  Glad you are coming along pretty well with it now after the..uh..hiccups :)

  Are you sure? yes | no

Similar Projects

Does this project spark your interest?

Become a member to follow this project and never miss any updates