FelinityVM | Details | Hackaday.io

FelinityVM is a new virtual machine designed with the following goals:
-Simple to implement

Simplicity can be a vague concept to specify, so to clarify what that means for this project: someone familiar with computer architectures should be able to code at least a rudimentary implementation inside of a week. Of course more advanced optimizations may take longer to research and implement, but from an operational standpoint, it should be that easy to create a new VM according to this specification from scratch. This helps not only to propogate the usage of the VM to embedded developers who are under tight time restrictions, but also helps prevent the many problems that come from the same code base being ported over and over. If the VM is simple enough to rewrite from scratch in a way that is integrated into your project and system design, why wouldn't you do that instead of taking a machine that was meant for another system and cramming it onto a new one, adding band-aid after band-aid until it finally works. Now there will be good implementations available for most major systems so reimplementing from scratch will not be the norm for most application developers, but having that as an option frees programmers from worrying about a particular implementation restricting what they can do with their application.

-Well documented

A lot of VMs or software in general these days is "code first document later," this project is more about creating a specification that can be easily implemented on any system than about a specific implementation. Although several reference implementations will be available along with implementations for most major architectures including X86_64, ARM, etc.

-Powerful yet small instruction set

The CISC instruction set used for this VM gives power back to the programmer making assembly writing intuitive and fun by having only a small number of instructions that can do more than most large instruction sets. I don't know how many people still write assembly for modern architectures, I'm sure there are quite a few, but personally I find it to be a rather tedius experience, and I'm sure I am not the only one. The reason for this is that people are not meant to write assembly most of the time, the architectures today are bloated full of niche operations and complex operand formats that are supposed to be optimization tools for the way that higher level languages need to generate assembly. But assembly can be made much easier and far more entertaining if the instruction sets are not geared toward compilers. Thats not to say that the FelinityVM instruction set does not take into account the work that compilers need to do or that compilers should not be written for the VM, but merely that it is designed so that those optimization tools are elegant features; core to the design rather than side features tacked on after-the-fact that add convolution to the rest of the architecture.

-Doesn't take away from native features

Most VMs like to abstract you from your native environment which is good because you want code to be portable, however sometimes you want to take control of specific elements of a native environment. The solution in other VMs such as Java is to write native code for that system, then write some more native code to act as an interface between the VM and the native code you just wrote thus creating messy layers of calls to do what you want. FelinityVM allows access to native memory, system calls, and even native processor registers all from within the VM. Furthermore, since these types of accesses are all primitives within the vm they are easily disabled when a sandboxed environment is needed. This gives FelinityVM the flexibility to behave as a sandboxed machine when running untrusted applications, as a microcontroller VM allowing the same code to run across all sorts of uController manufacturers with only minimal changes to runtime selection of registers, and as a full fledged bare-metal VM capable of implementing hypervisors, bootloaders, and OSs. Still want to run native code instead of executing in the VM? FelinityVM makes this as simple as setting up your arguments and return address in native memory and then jumping to that native code with a special VM instruction, there are no wrapper APIs that have to be generated ahead of time or anything of that sort. Part of the VM specification is giving you the primitives you need to make native execution simple.

-Runs like a native executable

Most VMs have to be installed prior to running your code, and then to run your code, you have to invoke the VM (a native executable on your system) and pass as an argument to it, your VM code. This is a problem since now your desktop manager has to know what arguments to pass to the VM when you click on an executable, simple scripts which execute commands in sequence now have to know the location of the VM executable in addition to the arguments it needs. All of these problems are handled by tools like environment variables and other operating system tools, and those tools have drawbacks to them as well. While this might be sufficient, it is also unecessary and convoluted. FelinityVM has the option to build a small copy of the VM into the executable along with native wrapper symbols for all functions and symbols defined by the VM. What this means is that on Windows, a compiled FelinityVM executable will be a .exe file and behave as any other exe would. On Linux, it would be an ELF file with full symbol tables pointing to wrapper functions for the global symbols exported at compile time in the VM.

-New Ideas in CPU-Memory interaction

In most VM or CPU architectures, there are instructions which perform operations on memory, between registers and memory, just on registers, or any of these with immediate or implied operands. In FelinityVM, all operations act on registers. The next natural question is then of course "How do you manipulate memory?" The answer is revolutionary (at least I think so), all registers on the VM can be identified by an 8-bit ID. These 8-bit IDs follow the 8-bit opcodes of the instructions to identify the register that holds the operand needed by the instruction. But here is the catch, not all 256 IDs are physical registers, some are "pseudoregisters." Pseudoregister IDs can be used anywhere that a register ID can be used, except instead of the value being fetched from a register by the instruction the value will be fetched from the appropriate location denoted by the pseudoregister ID. Some pseudoregisters denote commonly used constants like "0" or "1" while others tell the instruction to fetch the operand as an immediate value, following the instruction and operand IDs in memory. Now your probably thinking "Isn't that what MIPS does?" and you are partially correct in thinking that, except that MIPS is limited as to which registers you can access in this way, and MIPS does not allow accessing memory in this way. In this way, MIPS is limited in that not every instruction can act on the operand in any form. Now on FelinityVM, one pseudoregister is special, it lets you access memory. What this means is any operation that would store a value to a register can also be used to store a value directly to memory. Likewise, any instruction which reads an operand from a register can now read that operand from an arbitrary location in memory.

-Flexible Word Sizes and Endianess

With most CPU architectures, you are limited in the size of word you can operate upon by the size of the register. X86 has some flexibility on this in that you can break up the accumulator registers into smaller sub registers, but overall the experience of using them is rather dull. While your average programmer probably doesn't feel limited by using a larger word size than needed, the flexibility to use a variety of word sizes nevertheless can be a powerful tool for simplifying code. Likewise, with a choice of endianess, the programmer no longer needs to be limited to one for or another, or have to write messy code swapping back and fourth between endianesses depending on where the value is used. Those sections of code which need big-endian values need only set a status flag, and those that need little endian values only need to clear the flag from the status register.

-New Ideas about Flow and Logic Control

In most CPU architectures, when you want to access subroutines or jump to other sections of code, you have to use a jump or branch instruction for which there are typically several forms including but not limited to jumps that move forward in the code by a specified amount, jumps that go to a specific address, and even jumps that are only executed if a certain condition is met. FelinityVM simplifies this infrastructure by allowing the programmer to directly manipulate the program counter. This may sound dangerous to someone who is not used to dealing with flow control in this manner, however, keep in mind, whatever value someone might set the program counter to by writing to the register, they could just as easily do on common architectures by using one of the jump instructions. The advantage to controlling the flow of your program in this way is that you don't need any special instructions to do jumps. There are already conditional register asignment instructions which can be used for conditionally assinging a value in an arithmetic problem or conditionally setting the new address in the program counter. Furthermore, the conditional assignment instructions take a mask argument that is used when evaluating the truth of a condition. In this way, one single instruction can take the place of virtually all the various conditional branches such as "jump if zero, jump if not zero, jump if carry flag is set, etc." Similiarly, all bit manipulation instructions have been replaced by 3 simple instructions which give you more possibilities than the many other logic instructions. The first instruction is for doing bitwise operations, your typical ANDs, ORs, NOTs, EORs, etc. Instead of performing one of these tasks according to the instruction, the programmer simply specifies the truth table of the bitwise operation he wants to perform. Pseudoregisters exists with the truth table template for AND, OR, NOT and EOR so that you need not waste precious cycles preloading a General purpose register with the table whe you want to do one of these functions. But now, if you want to do a NAND operation or something more creative all in one instruction, you can by simply specifying the truth table of the operation you want to perform. Similiarly a second instruction performs a custom operation specified by a truth table, except its inputs and outputs are not performed bitwise, but rather, logically. that is an input is considered false if the value in the operand is 0, and true otherwise. the output value returned is then either a 0 or a 1 in the least significant bit of the register. The third bit manipulation instruction deals with moving bits and takes an operand which specifies how the bits should be moved e.g. shifted left, rotated right, etc. These values are also available as pseudo registers.

Project Details