Close

How about higher-order blocks?

A project log for Bitlang

Minimalistic language for Turing machines and beyond.

joelsJoelS 08/20/2014 at 23:580 Comments

When you read this article, remember the reason for doing this:

* Specification

* Code analysis

* Testing: We can tell the compiler to not call that block (function) after all, hijacking it or spying on it.

* Keep the language small at all costs and put everything into the standard library. Writing new compilers becomes easy. Sometimes you don't need speed, you just need to get it working on a new platform (even if the examples below are extreme).

* The development of the Bitlang Compiler will be easier. (We'll use a very simple parser + LLVM.)

Higher-order blocks aren't really C-macros or C++ templates / Java generics. They're more than that and need to be powerful. For example, let's consider this:

if(a() || b()) {

...

} else if(c.blah) {

...

}

When we implement this, there are a few things to consider. If a is false, we don't want to evaluate b. Likewise, if the first if fails, we need to do the else.

To implement this, we could define (in pseudocode now):

first = 2^{ |first, second| return first }

if = 2^{ |bit expression, block run|

  return ELSE(expression ? first(false, run()) : first(true, nothing()))

}

It gets a bit ugly, but once we have if, everything will be nice in the future. Also, note that this is the specification. The heavily optimized standard library in the compiled code will look more or less exactly as if a C compiler produced it.

So, "if" is a named block that evaluates the block if expression is true. It has two parameters. Everything works nicely, we run if the evaluated expression is true and return some kind of else-type. The else type (as well as the logical-or operator || that need to skip evaluation of b() if a() is true) is harder. Here, we need to be able to take an expression and evaluate it at will. This can be used in two ways:

* Branching. This is where conditional branching comes into the picture. For not-yet optimized compilers, it happens in the ternary expression. For optimized ones, it happens right in the source code as "branch if xxx".

* Analysis. A simple example is hijacking 0^{ this is a %s string } or "this is a %s string" and return the right type. More examples would be in testing where we need to stop certain things from happening.

Things get more complex, because we can have higher-than-2 degree of blocks too, i.e. blocks than run inside the blocks inside the compiler/tool. That will be very handy for writing the compiler that interfaces LLVM.

Discussions