Close

Typing (for realz)

A project log for BCFJ

Because I want to create a good-for-all language borrowing qualities from Bash, BASIC, C, Forth and JavaScript

yann-guidon-ygdesYann Guidon / YGDES 05/20/2024 at 02:410 Comments

Hello 2024, here I am again !

The old log 5. Typing had a few ideas but some water has flowed under the bridges since...

Concerning the "just an integer" idea, I have discarded it because it's just a bat5h1t crazy time bomb that's furiously ticking and eager to blow just like it did in C. So I have decided that all the types should be fully defined at declaration time.

But what if I want a "whatever" type, just to get the ball rolling and not care until later ? Enters Mr Cockroft with his insane DEC64 format. Look at https://www.crockford.com/dec64.html ... To say that I endorse it is an exaggeration but in the context of what I intend to use if for (prototype algorithms before refining the implementation), it's "good enough" and provides some convenience for integer platforms. It is somewhat inspired by the JavaScript tradition and provides 56 bits of integerness, some weird scaling rules, but it is not as inconvenient as IEEE754 and I guess I can use if for some DSP work for example.

So here are the scalar types :

The type upper case letter is followed by a number that is a power of 2, at least 8. So valid sizes are :

Modifiers:

padding, ro and wo can be combined into a 2-bit field:

00 : padding (no read, no write)
01 : read only (const flag)
10 : write only (could be a "sink" or dummy)
11 : normal variable

Even that is not able to fully describe a scalar value. So a sort of "syntax" is needed...

I'm reinventing some sort of ASN.1 binary syntax in fact! But adapted and constrained to the types I expect to handle under the hood of my toy language.

Anyway a scalar type descript will not fit into a byte.

The size field is 4 bits to accommodate 16 possible sizes in bytes:

8 16 32 64 128 256 512 1024
2048 4096 8192 16384 32K 64K 128K 256K

256K is a lot... but you never know and the bits are there. You're free to set your own limit for the number of bits you want to support.

I have also defined 5 types: U/S/F/D/B, so that's 3 bits with some margin. Unicode points are just a subset, for example.

Then the modifiers:

Another property I would like to add is overflow behaviour:

That fits in two more bits.

The total is 5+3+4+2=14 bits, fitting in a U16 scalar with 2 bits left for extensions. Because I have not found yet a way to describe a fixed point integer yet.

So what's the purpose of this internal, binary, unambiguous representation ?

Oh, there are many reasons to do so.

First it is a great way to compare function prototypes without crazy hassles later.

Imagine, you describe a function with a list of parameters, it gets encoded in a binary chain, so that chain is all you need to make sure an API matches between caller and callee. Just compare the chain for equality.

It is the basis for a typing hierarchy and a bytecode-like version of a program, which can be later described unambiguously across languages.

The remaining 2 bits can encode the type of the U16:

Aligned Strings are missing as well.

I also need to define a range, that would be in a struct probably.

But since it is a "bootstrap language" it is not required to support every bell and whistle, right ?

Edit :

I have forgotten the "Function" type...

Discussions