Close
0%
0%

BCFJ

Because I want to create a good-for-all language borrowing qualities from Bash, BASIC, C, Forth and JavaScript

Similar projects worth following

One of those things I have been thinking about for countless years...

A language that can be very simple and easy to use, like BASIC but with the power of Bash and others.

A language that can span many levels, from the startup firmware of a machine, to user applications, from the machine code to the abstract concepts.

A language that is efficient, to build optimised code like in C (not just by making a powerful compiler but letting you guide and iterate the compilation)

A language that can work with interpreted and compiled code (like FORTH but more advanced ?)

...

This language is inspired by Bash/C/FORTH/JavaScript as well as a few others (LISP/Scheme or the Pascal/Ada/VHDL lineage for example) for:

  • ease of programming (no crazy syntax, as Python advocates, so no obscure stack stuff like FORTH)
  • security/safety (inherent checks like Ada)
  • performance (so you can tweak a script into fine-tuned machine code at will)
  • self-sufficience (store the system in a Flash/ROM)
  • interactive console use (like Bash)
  • building the OS
  • support of my CPU's features (so the F-CPU and the YASEP have something to run)

This keeps many nice things we have come to expect from today's programming languages but some concepts diverge :

  • Don't bother with POSIX
  • Enforce sandboxed and separate units of code (safety, modularity, reuse) with capability-based access rights. Just like HURD's "everything is a server".
  • Several subsets can be enabled/disabled depending on the use :
    - machine code generation is allowed only in the compiler and assembler context
    - hardware features, IO, protections etc are allowed only in the kernel modules
    - implicit dynamic features are not allowed for code that will be compiled (like VHDL code that can't be synthesised)
    - code introspection (like in FORTH or JS's "eval") only available in development mode (introspecting code can't be compiled)

The main idea is to create a baseline interpreted language that is used to compile itself. It must be able to generate machine code from its own source code, starting from a basic assembler and evolving into a more featured compiler. The interpreter's command line can also serve as a classic shell to administrate the computer.


Logs:
1. A bit of historical perspective on early language design
2. The need for a preprocessor
3. Note for later
4. Units
5. Typing
6. Standard types
7. .

Chistory.html

The Development of the C Language Dennis M. Ritchie (Bell Labs/Lucent Technologies) Copyright 1993 Association for Computing Machinery, Inc. This article was presented at Second History of Programming Languages conference, Cambridge, Mass., April, 1993.

HyperText Markup Language (HTML) - 63.66 kB - 03/09/2018 at 05:32

Download

  • Standard types

    Yann Guidon / YGDES04/02/2018 at 04:25 0 comments

    BCFJ's fundamental type is equivalent to BASIC's number, or C's "int". It's enough to get a few things done but won't go far...

    Scalar types :

    Defining the size explicitly is important, otherwise many problems appear.

    Integer numbers are either signed or unsigned so the following types should be supported (if the processor can handle it) :

    S8, S16, S32, S64 : Signed integer of 8/16/32/64 bits
    U8, U16, U32, U64 : Unsigned integer of 8/16/32/64 bits
    F16, F32, F64 : Floating point of 16/32/64 bits

    (I didn't add support for logarithmic, unlike in the early F-CPU, which was a dead-end...)

    Characters... Are another thing. ASCII is only a relic now and UTF8 is the norm, which can have a crazy dynamic range. The norm limits the size to 21 bits (assembled from 4 bytes) so there is no actual "character" type as in C because the representation can vary wildly. However, intermediary types are possible for various representations (UNICODE point or byte serialised).

    Strings too can have multiple representations (point or byte) and can't be handled directly in the core language. Yet this is very important and useful so a careful and simple design is required early on. Without proper string handling, no parser is possible... Usually, strings are very easy to process in plain ASCII but I don't want to make an ASCII mode at first because UTF8 will never get implemented later. Is the development of UTF-8 library holding everything back ?

  • Typing

    Yann Guidon / YGDES04/02/2018 at 03:26 15 comments

    What's best ? Pascal's strong typing or JS' weak typing ?

    Both have excellent arguments but also significant drawbacks.

    It's hard to reconcile both.

    Since BCFJ starts as an interpreted language, typing can be weak and in fact, the first type is "a number" (think "int" in C).

    Later, when objects ("blobs") are introduced (à la JS) a stronger typing mechanism can be introduced. The trick is to have/check the "type" attribute, which points to a collection of handlers that then resolve the types, convert or compute values...

    So BCFJ does not enforce a typing mechanism or precise resolution behaviours from the beginning (this removes a LOT of red tape from the language). Later, conventions can be added to prevent inconsistencies or adopt the preferred behaviours.

    So BCFJ starts "naked", like in FORTH, with an "integer" type, which is then completed... See the next logs.


    Oh my...

  • Units

    Yann Guidon / YGDES03/31/2018 at 04:52 0 comments

    Have you ever used Turbo Pascal ? or Ada ? Or VHDL ?

    In TP, you can make a program, but also share code with what's called "units". They are named "libraries" in VHDL.

    In BCFJ, everything is a "unit".

    A "unit" is code, linked to other units by a collection of entry points.

    • One entry point is meant for initialisation (it's called "init", and might be the first ever entry point)
    • Others can provide functions to accepted processes.

    When "init" is void/empty/absent, the unit is comparable to a shared library/dll.

    The "Init" can also be equivalent to the "main" in C/POSIX. Other entry points are not absolutely required but they can provide asynchronous signal handling for the unit.

    Unlike POSIX shared libraries, a direct call is not possible, units have separate and protected address spaces. The calling process must perform an Inter Process Call and send the information through the registers and shared memory spaces.

    Some units can be IPCalled and access the calling process' address space BUT can't use any other space at the same time : they provide shared routines, that's all. Data leaks shouldn't be possible unless explicitly requested...

    Remember : units do not have to trust each other. The caller and the callee should not assume benevolence, security and/or safety. A unit could be replaced by a different version between IPCs, either for maintainance or because something went wrong... So don't let any unit's code access your private data.

    This is where "capabilities" and process properties become essential. In order to let other units call you or accept your call, you need reliable information about their rights. For example you can't let anybody call your "init" entry point (only a given process can do this and its process ID might change, but not its access rights)

  • Note for later

    Yann Guidon / YGDES03/16/2018 at 13:56 0 comments

    Nobody reads the early design notes. But dear future self, I warn you about the dangers of this.

    https://thedailywtf.com/articles/A_Case_of_the_MUMPS

    https://thedailywtf.com/articles/MUMPS-Madness


    ...

    On the sunnier side, https://blog.codinghorror.com/a-scripter-at-heart/

  • The need for a preprocessor

    Yann Guidon / YGDES03/10/2018 at 11:05 0 comments

    Compiles languages often have a preprocessor (I'm looking at you, Ada/VHDL...). Scripting languages don't.

    Since this is based on a scripting engine (with an interpreter that manages files inclusions etc.) there is no need for preprocessing or substitution. The dynamic environment does the preprocessor's work as well as "elaboration" (like in Ada/VHDL). The "source code" contains not only the instructions to execute but also how to execute them...

  • A bit of historical perspective on early language design

    Yann Guidon / YGDES03/09/2018 at 05:31 0 comments

    https://www.bell-labs.com/usr/dmr/www/chist.html has been added to the Files section.

View all 6 project logs

Enjoy this project?

Share

Discussions

Yann Guidon / YGDES wrote 05/20/2018 at 01:33 point

A nice find by Peter Forth : http://beza1e1.tuxen.de/articles/forth.html
I'd love to come up with such a simple and clean path for BCFJ :-)

  Are you sure? yes | no

Morning.Star wrote 03/05/2018 at 06:33 point

Nice, its about time somebody made a decent language. Keeping my eye on this Yann :-)

  Are you sure? yes | no

Yann Guidon / YGDES wrote 04/02/2018 at 03:31 point

Don't hold your breath tooooo long, as I've been tinkering with many of these ideas  for as long as I've learned to program. I've often thought "why is this feature so lousy ? Why couldn't it be done differently ?" So I have collected a pile of thoughts...

Now I realise I can't design my CPUs without some sort of decent support language. I can't procrastinate anymore...

  Are you sure? yes | no

Similar Projects

Does this project spark your interest?

Become a member to follow this project and never miss any updates