Close
0%
0%

Prometheus A.I.

One set of rings that will control all.

Similar projects worth following
What if we could add a kind of "digital DNA", to a classic language like UCSD Pascal, with an eye toward developing an efficient platform that will facilitate the creation of projects that integrate feature sets from several other programming languages and styles within a single unified framework that integrates the functionality of multiple devices within a cohesive environment. In effect, such a framework might allow an aspiring robot designer to write an application in a high-level language such as PASCAL, and then cross-compile that application to another intermediate language such as a specialized variant of LISP, which could then either be implemented in the form of a C++ library which provides some of the features of LISP or else another meta-compiler might be used to convert the intermediate representation to run on a microcontroller such as a Propeller 2 using the built-in FORTH interpreter or else into the native assembly, or even traditional UCSD p-code.

Several approaches are frequently taken when developing projects that involve some type of AI.  In the traditional approach, interaction with a simulated intelligence can be produced by combining simple pattern matching techniques with some type of scripting language which in turn provides a seemingly life-like experience, which within some contexts can be highly effective, even if only up to a certain point.  This is the approach taken by classic chatbots such as ELIZA, PARRY, MEGAHAL, SHURDLU, and so on.   Whether this type of AI is truly intelligent might be some subject for debate, and arguments can be made both in favor of, as well as against claims that such systems are in some way intelligent, on the one hand - even though nobody can reasonably make any sort of claim that such systems might in any way be sentient - yet WHEN they work, they tend to work extremely well.

Most modern attempts at developing AI as of late seem to be focused on efforts to develop applications that more accurately model some of the types of behaviors associated with the types of neural networks found in actual biological systems.  Such systems tend to be computationally intensive, often requiring massively parallel computing architectures which are capable of executing billions of concurrent, as well as pipelined non-linear matrix transformations so as to perform even the simplest simulated neuronal operations.  Yet this approach gives rise to so-called learning models that might not only have the potential to recognize puppies, etc. but why not build networks that can try to solve more esoteric problems like certain issues in bio-molecular research, and mathematical theorem proving, etc.

Thus, the first approach seems to work best for problems that we know how to solve, and this method, therefore, leads to solutions, that - when they work - are both highly efficient, as well as provable, with the main issues being the amount of work that goes into content creation, as well as debugging and testing.

The second approach seems to offer the prospect of allowing for the creation of systems that are arguably crash-proof, at least in the sense that it should be possible to build simulations of large neural networks, that are just massively parallelized as well as pipelined matrix algebraic data flow engines, which from a certain point of view, is simplicity in and of itself.  So that would of course seem to imply that from at least one point of view, the hardware can be made crash-proof, that is within reasonable limits, even if an AI application running on such a system might hang from the point of view of the case where the proposed matrix formulation according to some problem of interest fails to settle on a valid eigenstate.

So, let's invent a third approach, according to the possible introduction of some type of neural network of the second type that can hopefully be conditioned to create script engines of the first type.  Not that others haven't tried doing this with so-called hidden Markov models which concordantly will just as often introduce some kind of Bayesian inference to some hierarchical model.  Thus, there have been many attempts at this sort of thing, often with interesting, even if somewhat, at times nebulous results, i.e., WATSON, OMELETTE.  So, obviously - something critical is still missing!

Now as it turns out, the human genome consists of about 3 billion base pairs of DNA, each of which encodes up to two bits of information - which might therefore fit nicely in about 750 megabytes for a single set of up to 23 chromosomes, if it can be stored that is, in a reasonably efficient, but uncompressed form.  Now if it should turn out that 99% of this does not code for any actual proteins, then it might very well be that all of the actual information needed to encode the proteins that go into every cell in the human body, well that information might only need a maximum of about...

Read more »

prometheus.pdf

A draft copy of the text of the Description, Details, and Project Logs for this project in pdf form. Enjoy!

Adobe Portable Document Format - 225.17 kB - 07/14/2022 at 12:12

Preview
Download

pascal_procedures.pdf

A somewhat more condensed version of the list of the procedures in the original UCSD Pascal compiler source code. Numbers next to the procedure name are not line numbers, rather they are the symbol numbers based on a version of the tokenizer that captures each symbol's index number as it is parsed. Quotation marks are added when the debugging code reports on the name of a variable, or string literal.

Adobe Portable Document Format - 9.59 kB - 07/08/2022 at 07:57

Preview
Download

UCSD Pascal 1.5 Compiler.pdf

This is the source code for the original UCSD Pascal 1.5 Compiler which was released for free, non-commercial and-or educational use by the University of California, San Diego sometime around 2003. I have taken the original text file from the UCSD distribution and added line numbers for reference purposes, as well as printing it out in pdf format.

Adobe Portable Document Format - 221.37 kB - 06/27/2022 at 02:09

Preview
Download

  • What answer would please you the most?

    glgorman08/08/2022 at 02:52 0 comments

    I found the original source code for Eliza, as it was written in MAD SLIP.  Unfortunately, when I tried OCR in Adobe Acrobat - the text is hopelessly garbled.  So that the original Eliza script looks something like this - if you let the OCR pretend that it knows what it is doing.

    (DOES IT PLEASE YOU TO BELIEVE THAT) 
    C::O YO!..: SOMETI.'·~ES i..JI SH YCU H::R::- 4) 
    (PE~~AP$ YOlJ VlO~LO LIKE TO 3E 4)) 
    ( { 0 ! D YJU) {iil-lY DO YOU Tf-ll \K 1 3 y~~) 
    (YO:...: LIKE TO THINK I 3 YOU - CO~~ 1 T Y-2\...}

    Now I am not going to jump on the bandwagon that is going around, (just yet) about some notion that because of AI computers have invented their own language, which scientists can't understand.  Really, it's just garbled scanned text.  Or at least it is for now.  Yet what if we invented a programming language where there is no such thing as a syntax error, so that anything will compile, and possibly execute?  Obviously, if Conway's Game of life is known to be Turing complete, then, at least in principle, it should be possible to implement some type of ostensibly sentient A.I. that, might work like ELIZA or GPT-N, that is when equipped with some kind of learning mode that allows the user to explain to it things like the fact that "a boat might be a kind of thing that fills a hole in the water that you throw money into.", and so on.  This is already led to some dubious results, for others, which I won't go into further - here, just yet.

    Instead, I have taken to the task of retyping the original MAD SLIP source code for Eliza into a fresh text file, hopefully with far fewer garbled characters than what Adobe seems to be able, or otherwise completely unable to do.

    CHANGE MAD
              EXTERNAL FUNCTION (KEY,MYTRAN)
              NORMAL MODE IS INTEGER
              ENTRY TO CHANGE.
              LIST.(INPUT)
              V'S G(I)=$TYPE$,$SUBST$,$APPEND$,$ADD$,
              1$START$,$RANK$,$DISPLAYA$
              V'S SNUMB = $ I3 *#*$
              FIT=0
    CHANGE    PRINT COMMENT $PLEASE INSTRUCT ME$
              LISTD.(mtlist.(INPUT),0)
              JOB=POPTOP.(INPUT)
              T'H IDENT, FOR J=1,1, J.G. 7
    IDENT     W'R G(J) .E. HOB, T'O THEMA
              PRINT COMMENT $CHANGE NOT RECOGNIZED$
              T'O CHANGE
    THEMA    W'R J .E. 5, F'N IRALST.(INPUT)
             W'R J .E. 7
                  T'H DISPLA, FOR I=0,1, I .G. 32
                  W'R LISTMT.(KEY[I]) .E. O, T'O DISPLA
    READ(7)          S=SEQDR.(KEY[I])
                  W'R F .G. O, T'O DISPLA
                  PRINT COMMENT $*$
                  TPRINT.(NEXT,0)
                  PRINT FORMAT SNUMB,I
                  PRINT COMMENT $ $
                  T'O READ(7)
    DISPLA        CONTINUE
                  PRINT COMMENT $ $
                  PRINT COMMENT $MEMORY LIST FOLLOWS$
                  PRINT COMMENT $ $
                  T'H MEMLST, FOR I=1 , 1, I .G. 4
    MEMLST        TXTPRT.(MYTRAN(I),0)
                  T'O CHANGE
              E'L

    Now actually, there are about seven pages of this stuff, that I am in the process of cleaning up, and which of course adds yet one more TODO to be done to my bucket list of things to be done, and that would of course be - why not implement "just enough MAD-SLIP" to run on an Arduino or a Propeller, so we can finally have the "original", or at least feel a little closer to the metal than some java-script implementations, no matter how nice, etc.  Apparently, the IBM 7090 had something like 64K of core memory, arranged as 32768 16-bit words.  So that sort of thing seems reasonably doable - at least as far as memory footprint requirements are concerned, and not according to whether there should be any need to have actually compile and run this on an IBM 7090 in emulation, even if stuff like that exists in Hercules.  No thank you, at least not yet.  I don't really like anchovies either.

    Yet here we can easily see where ELIZA had a mode, where she could say "Please Instruct Me", or "Change not recognized?"  So there WAS a learning mode!  Or there was one in the works!  Now the challenge begins to take on a different flavor, and that is not just to implement LISP, MAD SLIP, Pascal, and C, and so on - in a suitable microcontroller environment - but to REALLY "Hack it back", by perhaps getting "just enough Turning completeness" into the C/C++ pre-preprocessor, or EBNF lexer,...

    Read more »

  • Through the Maze, Down the Rabbit Hole, Into the Labyrinth

    glgorman07/18/2022 at 19:24 0 comments

    It goes something like this.  

     PASCALCOMPILER::THREAD_ENTRY gets called.  That calls  PASCALCOMPILER::COMPILER_MAIN, which leads to PASCALCOMPILER::BLOCK, which in turn brings us into PASCALCOMPILER::BODY which gets us into BODYPART::MAIN via the "rabbit hole" of doing a reinterpret_cast on the compiler object which in turn was Frankenstiened together via a custom allocator, rather than with the regular constructor object, which shouldn't be possible if the inheritance is based on a set of virtual base classes, but it does seem to be working, for now.  Then we cross the Rubicon into the Labyrinth that consists of mutually recursive calls back into DECLARATIONPART::MAIN, then back up to PASCALCOMPILER::BLOCK (!),, followed by a trip into PASCALCOMPILER::DECLARATIONS, which somehow finds DECLARATIONPART::MAIN again, where it finally falls to the center of the earth looking for a symbol that it doesn't find, via a SKIP loop - finally emitting ERROR 18 "Error in Declaration Part".

    O.K., so now we have proof of concept: Pascal-style mutually recursive nested functions in C/C++, even though "local procedures are illegal" in C.

    Time to say oh, la, la! and order a deluxe pizza?   Maybe.  Even though it is still quite a bit far off before I will be generating a binary that will run on the Propeller, Arduino, or a TTL-NOR computer. for that matter.  Yet there is something interesting that comes to mind.   Most of the code conversion was done in a word processor, using find and replace and then manually editing the prototypes, as previously discussed.  So I know there are other plenty of places where some block might not have gotten an extra left or right curly brace to properly match up and properly format some DO WITH this WHILE that mess, that might also have nested cases, etc.  So I simply added or deleted braces by eye, often without looking at the original source, just to get it to compile!  I mean, so whet?  Just give it the brains of Abby Normal, right?  Maybe Eliza could do a better job.

    Maybe Eliza COULD do a better job!  Right now Eliza is pretty good at substitutions, but not so good at permutations, and re-arrangements.  Yet clearly something a little more sophisticated than diff and patch is needed, and yet grep isn't quite the right answer either.  Obviously, one could grep out the FOR loops, and the IF THENS, etc, and then one could compare the original code with the translated code, but I don't think that grep knows how to do that.  That's more like the task of running two or more lexers side by side, each of which is somehow an "expert" at some part of the structure while ignoring constructs that it doesn't understand.

    Sort of like what GPT-3 tries to do - predict the next word in a sentence, based on an analysis of 100's of gigabytes of text, whether it is standard English or code.  Then there is also Microsoft's GitHub co-pilot - which I haven't tried yet - I don't know if it is even up and running, and in any case - can it use its AI to compare a mostly correct but a bit buggy Pascal source with the brand new but still seriously broken C version of the same program?  Probably not.

    Yet - I knew that there was a reason for creating THIS mess, when I did, back around mid-summer 1997.

    class text_object
    {
        CCriticalSection critical;
    
    public:
        bool m_bEnd;
        UINT    m_code_page;
        node_list<char*> m_nList;
        node<char*> *m_nPos;
        s_node<char*, language> m_sList;
    
    public:
        text_object ();
        text_object (char *m_pText);
        text_object (const text_object ©);
        text_object (node_list<char *> *copyFrom);
        text_object (bTreeType<char*> **source);
        ~text_object ();
    .
    ........etc ........

     It is a text object class, that encapsulates several types of data structures, which can be constructed from simple pointers to char, or from linked lists of pointers to char of two different types, depending...

    Read more »

  • It's ALIVE!

    glgorman07/17/2022 at 20:50 0 comments

    I've managed to get all of the functions from the compiler part, unit part, and declaration part to the point of compiling and linking; with still a lot of work remaining on the body part, which is still another 2000 lines, which mainly need WITH statements fixed up, as well as figuring out exactly how I am going to deal with the fact that PASCAL allows nested procedures, but C/C++, of course, does not.  Do I pass a pointer to any variables that I need to share from the nesting to the nested functions, - or is there a more elegant way that will work out better in the end; like if there is a sneaky way to use C++ nested classes (!) - which might work out really nicely if I could figure out how to use "placement new" to construct an instance of a virtual derived class which might somehow encapsulate an instance of an existing base object.  Now THAT would be nice.

    This method seems to work fairly straightforwardly.  Just derive DECLARATIONPART from PASCALCOMPILER and use a constructor that copies all of the member variables from the nesting class to the nested class, which is only about 2800 or so bytes, since a lot of stuff is in linked lists or tree structures, and we really only need to borrow a copy of the master pointers.

    DECLARATIONPART::DECLARATIONPART(COMPILERDATA *ptr)
    {
    	size_t sz = sizeof(COMPILERDATA);
    	COMPILERDATA *ptr2;
    	ptr2 = (COMPILERDATA*)this;
    	memcpy(ptr2,ptr,sz);
    }

    Now for something REALLY weird!  Why not just construct an object and then call a member function of that class by dereferencing the constructor, that is, without ever giving the object a name.  Apparently anonymous objects are allowed, but it would be nice if they looked a little prettier.

    if (!(options.INMODULE&&(SY==ENDSY)))
    {
    	CERROR(6);
    	SKIP(FSYS);
    	DECLARATIONPART(this).MAIN(FSYS);
    }

    Or maybe use placement new to construct the derived class on top of the existing object hierarchy, overwriting the stack in the process, but perhaps figuring out a way to make the stack "look" exactly what it would look like, that is if C/C++ allowed nested procedures.  There is also the "alloca" method of reserving space on the stack, then perhaps manipulating object trees over that - which brings me back to placement new.  Yet for now - the method of copying the whole base object seems to work well enough to get me into debugging the declaration part - and hopefully sooner than never, I will actually be generating some code files that can actually be run.

  • Project Status Update

    glgorman07/14/2022 at 12:21 0 comments

    I have posted pdf versions of the project description, details, and project logs in the files section of this project, along with a pdf version of the original source code for the UCSD p-system compiler for a more pleasant reading.  So much more fun if you are viewing with a tablet.  Additional source files are available on GitHub and in standard form and will be regularly updated as things continue to make progress.

    I read that CP/M has now been officially liberated.  So now it might be worthwhile to consider adding CP/M compatibility, instead of the original UCSD p-system file system.  Lots to do.

    In the meantime: Enjoy!

    I think I will completely rewrite the tokenizing function INSYMBOL even further, so as to, hopefully, completely eliminate all case statements, which of course get replaced with switch statements in C/C++.  Doesn't this look much nicer?

    namespace pascal0
    {
    key_info operators[] = 
    {
    	key_info(":=",BECOMES,NOOP),
    	key_info("(*",COMMENTSY,NOOP),
    	key_info("{",COMMENTSY,NOOP),
    	key_info("*)",SEPARATSY,NOOP),
    	key_info("}",SEPARATSY,NOOP),
    	key_info("<>",RELOP,NEOP),
    	key_info(">=",RELOP,GEOP),
    	key_info("<=",RELOP,LEOP),
    	key_info("..",COLON,NOOP),
    	key_info(".",PERIOD,NOOP),
    	key_info(":",COLON,NOOP),
    	key_info(";",SEMICOLON,NOOP),
    	key_info("^",ARROW,NOOP),
    	key_info("[",LBRACK,NOOP),
    	key_info("]",RBRACK,NOOP),
    	key_info("(",LPARENT,NOOP),
    	key_info(")",RPARENT,NOOP),
    	key_info(",",COMMA,NOOP),
    	key_info("+",ADDOP,PLUS),
    	key_info("-",ADDOP,MINUS),
    	key_info("*",MULOP,MUL),
    	key_info("/",MULOP,RDIV),
    	key_info("=",RELOP,EQOP),
    	key_info(">",RELOP,GTOP),	
    	key_info("<",RELOP,LTOP),
    	key_info("\'",STRINGCONST,NOOP),
    	key_info("",OTHERSY,NOOP),
    };
    };

    And thus another long chain of case statements bites the dust. 

  • Eliza Learns Pascal?

    glgorman07/10/2022 at 08:25 0 comments

    Well, sort of - this is going to be a LONG journey - but things are starting to move very quickly as of late.  Writing code is like that - months go by and NOTHING gets done - then in a couple of weekends I write a few thousand lines of code.  This should be fun after all.

    As if figuring out how to write a completely independent lexer, that works as good as, or better than the original wasn't enough work to do - then there is the notion of how to create ASTs (abstract syntax trees) that not only work with PASCAL, with C/C++, and yet also with standard English grammar, which might contain dialog, or it might contain commands like "KILL ALL TROLLS!", or "Build me a time machine".  Oh, what fun.

    int PASCALSOURCE::SYMBOL_DUMP (LPVOID)
    {
        size_t i;
        CREATE_SYMLIST(NULL);
        size_t sz = m_symbols.size();
        for (i=0;i<sz;i++)
        {
        DEBUG_SY(m_symbols[i],FORSY,DOSY);
        }
        WRITELN(OUTPUT);
        WRITELN(OUTPUT,(int)sz," decoded");
        return 0;
    }

     Yet isn't it nice to contemplate being able to search a project for every FOR statement or every IF-THEN, or to make a list of all of the procedures in the source, to be better able to make sure the conversion is going correctly?  Yet why not search "The Adventures of Tom Sawyer" for every reference to whitewash preceded by or followed by fence, or paragraphs that contain the name Injun Joe, and cave or caves in either same, the preceding or the following sentence, paragraph, or context?  Seems like a daunting task, but is it? Maybe, or maybe not.

    So, let's throw another log on the fire, and do it not with string manipulating functions like strcmp, strcpy, etc., but with abstract functions that can operate on, and transform text objects, whether they are in the form of pure ASCII strings, or tables, or linked lists, or vectors connection maps that link tree structures where the individual nodes of the subtrees point to linked lists or vectors of tokenized, and possibly compressed input which might in turn reference tables of dictionary pointers.

    Writing, or re-writing a compiler is quite a chore.  Having some interesting code analysis tools makes things a LOT more interesting.

     Now, back to killing trolls, and inventing time travel?

    Not, quite yet.  Let's suppose that we are analyzing real DNA, then one way of doing THAT involves lab techniques that involve things like restriction enzymes, centrifuges, HPLC, CRISPR, DNA chip technology, etc.  All so that we can later look at a genome, among other things, and have some way of doing something like "Find sequences that have CATTAGGTCTGA followed by ATCTACATCTAC or something like that, with whatever else might be in the middle.  Like if we had a partial analysis of some fragments of a real protein that we want to learn more about, and we need to find out where in some three billion base pairs that might be encoded, even if that is also in fragments, which might be subjected to later post-translation editing.

    Something like this looks VERY doable.

    DEBUG_GENE ( genome, "CATTAGGTCTGA" , "ATCTACATCTAC" );

     Just in case that sort of thing might be useful to someone.

    Suffice to mention, also, that if you have been programming long enough, then you know what it is like to sprinkle your code with 1000's of TRACE statements, or trying to pipe debugging information to a logfile with fprintf statements, and all of the hassle that goes into creating the format strings, setting up and cleaning up buffers for all of that, and so on.  When PASCAL does it so nicely - like this -- 

    WRITE (OUTPUT,' ',SYMBOL_NAMES2[p.SY]);
    WRITE (OUTPUT,'(',p.VAL.IVAL,')');

    Letting us use the PASCAL-style WRITE and WRITELN functions, which are perfectly happy to accept strings, characters, integers,...

    Read more »

  • The Road Much Less Travelled.

    glgorman07/08/2022 at 05:55 0 comments

    I cooked up an Eliza-based Pascal source tokenizer and tried using it to see how good it was (is) at doing some of the initial steps in converting the Pascal compiler to C++.  Although the initial results seem a bit cringe-worthy, they are not a complete disaster either.  So, I got really aggressive in creating a debugging environment for the Eliza-based tokenizer, as well as the original and these results together are looking quite promising.  First, a glimpse of the Eliza-based method.

    void PASCALCOMPILER::SOURCE_DUMP ()
    {
        ELIZA eliza;
        text_object source;
        char *buff1, *buf2;
        int line;
        line = 0;
        if (SYSCOMM::m_source==NULL)
        {
        WRITELN(OUTPUT,"NULL source file");
        return;
        }
        else if ((*SYSCOMM::m_source).size()==0)
        {
        WRITELN(OUTPUT,"Empty source file");
        return;
        }
        else do
        {
        buff1 = (*SYSCOMM::m_source)[line];
        source = buff1;
        buf2;
        eliza.process = source;
        eliza.pre_process (pascal2c);
        eliza.process >> buf2;
        WRITE(OUTPUT,buf2);
        delete buf2;
        line++;
        }
        while (buff1!=NULL);
    }

    The mostly complete source for this mess can be found of course in the GitHub repositories for this project and will be updated regularly.  Be very afraid.  Use at your own risk.  Guaranteed to contain LOTS of bugs.  On the other hand - creating a bunch of debugging code that inspects each symbol as it is parsed, and which selects for things like whatever is found starting with every occurrence of the keyword PROCEDURE and continuing until the first SEMICOLON encountered thereafter - yields a very promising result - which looks (in part) like this.

    12762: PROCEDURE
    12763:  "ASSIGN"
    12764: (
    12765:  "EXTPROC"
    12766: :
    12767:  "NONRESIDENT"
    12768: )
    12769: ;
    
    12859: PROCEDURE
    12860:  "GENJMP"
    12861: (
    12862:  "FOP"
    12863: :
    12864:  "OPRANGE"
    12865: ;
    
    13012: PROCEDURE
    13013:  "LOAD"
    13014: ;
    
    13017: PROCEDURE
    13018:  "GENFJP"
    13019: (
    13020:  "FLBP"
    13021: :
    13022:  "LBP"
    13023: )
    13024: ;
    
    13048: PROCEDURE
    13049:  "GENLABEL"
    13050: (
    13051: VAR
    13052:  "FLBP"
    13053: :
    13054:  "LBP"
    13055: )
    13056: ;
    
    13078: PROCEDURE
    13079:  "PUTLABEL"
    13080: (
    13081:  "FLBP"
    13082: :
    13083:  "LBP"
    13084: )
    13085: ;
    
    13175: PROCEDURE
    13176:  "LOAD"
    13177: ;
    
    13469: PROCEDURE
    13470:  "STORE"
    13471: (
    13472: VAR
    13473:  "FATTR"
    13474: :
    13475:  "ATTR"
    13476: )
    13477: ;
    
    13560: PROCEDURE
    13561:  "LOADADDRESS"
    13562: ;

    Now without taking another digression into a discussion of the meaning of the word SELECT, and what might mean in the context of relational databases, it should be easy to see how if all we were to do is to tokenize the input and then select sub-sections according to certain properties, then obviously - this leads to something that looks like it might be handled quite easily by some kind of #define TYPEGLOB_REORDER (A, B, C, ...) macro.  Even if I am not proceeding at this point with trying to do a pure preprocessor macro-based language scheme.  Somewhere, over the rainbow, maybe someday?

  • Eliza meets C, whether this is to be, or not to be - we shall see.

    glgorman07/06/2022 at 14:34 0 comments

    Further integration of my 25-year-old C++ port of the '70s vintage Eliza program into the Pascal compiler is moving along nicely.  The original code had an important step, referred to as conjugation - wherein words like myself and yourself, or you and I would be swapped.  Yet, after realizing there are similarities to this process, and what goes on in the C/C++ preprocessor, I decided to rename the function pre_process, for obvious reasons - since that is one of the directions that I want this project to be headed.  So even though I have no desire to learn APL, there is still some appeal to the notion that perhaps an even better ELIZA can be done in just one line of something that conveys the same concepts as APL, as if there is any concept to APL at all.

    void ELIZA::pre_process (const subst *defines)
    {
        int word;
        bool endofline = false;
        char *wordIn, *wordOut, *str;
        node<char*> *marker;
        process.rewind();
        while (endofline==false)
        {
        word = 0;
        marker = process.m_nPos;
        process.get (str);
        endofline = process.m_bEnd;
        for (word=0;;word++) {
            wordIn = (defines[word]).wordin;
            wordOut = (defines[word]).wordout;
            if (wordIn==NULL)
            break; 
            if (compare (wordIn,str)==0) {
            marker->m_pData = wordOut;
            break; }
        }
        }
    }
    

     Thus, with further debugging, I can see how a function like this should most likely be moved into the FrameLisp::text_object class library, since in addition to being generally useful for other purposes, it also helps to try to eliminate as many references to objects of char* type in the main body of the program as possible, with an eye toward having an eventual UNICODE version that can do other languages, emojis etc.  Which certainly should be doable, but it can turn into a debugging nightmare if it turns out to be necessary to hunt down thousands of char and char* objects.  Thus, I have created my own node<char*>, node_list<char*> and text_object classes using templates, for future extensions and modifications.

    Thus, even though ELIZA is kind of broken right now, and is being debugged, this pretty much embodies the simplicity of the algorithm:

    text_object ELIZA::response ()
    {
        int sentenceNum;
        text_object result, tail;
        char *str = NULL;
        node<char*> *keyword, *tail_word, *last_word;
        
        process = textIn;
        pre_process (conjugates);
        return process;
    
        keyword = find_keyword ();
        sentenceNum = currentReply [key];
        currentReply [key]++;
        if (currentReply[key]>lastReply[key])
            currentReply[key] = firstReply [key];
    
        result = replies [sentenceNum];
        node<char*> *marker = process.m_nPos;
        if (keyword!=NULL)  
        tail.m_nList.m_nBegin = marker;
        else
        tail = "?";
    
        tail_word = result.findPenultimate (str);
        result.get (str);
        result.peek (str);
        if (strcmp(str,"*")==0) {
        last_word = tail_word->m_pNext;
        delete last_word;
        tail_word->m_pNext = NULL;
        result.m_nList.m_nEnd = tail_word;
        result.append (tail); 
        result.append ("?");
        }
        result.m_nPos = result.begin();
        return result;
    }

    Yep, maybe the ELIZA algorithm, with the right text processing libraries just might only take about 40 lines or so of code, with no APL needed or desired.  Now testing just the...

    Read more »

  • Life on Square One - Part II

    glgorman07/04/2022 at 03:49 0 comments

    In an earlier project, I was looking at how It might be possible to get the C/C++ -processor to chow down on Pascal programs, that is if the preprocessor would allow us to do things like temporarily redefining things like the semicolon or equals symbols, and so on - with nested #ifdef's, #undef's and the like.  Sort of like this - which doesn't actually work with all of the macros, but it does work with some so that you can at least partially convert a Pascal program to C/C++ by creating some kind of "pascal.h" file and then add #include "pascal.h" in your Pascal code, and then grab the preprocessor output, right?  Well, no - but almost, very very almost like this:

    #define {            /*
    #define }            */
    #define PROCUEDURE    void
    #define BEGIN        {
    #define END          }
    #define :=          ASSIGN_EQ
    #define =        COMPARE_EQ
    #define IF            if (
    #define ASSIGN_EQ     =
    #define COMPARE_EQ    ==
    #define THEN        )
    #define REPEAT        do {
    #define UNTIL        } UNTIL_CAPTURE
    #define UNTIL_CAPTURE    (!\
    #define ;            );\
    #undef            UNTIL_CAPTURE
    #define ;           );
    #define = [        = SET(\
    #define ]        )\
    #define )        \
    #undef    =        [\
    #undef    ]
    // so far so good ....
    #define WITH        ????????

    I mean, if someone else once figured out how to get the GNU C/C++ preprocessor to play TETRIS ... then it should be possible to do whatever else we want it to do, even if some other powers claim that strictly speaking the preprocessor isn't fully Turing complete in and of itself, but that it is actually only just some kind of push-down-automation, because of some issues like having a 4096 byte limit on the length of string literals, and so on.  Yeah, right - I think I can live with that one if what they are saying, is in effect is that it is probably as Turing complete as anyone might actually need to be.

    Still, this gives me an idea that seems worth pursuing, like what does ELIZA have in common with the preprocessor or a full-blown compiler for that matter?  Well, the Eliza code from the previous log entry used the following static string tables, arrays, or whatever you want to call them, based on a C++ port of an old C version that was converted from an example that was written in BASIC and which most likely appeared in some computer magazine, most likely, Creative Computing, back in the '70s.

    char *wordin[] =
    {
        "ARE", "WERE", "YOUR", "I'VE", "I'M", "ME",
        "AM", "WAS", "I", "MY", "YOU'VE", "YOU'RE", "YOU",NULL
    };
    
    char *wordout[] =
    {
        "AM", "WAS", "MY", "YOU'VE", "YOU'RE", "YOU",
        "ARE", "WERE", "YOU", "YOUR", "I'VE", "I'M", "ME",NULL
    };

    This could probably be fixed up a bit - to be more consistent with the methods that I am using in my port of the UCSD Pascal compiler to solve the problem of keyword and identifier recognition, as was also discussed earlier, and for which in turn I had to write my own TREESEARCH and IDSEARCH functions.

    struct subst
    {
        char *wordin;
        char *wordout;
        subst();
        subst (char *str1, char *str2)
        {
            wordin = str1;
            wordout = str2;
        }
    };

     Which should allow us to do something like this - even if this is, as of right now - untested.

    subst conjugates [] = 
    {
        subst("ARE","AM"),
        subst("WERE","WAS"),
        subst("YOUR","MY"),
        subst("I'VE","YOU'VE"),
        subst("I'M","YOU'RE"),
        subst("ME","YOU"),
        subst("AM","ARE"),
        subst("WAS","WERE"),
        subst("I","YOU"),
        subst("MY","YOUR"),
        subst("YOU'VE","I'VE"),
        subst("YOU'RE","I'M"),
        subst("YOU","I"),
        subst(NULL,NULL),
    };

     So, I searched Google for Eliza source code, and among other things, I found variations of Weizenbaum's original paper on the subject are now available, as well as variations of things like some kind of language called GNU SLIP, which is a C++ implementation of the symmetric list processing language that the original Eliza was originally written in since it seems that Eliza wasn't actually written in pure Lisp at all, contrary to popular belief!  Yet, documentation...

    Read more »

  • Oh Lazarus, where 'art thou?

    glgorman07/02/2022 at 11:14 0 comments

    The Art Officials never commented about my last post, and it is going to be a while before I actually get any version of Pascal, whether it is Lazarus, or some other version of FreePascal or UCSD actually up and running on the Parallax Propeller P2.  So I figure that this might be just as good of a time as any for a quick conversation with Eliza.

    Now as it turns out, in an earlier project I was discussing how I have been working on a library called Frame-Lisp, which is sort of a frames-based library of Lisp-like functions that I would like to eventually get running as a back end for ports of ELIZA, and SHURDLU and PARRY and MEGAHAL and pretty much any compiler that I would like to be able to create, invent, or just simply port to other interesting and fun platforms, like Propeller, or Arduino, or FPGA, or pure retro TTL based systems  Well, you get the idea.  Yet, well then - guess what?  It also turns out that I did ELIZA something like 25 years ago, and I recently somehow managed to find the archive of that build and get it running again, sort of.  Which of course gives me an idea - since what the original Eliza lacked, like many attempts at creating chat 'bots, is some kind of internal object compiler that could in principle give a language like C/C++ some capacity for new object type creation at run time, which according to some, is considered a form of reflection - which is, of course, going to be necessary, that is if we are going to try to simulate any kind of sentience.

    Getting back to the idea therefore of how a compiler should be able to recompile itself is, I think, important.  Even while there is also this idea that if the human genome actually consists of only around 20,000 coding genes, of which only about 30% of which are directly involved in affecting the major function of the brain and how it is wired; then I am thinking that the complexity of a successful A.I. that is capable of actual learning might not be as complicated as others are trying to make it.  It is simply going to be a matter of trying to build upon the concepts of how compilers work, on the one hand, with an idea toward developing data flow concepts based on the contemporary neural network approach. 

    Interestingly enough, this particular ELIZA only needs about 150 lines of code to implement, along with about 225 lines for the hard-coded script, i.e., canned dialog and keywords.  That is in addition to a few thousand or so lines that are needed to run the back-end lisp-like stuff.  So, is it possible that that is where others are failing, that is because they are failing to include essential concepts of compiler design in their approach to A.I.?

    Along another line of reasoning, I have never been a particular fan of Maslow's hierarchy of needs, which I won't get into quite yet, other than that I think that Ericson's stages of conflicts throughout life work out much better in sense of how the effects of the critical period notion affect psycho-social development.

    Even if Eliza doesn't actually learn, there is still some appeal to writing an AI that can re-compile itself.  Hidden Markov models do pretty well up to a point with learning, and then there was M5 of course, in the classic Star Trek, which was programmed by Daystrom with engrams, or so we were told, including the one "this unit must survive."

  • Art Official Intelligence

    glgorman06/27/2022 at 07:13 0 comments

    And in other news, I am continuing to work on porting the UCSD Pascal compiler to C++, so that I will eventually be able to compile Pascal programs to native Propeller assembly, or else I will implement P-code for the Propeller, or Arduino, or both, or perhaps for a NOR computer.  Approximately 4000 lines out of the nearly 6000 lines of the original Pascal code have been converted to C++, that is to the point that it compiles, and is for the most part operational, but in need of further debugging.  As was discussed in a previous project, I had to implement some functions that are essential to the operation of the compiler, such as an undocumented, and also missing TREESEARCH function, as well as another function which is referred to as being "magic" but which is also missing from the official distribution - and which is referred to as IDSEARCH.   Likewise, I had to implement Pascal-style SETS, as well as some form of the WRITELN and WRITE functions, and so on - amounting to several thousand additional lines of code that will also need to be compiled to run in any eventual Arduino or Propeller runtime library.  Then let's not forget the p-machine itself, which I have started on, at least to the point of having some functionality of floating point for the Propeller or Arduino, or NOR machine, etc. 

    Here we can see that the compiler, which is being converted to C++, is now - finally starting to be able to compile itself.  The procedure INSYMBOL is mostly correct and the compiler is getting far enough into the procedures COMPINIT and COMPILERMAIN so as to be able to perform the first stages of lexical analysis.  

    Now, as far as AI goes, where I think that this is headed is that it is going to eventually turn out to be useful to be able to express complex types of grammar that might be associated with specialized command languages according to some kind of representational form that works sort of like BNF, or JSON, but ideally which is neither - and that is where the magic comes in - like suppose we have this simple struct definition:

    struct key_info
    {
        ALPHA        ID;
        SYMBOL        SY;
        OPERATOR    OP;
        key_info() { };
        key_info(char *STR, SYMBOL _SY, OPERATOR _OP)
        {
            strcpy_s(ID,16,STR);
            SY = _SY;
            OP = _OP;
        }
    };

     Then we can try to define some of the grammar of Pascal like this:

    key_info key_map[] =
    {
        key_info("DO",DOSY,NOOP),
        key_info("WITH",WITHSY,NOOP),
        key_info("IN",SETSY,INOP),
        key_info("TO",TOSY,NOOP),
        key_info("GOTO",GOTOSY,NOOP),
        key_info("SET",SETSY,NOOP),
        key_info("DOWNTO",DOWNTOSY,NOOP),
        key_info("LABEL",LABELSY,NOOP),
        key_info("PACKED",PACKEDSY,NOOP),
        key_info("END",ENDSY,NOOP),
        key_info("CONST",CONSTSY,NOOP),
        key_info("ARRAY",ARRAYSY,NOOP),
        key_info("UNTIL",UNTILSY,NOOP),
        key_info("TYPE",TYPESY,NOOP),
        key_info("RECORD",RECORDSY,NOOP),
        key_info("OF",OFSY,NOOP),
        key_info("VAR",VARSY,NOOP),
        key_info("FILE",FILESY,NOOP),
        key_info("THEN",THENSY,NOOP),
        key_info("PROCEDURE",PROCSY,NOOP),
        key_info("USES",USESSY,NOOP),
        key_info("ELSE",ELSESY,NOOP),
        key_info("FUNCTION",FUNCSY,NOOP),
        key_info("UNIT",UNITSY,NOOP),
        key_info("BEGIN",BEGINSY,NOOP),
        key_info("PROGRAM",PROGSY,NOOP),
        key_info("INTERFACE",INTERSY,NOOP),
        key_info("IF",IFSY,NOOP),
        key_info("SEGMENT",SEPARATSY,NOOP),
        key_info("IMPLEMENTATION",IMPLESY,NOOP),
        key_info("CASE",CASESY,NOOP),
        key_info("FORWARD",FORWARDSY,NOOP),
        key_info("EXTERNAL",EXTERNLSY,NOOP),
        key_info("REPEAT",REPEATSY,NOOP),
        key_info("NOT",NOTSY,NOOP),
        key_info("OTHERWISE",OTHERSY,NOOP),
        key_info("WHILE",WHILESY,NOOP),
        key_info("AND",RELOP,ANDOP),
        key_info("DIV",MULOP,IDIV),
        key_info("MOD",MULOP,IMOD),
        key_info("FOR",FORSY,NOOP),
        key_info("OR",RELOP,OROP),
    };

    And as if, isn't this all of a sudden - who needs BNF, or regex, or JSON?  Thus, that is where this train is headed - hopefully!  The idea is, of course, to extend this concept so that the entire specification of any programming language (or...

    Read more »

View all 10 project logs

Enjoy this project?

Share

Discussions

Similar Projects

Does this project spark your interest?

Become a member to follow this project and never miss any updates