Close

Autobogotiphobia

A project log for UBASIC And The Need For Speed

I basic, 'ubasic', they basic, we basic; wouldn't you like a speedy BASIC, too?

ziggurat29ziggurat29 06/20/2018 at 01:560 Comments

Jun-15 8:01 PM

When writing, I try to edit, but I have know that I will fail and there will be silly typos.  I have gotten over this.  Less so with errors in facts.  And so it was the case when I was looking at the data again, and noticed that some things did not quite jibe.

For one thing, the time taken for strncmp was shown to be zero, even though it was called 125 million times.  Somehow I doubt that.  get_next_token()'s implementation for the most part is looping over a list of keywords and calling strncmp().  So it seemed that strncmp()'s execution time was being somehow sucked up into get_next_token()'s.

Another thing was that jump_linenum() was being called an awful lot.  Doing a quick cross-reference to the call sites for this method, I discovered that it is used for 'goto', but also 'gosub', 'return', 'next', 'if', 'then' -- pretty much anything that changes flow-of-control other than sequentially forward.  That makes some sense.  But then I noticed the implementation details, and things really made sense:

static void jump_linenum(int linenum)
	{
	tokenizer_init(program_ptr);
	while(tokenizer_num() != linenum)
		{
		do
			{
			do
				{
				tokenizer_next();
				}
			while(tokenizer_token() != TOKENIZER_CR &&
			        tokenizer_token() != TOKENIZER_ENDOFINPUT);
			if(tokenizer_token() == TOKENIZER_CR)
				{
				tokenizer_next();
				}
			if (tokenizer_token() == TOKENIZER_ENDOFINPUT)
				{
				sprintf(err_msg,"Unreachable line %d called from %d\n",linenum,last_linenum);
				stdio_write(err_msg);
				tokenizer_error_print();
				longjmp(jbuf,1);             // jumps back to where setjmp was called - making setjmp now return 1
				}
			}
		while(tokenizer_token() != TOKENIZER_NUMBER);
		DEBUG_PRINTF("jump_linenum: Found line %d\n", tokenizer_num());
		}
	}

So, the first thing that caught my eye is "tokenizer_init(program_ptr);".  That means any time we are changing flow-of-control, we are re-initializing the parsing of the entire program from the beginning.  The init call itself is not expensive -- it just sets up some pointers -- but it makes it clear that any time one does a transfer of control, that we are setting up to scan the whole program to find the target line number.

The next thing that caught my eye were the loops within.  Essentially, they are tokenizing the program, and throwing away all the tokens, except for the end-of-line.  This is used to sync the parser to the start of the line (with the line number being the first parsed value) to test if it is the target line number.  That's when I had a realization:  that probably the reason there is so much time spend in the tokenizer routine 'get_next_token()' is because all transfer-of-controls imply re-parsing the program text up to the target line.

In retrospect, maybe this should have been a bit obvious looking at the numbers from the profiler, the 17,186 jump_linenum calls being related to 5,563,383 get_next_token calls, but I didn't think much of it at the time.

So, I suddenly got a sinking feeling that I had given bad advice in recommending focusing on improving get_next_token().  That that function still could benefit from improvement, but the real gains might be in simply having it called far less often.

The other thing that puzzled me was the apparently free calls to strncmp that were made 125 million times.  Now that would be some magic.  Having freshly studied jump_linenum(), I hit upon an intuition:  maybe information is being lost behind the 'static' declaration.  This warrants a little explaining...

'static' is an odd C keyword in that is overloaded to mean either of:

In this case it means 'private'.  Compilers are free to realize this capability however they like, and it is done differently in gcc and msc, but either way, the result is to remove some semantic information from the resulting object code.  Could the 'static' specifier be warping my profiler results?

Well, that was just a /*...*/ away, so I removed the 'static' specifier from jump_linenum() and re-profiled.  Now my data look like:

That is a little more sane!  Now I can see that the nearly 100% of the time is spent in strncmp -- more or less as expected, since that is how this tokenizer is implemented:  a loop over a list of keywords, comparing against each.  Really, I should strip all 'static' and re-run, but I think I've got enough of a view from just this one change.

Ugh, I hate it when I do that.  Aside from some embarrassment at being wrong, I was more concerned that maybe someone could be off on a wild goose chase based on faulty advice.  The likelihood of that was low, but I felt I needed to convey the updated findings.

I quickly crafted and sent an amended statement, this time suggesting that maybe focusing on improving jump_linenum() might be the big win, rather than squeezing blood from a turnip with get_next_token().

So my moral duty was done.  A little bit frustrated with myself for having not scrutinized the data more closely before formulating advice, I decided to actually try to implement the recommendations so I could measure the impact.

Discussions