I've been making slow but steady progress over a little while, having implemented next1, next2 and next1-renter. I've even written pretty good unit tests for some of them. This has been a hard process, getting the tests right is harder than getting the code right!
Yesterday it occurred to me that I could string all the routines together, saving on the jumps. I tried it out, and while neat, it actually made the code slower. So then I decided to go the other way - inline everything. This makes the code faster in some cases, but massively changes the invariants of the routines, meaning the tests need to be rewritten. This whole business took most of my available time for a day.
So now I have a dilemma: carry on refactoring, for only possible benefit (let's face it, there's no real world benefit to any of this) or go back to my suboptimal code, and push forward with that. Neither sounds good. I also have come up with a few more approaches to saving time in dispatch.
This is what I meant about shaving yaks!
Part of the problem is that I don't really have a good way of evaluating any solution right now.
I'm going to commit my rewritten code as a branch, and revert.
I'm the forum Marcel posted a vCPU implementation of parts of a Forth kernel, and it's remarkably small compared to mine. I intend to post in the forum when my version reaches parity with his.