Close

Where Am I?

A project log for Vintage Z80 palmtop compy hackery (TI-86)

It even has a keyboard!

eric-hertzEric Hertz 01/06/2022 at 04:550 Comments

So, in a recent past-log I finally explained what it is I plan to do with this calc-hackery...

And that had been kinda side-lined due to other/new issues with the very thing /this/ thing was intended to diagnose. heh.

Anyhow, Thank Goodness, it seems those have subsided, so I can return to this to help diagnose that's long-term ailments.

...

I'm interfacing with a 1-wire bidirectional UART... It works on a request/response basis. And, frankly, I don't really have a whole lot of /reliable/ information to go on about the structure of a request. So, until I get that right-enough, the responder might very well stay completely silent.

This, combined with the fact the z80 CPU is clocked by a somewhat variable RC-clock, makes for a bit of an issue when it comes to bitbanging a UART to communicate with it... If the responder sent out a message every so often, I could determine its bit-duration (with respect to the CPU frequency, NOT WRT seconds!) with some sort of autobaud system. But, since it only speaks when spoken to (correctly!), I can't autobaud from it.

Thankfully there are /two/ serial communication busses in this system. The other, which I will otherwise not interact with, has chitter-chatter going on quite regularly. So, I can autobaud off that!

Sorta. It communicates at a MUCH higher baudrate, so fast, in fact, that the calculator CPU can /barely/ read the port fast enough to catch every bit, and certainly too fast and too regularly for the calculator to actually process the data.

But, it /can/, just barely, catch every bit. And the packets are also rather long, which is helpful because then I can get a good sense of the CPU's clock speed with respect to that baudrate... 41.6K. It's also quite handy that protocol is "PWM" rather than a typical UART, because that means Every Bit has both a high and a low, and thus an inherent bit-clock.

So, if I create the fastest sampling loop possible on the z80, I can sample just fast enough to see every high and low. Then I can divide the number of samples by the number of bits, and a little bit of dimensional-analysis math to determine the actual CPU speed... in actual Hertz.

From there, I think my references are pretty consistent that my UART needs to communicate at 10.4Kbps. So, now Ican figure out how many CPU clock cycles should occur between each UART bit.

YAY!

Since I'm going into this rather blindly, I've been developing quite a few tools, which are unnecessary in the final project, in order to test things as I go with an actual computer rather than with the most-likely silent end-goal device.

That means e.g. in order to develop the UART bitbanging code I needed to interface it with RS-232 at a normal baudrate ("10.4 Good Buddy!" just ain't normal). And detecting the CPU frequency, in that case, had to come from the computer, too, rather than my 41.6Kbps serial bus. So, that meant developing a real "autobaud" function which will never be used in the final system. heh!

Similarly, Bitbanging the receiver is very different than bitbanging the transmitter... So, those two systems are different "libraries" altogether... But they have to work together. Since RS-232 is /not/ one-wire, I started with the receiver and transmitter on separate pins on the calculator's link port. But, later they'll be on the same pin. Thus I continue to develop "iPort7," "iUAR," and "iUAT" such that switching pins is merely a matter of changing an ".equ." 

But, that's further complicated by the fact that Port7 is /not/ Read-Modify-Writeable! So, every time one pin on the port is written or reconfigured (input, output, pulled-up, driven high, pulled low, Hi-Z, ...) the /other/ pin must be accounted-for, too. iPort7 now makes that quite a bit easier, such that e.g. I can work on developing iUAT without knowing or caring what the other pin is used for. 

Also making that easier is the fact that these processes pretty much /can't/ be multitasked. There's really not much CPU time left once you shift bits and add a variable time delay... So, kinda inherently one knows that the unused pin shouldn't change in configuration during a byte transfer.

This is a bit of a backwards step for me, I'm used to trying to multitask in such projects. A 20MHz AVR is literally on the order of 100 times faster than a 6MHz z80! That's something that's really taken quite a bit to wrap my head around... I really thought it'd be more on the order of 10x. So, e.g. new Arduino users might make an LED blink by toggling its pin then delaying for a second, then repeating. And as they progress they might do similar in blinking that LED much faster for, say, an IR remote control. Using microsecond delays. But, that AVR could /easily/ be doing many other things during those microsecond delays if it was coded differently. So I've long since developed a round-robin system where nearly all my libraries do a small task then allow another to do its small task. E.G. "if it's time, toggle the LED".

That system also relies heavily on a constantly-running microsecond-ish timer... which the z80 doesn't provide. But, even if it did, frankly, I don't think it would be fast enough to do any other, even /tiny/, tasks during a delay of ~0.05ms (for 9600bps) between bits (after calculations, shifts, port-writing, etc.) In fact, I wonder if it could even handle a few pushes/pops (nevermind calls/rets) in that time.

Hah! Seriously, this is a very strange concept to me... If you want to toggle a pin ten times a millisecond, then put in ten 0.1ms delays! I dunno, maybe I'm the only one who sees this as weirdly foreign.

Anyhow, so that means we have to do everything step-wise... 

First, sample the "white wire" as fast as you can, storing raw samples straight to RAM... that's 1024 BYTES for 1024 BITS of useful data, which might've been far less acheivable in the original z80 era, but now we've got bank switching and more (but not /far/ more) RAM.

Second, walk through every sample, counting high-low transitions... oh, but first, look for the start of a serial frame, because the bus /does/ idle briefly sometimes, and that'd screw up timing measurements. Stop when you see the end of a frame. Keep track of how many samples were within that frame, and how many bits.

Third, calculate the CPU frequency. Then the number of T-States that should occur between each bit at 10400bps.

Oh, but now it gets really complicated, because we're ready to transmit a request over the "red wire" UART...

So, how many T-States (CPU clocks) does it take to set up a bit; shifting it into the Carry register, then jumping as-necessary to write either a one or a zero to the red wire, but not affecting the configuration of the white wire? And how many nops must be placed, where, to make sure that both those jump branches take the same number of T-States? And, wait, nop is 4T-States, but I need 5! (BTW, ret z is 5 T-States when Z is not true, so there's your very weird 5T-State "nop"). OK, we've used X T-States, but we need Y between bits... But Y is variable depending on the temperature, or battery level, etc. 

So I can't just delay 1/10400th of a second, for MANY reasons: Setting up that pin toggling takes a surprisingly huge fraction of that... that must be subtracted. And, of course, if it's cold out 1/10400th of a second might be 400CPU clocks, but 450 once the cabin's heated up. 

So now we need a pretty sophisticated delay function that needs to know how many T-States were used externally, Knows how many (varying) T-States it /should/ delay, calculates how many are remaining, subtracts the number of T-States /it/ used in those calculations, then, finally, delays.

By that time, really, there's less than 100T-States to delay. And the delay loop, itself, takes around twenty. And, you may've noticed, 20 T-States of error in 400, ten times (for a serial frame with a start and stop bit), is 200 T-States of error, (half a bit-duration!) by the end of the serial frame. That /might/ work with some UART receivers, but is far beyond reliable.

So, now, the delay function has to track its error and make up for the last error the next time(s) it's called. This is OKish, I think, because most UART receivers I've encountered sample each bit near the middle, and +- 20/400ths is nowhere near the middle of a bit. So, if it goes over by 20T-States in the first bit, it'll be 20 fewer in the next. (hmmm, is that bit then 40 short?). Of course, if it goes over by 5 each time, then it'll take 4 bits/delays to make up for it. Heh! But, I'm pretty certain that inter-bit "jitter" is acceptable.... FAR better than 200T of additive error.

The weird thing is how all these numbers seem to /just/ fit.... After all those calculations, and the tightest delay loop I could come up with, it really only delays maybe four or five loops. Any fewer and I dunno if it'd be accurate-enough to work with both 10400 and 9600 (for testing) at various temps or batery charges.

 Similar-just-fit is the 41.6K signal, the fact it's PWM, and the fastest z80 sampling loop I could come up with. It's /just/ fast-enough to be able to pick up every high/low in each bit (1 or 0), with /just/ enough margin to work when the CPU is running slower than usual. Weird.

Anyhow, then, of course, after transmitting, it'll expect a response. So it switches the wire to an input, and reception is similar-ish.

But, as I was saying before, this whole thing is very much a singular state-machine, no multitasking. Sample the 41.6kbps signal, parse the samples, calculate CPU frequency, transmit a bit, delay, transmit another bit, repeat 10 times, repeat for N bytes, receive...

It takes a different mindset, to develop an entire system that only does one task at a time, and each to completion...

So far I've got all that going... I haven't yet tested the 41.6kbps sampling/parsing/calculating AND the UART at the same time, since now I need to connect both the computer /and/ the device... I guess that's next.

It's also entirely plausible I'll never get the requests right to get a response... But, if that's the case, most all this work /could/ be applicable elsewhere. E.G. my "blacklink" cable, connected via USB-Serial dongle tops-out at "0.1K/s." Hah! I keep running into memory-fulls, which means something like 80KB of code to backup at 0.1KB(b?)/s is no fun... But, all this UART stuff could increase that speed dramatically... and, if i go that route with it, the autobaud code, which won't end up in /this/ project, would be quite useful.

Also, since I've made the UART pins configurable and overlappable, it shouldn't be too difficult to use the same uart functions on different wires at different times. So, e.g. the Calculator could support /two/ bidirectional 1-wire UARTs. So, e.g. it's entirely plausible developing the request/response code in assembly will just be too durn exhausting for me, especially since responses may not come to bad requests.

Heh. I actually got overwhelmed by that prospect a while back and said "F it, I know enough about the PL-2303 to know it /can/ be configured for custom baudrates... Why don't I just hook up the computer directly, and scrap the calculator idea? It'd give me logging, too." Yikes, sad albeit brief moment. Turns out my PL-2303 driver /doesn't/ implement custom bauds... I tried /really/ hard. But I'm not in the position to recompile my kernel. So, back to Calcy, kinda to my relief that I hadn't done MONTHS of work just to scrap it. So, if I just can't take any more z80 assembly, or run out of RAM, or decide I need logging in CSV's, I could just use Calcy a bit like I kinda saw it /before/ I remembered why I'd been looking for a handheld terminal in the first place... as a microcontroller that just happens to have a terminal attached. 

If I'd've gone about this my usual way, I'd just grab an AVR and load up my UAR and UAT libraries, configure two of each on two pins, (Tx and Rx sharing pins? hmm, never did 1-wire with the AVRs... would that long-developed code support it?) one at 10.4, the other at 19.2, and with multitasking it'd just forward data from one to the other... Now's where the multitasking thing would be my default go-to... but, why?

The protocol is request/response, 1-wire is inherently half-duplex, this serial-forwarding /could/ happen at any rate... The computer sends the request at 9.6, the calculator repeats it at 10.4, grabs the response, retransmits it at 9.6. ... heck, even 1200bps would work (just to make my point that this is a singular state-machine, multitasking isn't necessary, here, even /with/ the intermediate data-forwarder).

So, an AVR would be easier (for me) and smaller and faster, and so-forth, but that system lacks something key to the end-goal... I can't always drive around with a full-on computer as its only interface. Calcy's a microcontroller with a terminal, project-box, batteries, etc  that I can velcro to the dashboard. And connecting to compy for more fancy features at times is definitely an option. And, even using Calcy as nothing more than a baud-converter, at least while I try to figure out the request-protocol isn't a bad use of Calcy, but when I get that protocol figured out, I can assemble it up and leave the Compy out of the equation. Yeah!

We'll see. Next baby-step is connecting a potentially 12V pwm signal and a 3.6V UART to the calculator's link port at the same time... and wires dangling at my feet. 

(I didn't have to worry about the PWM nor PL2303 voltage levels before, since I'm using the blacklink as a level-converter that also draws its supply voltage from its signals! How handy! Until you need two voltages.)

Hey, the engine may not be particularly happy, but it's keeping me warm, even through the recent snow! Which is more than I could say just a couple months ago. But, I kinda owe her a deeper diagnostic like this sooner rather'n later. Back to work, slacker!

Discussions