I stayed up pretty late looking at this stuff - it just didn't make sense. Finally, after playing with the simulations, I found something. The complementary NPN/PNP pair can switch very fast if it is switched continuously - like with a 50% duty cycle pulse. This somehow keeps both transistors biased, or keeps them out of saturation, or something, that allows them to switch very quickly. After a period of time this "magic quality" wears off, and they switch slowly again. I don't have a full explanation yet, but I've seen it in simulation and on the hardware now. I started with the simple inverter circuit again: here simulated with 1 nF and 100 pF caps:
when you drive the 100 pF cap version with a 50% duty cycle square wave, you can get it to go 25 MHz (the limit of my current testing setup). With duty cycles other than 50%, the waveforms fall apart. Here are some scope shots for a 3 MHz input with various duty cycles. Yellow (lower trace) is the input to the inverter; cyan (upper trace) is the output.
This is not just a case of minimum pulse widths - for example, here's the same circuit driven at 10 MHz with 50% duty cycle. The high time and the low time are both shorter than the 30% and 80% duty cycles above, but there's no problem at all with the output:
The same thing happens with the 1 nF capacitor, just at different frequencies. This effect is also present in the ring oscillator, which naturally assumes around a 50% duty cycle, so it is able to oscillate very quickly.
Something is happening in the transistor pair that I don't understand. I don't think the effect is really due to the duty cycle - I suspect that a recent pulse makes the pair faster for a period of time. If another transition happens within this window, the pair will remain fast. Once the effect has worn off, the pair will be slow for the next transition. I can't fully explain what is happening inside the transistors yet, but I see the same effect in LTspice, where it's much easier to make measurements, so I have a shot at figuring it out.
The real bummer is that this invalidates all the previous speed testing. I haven't fully re-run the tests yet with isolated pulses instead of square waves, but a quick look at the scope puts the propagation delays in the 25-35ns range. It's not quite as bad as a CD4001, but it's no match for even the 7404.