I forget exactly when I imagined this protocol but that was at least 10 years ago, Hackaday.io didn't exist and USENET was already a shadow of itself... Without a practical use case, I had implementation thus no feedback, no challenge or interaction about my "ideas". Today I can go further because I can exchange with more people! Paul's comments made me dig further and that's how I saw a flaw in my initial scheme.
The last logs said "don't ACK an ACK to prevent endless ACKing" because that would overload the link and increase power draw. Well, it was not a good idea because that would block/stop the link and I didn't simulate this case in my head. So I did more head simulations and here are the results.
The first issue is that the link should minimise bit toggling (to save on power and EMI etc.) but the raw protocol relies on each peer to reply as fast as possible to let the other send its own data or ACK. If one stops, the other can't talk either (until timeout). But if each ACK is replied with by an ACK then the link is overwhelmed.
Let's imagine this case: Peer1 sends a frame, Peer2 has nothing to say and simply ACKs, so Peer1 receives only ACKs. When Peer1 is done with the frame and sends ACK to close it, Peer2 will not reply ACK with ACK. This blocks Peer1 until Peer2 timeouts or has anything to say, which is undesired, to say the least...
What if Peer1 had another frame to send just after ? So let's modify the rule and count how many ACK are replied. Let's say "don't ACK more than 3 ACKs in a row" to unlock the situation, as it fits in a 2-bit saturation counter easily (2 DFF and a few gates). This solves this special case but not the whole problem.
The worst case will be a 1Hz update, depending on which peer timeouts first and toggle an ACK to keep the link open/active. This is an unacceptable wait in practice but still works as a watchdog or failsafe "just in case".
The following discussion primarily applies to the above requirement to minimise both latency and useless ACKs though otherwise this remains totally valid so let's go and state those obvious facts:
There will always be one peer that initiates the protocol's pingpong. Or re-initiate.
Because the ping-pong has to start in some way, the initial protocol must be violated. Some peer must take the initiative and this breaks the symmetry.
3) Race conditions
Since violations will necessarily occur (though not necessarily often but it must be addressed) then race conditions will occur at the edge of the protocol.
From there we can unwind these assertions and see that if we can solve the problem of the race conditions, then it is safe to (re)initiate the protocol. Note that the race conditions can not be avoided, at most reduced, but if they can be detected and managed, then it's good. Now we have to identify the edges of the protocol.
So far a frame is defined by a run of data bits then an ACK trit. There can be any number of ACK, no problem.
The initial "perfect ping-pong" assertion ensures that there is no violation or race condition as long as the traffic continues. But this traffic must stop when "data to send" is exhausted. So it must be reinitiated before the timeout, which itself is by definition... asymmetric because we can't know which peer will restart first and, in the worst case, they can start exactly at the same time, which is increased by the enlarged sampling window that compensates for variations in wire length/capacitance/propagation...
So a safe link establishment must be created, which takes both race conditions and sampling issues into account.