Close

Line scrambling with PEAC

A project log for PEAC Pisano with End-Around Carry algorithm

Add X to Y and Y to X, says the song. And carry on.

yann-guidon-ygdesYann Guidon / YGDES 04/24/2022 at 11:510 Comments

One significant application for binary PEAC is for scrambling data over a serial link. LFSRs are the traditional candidates because, well, what else ? But their shortcomings are quickly prominent.

State of the art with LFSR

I'll let you browse Wikipedia for the references and definitions but here are the major purposes of a line scrambler :

LSFR-based systems are used almost everywhere and are well characterised. Their ability to "crash" however is not as well covered but an example was given by Wikipedia where an old transmission protocol used a 7-tap LFSR that was trivial to attack: just transmit all possible sequences of the LFSR output, so that one (or more) packets would be all (or mostly) 0 (or 1) and the receiver can't recover the clock, leading to a loss of synchronisation and a failure of the link.

Modern links use 64B66B (that is : raw copy of the data and add a couple of signaling bits) followed by a 58-tap LFSR. The "clock crash" attack becomes practically impossible but then you have to deal with the possibility of sequences of 57 identical bits. From what I have gathered, another smaller LFSR or encoder is used again to further ensure DC balance. At this point the designers bet on the statistical properties of the data and channel to "push" the probability of an "adverse condition" to the "extremely unlikely" domain but it's still not impossible. Apparently, low encoding overhead seems to be more important to push the bit-per-symbol ratio.

Of course, LFSRs must run at the bit/line speed so a 10Gbps link requires a 10GHz LFSR. Same if you want to reach 40 Gbps and there, the available technology requires a parallel LFSR implementation, meaning that the poly is extremely sparse to allow operation at a low-enough ratio. With the IEEE proposal x58+x19+1, there are 19 layers of XOR to implement before the circuits becomes too complex, but basically this limits the slowdown to 1/19. For a 1GHz logic speed, the bit speed is at most 20Gbps, and the scrambler size (and power draw) is directly linear to the slowdown.

One thing I was surprised to not find is a detection that the LFSR got stuck or "0-crashed", in order to restart it automatically. It would depend if the scrambler is additive or multiplicative but apparently, handwaving and reducing the risks down to statistics is the preferred engineering method.

Links:

https://en.wikipedia.org/wiki/Linear-feedback_shift_register#Scrambling

https://en.wikipedia.org/wiki/Scrambler#Additive_(synchronous)_scramblers

https://en.wikipedia.org/wiki/Pulse-amplitude_modulation

https://en.wikipedia.org/wiki/64b/66b_encoding

And now with PEAC

PEAC works with integer additions instead of simple XOR gates but the complexity grows differently and more favourably. LFSR compute one bit per cycle so N layers of XORs must be implemented to compute N bits in parallel, which makes the latency grow proportionally to N, and the circuit size grows as N². With PEAC the latency grows as log(N) and the size as N×log(N). There is no "low density poly" to choose either. This favours wide systems, with 16 or 26 bits as "slow"/parallel processing, and the following serialiser can be smaller and faster.

But then, this all depends on the overhead and clock recovery characteristics. One feature in particular can be used to enhance clock recovery, DC balancing and error detection: the carry bit could be inserted in the bitstream to provide three features for the cost of one.

With proper initialisation, PEAC checksum/scrambler is guaranteed to never reach the state of all-1 or all-0. Even if one or both registers are all-0, the carry must have a complementary value, by design, which frees the designer from estimating probabilities.

16 bits of data can be encoded with 17 bits, with the 17th being the carry which acts like an enhanched parity bit. The scrambler is a direct derivation of the checksum circuit and preserves all its properties.

If the data symbols are encoded with PAM4, the extra carry bit could be encoded with PAM2 : this encoding difference will be enough to resynchronise an incoming bit stream. A group of symbols to encode 16 bits will have 8 PAM4 symbols followed (or preceded) by one PAM2 with its distinctive lack of 2 levels, as a marker. A frame mis-alignment will be detected like a transmission error, or vice versa.

PEAC26x2 can also replace the 58-bit LFSR for coarse data scrambling because it easily works at the word level without optimisation and works 26 times slower than the bit frequency. PEAC26x2 has 52 bits of state, plus the carry, but works 26 bits at once (almost 2^5) which makes the bit period almost equivalent (52+5=57).

Discussions