(The "black hole" in this log is also called "fixed" or "Sticky state", as in the FNV algorithm)
Remember the last log 11. Two hairy orbits : when not perturbed (in "generator" or "freewheel" mode), the checksum algorithm loops through the (roughly) 2³¹ states of one of the two orbits. This is already a huge progress compared to Fletcher and a "worst case" where the first accumulator is for example 32768 and the input data is a long string of 0s : the first accumulator will not change while the second one will simply toggle its MSB. Adler solves this with a modulo-prime trick that increases the worst case period but 65521 does not use the full 64Ki state space (what happens if I input a checksum value of 65522 ?).
When dealing with a string of 0s, the Fibonacci checksum has a period of 2³¹ but there are two rare exceptions: the states (0,0,0) and (0xFFFF,0xFFFF,1) are "black holes" where the algorithm remains stuck.
The good news is that these states don't occur naturally. This problem is also present with other checksums, in particular CRCs and derivatives, though CRC/LFSRs have only one such forbidden state, and similar measures can be used.
The state (0,0,0) can not be reached by itself : to reach X=0, the sum must wrap around, which would set the Carry bit, but here it is not set. And since Y=0, the previous Y must also wrap around and/or be 0... This situation is impossible, as there is no subtraction. As a corollary: 0+0=0 and the state can't change.
A similar type of proof shows that (0xFFFF,0xFFFF,1) can't be reached by an orbit. If we added 0xFFFF + 0xFFFF, the sum would be 0x1FFFE, which does not match the expected result. The only way to have 0xFFFF is when the carry is already set (which goes back to the impossible state)... In other terms: to reach 0x1FFFF while one operand is 0xFFFF (because it is then part of the new state), the other operand must be 0x10000 which is impossible, unles the carry is already set (because 0x10000=0xFFFF+C.
Anyway, the exhaustive mapping of the state space has verified these proofs. So we are safe when the algorithm is freewheeling.
But the checksum digests external, arbitrary data... So what are the chances and/or conditions to be swallowed by one of these two black holes (compared to other algorithms) ?
This algorithm has an entropy of 33 bits : there are 2 degenerate states, or 1 in 2³² states, which is the same as a CRC32. Let's look at the source code to see if it is possible to enter a degenerate state...
t = X + Y + C; Y = X ^ data; C = t >> 16; X = t & 0xFFFF;
- The only way to get X=0 is to wrap-around (because there is no subtraction, as you remember), which means that C is set. Thus, even if the input data is equal to the previous X (to nullify Y), the state (0,0,0) can not be reached by external data, the worst it can do is (0,0,1).
- How can we get to the state (0xFFFF,0xFFFF,1) ? In order to have X=0xFFFF, the addends MUST add up to 0xFFFF. This implies that the sum can not wrap around thus C is cleared. If the whole sum is 0x1FFFF, it means that the state is already in the degenerate state.
So here was the proof that no input data can "get this algo stuck in a black hole". No extra software or hardware is required to keep the system in working state (as long as it is correctly initialised, unlike a CRC). If a string of 0s is received, up to 4GiB+64KiB can be processed without wraparound.
This property is possible because the data combination is performed in parallel, not in series with the addition, and it preserves the range of the allowed states. If the combination took place directly after the addition, the result could be controlled (X could be forced to FFFF while C is set, for example) and the whole state would become totally vulnerable.
The funny thing is that this parallel organisation was initially chosen for performance and scheduling, not safety. It is rare enough when all these constraints align and give such an elegant result !
We can consider that Y could be set to any arbitrary value by external data. In this version, data are combined by XOR for convenience because
- this operation uses fewer gates
- it does not interfere with the single carry flag that old processors use.
OTOH, adding (or subtracting) could increase the avalanche effect.