A project log for Pisano-carry Checksum algorithm

Add X to Y and Y to X, says the song. And carry on.

Yann Guidon / YGDESYann Guidon / YGDES 04/17/2021 at 01:570 Comments

(Previously posted in the main detail page of the project)

When you need to reduce a sequence of data into a single number, there are roughly 3 families of algorithms with varying trade-offs. Their names are sometimes used regardless of their actual use but I try to associate them correctly here:

In this project, we deal with the first, "dumb" class of reduction algorithms because checksums ensure the integrity of many low-level protocols. Their speed, size and simplicity dictate the performance of the systems, where cost is also a big constraint. Another attribute is the potential for parallelism. Checksums usually don't bother locating or correcting the detected alteration(s), SEC/DED or Reed-Solomon codes must be implemented for "forward error correction".

The most employed checksums are, in increasing order of complexity and integrity (slightly borrowed from Wikipedia) :

It's always a compromise. Adler seems to be a good compromise of speed and integrity, as it is used in zlib while CRC32 is used for the ZIP format. CRC32 was used a lot but its 1KB lookup table wastes room and is a bottleneck in the most recent CPUs.

Multiplicative checksums are a bit too heavy, with only a marginal sensitivity to alterations. They might be suitable for hash tables because their "confusion & diffusion" is higher but multiplies are still not free, and computing one per input datum is quite extreme. Multiplies can be emulated by "shift and add" operations but a barrel shifter (even used for a fixed amount) is still not free in a processor (though it is a simple rewiring of the bits when implemented in hardware).

The domain of checksums seems to have been frozen during the last years and I hadn't seen an algorithm that is as simple as Fletcher while being almost as good as CRC32.

Then I listened to a Piano Magic song yet another time...