Close

Fusion

A project log for PEAC Pisano with End-Around Carry algorithm

Add X to Y and Y to X, says the song. And carry on.

yann-guidon-ygdesYann Guidon / YGDES 01/02/2022 at 17:500 Comments

As the scanner becomes more and more efficient, more and more data are generated and... stay on the hard disks. But the scanner is only the first half of the system and all these semi-arcs must be re-joined together one day (ASAP) to rebuild the whole orbits and tally their characteristics. This is the purpose of the "fusion" program which I describe here. I have already discussed the algorithm there:

90. Post-processing
91. Post-processing: the strategy is coming
92. Post-processing: the strategy looks good

but here is a quick reminder since some details have been confirmed since these logs.

The chosen algorithm is not obvious so let's see the flaws of the trivial version. Ideally, the fusion program must be scalable and perform significant work for every pass, or it will take ages to process over 100GB of data, even at 100MB/s.

Let's imagine a program that streams the log file and writes a compacted log file. This program "consumes" a source file, gets the first entries and stores them in memory, waiting for more entries to match the end of the entries in memory. This creates many problems:

So the program uses a different approach, with 2 streams of input data, made of the log sorted with different orders. This doubles the required storage but simplifies things as well (and some operations could be performed "in-place" by reusing the space of used data).

To maximise the efficiency and reduce the development time, we enlist the program sort to shuffle the binary p7k log file. This way, the fusion program does not need to duplicate data in memory or search random entries. This way, each pass should cut the number of entries in half, which satisfies the scalability requirement because the whole process, even though it might be quite heavy, works in O(log(n)). So w32 will use at most 32 passes.

The only required RAM storage is a "scoreboard" that flags the entries that are being merged. Though, after a bit of thinking, this might not even be necessary because the information "this entry is already processed" could be found/deduced from the available stream, if we bother looking at the right place (hint: one index must be < the current index).

Anyway, you must remember one of the conditions for an orbit to be "maximal": the sum of the lengths of all the semi-arcs must equal a precise value. We can already compute this with an adaptation of a simple tool that is already written: dump7ck.c though in practice, its runtime is sub-optimal, I'd say. So as a preliminary check, I can adapt dump7ck.c by 1) speeding it up and 2) accumulating all the semi-arcs lengths. This will help discarding width that contain orbits with no 0-crossing.

Then the program can be adapted to read 2 input streams for combination.

__________________________________________________________

Update: dump7ck.tgz is ready and stops hammering the kernel with small read requests.

Discussions