So lately I started tinkering with #YAMS and things started to get weird because the most important place where I would need a good sorting algo would be here, to manage the merging of the trajectories to rebuild the whole orbit(s). Previously I used the traditional sort utility which implies a specific type of encoding. See
94. New packing: p7k
95. New packing at work
But if I have my own sorting infrastructure, not only I can simplify/compact the records, but I could also exploit some parallelism here and there, eventually avoiding sort completely.
For now the goal is to reach w32 and this determines the format:
- uint32_t Xstart : range from 0 to 0xFFFFFFFF
- uint32_t Xend: range from 0 to 0xFFFFFFFF
- uint32_t Arcs: range from 1 to 0xFFFFFFFF
- uint64_t Len: range from 1 to 0xFFFFFFFFFFFFFFFD
- uint8_t Parity flag : 0 or 1
Total : 21 bytes per record, down from 26, so the final log for w32 would be about 90GB. The last byte has very little entropy but it is required to check that w26 is perfect, not just maximal.
Maybe the record could be reduced a bit by using variable-sized Arcs and Len fields, because they are pretty low at the start and increase in size with each pass of fusion. Xstart and Xend remain fixed (to ease sorting). I'm fine with variable-sized records because random access is not required here.
Arcs ranges from 1 to 4 bytes (2 bits), Len from 1 to 8 bytes (3 bits) so there would be 1 bit left in a byte to signal the parity flag. So the minimum size of a record would be 11 bytes. The 2 remaining bits could help reduce the size further: with 3 bits, one can encode either the number of bytes, or a direct value from 1 to 4.
The fusion program should be able to perform the sorts only, so partial logs can be preprocessed for fusion. The input processing would likely be decoupled from the output processing (usint pthread ?) to keep the whole program fast, despite all the I/O stalls. The sum of all the lengths should always be computed, to catch any flaw during processing.
For w32, the Len field should be capable to hold more than 64 bits... so 4 bits are required to encode more than 4 bytes, because the orbit could be perfect...