Even smarter C-ving, towards a reverse sieve

While the computers hum and replace my room heater to compute even more perfect and maximal orbits, I was pondering on the behaviour and failures of the sieve generator presented in 149. Tighter C-ving.

If you run the code, step through it with verbose=1. This will display the internal "scoreboard" that shadows the histogram. It's such a sight of indescribable beauty...

round 0: base=11  5  hits=3
***___
***_#_
***_##
{11, 3},

round 1: base=19  9  hits=5
****___*__
****_#_*__
****_##*__
****_##*#_
****_##*##
{19, 4},

round 2: base=29  14  hits=7
*****__*______*
*****_#*______*
*****_#*_#____*
*****_#*_##___*
*****_#*_##__#*
*****_#*###__#*
*****_#*###_##*
*****_#*######*
{29, 5},

round 3: base=31  15  hits=9
******_*______*_
******_*#_____*_
******_*#____#*_
******_*#__#_#*_
******_*##_#_#*_
******#*##_#_#*_
******#*####_#*_
******#*####_#*#
{31, 12},

round 4: base=41  20  hits=11
******_*____*_*___*__
******_*_#__*_*___*__
******_*_#__*_*#__*__
******_*_#__*_*#__*#_
******_*##__*_*#__*#_
******_*##_#*_*#__*#_
******_*##_#*_*#_#*#_
******_*####*_*#_#*#_
******_*####*_*###*#_
******_*####*#*###*#_
******_*####*#*###*##
{41, 6},

round 5: base=59  29  hits=13
********____*_*___*____*______
********____*_*__#*____*______
********____*_*__#*____*#_____
********____*_*__#*____*#__#__
********____*_*__#*____*#__##_
********____*_*__#*__#_*#__##_
********____*_*_##*__#_*#__##_
********#___*_*_##*__#_*#__##_
********#___*_*_##*_##_*#__##_
********#___*_*_##*_###*#__##_
********#___*_*_##*_###*#_###_
********##__*_*_##*_###*#_###_
********##_#*_*_##*_###*#_###_
********##_#*_*_##*_###*#_####
********####*_*_##*_###*#_####
********####*_*_##*####*#_####
********####*_*###*####*#_####
********####*#*###*####*#_####
{59, 25},

round 6: base=61  30  hits=15
********____*_*___*____*_*_____
********____*_*#__*____*_*_____
********____*_*#__*___#*_*_____
********____*_*#__*___#*_*__#__
********____*_*#__*_#_#*_*__#__
********#___*_*#__*_#_#*_*__#__
********#___*_*##_*_#_#*_*__#__
********#___*_*##_*_#_#*_*#_#__
********#___*_*##_*_#_#*#*#_#__
********#__#*_*##_*_#_#*#*#_#__
********#_##*_*##_*_#_#*#*#_#__
********#_##*#*##_*_#_#*#*#_#__
********#_##*#*##_*##_#*#*#_#__
********#_##*#*##_*####*#*#_#__
********####*#*##_*####*#*#_#__
********####*#*##_*####*#*#_#_#
********####*#*##_*####*#*###_#
********####*#*##_*####*#*#####
{61, 17},

You get the idea.

In the first stages, the "early exit" trick stops the loop over the scanned orbits, when the histogram has only one empty bit left. The program is written in C and the scans are already in the constant's memory area so it's not a matter of speed at this point, though it's great for comfort as it reduces the effect of O(n²) execution.

The "problem" is when there are 2 or more empty slots when the scanlist is exhausted. At first, there is one prime that does not "work" and it is skipped, because better leave it at that, rather than polluting the scoreboard with uncertain data, that will later create false misses. Oh and the format is clear : only one offset is allowed per prime base, not "n probable cases" (though I keep the idea for later).

And then this missing entry increases the number of empty bins: as the scan list becomes insufficient, the missing entries from previous bases increase the number of holes and more holes subsist. It creates a sort of avalanche.

I'll get the full scan of gPEAC soon up to mod=500000 (that's 1/4th of the effort to reach 1M) but even then, the avalanche problem will persist. The earlier and harder it is quenched, the more efficient it becomes.

I was thinking about using the results of the bases above the failing entry but we have already determined that even though they are shuffled, ALL the entries of the scorboard will eventually get checked, so there is no way to get a "probable" hint from higher entries. This is not how a sieve works.

Let's say you have only 2 (or 3 ? or 4? or n ?) entries left. The challenge is to find which one is the actual "good" one. There is no choice, we have to find a "proof", and it's clear : we have to run the computation. The result is necessarily: one of them is P/M or it is not. So we have to run the orbit() code on carefully selected numbers.

In a sequential context, we only have to check n-1 orbits, and there is a chance of 1/n to hit the longer P/M orbit first. So if there are 2 empty slots, the first candidate has a "coin toss" chance" of taking the user to a looooong ride, because at the current speeds, at mod=500000, the scan takes us to an orbit of >250 billion steps, which is fortunately cut short to 125B iterations because we now exit early. Let's say on step takes 4 clock cycles, a current CPU takes about 2 minutes to "prove" the given orbit.
In a parallel context, we know that the "bad orbits" don't last as much. So we can run n parallel scans, wait until n-1 processes have exited and the remaining operation can be safely stopped : it must be the desired operation.

Which opens a new battlefront in the search of validation of ultralarge moduli, if we take this process in reverse...

The key now is to wisely select the moduli to "prove", to increase efficiency. The higher the values, the longer it takes. So the proof must start just after the already scanned range (currently 131540 but it's increasing fast). Let's name this the limit L. From there we get a corrected offset, L % base_prime. This gives us an adjusted/aligned value A to add the candidates to: A = L + base_prime - (L % base_prime).

So we start orbit() on L+C1, L+C2, L=C3 in parallel. When there is only one still running, it gets cancelled and its modulus is added to the list of known orbits. The histogram is updated accordingly and the building of the sieve can resume. Hopefully these key moduli would help push the avalanche further a little bit. If this was practically scalable, it could be turned into a "reverse sieve" that borrows ideas from sieve wheels that I have already studied a decade ago.

-o-O-o-

Looking back at my previous assertions, I may have been a bit dismissive of my previous idea of back-checking the incomplete histograms. Indeed because of the unicity of the slots.

-o-O-o-

Apart from all that, I'm worried about the rather low selectivity of the sieve: at about mod=400K, it keeps about 24% of the numbers, from which about 35% are "hits". In the 390K-400K range there are only 852 P/M orbits but I still fail to see what causes so many false starts.

I know that the lower bound for the sieve is above 1/5, plus a little bit from the next bases but the impact of the next ones are vanishingly small. They can still help with humongous moduli like w32 and the current sieve still lets proven bad moduli pass. I think the sieve has still a LOT of progress ahead of it.

...

The new gPEAC scanner

Boosted C-ving

Discussions

Become a Hackaday.io Member