Performance tweak for ulong n_is_prime#2579
Merged
fredrik-johansson merged 5 commits intoflintlib:mainfrom Feb 15, 2026
Merged
Conversation
Collaborator
|
Looks good, thanks! |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
I've modified the semiprime check algorithm to run in approximately 0.75t the time of the current version. Unfortunately the function was only a small fraction of the total time of the primality testing so the observed speed up is only about 0.96t for n_is_prime_odd_no_trial and completely lost in variance of the trial division for n_is_prime.
This should close out #2504 . I don't see any way to further reduce the hashtable size while staying less than 1/5 the cost of a fermat test. The semiprime check rapidly decreases in efficiency, due to the k-semiprimes getting both rarer and having fewer liars as k increases. This means that further reducing to 49Kib or 32Kib would require so many checks (50+) that the BPSW would become more efficient.
The next step is to replace the trial division by prime-inverse multiplication, which I'm holding off on until I evaluate the factorisation algorithms, and other places we use can prime_inverses mod 2^64, so we can have a more "unified" codebase.