Generating Random Large Primes

March 31, 2017

Sometimes you need to have a large prime, for testing, cryptography, or some other purpose. I’m talking about primes of several hundred to a few thousand digits that are certified — proven to be prime — rather than probable primes according to a Miller-Rabin or other probabilistic test. Henry Pocklington’s Criterion, which dates to 1914, gives us a way to find such primes quickly. Paulo Ribenboim explains it thus:

Let p be an odd prime, and let k be a positive integer, such that p does not divide k and 1 < k < 2(p + 1). Let N = 2kp + 1. Then the following conditions are equivalent:

  1. N is a prime.
  2. There exists an integer a, 1 < a < N, such that a(N−1)/2 ≡ −1 (mod N) and gcd(ak + 1, N) = 1.

This gives us an algorithm for generating large certified primes. Choose p a certified prime. Choose 1 ≤ k < 2 × p at random; we ignore the last two possibilities for k, so we don’t have to worry about k being a multiple of p. Compute N. For each a ∈ {2, 3}, test the conditions for primality. If you don’t find a prime, go back and choose a different random k. Once you have a prime N, “ratchet up” and restart the process with the new certified prime N as the p of the next step. Continue until N is big enough.

Your task is to write a program that generates random large primes. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

Advertisement

Pages: 1 2

18 Responses to “Generating Random Large Primes”

  1. Milbrae said

    Simplistic Python v3.x

    from __future__ import division
    from random import randrange
    
    def gcd(m, n):
        while n:
            r = m % n
            m = n
            n = r
        return m
    
    def simplePrimeTrial(n):
        if n < 5:
            return (n == 2) or (n == 3)
    
        if (n % 2 == 0) or (n % 3 == 0) or (n % 7 == 0):
            return False
    
        d = 7
        while d * d <= n:
            if n % d == 0:
                return False
            d += 2
    
        return True
    
    #
    if __name__ == "__main__":
    
        prime = False
    
        while not prime:
            # choose odd prime p
            p = randrange(10**6, 10**9) | 1
            while not simplePrimeTrial(p):
                p += 2
    
            # choose 1 < k < 2(p + 1)
            k = 0
            while (k % p == 0):
                k = randrange(2, 2 * (p + 1))
    
            # get N
            N = 2 * k * p + 1
    
            # conditions
            for a in [2, 3]:
                if pow(a, (N - 1) // 2, N) == (N - 1):
                    if gcd(a**k, N) == 1:
                        prime = True
                        break
    
            if prime:
                break
    
        print (p, k, N, prime)
    
  2. Milbrae said

    Forgot to mention a possible result:

    805103465 393971070 634374947133515101 True
    
  3. Milbrae said

    The trial division routine is completely out of order. Sorry! Here’s the corrected procedure:

    def simplePrimeTrial(n):
        if n < 6:
            return (n == 2) or (n == 3) or (n == 5)
    
        if (n % 2 == 0) or (n % 3 == 0) or (n % 5 == 0):
            return False
    
        d = 7
        while d * d <= n:
            if n % d == 0:
                return False
            d += 2
    
        return True
    
  4. programmingpraxis said

    Here is my favorite simple primality checker:

    def isPrime(n):
        wheel = [1,2,2,4,2,4,2,4,6,2,6]
        w, f, fs = 0, 2, []
        while f*f <= n:
            if n % f == 0:
                return False
            w, f = w + 1, f + wheel[w]
            if w == 11: w = 3
        return True

    It uses a prime wheel, so it should be about twice as fast as yours.

  5. Milbrae said

    @programmingpraxis: Awesome! Thanks you!

    Also, I’vve found another bug in my source. The conditions check should read like this:

    # conditions
            for a in [2, 3]:
                if pow(a, (N - 1) // 2, N) == (N - 1):
                    if gcd(a**k + 1, N) == 1:
                        prime = True
                        break
    

    The GCD bit still bothers me. If k and N are large and of different size, this will take forever and possibly raise a MemoryError.

  6. programmingpraxis said

    Your gcd is fine, and will execute quickly; Knuth guarantees it. Most Python programmers would write it like this:

    def gcd(m, n):
        while n <> 0:
            m, n = n, m % n
        return m
  7. programmingpraxis said

    @Millbrae: By the way, I blindly copied the prime checker from my wheel factorization program, and it does too much work. Here’s a corrected version:

    def isPrime(n):
        wheel = [1,2,2,4,2,4,2,4,6,2,6]
        w, f = 0, 2
        while f * f <= n:
            if n % f == 0:
                return False
            w, f = w+1, f + wheel[w]
            if w == 11: w = 3
        return True

    And here is the corresponding function that factors integers, which is what I blindly copied from:

    def factors(n):
        wheel = [1,2,2,4,2,4,2,4,6,2,6]
        w, f, fs = 0, 2, []
        while f * f <= n:
            while n % f == 0:
                fs.append(f)
                n /= f
            w, f = w + 1, f + wheel[w]
            if w == 11: w = 3
        fs.append(n)
        return fs

    All these functions have appeared on Programming Praxis previously.

  8. Milbrae said

    Thanks again, PP!

    Yet there has to be some faster way to get GCD(a**k + 1, N) checked – if k and N are large. Python and PyPy take forever to finish for number 10**12 < n < 10**18.
    [Quote from the description]Henry Pocklington’s Criterion, which dates to 1914, gives us a way to find such primes quickly.[/Quote]
    … quickly … Quickly … QUICKLY

  9. programmingpraxis said

    @Millbrae: If that gcd isn’t fast enough, you might want to look at this.

  10. Paul said

    @Millbrae: If you do gcd(pow(a, k+1, N), N) there are no memory problems and it runs a lot faster.

  11. Milbrae said

    Thanks a lot, Paul! I’ve been thinking abuot something like this:

    def euclid_gcd_a_power_k_plus_1_n(a, k, n):
        op = (pow(a, k, n) + 1) % n
        while op:
            r = n % op
            n = op
            op = r
        return n
    

    Haven’t tested it though…

  12. Milbrae said

    Latest code in Python:

    from __future__ import division
    from random import randrange
    from math import sqrt
    from time import clock
    
    def gcd(a, b):
        while b:
            r = a % b
            a = b
            b = r
        return a
    
    def binary_gcd(a, b):
        shift = 0
    
        while a and b:
            if (a & 1):
                if (b & 1):
                    a, b = (a - b, b) if a > b else (a, b - a)
                else:
                    b >>= 1
            elif (b & 1):
                a >>= 1
            else:
                shift += 1
                a >>= 1
                b >>= 1
    
        return (a or b) << shift
    
    def euclid_gcd_a_power_k_plus_1_n(a, k, n):
        op = (pow(a, k, n) + 1) % n
        while op:
            r = n % op
            n = op
            op = r
        return n
    
    def isPrime1(n):
        wheel = [1, 2, 2, 4, 2, 4, 2, 4, 6, 2, 6]
        w, f = 0, 2
        while f * f <= n:
            if n % f == 0:
                return False
            w, f = w + 1, f + wheel[w]
            if w == 11: w = 3
        return True
    
    def isPrime2(n):
        if n < 5:
            return (n == 2) or (n == 3)
    
        if (not (n & 1)) or (n % 3 == 0):
            return False
    
        limit = int(sqrt(n))
        d = 5
        i = 2
        while d <= limit:
            if n % d == 0: return False
            d += i
            i = 6 - i
    
        return True
    
    #
    if __name__ == "__main__":
    
        prime = False
    
        t = clock()
    
        while not prime:
            # choose odd prime p
            p = randrange(10**15, 10**18) | 1
            while not isPrime1(p):
                p += 2
    
            # choose 1 < k < 2(p + 1)
            k = 0
            while (k % p == 0):
                k = randrange(2, 2 * (p + 1))
    
            # get N
            N = 2 * k * p + 1
    
            # intermediate display
            print ("p = %d / k = %d" % (p, k))
    
            # conditions
            for a in [2, 3]:
                if pow(a, (N - 1) // 2, N) == (N - 1):
                    #if euclid_gcd_a_power_k_plus_1_n(a, k, N) == 1:
                    if gcd(pow(a, k + 1, N), N) == 1:
                        prime = True
                        break
    
            if prime:
                break
    
        print ()
        print ("--> p = %d, k = %d, N = %d, prime = %s" % (p, k, N, prime))
        print ("%f seconds" % (clock() - t))
    

    Sample output:

    [...]
    --> p = 119053509950371751, k = 166652094649708243, N = 39681033617258670310871283078086987, prime = True
    362.305565 seconds
    

    I’ve tested N at https://www.alpertron.com.ar/ECM.HTM. Everything’s turned out fine.

  13. Milbrae said

    Additionally, some numbers. If you manually set p = 4089548364052918515677660474680192167456016670586738083090737083455827, the above code will output…

    --> p = 4089548364052918515677660474680192167456016670586738083090737083455827, k = 3954952269115446666063958491090918288547649049955194620953816304862234, N = 32347937164136905687977400799521711490238478133100320456741932622470671464996525396091254589705022471721754863572714447927950109639719075037, prime = True
    

    N is then a 140-digit prime ;)

  14. programmingpraxis said

    @Milbrae: It should not take six minutes to generate a random 35-digit prime. You seem to misunderstand the algorithm. Start with a small prime, and use it to generate a larger prime. Then recursively use the larger prime that you just generated to generate an even larger prime. And so on. Continue the “ratcheting” process until the end result is the desired size. My Scheme implementation generates a random 35-digit prime in 15ms, which is four orders of magnitude faster, and generates a 140-digit prime in a little bit less than a second:

    > (time (rand-prime #e1e34 #e1e35 #t))
    (607 1022 2)
    (1240709 147645 2)
    (366368960611 227126727978 2)
    (166424366512554387349117 133785462153 2)
    (time (rand-prime 10000000000000000000000000000000000 ...))
        no collections
        15 ms elapsed cpu time
        21 ms elapsed real time
        2632920 bytes allocated
    44530321574804691176098788802937803
    > (time (rand-prime #e1e139 #e1e140 #t))
    (4957 2898 2)
    (28730773 19790703 3)
    (1137204390806839 1366720362160076 3)
    (3108480793707103200437027119529 768273308507447641982667587584 3)
    (4776325647626426015624444714829864966096790270290693688655873 16807863103533598
    20583032452649851040365352536001193141338112 3)
    (1605596552464028483036164591830457063463238223076044024630019667995991481919686
    4066663901322728680984541663162851815063553 2679361547405450752 3)
    (time (rand-prime 10000000000000000000000000000000000000000000000000000000000000
    000000000000000000000000000000000000000000000000000000000000000000000000000000 .
    ..))
        20 collections
        968 ms elapsed cpu time, including 0 ms collecting
        981 ms elapsed real time, including 1 ms collecting
        83927816 bytes allocated, including 84370688 bytes reclaimed
    86039473266377526955299780382198183524627420317908841567482428065070282459178371
    593009228697984643733130344190244139690296472557146983283713
  15. Milbrae said

    Well, then maybe this will suffice…

    from __future__ import division
    from random import randrange
    from time import clock
    
    def gcd(a, b):
        while b:
            r = a % b
            a = b
            b = r
        return a
    
    def isPrime(n):
        wheel = [1, 2, 2, 4, 2, 4, 2, 4, 6, 2, 6]
        w, f = 0, 2
        while f * f <= n:
            if n % f == 0:
                return False
            w, f = w + 1, f + wheel[w]
            if w == 11: w = 3
        return True
    
    def generate_large_prime(seed = 127, length = 30):
        p = seed | 1
        while not isPrime(p):
            p += 2
    
        cnt = 1
        while len(str(p)) < length:
            k = 0
            while k % p == 0:
                k = randrange(2, 2 * (p + 1))
    
            N = 2 * k * p + 1
    
            for a in [2, 3]:
                if pow(a, (N - 1) // 2, N) == (N - 1):
                    if gcd(pow(a, k + 1, N), N) == 1:
                        print ("%3d. %d" % (cnt, N))
                        cnt += 1
                        p = N
                        break
    
        return None
    
    #
    if __name__ == "__main__":
    
        t = clock()
        generate_large_prime(1000, 200)
        print ("\n%f seconds" % (clock() - t))
    

    Sample output:

      1. 2149171
      2. 9834713954551
      3. 104610237726493021241396621
      4. 34200932046364083964639294647733262301861498990032657
      5. 396323208934679755427233521579881191053730646550572161988585492317517817803
    0298810631751649309406392290359
      6. 28909194903683721538230094312229715957688421171095500854838580090229909584920332496482513814271432996604585263726247834081029062687390514758431875294893376647359646225385355346599037139194196446020843723285340689
    
    0.393686 seconds
    
  16. programmingpraxis said

    @Milbrae: Much faster. But the result is 212 digits, not the requested 200. Look at my computation of k-lo and k-hi.

  17. Milbrae said

    Or slightly changed, showing only the generated prime (and its length)

    [...]
    def generate_large_prime(seed = 127, length = 30):
        p = seed | 1
        while not isPrime(p):
            p += 2
    
        while len(str(p)) < length:
            k = 0
            while k % p == 0:
                k = randrange(2, 2 * (p + 1))
    
            N = 2 * k * p + 1
    
            for a in [2, 3]:
                if pow(a, (N - 1) // 2, N) == (N - 1):
                    if gcd(pow(a, k + 1, N), N) == 1:
                        p = N
                        break
    
        return N
    
    #
    if __name__ == "__main__":
    
        t = clock()
        lp = generate_large_prime(100, 150)
        print ("%d (%d digits)" % (lp, len(str(lp))))
        print ("\n%f seconds" % (clock() - t))
    

    Sample output:

    49147198556389931870735198516291966016405112299420103881855365916653844771408104584372776813914932321850903847461556401822236286680702008668299053716265038489973603779363979922476355643397827803871258431258822366365509 (218 digits)
    
    0.206664 seconds
    
  18. […] studied Pocklington’s Criterion, which lets us quickly find large random primes, in a previous exercise. That algorithm generates a certified prime — a number that is proven to be prime — rather than […]

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: