Counting Primes Using Legendre’s Formula

July 22, 2011

We will need functions to compute the nth prime, for the calculation of φ, and to compute π(x) for small values of x, to terminate the recursion in the formula for π. We do both by using the sieve of Eratosthenes to create a vector of the primes up to some convenient limit (the vector has an additional nonsense value in the 0th position so we don’t have to keep subtracting 1 whenever we index the vector), then just index the vector to compute the nth prime and use binary search to compute π:

(define ps (list->vector (cons -1 (primes #e1e5))))

(define (p n) (vector-ref ps n))

(define (pi n)
  (let loop ((lo 1) (hi (- (vector-length ps) 1)))
    (let ((mid (quotient (+ lo hi) 2)))
      (cond ((<= hi lo) mid)
            ((< n (p mid)) (loop lo (- mid 1)))
            ((< (p mid) n) (loop (+ mid 1) hi))
            (else mid)))))

The φ function is computed according to the recurrence equation. We memoize the function because small results appear again and again in the recursive calculations:

(define phi
  (let ((memo (make-hash (lambda (x) (+ (* 1000 (car x)) (cadr x))) equal? #f 9973)))
    (lambda args
      (let ((x (car args)) (a (cadr args)) (t (memo 'lookup args)))
        (cond (t t) ; return memoized value
              ((= a 1) (let ((t (quotient (+ x 1) 2))) (memo 'insert args t) t))
              (else (let ((t (- (phi x (- a 1)) (phi (quotient x (p a)) (- a 1)))))
                      (memo 'insert args t) t)))))))

Now the calculation of π is simple:

(define (prime-pi n)
  (let ((a (pi (isqrt n))))
    (+ (phi n a) a -1)))

Here is Legendre’s calculation:

> (prime-pi 1000000)
78498

We used hash tables and the isqrt function from the Standard Prelude, and primes from a previous exercise. You can run the program at http://programmingpraxis.codepad.org/sNCV5Y0d.

Pages: 1 2

9 Responses to “Counting Primes Using Legendre’s Formula”

  1. Graham said

    Even with memoization, my Python solution didn’t finish quickly enough for me. Interestingly, (or not, as the case may be)
    the following naive Haskell implementation finishes much more quickly (especially when compiled via GHC).

    import Data.List (foldl')
    import Data.Numbers.Primes (primes)
    
    sieve :: Int -> [Int] -> [Int]
    sieve p = filter (\x -> x `mod` p /= 0)
    
    phi :: Int -> Int -> Int
    phi x a = length $ foldl' (flip sieve) [1..x] (take a primes)
    
    isqrt :: Int -> Int
    isqrt = floor . sqrt . fromIntegral
    
    primePi :: Int -> Int
    primePi 1 = 0
    primePi x = phi x a + a - 1 where a = primePi $ isqrt x
    
    main :: IO ()
    main = print $ primePi 1000000
    

    I guess the next step is to get good enough at Haskell to figure out a way to cut the number of phi calls.

  2. programmingpraxis said

    The reference solution finds pi(10^6) in 72ms on my machine, and pi(10^7) in about ten times that, which is still less than a second. What kind of times are you seeing?

    Sieving by trial division to get a list of the primes is ugly. The fastest, prettiest prime generator that I know in Haskell is by Melissa O’Neill: http://www.cs.hmc.edu/~oneill/papers/Sieve-JFP.pdf.

  3. Graham said

    The reference solution finds pi(10^6) in about 0.3 seconds and pi(10^7) in about 1.6 (under Petite Chez Scheme). Running the naive Haskell above with GHC’s runhaskell command (so, intepreted) takes almost 9.6 seconds for pi(10^6), while compiled with gch --make -O2 take about 0.5 seconds. My Haskell interpreted chokes when going for pi(10^7), but manages to finish in about 11 seconds when compiled. Thanks for the O’Neill paper; I think I’d come across it before, but hadn’t checked it out in depth.

  4. Mike said

    Python 3.2

    from bisect import bisect
    from functools import lru_cache
    from math import sqrt
    from utils import primesTo
    
    prime = list(primesTo(10000))
    
    @lru_cache(maxsize=None)
    def phi(x, a):
        if a == 1:
            return (x+1)//2
        else:
            return phi(x, a-1) - phi(x//prime[a - 1], a - 1)
    
    
    def pi(n):
        ''' return number of primes <= n
    
        >>> pi(10)
        4
        >>> pi(1000)
        168
        >>> pi(1000000)
        78498
        '''
            
        if n < prime[-1]:
            return bisect(prime, n)
        
        else:
            a = pi(int(sqrt(n)))
            return phi(n, a) + a - 1
    
    ##if __name__ == "__main__":
    ##    import doctest
    ##    doctest.testmod()
    
  5. Graham said

    @Mike, your solution seems to run into the same problem my Python work did, even with Python 3.2: trying pi(10^7) causes the interpreter to crash after exceeding the recursion limit. Any thoughts?

  6. Mike said

    @Graham,

    I had trouble with the stack limit as well.

    Evaluating the code for pi() with n = 1e7, a = 966. So, pi(1e7) = phi(1e7, 966) + 966 – 1. phi(1e7, 966) depends on phi(1e7,965), which depends on phi(1e7,964), … down to phi(1e7, 1). That’s 966 stack frames, which is close to the stack overflow limit.

    I suspect the python shell uses a few stack frames, but probably not enough to push pi(1e7) over the limit.

    I believe the culprit is the lru cache decorator for the phi() function. Looking at the source code in functools.py, when there is a cache miss, the wrapped function calls the original code for phi(). So I think that on a cache miss, two frames get pushed on the stack (one for the call to the wrapped function and one for the call to the original function). The original function may the recurse and call the wrapped function, etc.

    I have not tested my hypothesis. However, if I pre-populate the cache by calling pi() in a loop with gradually increasing arguments, I can calculate pi() for larger arguments without hitting the stack limit.

  7. programmingpraxis said

    Scheme has no trouble with stack overflow because stack frames are stored on the heap, not on the stack.

    I don’t know enough about Python to help. Here are some random thoughts on the matter. I don’t know if any of them are useful.

    Setting a higher stack overflow limit only helps for a little while. I don’t know if that is possible in Python, but in any event it is not a serious solution.

    Does Python have some kind of lazy evaluation method? If it does, and if it uses the heap instead of the stack, maybe you could arrange to use that instead of the normal recursion stack.

    You could build your own stack out of cons pairs stored on the heap, and make it however big you want, instead of using Python’s built-in recursion stack.

    You don’t necessarily need to use the recurrence equation to calculate Legendre’s sum. See the MathWorld article for Legendre’s Formula. Specifically, equation (1) may be useful.

  8. Mike said

    Here is a Python 3 version of phi(x,a) using a separate stack.

    I optimized the code a bit.
    –If x < prime[a], the second term in the recurrence equation is zero, and the first term in the recurrence equation is basically doing a linear search through the primes to find the largest prime that divides x. The bisect() call replaces the linear search with a binary search to find the largest prime less than x.
    –As stated in the problem, the recurrence terminates when a == 1. The recurrence can also be terminated when x == 1, where phi(1,a) is +1 or -1 depending on the sign the phi term would have (the variable s in the code).

    from bisect import bisect
    from math import sqrt
    from utils import primesTo

    # pad prime[] so that prime[n] is the nth prime number
    prime = [0] + primesTo(100000)

    def phi(x, a):
    if a == 0:
    return x

    stack = [(1,x,a)]
    value = 0
    while stack:
    s, x, a = stack.pop()

    if x == 1:
    value += s

    else:
    if x < prime[a]:
    a = bisect(prime, x, 0, a) - 1

    if a == 1:
    value += s * ((x + 1)//2)

    else:
    stack.extend([(s, x, a-1), (-s, x//prime[a], a-1)])

    return value

    def pi(x):
    if x < 2:
    return 0

    else:
    a = pi(int(sqrt(x)))
    return phi(x, a) + a - 1

    There is no caching of intermediate results, so it is not very fast:

    _n_______time (s)_____pi____
    10^0 0.0000083810 0
    10^1 0.0000455365 4
    10^2 0.0001030857 25
    10^3 0.0007869715 168
    10^4 0.0064016516 1229
    10^5 0.0681326817 9592
    10^6 0.2781906639 78498
    10^7 2.4777696607 664579
    10^8 25.5805003150 5761455
    10^9 262.7440423294 50847534

  9. Mike said

    With formatting:

    from bisect import bisect
    from math import sqrt
    from utils import primesTo
    
    # pad prime[] so that prime[n] is the nth prime number
    prime = [0] + primesTo(100000)
    
    def phi(x, a):
        if a == 0:
            return x
    
        stack = [(1,x,a)]
        value = 0
        while stack:
            s, x, a = stack.pop()
    
            if x == 1:
                value += s
    
            else:
                if x < prime[a]:
                    a = bisect(prime, x, 0, a) – 1
    
                if a == 1:
                    value += s * ((x + 1)//2)
    
                else:
                    stack.extend([(s, x, a-1), (-s, x//prime[a], a-1)])
    
        return value
    
    def pi(x):
        if x < 2:
            return 0
    
        else:
            a = pi(int(sqrt(x)))
            return phi(x, a) + a – 1
    
    _n_______time (s)_____pi____
    10^0   0.0000083810   0
    10^1   0.0000455365   4
    10^2   0.0001030857   25
    10^3   0.0007869715   168
    10^4   0.0064016516   1229
    10^5   0.0681326817   9592
    10^6   0.2781906639   78498
    10^7   2.4777696607   664579
    10^8  25.5805003150   5761455
    10^9 262.7440423294   50847534
    

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 196 other followers