Counting Primes Using Legendre’s Formula

July 22, 2011

The prime-counting function π(n) computes the number of primes not greater than n, and has been a matter of fascination to mathematicians for centuries. The Prime Number Theorem tells us that π(n) is approximately n / log(n); Carl Fredrich Gauss proposed the prime number theorem in 1792 when he was only fifteen years old, and Jacques Hadamard and Charles-Jean de la Valle Poussin both proved it, independently, in 1896. The first person to make a serious attempt at the calculation, except for trivial attempts that simply enumerated the primes and counted them, was the French mathematician Adrien-Marie Legendre at the end of the eighteenth century; his formula was correct, but he erroneously reported that π(106) = 78526, whereas the correct value is 78498. We will recreate Legendre’s calculation in today’s exercise.

Legendre’s formula works in two parts. First, he defined a function φ(x, a) that counts all the numbers from 1 to x inclusive that are not stricken by sieving with the first a primes. For instance φ(50, 3) = 14, because the numbers 1, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47 and 49 less than or equal to 50 are not divisible by the first three primes 2, 3, or 5. The φ function can be computed by the recurrence equation φ(x, a) = φ(x, a−1) − φ(x/pa, a−1), where φ(x, 1) is the number of odd integers less than or equal to x and pa is the ath prime number. The second part of Legendre’s formula uses the φ function to compute the value of π recursively: π(x) = φ(x, a) + a − 1, where a = π(⌊√x⌋).

Your task is to write φ and π functions and recreate Legendre’s calculation of π(1000000). When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

Advertisement

Pages: 1 2

9 Responses to “Counting Primes Using Legendre’s Formula”

  1. Graham said

    Even with memoization, my Python solution didn’t finish quickly enough for me. Interestingly, (or not, as the case may be)
    the following naive Haskell implementation finishes much more quickly (especially when compiled via GHC).

    import Data.List (foldl')
    import Data.Numbers.Primes (primes)
    
    sieve :: Int -> [Int] -> [Int]
    sieve p = filter (\x -> x `mod` p /= 0)
    
    phi :: Int -> Int -> Int
    phi x a = length $ foldl' (flip sieve) [1..x] (take a primes)
    
    isqrt :: Int -> Int
    isqrt = floor . sqrt . fromIntegral
    
    primePi :: Int -> Int
    primePi 1 = 0
    primePi x = phi x a + a - 1 where a = primePi $ isqrt x
    
    main :: IO ()
    main = print $ primePi 1000000
    

    I guess the next step is to get good enough at Haskell to figure out a way to cut the number of phi calls.

  2. programmingpraxis said

    The reference solution finds pi(10^6) in 72ms on my machine, and pi(10^7) in about ten times that, which is still less than a second. What kind of times are you seeing?

    Sieving by trial division to get a list of the primes is ugly. The fastest, prettiest prime generator that I know in Haskell is by Melissa O’Neill: http://www.cs.hmc.edu/~oneill/papers/Sieve-JFP.pdf.

  3. Graham said

    The reference solution finds pi(10^6) in about 0.3 seconds and pi(10^7) in about 1.6 (under Petite Chez Scheme). Running the naive Haskell above with GHC’s runhaskell command (so, intepreted) takes almost 9.6 seconds for pi(10^6), while compiled with gch --make -O2 take about 0.5 seconds. My Haskell interpreted chokes when going for pi(10^7), but manages to finish in about 11 seconds when compiled. Thanks for the O’Neill paper; I think I’d come across it before, but hadn’t checked it out in depth.

  4. Mike said

    Python 3.2

    from bisect import bisect
    from functools import lru_cache
    from math import sqrt
    from utils import primesTo
    
    prime = list(primesTo(10000))
    
    @lru_cache(maxsize=None)
    def phi(x, a):
        if a == 1:
            return (x+1)//2
        else:
            return phi(x, a-1) - phi(x//prime[a - 1], a - 1)
    
    def pi(n):
        ''' return number of primes <= n
    
        >>> pi(10)
        4
        >>> pi(1000)
        168
        >>> pi(1000000)
        78498
        '''
    
        if n < prime[-1]:
            return bisect(prime, n)
    
        else:
            a = pi(int(sqrt(n)))
            return phi(n, a) + a - 1
    
    ##if __name__ == "__main__":
    ##    import doctest
    ##    doctest.testmod()
    
  5. Graham said

    @Mike, your solution seems to run into the same problem my Python work did, even with Python 3.2: trying pi(10^7) causes the interpreter to crash after exceeding the recursion limit. Any thoughts?

  6. Mike said

    @Graham,

    I had trouble with the stack limit as well.

    Evaluating the code for pi() with n = 1e7, a = 966. So, pi(1e7) = phi(1e7, 966) + 966 – 1. phi(1e7, 966) depends on phi(1e7,965), which depends on phi(1e7,964), … down to phi(1e7, 1). That’s 966 stack frames, which is close to the stack overflow limit.

    I suspect the python shell uses a few stack frames, but probably not enough to push pi(1e7) over the limit.

    I believe the culprit is the lru cache decorator for the phi() function. Looking at the source code in functools.py, when there is a cache miss, the wrapped function calls the original code for phi(). So I think that on a cache miss, two frames get pushed on the stack (one for the call to the wrapped function and one for the call to the original function). The original function may the recurse and call the wrapped function, etc.

    I have not tested my hypothesis. However, if I pre-populate the cache by calling pi() in a loop with gradually increasing arguments, I can calculate pi() for larger arguments without hitting the stack limit.

  7. programmingpraxis said

    Scheme has no trouble with stack overflow because stack frames are stored on the heap, not on the stack.

    I don’t know enough about Python to help. Here are some random thoughts on the matter. I don’t know if any of them are useful.

    Setting a higher stack overflow limit only helps for a little while. I don’t know if that is possible in Python, but in any event it is not a serious solution.

    Does Python have some kind of lazy evaluation method? If it does, and if it uses the heap instead of the stack, maybe you could arrange to use that instead of the normal recursion stack.

    You could build your own stack out of cons pairs stored on the heap, and make it however big you want, instead of using Python’s built-in recursion stack.

    You don’t necessarily need to use the recurrence equation to calculate Legendre’s sum. See the MathWorld article for Legendre’s Formula. Specifically, equation (1) may be useful.

  8. Mike said

    Here is a Python 3 version of phi(x,a) using a separate stack.

    I optimized the code a bit.
    –If x < prime[a], the second term in the recurrence equation is zero, and the first term in the recurrence equation is basically doing a linear search through the primes to find the largest prime that divides x. The bisect() call replaces the linear search with a binary search to find the largest prime less than x.
    –As stated in the problem, the recurrence terminates when a == 1. The recurrence can also be terminated when x == 1, where phi(1,a) is +1 or -1 depending on the sign the phi term would have (the variable s in the code).

    from bisect import bisect
    from math import sqrt
    from utils import primesTo

    # pad prime[] so that prime[n] is the nth prime number
    prime = [0] + primesTo(100000)

    def phi(x, a):
    if a == 0:
    return x

    stack = [(1,x,a)]
    value = 0
    while stack:
    s, x, a = stack.pop()

    if x == 1:
    value += s

    else:
    if x < prime[a]:
    a = bisect(prime, x, 0, a) - 1

    if a == 1:
    value += s * ((x + 1)//2)

    else:
    stack.extend([(s, x, a-1), (-s, x//prime[a], a-1)])

    return value

    def pi(x):
    if x < 2:
    return 0

    else:
    a = pi(int(sqrt(x)))
    return phi(x, a) + a - 1

    There is no caching of intermediate results, so it is not very fast:

    _n_______time (s)_____pi____
    10^0 0.0000083810 0
    10^1 0.0000455365 4
    10^2 0.0001030857 25
    10^3 0.0007869715 168
    10^4 0.0064016516 1229
    10^5 0.0681326817 9592
    10^6 0.2781906639 78498
    10^7 2.4777696607 664579
    10^8 25.5805003150 5761455
    10^9 262.7440423294 50847534

  9. Mike said

    With formatting:

    from bisect import bisect
    from math import sqrt
    from utils import primesTo
    
    # pad prime[] so that prime[n] is the nth prime number
    prime = [0] + primesTo(100000)
    
    def phi(x, a):
        if a == 0:
            return x
    
        stack = [(1,x,a)]
        value = 0
        while stack:
            s, x, a = stack.pop()
    
            if x == 1:
                value += s
    
            else:
                if x < prime[a]:
                    a = bisect(prime, x, 0, a) – 1
    
                if a == 1:
                    value += s * ((x + 1)//2)
    
                else:
                    stack.extend([(s, x, a-1), (-s, x//prime[a], a-1)])
    
        return value
    
    def pi(x):
        if x < 2:
            return 0
    
        else:
            a = pi(int(sqrt(x)))
            return phi(x, a) + a – 1
    
    _n_______time (s)_____pi____
    10^0   0.0000083810   0
    10^1   0.0000455365   4
    10^2   0.0001030857   25
    10^3   0.0007869715   168
    10^4   0.0064016516   1229
    10^5   0.0681326817   9592
    10^6   0.2781906639   78498
    10^7   2.4777696607   664579
    10^8  25.5805003150   5761455
    10^9 262.7440423294   50847534
    

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 136 other followers