## Collatz Primes

### May 1, 2015

I’ll write the solution in Python because that was the language of the original question on Stack Overflow. Let’s begin with a function that determines if a number is prime; we will use the Miller-Rabin algorithm, which is faster than trial division for the size of numbers that we will be dealing with:

from random import randint def isPrime(n, k=5): # miller-rabin if n < 2: return False for p in [2,3,5,7,11,13,17,19,23,29]: if n % p == 0: return n == p s, d = 0, n-1 while d % 2 == 0: s, d = s+1, d/2 for i in range(k): x = pow(randint(2, n-1), d, n) if x == 1 or x == n-1: continue for r in range(1, s): x = (x * x) % n if x == 1: return False if x == n-1: break else: return False return True

Since we want to find the first number that satisfies the criteria, our strategy will be to start from 2 and work our way up, storing results as we go. We prime the cache (that’s a bad pun, sorry) with the prime counts in the Collatz sequences from 0 to 2:

primeCount = [0,0,1]

Function `pCount(n)`

counts the primes in the Collatz sequence for *n*. As soon as the current value of the sequence *k* drops below *n*, we look up the result in the cache. Until then, we test each odd number in the Collatz sequence for primality and increment the prime count *p* if appropriate. When we have the prime count for *n*, we add it to the cache and return it.

def pCount(n): k, p = n, 0 while k > 0: if k < n: t = p + primeCount[k] primeCount.append(t) return t elif k % 2 == 0: k = k / 2 elif isPrime(k): p = p + 1 k = 3*k + 1 else: k = 3*k + 1

Now it’s just a matter of computing the prime counts for each *n*, in order starting from 3, stopping when the prime count exceeds 65:

n = 3 t = pCount(n) while t < 65: n = n + 1 t = pCount(n)

That doesn’t take long, less than a minute on my computer. Here’s the result:

>>> print n 1805311

There are 67 primes in the result. If you want to see them, here is a simple function that prints the Collatz sequence for a given *n*:

def collatz(n): cs = [] while n != 1: cs.append(n) n = 3*n+1 if n&1 else n//2 cs.append(1) return cs

And here’s the list of primes:

>>> filter(isPrime,collatz(n)) [2707967, 4061951, 6092927, 9139391, 30845447, 46268171, 52051693, 14639539, 1158011, 1737017, 1648811, 2473217, 880361, 660271, 2228417, 352543, 1784753, 47059, 70589, 59561, 100511, 150767, 508841, 381631, 1932011, 2898017, 2173513, 2445203, 2063141, 386839, 580259, 122399, 275399, 348553, 248141, 93053, 117773, 6211, 1747, 2621, 983, 2213, 1579, 1777, 47, 71, 107, 137, 103, 233, 263, 593, 167, 251, 283, 479, 719, 1619, 911, 1367, 577, 433, 61, 23, 53, 5, 2]

You can read more about it at A078350, or run the program at http://ideone.com/Nauozg.

In Python..

Solution performance can be increased by creating a cached collatz(n) function. For example, if we have previously computed collatz(6) already, a consecutive call for collatz(12) would obtain the result from cache.

Result:

1805311 68

1993215 71

2139627 69

2407081 68

2707967 68

2989823 71

In the following Python code, I import an isprime() function from my personal library. This uses BPSW for numbers above 3825123056546413051, Miller-Rabin in various increments for numbers below that, and has a few prefilters including trial division, Fermat’s test, and Euler’s test.

Brute force:

On my machine, this finishes in about 15 minutes.

Caching:

This finishes in 27 seconds.

(The answer is 1805311, which initiates a sequence of length 402 containing 67 primes.)

I use a sieve to determine if a number, n, is prime where n < 2×10**9. If n is greater than that then I use an isprime method. Since most of the numbers in the Collatz sequence are

< 2×10**9, the program executes much faster, about 2 sec.