## Sexy Primes

### August 23, 2019

Today’s exercise was inspired by this post at Stack Overflow asking for the count of consecutive prime pairs with a difference of 6 between one and two billion; I modified the task to count sexy primes. The difference is that two primes can be sexy but not consecutive; for instance, 5 and 11 form a sexy pair, but they are not consecutive because 7 is prime. We’ll write several solutions for this task, using Python because that poster at Stack Overflow asked for Python.

Our first solution identifies sexy primes p by using a primality test to check both p and p + 6 for primality:

```def isSexy(n):
return isPrime(n) and isPrime(n+6)```

Then we check each odd number from one billion to two billion:

```counter = 0
for n in xrange(1000000001, 2000000000, 2):
if isSexy(n): counter += 1```
`print counter`

That takes an estimated two hours on my machine, determined by running it from 1 billion to 1.1 billion and multiplying by 10. We need something better.

Our second solution uses our standard prime generator to generate the primes from one billion to two billion, checking for each prime p if p – 6 is on the list:

```counter = 0
ps = primeGen(1000000000)
p2 = next(ps); p1 = next(ps); p = next(ps)
while p < 2000000000:
if p - p2 == 6 or p - p1 == 6:
counter += 1
p2 = p1; p1 = p; p = next(ps)```
`print counter`

That took about 8.5 minutes and produced the correct result.

Our third solution uses a segmented sieve to make a list of primes from one billion to two billion, then scans the list counting the sexy primes:

```counter = 0
ps = primes(1000000000, 2000000000)
if ps[1] - ps[0] == 6: counter += 1```
```for i in xrange(2,len(ps)):
if ps[i] - ps[i-2] == 6 or ps[i] - ps[i-1] == 6:
counter += 1```
`print counter`

That took about four minutes and produced the correct result. Almost all of the time went to sieving. I suspect that a lot of that time was spent actually storing the primes; there are over 47,374,753 primes between one billion and two billion, which requires a lot of space, and a lot of time to manipulate all that space. Among other things, this shows that our prime generator is not outlandishly slow.

Our fourth and final solution requires bespoke code rather than using the libraries as did our first three solutions. The idea is to sieve each of the polynomials 6n + 1 and 6n − 1 separately, scan for adjacent pairs, and combine the counts.

This is a little bit tricky, so let’s look at an example: sieve 6n + 1 on the range 100 to 200 using the primes 5, 7, 11 and 13, which are the sieving primes less than 200 (excluding 2 and 3, which divide 6). The sieve has 17 elements 103, 109, 115, 121, 127, 133, 139, 145, 151, 157, 163, 169, 175, 181, 187, 193, 199. The least multiple of 5 in the list is 115, so we strike 115, 145 and 175 (every 5th item) from the sieve. The least multiple of 7 in the list is 133, so we strike 133 and 175 (every 7th item) from the sieve. The least multiple of 11 in the list is 121, so we strike 121 and 187 (every 11th item) from the list. And the least multiple of 13 in the list is 169, so we strike it from the list (it’s in the middle of a 17-item list, and has no other multiples in the list). The primes that remain in the list are 103, 109, 127, 139, 151, 157, 163, 181, 193, and 199; of those, 103, 151, 157 and 193 are sexy.

The trick is finding the offset in the sieve of the first multiple of the prime. The formula is (lo + p) / -6 (mod p), where lo is the first element in the sieve (103 in the example above) and p is the prime; -6 comes from the gap between successive elements of the sieve. In modular arithmetic, division is undefined, so we can’t just divide by -6; instead, we find the modular inverse. And since modular inverse is undefined for negative numbers, we first convert -6 to its equivalent mod p. Thus, for our four sieving primes, the offsets into the sieve are:

```((103 + 5) * inverse(-6 % 5, 5)) % 5 = 2 ==> points to 115
((103 + 7) * inverse(-6 % 7, 7)) % 7 = 5 ==> points to 133
((103 + 11) * inverse(-6 % 11, 11)) % 11 = 3 ==> points to 121
((103 + 13) * inverse(-6 % 13, 13)) % 13 = 11 ==> points to 169```

Sieving sieve 6n – 1 works the same way, except that lo points to 101 instead of 103; the sieve contains 101, 107, 113, 119, 125, 131, 137, 143, 149, 155, 161, 167, 173, 179, 185, 191, 197:

```((101 + 5) * inverse(-6 % 5, 5)) % 5 = 4 ==> points to 125
((101 + 7) * inverse(-6 % 7, 7)) % 7 = 3 ==> points to 119
((101 + 11) * inverse(-6 % 11, 11)) % 11 = 7 ==> points to 143
((101 + 13) * inverse(-6 % 13, 13)) % 13 = 7 ==> points to 143```

After sieving, the numbers that remain in the sieve are 101, 107, 113, 131, 137, 149, 167, 173, 179, 191, 197, of which 101, 107, 131, 167, 173 and 191 are sexy, so there are 10 sexy primes between 100 and 200.

Here’s the code:

```start = time()
counter = 0
ps = primes(isqrt(2000000000))[2:]
size = (2000000000-1000000000)/6+1```
```# sieve on 6n-1
lo = 1000000000/6*6+5
sieve = [True] * size
for p in ps:
q = ((lo+p)*inverse(-6%p,p))%p
for i in xrange(q,size,p):
sieve[i] = False```
```for i in xrange(1,size):
if sieve[i] and sieve[i-1]:
counter += 1

# sieve on 6n+1
lo += 2
sieve = [True] * size
for i in xrange(0,size):
sieve[i] = True
for p in ps:
q = ((lo+p)*inverse(-6%p,p))%p
for i in xrange(q,size,p):
sieve[i] = False
for i in xrange(1,size):
if sieve[i] and sieve[i-1]:
counter += 1
print counter
print time() - start
print "Done"
That took 3:21 and produced the correct result. There are 5924680 sexy primes between one billion and two billion.

__ATA.cmd.push(function() {
__ATA.initVideoSlot('atatags-370373-61abf78b9e893', {
sectionId: '370373',
});
});

__ATA.cmd.push(function() {
__ATA.initDynamicSlot({
id: 'atatags-26942-61abf78b9e8dd',
location: 120,
formFactor: '001',
label: {
},
creative: {
},
privacySettings: {
text: 'Privacy',

}
}
});
});

Related
Pages: 1 2		```

### 9 Responses to “Sexy Primes”

1. Paul said

In Python. The lazy prime generator is based on a segmented sieve and a little too much code to copy here. Thee deque keeps the tested prime in D[0] and has further 3 next primes to test against.
This takes some 90 secs on my (old) PC and reproduces the answer of ProgrammingPraxis.

```from collections import deque

start = 10**9
stop = 2 * 10**9
pp = lazy_prime(start)
D = deque((next(pp), next(pp), next(pp), next(pp)), maxlen=4)
cnt = 0
while D[0] < stop:
if D[0] + 6 in D:
cnt += 1
D.append(next(pp))
print(cnt)
```
2. Bill Wood said

Rust version, takes about 45 secs on my PC:

```/* author Bill Wood */
fn sqrt(n: usize) -> usize {
let (mut r1, mut r) = (n, n + 1);
while r1 < r {
r = r1;
r1 = (r1 + n/r1) >> 1
}
r
}

fn prime_sieve(begin: usize, end: usize) -> Vec<usize> {
let mut sieve = vec![false; end + 1];
let sn = sqrt(end);
for i in 2..=sn {
if !sieve[i] {
let mut j = i*i;
while j <= end {
sieve[j] = true;
j += i;
}
}
}

let mut result = vec![];
for i in begin..=end {
if !sieve[i] {
result.push(i);
}
}
result
}

fn main() {
let primes = prime_sieve(1_000_000_000, 2_000_000_000);
let mut count = 0;
for (i, x) in primes.iter().enumerate() {
for &y in &primes[i + 1..] {
if y > x + 6 {
break;
}
if y == x + 6 {
count += 1;
}
}
}
println!("{} of {} primes are sexy", count, primes.len());
}
```

Output:

```\$ rustc -C opt-level=3 sexy-primes.rs
\$ time ./sexy-primes
5924680 of 47374753 primes are sexy

real    0m44.919s
user    0m36.688s
sys     0m3.969s
```
3. Daniel said

Here’s a sieve-based C solution.

```#include <stdio.h>
#include <stdlib.h>

int main(void) {
int N = 2000000006;
char* array = calloc(N + 1, 1);
for (int i = 2; i * i <= N; ++i) {
if (array[i]) continue;
for (int j = i * i; j <= N; j += i) {
array[j] = 1;
}
}
int sum = 0;
for (int i = 1000000000; i <= 2000000000; ++i) {
sum += !array[i] && !array[i + 6];
}
printf("%d\n", sum);
return EXIT_SUCCESS;
}
```

Output:

```5924680
```
4. Daniel said

I forgot to call free :-(
(my preference is to explicitly clean up, even if the memory is reclaimed by the OS)

5. Daniel said

Here’s the C++ version of my code above. It uses vector instead of a char array, to reduce the program’s memory usage from 2GB to 250MB, since it can use a single bit for each boolean instead of a byte. Additionally, the execution time on my laptop decreases from 22 seconds for the C version to 17 seconds for the C++ version (both compiled with -O3).

```#include <cstdlib>
#include <iostream>
#include <vector>

using namespace std;

int main(void) {
int N = 2000000006;
vector<bool> array(N + 1, false);
for (int i = 2; i * i <= N; ++i) {
if (array[i]) continue;
for (int j = i * i; j <= N; j += i) {
array[j] = 1;
}
}
int sum = 0;
for (int i = 1000000000; i <= 2000000000; ++i) {
sum += !array[i] && !array[i + 6];
}
printf("%d\n", sum);
return EXIT_SUCCESS;
}
```

Output:

```5924680
```
6. chaw said

Here’s a solution in C that may not quite be in the spirit of the
question since it uses the excellent primesieve library to do all the
heavy lifting but, hey, it works and it’s fast, and it isn’t my
homework. Runs in about half a second on my very modest
computer. Bam!

``` #include <primesieve.h> #include <inttypes.h> #include <stdio.h> static int sexy_count = 0; void record_prime(uint64_t p) { static uint64_t buf[] = {0, 0, 0}; static short cur = 0; uint64_t q = p - 6; if (q == buf[0] || q == buf[1] || q == buf[2]) sexy_count++; buf[cur] = p; cur = (cur + 1) % 3; } int main() { primesieve_callback_primes(1000000000, 2000000000, &record_prime); printf("%u\n", sexy_count); return 0; } ```

``` < ```

```pre class="src src-sh">time ./sexy-primes ```

Output:

``` 5924680 real 0m0.563s user 0m0.560s sys 0m0.000s ```

7. Bill Wood said

Daniel inspired me to re-write the Rust version with a bit vector (home-rolled) instead of a bool vector. His C++ version and this Rust version both run in 25 secs on my home PC. The Rust bool version above runs 3 secs slower (28 secs) on my home PC (timing of 45 secs above was on my work PC).

```/* author Bill Wood */
fn sqrt(n: usize) -> usize {
let (mut r1, mut r) = (n, n + 1);
while r1 < r {
r = r1;
r1 = (r1 + n/r1) >> 1
}
r
}

struct Bvec {
v: Vec<u8>
}

impl Bvec {
fn new(s: usize, val: u8) -> Self {
Bvec { v: vec![val; s/8 + if s%8 == 0 { 0 } else { 1 } ] }
}
#[inline(always)]
fn get(&self, index: usize) -> u8 {
self.v[index/8] & 1 << index%8
}
#[inline(always)]
fn set(&mut self, index: usize, val: u8) {
self.v[index/8] |= val << index%8;
}
}

fn prime_sieve(begin: usize, end: usize) -> Vec<usize> {
let mut sieve = Bvec::new(end + 1, 0);
let sn = sqrt(end);
for i in 2..=sn {
if sieve.get(i) == 0 {
let mut j = i*i;
while j <= end {
sieve.set(j, 1);
j += i;
}
}
}

let mut result = vec![];
for i in begin..=end {
if sieve.get(i) == 0 {
result.push(i);
}
}
result
}

fn main() {
let primes = prime_sieve(1_000_000_000, 2_000_000_000);
let mut count = 0;
for (i, x) in primes.iter().enumerate() {
for &y in &primes[i + 1..] {
if y > x + 6 {
break;
} else if y == x + 6 {
count += 1;
}
}
}
println!("{} of {} primes are sexy", count, primes.len());
}
```

Rust output:

```\$ time ./sexy-primes-bit
5924680 of 47374753 primes are sexy

real    0m25.076s
user    0m24.872s
sys     0m0.200s
```

Daniel’s C++ version:

```\$ time ~/c/try/sexy-primes
5924680

real    0m25.032s
user    0m24.941s
sys     0m0.090s
```
8. matthew said

Here’s a basic segmented sieve solution, using a vector (ie. a bitmap representation) and working in segments of 128k entries. Takes a little under 3 seconds on my laptop, versus 28 seconds for Daniel’s C++ solution above. Not competitive with Chaw’s impressively fast primesieve solution yet though, I must have a look at what that is doing:

```#include <vector>
#include <iostream>
#include <algorithm>
#include <math.h>

std::vector<int> getoddprimes(int max) {
std::vector<int> primes;
max /= 2;
std::vector<bool> a(max);
for (int i = 1; i < max; i++) {
if (a[i] == 0) {
int p = 2*i+1;
primes.push_back(p);
for (int j = p*p/2; j < max; j += p) {
a[j] = 1;
}
}
}
return primes;
}

int main() {
int start = 1000000000, end = 2*start+6, block = 128*1024;
int primemax = sqrt(end); // Maybe should use isqrt.
std::vector<int> primes = getoddprimes(primemax+1);
int nprimes = primes.size();

start /= 2; end /= 2; // Just look at odd numbers
std::vector<bool> a(end-start);
std::vector<int> offsets(nprimes); // Keeps track of sieve position
for (int i = 0; i < nprimes; i++) {
int p = primes[i];
int offset = p*p/2;
if (offset < start) {
offset += (start-offset+p-1)/p*p; // Move into region of interest
}
offsets[i] = offset;
}
int blockstart = start, sexystart = blockstart+3, count = 0;
while (blockstart != end) {
int blockend = std::min(end, blockstart + block);
for (int i = 0; i < nprimes; i++) {
int p = primes[i];
int offset = offsets[i]; // Faster to cache in register
while (offset < blockend) {
a[offset-start] = 1; // Cross off multiple of p
offset += p;
}
offsets[i] = offset;
}
for (int i = sexystart; i < blockend; i++) {
count += a[i-start-3] == 0 && a[i-start] == 0; // Look backwards
}
sexystart = blockstart = blockend;
}
std::cout << count << "\n";
}
```
9. Bill Wood said

This Rust one runs in under 0.6 seconds on my Google Pixelbook with Crostini Linux. It’s faster than the previous Rust one above because:
1) it only looks at odd numbers
2) it calculates the primes in segments which is faster because the segments fit in the CPU cache
3) removed bit vectors as ultimately they were slower
4) it uses rayon to parallelize computing the segment sieves and find the sexy primes

```/* author Bill Wood */

use rayon::prelude::*;
use std::io::Write;

fn main() {
let mut args = std::env::args()
.skip(1)
.map(|a| a
.split('_')
.fold("".to_string(), |a, s| a + s)
.parse::<usize>());

if args.len() <= 3 {
if let (Ok(mut low), Ok(mut high), Ok(seg_size_kb), ) = (
args.next().unwrap_or(Ok(1_000_000_000)),
args.next().unwrap_or(Ok(2_000_000_000)),
args.next().unwrap_or(Ok(128)),) {
println!("S15: Finding sexy primes with args {} {} {}", low, high, seg_size_kb, );

low = if low < 3 { 3 } else { low | 1 };   // lowest odd >= low
high = (high - 1) | 1;  // highest odd <= high

// find sieving primes
let sieve_primes = simple_sieve(high);

// prepare for parallel segment sieve
let candidates_per_seg = seg_size_kb*1024*2;
// generate a Vec of (low, high) segment tuples
let segs: Vec<(usize, usize)> = { low..=high }
.step_by(candidates_per_seg)
.map(|seg_low| {
let seg_high = seg_low + candidates_per_seg - 2;
(seg_low, if seg_high > high { high } else { seg_high } + 6)
})
.collect();

// do parallel segment sieve
let (sexy_count, nprimes) =
segs.into_par_iter()
.map(|(low, high)| {
// sieve this segment
let sieve_seg = seg_sieve(low, high, &sieve_primes);
// find sexy primes in this segment
let (mut sexy_count, mut nprimes) = (0_usize, 0_usize);
for p in 0..sieve_seg.len() - 6/2 {
if unsafe { !sieve_seg.get_unchecked(p) } {
nprimes += 1;
if unsafe { !sieve_seg.get_unchecked(p + 6/2) } {
sexy_count += 1;
}
}
}
(sexy_count, nprimes)
})
.reduce(|| (0, 0),
|a, b| (a.0 + b.0, a.1 + b.1));

println!("{} of {} primes are sexy", sexy_count, nprimes);
return;
}
}

writeln!(std::io::stderr(), "Usage: sexy_primes [low [high [seg_size_kb]]]").unwrap();
std::process::exit(1);
}

fn simple_sieve(mut high: usize) -> Vec<u32> {
// x where x: odd and (x+2)^2 > high and x^2 < high
high = (sqrt(high) - 1) | 1;
let mut pvec = vec![false; (high + 1)/2];
let mut primes = vec![];
let n = high >> 1;
let mut i = 3;
while i*i <= high {
if unsafe { !pvec.get_unchecked(i >> 1) } {
primes.push(i as u32);
let mut j = (i*i) >> 1;
while j <= n {
unsafe { *pvec.get_unchecked_mut(j) = true };
j += i;
}
}
i += 2;
}
i >>= 1;
while i <= n {
if unsafe { !pvec.get_unchecked(i) } {
primes.push(((i << 1) | 1) as u32);
}
i += 1;
}
primes
}

fn seg_sieve(low: usize, high: usize, sieve_primes: &Vec<u32>) -> Vec<bool> {
let n = (high + 2 - low) >> 1;
let mut sieve_seg = vec![false; n];
for p in sieve_primes {
let p = *p as usize;
let mut j = p*p;
if j > high {
break;
}
if j < low {
// lowest odd multiple of p which is >= low
j = p*(((low - 1)/p + 1) | 1);
}
j = (j - low) >> 1;
while j < n {
unsafe { *sieve_seg.get_unchecked_mut(j) = true };
j += p;
}
}
sieve_seg
}

fn sqrt(n: usize) -> usize {
let (mut r1, mut r) = (n, n + 1);
while r1 < r {
r = r1;
r1 = (r1 + n/r1) >> 1
}
r
}
```
```\$ time ./sexy-primes
S15: Finding sexy primes with args 1000000000 2000000000 128
5924680 of 47374753 primes are sexy

real    0m0.588s
user    0m2.298s
sys     0m0.005s
```