Fenderbender’s Square Reckoner

November 29, 2016

In many of our programs involving prime numbers and number theory, we need to be able to determine if a number n is a perfect square. One way to do that is to determine the integer square root of the number, using Newton’s method, then multiply to determine if the original number is a square. But that’s slow. In a previous exercise, we used a method devised by Henri Cohen to calculate the quadratic residues of n to various moduli, which can quickly determine that some n cannot be perfect squares.

Over at Mersenne Forum, fenderbender extends Cohen’s idea to make a ridiculously fast square predicate: he precalculates multiple moduli to reduce the operation from big integers to 32-bit integers, chooses the moduli after extensive testing, and tests the quadratic residues using a 64-bit bloom filter. The result is impressive. Where Cohen eliminates the expensive square root calculation in 99% of cases, fenderbender eliminates the expensive square root calculation in 99.92% of cases, and does it faster than Cohen. Go read fenderbender‘s explanation to see a beautiful combination of number theory, wonky programming, and sheer artistry.

Your task is to implement fenderbender‘s square predicate. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

Posted by programmingpraxis

Filed in Exercises

10 Comments »

10 Responses to “Fenderbender’s Square Reckoner”

Paul said
November 29, 2016 at 3:24 PM
@programmingpraxis: I converted the code to Python. It works about twice as fast compared to a method where you just use the 2 last digits for a first screening. Maybe I found an example where the method does not work. 152415787532388367504942236884722755800955129 is a square, but fenderbender’s method tells me not; it fails on the test using mod19. Could you check with your code?
programmingpraxis said
November 29, 2016 at 3:35 PM
@Paul: Works fine for me: http://ideone.com/sqY3zf.
Paul said
November 29, 2016 at 3:44 PM
@programmingparaxis: thanks. Apparently some checking to do.
Ernie said
November 29, 2016 at 3:51 PM
Does anyone know why fenderbender chose mod x with x = 128? I have tried other x. By trial and error, 5! (= 120) gives a rejection rate of 85% and 6! a rejection rate of 93.3%.
Would either of these be more appropriate?
programmingpraxis said
November 29, 2016 at 3:56 PM
@Ernie: I suspect it is due to the computational efficiency of bit-shifting rather than performing a big-integer modulo.

Paul said

November 29, 2016 at 5:03 PM

An interesting method. In Python it seems to be more effective to use sets i.s.o. Bloom filters. The version with sets is about a factor of 1.7 times faster. For the sets version I used mod64 i.s.o. mod128, because I could not find the quadratic residues for mod128.

FACTOR = 63*25*11*17*23*31*19

def is_square(n):
    m = n % 128
    if m*0x8bc40d7d & m*0xa1e2f5d1 & 0x14020a:
        return False
    largeMod = n % FACTOR
    m = largeMod % 63
    if m*0x3d491df7 & m*0xc824a9f9 & 0x10f14008:
        return False
    m = largeMod % 25
    if m*0x1929fc1b & m*0x4c9ea3b2 & 0x51001005:
        return False
    m = 0xd10d829a * (largeMod % 31)
    if m & (m+0x672a5354) & 0x21025115:
        return False
    m = largeMod % 23
    if m*0x7bd28629 & m*0xe7180889 & 0xf8300:
        return False
    m = largeMod % 19
    if m*0x1b8bead3 & m*0x4d75a124 & 0x4280082b:
        return False
    m = largeMod % 17
    if m*0x6736f323 & m*0x9b1d499 & 0xc0000300:
        return False
    m = largeMod % 11
    if m*0xabf1a3a7 & m*0x2612bf93 & 0x45854000:
        return False
    return isqrt(n) ** 2 == n

d64 = set((0, 1, 4, 9, 16, 17, 25, 33, 36, 41, 49, 57))
d63 = set((0, 1, 4, 7, 9, 16, 18, 22, 25, 28, 36, 37, 43, 46, 49, 58))
d25 = set((0, 1, 4, 6, 9, 11, 14, 16, 19, 21, 24))
d31 = set((0, 1, 2, 4, 5, 7, 8, 9, 10, 14, 16, 18, 19, 20, 25, 28))
d23 = set((0, 1, 2, 3, 4, 6, 8, 9, 12, 13, 16, 18))
d19 = set((0, 1, 4, 5, 6, 7, 9, 11, 16, 17))
d17 = set((0, 1, 2, 4, 8, 9, 13, 15, 16))
d11 = set((0, 1, 3, 4, 5, 9))

def is_square2(n):
    if n % 64 not in d64:
        return False
    if n % 63 not in d63:
        return False
    if n % 25 not in d25:
        return False
    if n % 31 not in d31:
        return False
    if n % 23 not in d23:
        return False
    if n % 19 not in d19:
        return False
    if n % 17 not in d17:
        return False
    if n % 11 not in d11:
        return False
    return isqrt(n) ** 2 == n

programmingpraxis said
November 29, 2016 at 7:27 PM
@Paul: It’s easy to compute the quadratic residues of a modulus, let’s say n = 11. For each number from 0 to n−1, square the number and take it mod n; thus, (0*0)%11=0, (1*1)%11=1, . . . , (5*5)%11=3, . . . , (10*10)%11=1. Those numbers are the quadratic residues of 11: 0, 1, 3, 4, 5, and 9. For the quadratic residues of 128 I get
```
> (unique =
    (sort <
      (map (lambda (n) (modulo (* n n) 128))
           (range 128))))
(0 1 4 9 16 17 25 33 36 41 49 57 64 65 68 73 81 89 97 100
 105 113 121)
```
Paul said
November 29, 2016 at 8:17 PM
@programmingpraxis: thanks, I found out in the meantime (and get the same numbers as you). Using mod128 did not give significant different results from using mod64. I am still a little working on the problem, what would be the best mod’s to use.

Milbrae said

March 7, 2017 at 5:44 PM

I’ve implemented 4 square finding routines and timed them.

import time
import math

d64 = set((0, 1, 4, 9, 16, 17, 25, 33, 36, 41, 49, 57))
d63 = set((0, 1, 4, 7, 9, 16, 18, 22, 25, 28, 36, 37, 43, 46, 49, 58))
d25 = set((0, 1, 4, 6, 9, 11, 14, 16, 19, 21, 24))
d31 = set((0, 1, 2, 4, 5, 7, 8, 9, 10, 14, 16, 18, 19, 20, 25, 28))
d23 = set((0, 1, 2, 3, 4, 6, 8, 9, 12, 13, 16, 18))
d19 = set((0, 1, 4, 5, 6, 7, 9, 11, 16, 17))
d17 = set((0, 1, 2, 4, 8, 9, 13, 15, 16))
d11 = set((0, 1, 3, 4, 5, 9))

FACTOR = 63*25*11*17*23*31*19

def is_square1(n):
    if n % 64 not in d64:
        return False
    if n % 63 not in d63:
        return False
    if n % 25 not in d25:
        return False
    if n % 31 not in d31:
        return False
    if n % 23 not in d23:
        return False
    if n % 19 not in d19:
        return False
    if n % 17 not in d17:
        return False
    if n % 11 not in d11:
        return False
    #return isqrt(n) ** 2 == n
    return int(math.sqrt(n))**2 == n

def is_square2(n):
    m = n % 128
    if m*0x8bc40d7d & m*0xa1e2f5d1 & 0x14020a:
        return False
    largeMod = n % FACTOR
    m = largeMod % 63
    if m*0x3d491df7 & m*0xc824a9f9 & 0x10f14008:
        return False
    m = largeMod % 25
    if m*0x1929fc1b & m*0x4c9ea3b2 & 0x51001005:
        return False
    m = 0xd10d829a * (largeMod % 31)
    if m & (m+0x672a5354) & 0x21025115:
        return False
    m = largeMod % 23
    if m*0x7bd28629 & m*0xe7180889 & 0xf8300:
        return False
    m = largeMod % 19
    if m*0x1b8bead3 & m*0x4d75a124 & 0x4280082b:
        return False
    m = largeMod % 17
    if m*0x6736f323 & m*0x9b1d499 & 0xc0000300:
        return False
    m = largeMod % 11
    if m*0xabf1a3a7 & m*0x2612bf93 & 0x45854000:
        return False
    #return isqrt(n) ** 2 == n
    return int(math.sqrt(n))**2 == n

def is_square3(n):
    h = n & 0xf
    if h > 9: return False

    if (h != 2) or (h != 3) or (h != 5) or (h != 6) or (h != 7) or (h != 8):
        return int(math.sqrt(n))**2 == n

def is_square4(n):
    return int(math.sqrt(n))**2 == n

#
if __name__ == "__main__":

    N = 10**7

    t = time.clock()
    for i in range(2, N+1):
        r = is_square1(i)
    print (time.clock() - t)

    t = time.clock()
    for i in range(2, N+1):
        r = is_square2(i)
    print (time.clock() - t)

    t = time.clock()
    for i in range(2, N+1):
        r = is_square3(i)
    print (time.clock() - t)

    t = time.clock()
    for i in range(2, N+1):
        r = is_square4(i)
    print (time.clock() - t)

> Timings (Standard Python):
3.6116340733525862
6.418930000824364
9.941192323010673
11.628690812502668

> Timings (Pypy):
0.3549592833256139
2.8981952367599213
0.24083208547439217
0.14977518674845003

As you can see it seems to matter with which interpreter you run your script. Feel free to correct me if any of that source code is wrong.

programmingpraxis said

August 18, 2017 at 6:21 PM

To make the function more general purpose, it is useful to handle negative n properly (it should return false) and n = 0 (it should return 0).

(define (square? n)
  (if (negative? n) #f (if (zero? n) 0
    (let ((m (modulo n 128)))
      (if (positive? (bitwise-and (* m #x8bc40d7d) (* m #xa1e2f5d1) #x14020a)) #f
        (let ((large-mod (modulo n 3989930175))) ; (* 63 25 11 17 19 23 31)
          (and (let ((m (modulo large-mod 63)))
                 (zero? (bitwise-and (* m #x3d491df7) (* m #xc824a9f9) #x10f14008)))
               (let ((m (modulo large-mod 25)))
                 (zero? (bitwise-and (* m #x1929fc1b) (* m #x4c9ea3b2) #x51001005)))
               (let ((m (* #xd10d829a (modulo large-mod 31))))
                 (zero? (bitwise-and m (+ m #x672a5354) #x21025115)))
               (let ((m (modulo large-mod 23)))
                 (zero? (bitwise-and (* m #x7bd28629) (* m #xe7180889) #xf8300)))
               (let ((m (modulo large-mod 19)))
                 (zero? (bitwise-and (* m #x1b8bead3) (* m #x4d75a124) #x4280082b)))
               (let ((m (modulo large-mod 17)))
                 (zero? (bitwise-and (* m #x6736f323) (* m #x9b1d499) #xc0000300)))
               (let ((m (modulo large-mod 11)))
                 (zero? (bitwise-and (* m #xabf1a3a7) (* m #x2612bf93) #x45854000)))
               (let ((root (isqrt n))) (if (= (* root root) n) root #f)))))))))

Programming Praxis