Fenderbender’s Square Reckoner

November 29, 2016

In many of our programs involving prime numbers and number theory, we need to be able to determine if a number n is a perfect square. One way to do that is to determine the integer square root of the number, using Newton’s method, then multiply to determine if the original number is a square. But that’s slow. In a previous exercise, we used a method devised by Henri Cohen to calculate the quadratic residues of n to various moduli, which can quickly determine that some n cannot be perfect squares.

Over at Mersenne Forum, fenderbender extends Cohen’s idea to make a ridiculously fast square predicate: he precalculates multiple moduli to reduce the operation from big integers to 32-bit integers, chooses the moduli after extensive testing, and tests the quadratic residues using a 64-bit bloom filter. The result is impressive. Where Cohen eliminates the expensive square root calculation in 99% of cases, fenderbender eliminates the expensive square root calculation in 99.92% of cases, and does it faster than Cohen. Go read fenderbender‘s explanation to see a beautiful combination of number theory, wonky programming, and sheer artistry.

Your task is to implement fenderbender‘s square predicate. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

Advertisements

Pages: 1 2

9 Responses to “Fenderbender’s Square Reckoner”

  1. Paul said

    @programmingpraxis: I converted the code to Python. It works about twice as fast compared to a method where you just use the 2 last digits for a first screening. Maybe I found an example where the method does not work. 152415787532388367504942236884722755800955129 is a square, but fenderbender’s method tells me not; it fails on the test using mod19. Could you check with your code?

  2. programmingpraxis said

    @Paul: Works fine for me: http://ideone.com/sqY3zf.

  3. Paul said

    @programmingparaxis: thanks. Apparently some checking to do.

  4. Ernie said

    Does anyone know why fenderbender chose mod x with x = 128? I have tried other x. By trial and error, 5! (= 120) gives a rejection rate of 85% and 6! a rejection rate of 93.3%.
    Would either of these be more appropriate?

  5. programmingpraxis said

    @Ernie: I suspect it is due to the computational efficiency of bit-shifting rather than performing a big-integer modulo.

  6. Paul said

    An interesting method. In Python it seems to be more effective to use sets i.s.o. Bloom filters. The version with sets is about a factor of 1.7 times faster. For the sets version I used mod64 i.s.o. mod128, because I could not find the quadratic residues for mod128.

    FACTOR = 63*25*11*17*23*31*19
    
    def is_square(n):
        m = n % 128
        if m*0x8bc40d7d & m*0xa1e2f5d1 & 0x14020a:
            return False
        largeMod = n % FACTOR
        m = largeMod % 63
        if m*0x3d491df7 & m*0xc824a9f9 & 0x10f14008:
            return False
        m = largeMod % 25
        if m*0x1929fc1b & m*0x4c9ea3b2 & 0x51001005:
            return False
        m = 0xd10d829a * (largeMod % 31)
        if m & (m+0x672a5354) & 0x21025115:
            return False
        m = largeMod % 23
        if m*0x7bd28629 & m*0xe7180889 & 0xf8300:
            return False
        m = largeMod % 19
        if m*0x1b8bead3 & m*0x4d75a124 & 0x4280082b:
            return False
        m = largeMod % 17
        if m*0x6736f323 & m*0x9b1d499 & 0xc0000300:
            return False
        m = largeMod % 11
        if m*0xabf1a3a7 & m*0x2612bf93 & 0x45854000:
            return False
        return isqrt(n) ** 2 == n
    
    d64 = set((0, 1, 4, 9, 16, 17, 25, 33, 36, 41, 49, 57))
    d63 = set((0, 1, 4, 7, 9, 16, 18, 22, 25, 28, 36, 37, 43, 46, 49, 58))
    d25 = set((0, 1, 4, 6, 9, 11, 14, 16, 19, 21, 24))
    d31 = set((0, 1, 2, 4, 5, 7, 8, 9, 10, 14, 16, 18, 19, 20, 25, 28))
    d23 = set((0, 1, 2, 3, 4, 6, 8, 9, 12, 13, 16, 18))
    d19 = set((0, 1, 4, 5, 6, 7, 9, 11, 16, 17))
    d17 = set((0, 1, 2, 4, 8, 9, 13, 15, 16))
    d11 = set((0, 1, 3, 4, 5, 9))
    
    def is_square2(n):
        if n % 64 not in d64:
            return False
        if n % 63 not in d63:
            return False
        if n % 25 not in d25:
            return False
        if n % 31 not in d31:
            return False
        if n % 23 not in d23:
            return False
        if n % 19 not in d19:
            return False
        if n % 17 not in d17:
            return False
        if n % 11 not in d11:
            return False
        return isqrt(n) ** 2 == n
    
  7. programmingpraxis said

    @Paul: It’s easy to compute the quadratic residues of a modulus, let’s say n = 11. For each number from 0 to n−1, square the number and take it mod n; thus, (0*0)%11=0, (1*1)%11=1, . . . , (5*5)%11=3, . . . , (10*10)%11=1. Those numbers are the quadratic residues of 11: 0, 1, 3, 4, 5, and 9. For the quadratic residues of 128 I get

    > (unique =
        (sort <
          (map (lambda (n) (modulo (* n n) 128))
               (range 128))))
    (0 1 4 9 16 17 25 33 36 41 49 57 64 65 68 73 81 89 97 100
     105 113 121)
  8. Paul said

    @programmingpraxis: thanks, I found out in the meantime (and get the same numbers as you). Using mod128 did not give significant different results from using mod64. I am still a little working on the problem, what would be the best mod’s to use.

  9. Milbrae said

    I’ve implemented 4 square finding routines and timed them.

    import time
    import math
    
    d64 = set((0, 1, 4, 9, 16, 17, 25, 33, 36, 41, 49, 57))
    d63 = set((0, 1, 4, 7, 9, 16, 18, 22, 25, 28, 36, 37, 43, 46, 49, 58))
    d25 = set((0, 1, 4, 6, 9, 11, 14, 16, 19, 21, 24))
    d31 = set((0, 1, 2, 4, 5, 7, 8, 9, 10, 14, 16, 18, 19, 20, 25, 28))
    d23 = set((0, 1, 2, 3, 4, 6, 8, 9, 12, 13, 16, 18))
    d19 = set((0, 1, 4, 5, 6, 7, 9, 11, 16, 17))
    d17 = set((0, 1, 2, 4, 8, 9, 13, 15, 16))
    d11 = set((0, 1, 3, 4, 5, 9))
    
    FACTOR = 63*25*11*17*23*31*19
    
    def is_square1(n):
        if n % 64 not in d64:
            return False
        if n % 63 not in d63:
            return False
        if n % 25 not in d25:
            return False
        if n % 31 not in d31:
            return False
        if n % 23 not in d23:
            return False
        if n % 19 not in d19:
            return False
        if n % 17 not in d17:
            return False
        if n % 11 not in d11:
            return False
        #return isqrt(n) ** 2 == n
        return int(math.sqrt(n))**2 == n
    
    def is_square2(n):
        m = n % 128
        if m*0x8bc40d7d & m*0xa1e2f5d1 & 0x14020a:
            return False
        largeMod = n % FACTOR
        m = largeMod % 63
        if m*0x3d491df7 & m*0xc824a9f9 & 0x10f14008:
            return False
        m = largeMod % 25
        if m*0x1929fc1b & m*0x4c9ea3b2 & 0x51001005:
            return False
        m = 0xd10d829a * (largeMod % 31)
        if m & (m+0x672a5354) & 0x21025115:
            return False
        m = largeMod % 23
        if m*0x7bd28629 & m*0xe7180889 & 0xf8300:
            return False
        m = largeMod % 19
        if m*0x1b8bead3 & m*0x4d75a124 & 0x4280082b:
            return False
        m = largeMod % 17
        if m*0x6736f323 & m*0x9b1d499 & 0xc0000300:
            return False
        m = largeMod % 11
        if m*0xabf1a3a7 & m*0x2612bf93 & 0x45854000:
            return False
        #return isqrt(n) ** 2 == n
        return int(math.sqrt(n))**2 == n
    
    def is_square3(n):
        h = n & 0xf
        if h > 9: return False
    
        if (h != 2) or (h != 3) or (h != 5) or (h != 6) or (h != 7) or (h != 8):
            return int(math.sqrt(n))**2 == n
    
    def is_square4(n):
        return int(math.sqrt(n))**2 == n
    
    #
    if __name__ == "__main__":
    
        N = 10**7
    
        t = time.clock()
        for i in range(2, N+1):
            r = is_square1(i)
        print (time.clock() - t)
    
        t = time.clock()
        for i in range(2, N+1):
            r = is_square2(i)
        print (time.clock() - t)
    
        t = time.clock()
        for i in range(2, N+1):
            r = is_square3(i)
        print (time.clock() - t)
    
        t = time.clock()
        for i in range(2, N+1):
            r = is_square4(i)
        print (time.clock() - t)
    

    > Timings (Standard Python):
    3.6116340733525862
    6.418930000824364
    9.941192323010673
    11.628690812502668

    > Timings (Pypy):
    0.3549592833256139
    2.8981952367599213
    0.24083208547439217
    0.14977518674845003

    As you can see it seems to matter with which interpreter you run your script. Feel free to correct me if any of that source code is wrong.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: