Trailing Zero-Bits
July 7, 2020
The easiest solution is to loop through the trailing bits:
(define (zerobits n)
(let loop ((n n) (b 0))
(if (odd? n) b (loop (/ n 2) (+ b 1)))))
That takes time O(log n) to iterate through the bits of the number. A little trick using some bit-hackery and a lookup table gives us an O(1) solution:
(define (zerobits n)
(vector-ref (vector 32 0 1 26 2 23 27 0 3 16 24 30 28 11 0 13
4 7 17 0 25 22 31 15 29 10 12 6 0 21 14 9 5 20 8 19 18)
(modulo (bitwise-and n (- n)) 37)))
Who thinks up things like this? You can run the program at https://ideone.com/8hx3c7.
Cool little drill. Here is my take on it using Julia 1.4.1: https://pastebin.com/GbYGLYKV
First approach: create the bin string and work with that. Easier for figuring out what’s going on.
Second approach: just solve the task and forget about binary representation. Faster but opaque.
If you are just going to do fixed size integers, then the looping solution is constant time as well. For integers of any length, represented as bignums, I don’t think you can do better than some sort of loop over the the bits of the number.
The array solution is quite cute though (37 presumably is the lowest modulus m such that 2^i mod m is different for i in 0..31).
Python. Simple bit-testing.
def trail(n):
“”” “””
z = 0
while (n & 1) == 0:
z += 1
n >>= 1
return z
n = [18, 48]
for k in n:
print (“%d = %d” % (k, trail(k)))
[/code]
Sample output:
18 = 1
48 = 4[/code]
Again with proper indentation
def trail(n): """ """ z = 0 while (n & 1) == 0: z += 1 n >>= 1 return z n = [18, 48] for k in n: print ("%d = %d" % (k, trail(k)))I’ve been playing with Compiler Explorer lately. So I wrote three implementations in C and explored them there.
Link.
Then I wrapped a test harness around them to see how they performed. On my machine, tz2 and tz3 are about twice as fast as tz, wth a slight edge to tz3. You can see from Compiler Explorer that the compiler unrolled the loops in tz1 and tz3.
#include <assert.h> #include <stdio.h> #include <stdlib.h> #include <sys/time.h> extern unsigned tz(unsigned); extern unsigned tz2(unsigned); extern unsigned tz3(unsigned); unsigned stupid_tz(unsigned num) { for (unsigned i = 0; i < 32; i++) if (num & (1LL << i)) return i; return 32; } void test(unsigned (*tzf)(unsigned), const char *label) { printf("testing %s\n", label); // printf("%s(0) => %u\n", label, (*tzf)(0)); assert((*tzf)(0) == 32); assert((*tzf)(1) == 0); assert((*tzf)(18) == 1); assert((*tzf)(48) == 4); unsigned num = 0; for (int i = 0; i < 10000000; i++) { unsigned expected = stupid_tz(num); unsigned actual = (*tzf)(num); if (actual != expected) { fprintf(stderr, "%#x: expected %u, got %u\n", num, expected, actual); abort(); } num += 1537; } } double dt(struct timeval *before, struct timeval *after) { double dsec = after->tv_sec - before->tv_sec; double dusec = after->tv_usec - before->tv_usec; return dsec + dusec / 1000000.0; } void perf(unsigned (*tzf)(unsigned), const char *label) { (void)tzf; (void)label; struct timeval before, after; unsigned nums[] = {0, 1, 0x10, 0x100, 0x1000, 0x10000, 0x100000, 0x1000000, 0x10000000, 0x80000000}; size_t n_nums = (&nums)[1] - nums; printf("perf %s\n", label); for (size_t i = 0; i < n_nums; i++) { unsigned num = nums[i]; gettimeofday(&before, NULL); for (int i = 0; i < 1000000; i++) (void)(*tzf)(num); gettimeofday(&after, NULL); printf("%s(%#x): %g nsec\n", label, num, 1000 * dt(&before, &after)); } printf("\n"); } int main() { setbuf(stdout, NULL); test(stupid_tz, "stupid_tz"); test(tz, "tz"); test(tz2, "tz2"); test(tz3, "tz3"); perf(tz, "tz"); perf(tz2, "tz2"); perf(tz3, "tz3"); return 0; }@kernelbob: how does this compare?
unsigned tz4(unsigned num) { return __builtin_ctz(num); }Or another nice one:
uint32_t tz5(uint32_t n) { n = (n&-n)-1; n = ((n >> 1)&0x55555555) + (n&0x55555555); n = ((n >> 2)&0x33333333) + (n&0x33333333); n = ((n >> 4)&0x0f0f0f0f) + (n&0x0f0f0f0f); n = ((n >> 8)&0x00ff00ff) + (n&0x00ff00ff); n = ((n >> 16) + n) & 0xffff; return n; }The rabbit hole always goes deeper…
After I posted, I found and read Our Gracious Host’s table lookup method, and then matthew asked about the __builtin_ctz function. __builtin_ctz is undefined for input zero, so I wrote two versions.
tz4 calls __builtin_ctz, and can’t be used for zero. (It generates a run-time error with clang’s “-fsanitize=undefined” flag.)
tz4z checks for zero explicitly, then calls __builtin_ctz on nonzero inputs.
tz5 uses table lookup mod 37.
Compiler Explorer.
Test and microbenchmark run. __builtin_ctz, with or without zero handling is fastest; table lookup is faster than any of my solutions.
A python solution:
def trailing_zero_bits(n): return bin(n)[::-1].find("1")Doesn’t work with zero.
Here’s another Haskell version having a few variants of the function. The first three work on arbitrary sized numbers (because popCount does).
Here’s a binary search solution in C.
#include <assert.h> #include <stdint.h> #include <stdio.h> #include <stdlib.h> int ntz(uint64_t x) { assert(x > 0); int n = 0; if (!(x & 0x00000000ffffffff)) {n = 32; x >>= 32;} if (!(x & 0x000000000000ffff)) {n += 16; x >>= 16;} if (!(x & 0x00000000000000ff)) {n += 8; x >>= 8;} if (!(x & 0x000000000000000f)) {n += 4; x >>= 4;} if (!(x & 0x0000000000000003)) {n += 2; x >>= 2;} if (!(x & 0x0000000000000001)) {n += 1;} return n; } int main(int argc, char* argv[]) { assert(argc == 2); uint64_t x = strtoull(argv[1], NULL, 10); printf("%d\n", ntz(x)); return EXIT_SUCCESS; }Example:
function binary(n) { str = ""; pwr = int(log(n)/log(2)); for (i = pwr; i > -1; i--) { flg = ((i == pwr)?1:0); str = flg str; if (flg == 1) { n = n - (2^pwr); pwr = int(log(n)/log(2)) } }; return str } { print $0 " --> " index(binary($0),1)-1 }$ echo 18 | awk -f /tmp/binary.awk
18 –> 1
$ echo 48 | awk -f /tmp/binary.awk
48 –> 4
Simple solution in C. In this the number of trailing zeros for 0 is the number of bits in an int.
int solve(int n){ int ret = 0; while(!(n % 2) && ret < 8 * sizeof(int)){ // If the last bit is 0 n >>= 1; // shift bits to the right ret++; // and increase the return value by 1. } return ret; }function checkTrailingZeros($n){
// Take the number and convert to binary
$binary = decbin($n);
}
Oops, sorry for the bad paste. Will do it the right way next time round