## String Search: Boyer-Moore

### August 28, 2009

The two previous exercises discussed the brute-force and Knuth-Morris-Pratt algoritms for searching strings. Today we discuss the Boyer-Moore string search algorithm, invented by Bob Boyer and J Strother Moore in 1977, in a variant devised by Nigel Horspool.

The Boyer-Moore algorithm is a “backwards” version of the Knuth-Morris-Pratt algorithm. It looks at the last character of the pattern first, working its way right-to-left until it finds a mis-match, when it slides the pattern right along the search string for a skip size based on the current character.

Consider the pattern ABABAC, the same pattern used in the prior exercise. The skip array is:

A    1
B    2
C    0
else 6

If the current character of the search string isn't in the pattern, you can skip all the way past the current pattern. If the current character of the search string is C, the last character of the pattern, the pattern doesn't move, and the comparison shifts to the next character to the left. If the current character of the search string is A, the next-to-last character of the pattern, slide the pattern one character to the right and restart at the end of the pattern. And if the current character of the search string is B, the second-to-last character of the pattern, slide the pattern two characters to the right and restart at the end of the pattern.

Your task is to write a function that performs string searching using the Horspool variant of the Boyer-Moore algorithm. When you are finished, you are welcome to read or run a suggested solution, or to post your solution or discuss the exercise in the comments below.

Pages: 1 2

### 4 Responses to “String Search: Boyer-Moore”

1. Connochaetes said
Maxchar = 256

def preprocess(pattern)
length    = pattern.length
skip 	  = Array.new(Maxchar, length)
index     = -1
pattern.each_byte  do |b|
skip[b] = length - (index += 1) - 1
end
skip
end

def horspool_search(str, pattern)
m, n = pattern.length, str.length
skip = preprocess(pattern)
k, j = m-1, m-1
while k < n
i = k
while j>=0 and str[i] == pattern[j]
i -=1; j -= 1
end
return i+1 if j == -1
k += skip[str[k]]
end
nil
end

2. [...] Praxis – String Search: Boyer-Moore By Remco Niemeijer In yesterday’s Programming Praxis problem we have to implement a more efficient string search algorithm than the [...]

3. Remco Niemeijer said

import Data.Map (findWithDefault, fromList, (!))

horspool :: Ord a => [a] -> Maybe Int -> [a] -> Maybe Int
horspool pat skip xs = f (lp - 1 + maybe 0 id skip) p' where
(lp, lxs, p') = (length pat, length xs, reverse pat)
t = fromList \$ zip pat [lp - 1, lp - 2..]
m = fromList \$ zip [0..] xs
f n []     = Just (n + 1)
f n (p:ps) | n >= lxs   = Nothing
| p == m ! n = f (n - 1) ps
| otherwise  = f (n + findWithDefault lp (m ! n) t) p'

4. Vikas Tandi said

Here is my code in C

#include <limits.h>

/* Boyer–Moore–Horspool algorithm */
int string_index_3(char *s, int str_size, char *p, int pattern_size)
{
int i, j;

/* sanity check */
if(!s || !p || pattern_size <= 0)
return -1;

/* initialize shift table */
for(i = 0; i <= UCHAR_MAX; i++)

/* build shift table */
for(i = pattern_size-2; i >= 0; i--)
bad_char_shift[p[i]] = pattern_size - i - 1;

/* search pattern */
for(i = 0; i < str_size;)
{
/* check the substring from right to left */
for(j = pattern_size -1; p[j] == s[i+j]; j--)
if(j == 0)
return i+1;

/* move right using skip table */