String Search: Boyer-Moore

August 28, 2009

The two previous exercises discussed the brute-force and Knuth-Morris-Pratt algoritms for searching strings. Today we discuss the Boyer-Moore string search algorithm, invented by Bob Boyer and J Strother Moore in 1977, in a variant devised by Nigel Horspool.

The Boyer-Moore algorithm is a “backwards” version of the Knuth-Morris-Pratt algorithm. It looks at the last character of the pattern first, working its way right-to-left until it finds a mis-match, when it slides the pattern right along the search string for a skip size based on the current character.

Consider the pattern ABABAC, the same pattern used in the prior exercise. The skip array is:

```A    1 B    2 C    0 else 6```

``` If the current character of the search string isn't in the pattern, you can skip all the way past the current pattern. If the current character of the search string is C, the last character of the pattern, the pattern doesn't move, and the comparison shifts to the next character to the left. If the current character of the search string is A, the next-to-last character of the pattern, slide the pattern one character to the right and restart at the end of the pattern. And if the current character of the search string is B, the second-to-last character of the pattern, slide the pattern two characters to the right and restart at the end of the pattern. Your task is to write a function that performs string searching using the Horspool variant of the Boyer-Moore algorithm. When you are finished, you are welcome to read or run a suggested solution, or to post your solution or discuss the exercise in the comments below. __ATA.cmd.push(function() { __ATA.initVideoSlot('atatags-370373-5b30312e13c24', { sectionId: '370373', format: 'inread' }); }); Advertisements __ATA.cmd.push(function() { __ATA.initSlot('atatags-26942-5b30312e13c62', { collapseEmpty: 'before', sectionId: '26942', width: 300, height: 250 }); }); __ATA.cmd.push(function() { __ATA.initSlot('atatags-114160-5b30312e13c65', { collapseEmpty: 'before', sectionId: '114160', width: 300, height: 250 }); }); Like this:Like Loading... Related Pages: 1 2 ```
``` Posted by programmingpraxis Filed in Exercises 4 Comments » ```
``` 4 Responses to “String Search: Boyer-Moore” Connochaetes said August 28, 2009 at 1:32 PM Maxchar = 256 def preprocess(pattern) length = pattern.length skip = Array.new(Maxchar, length) index = -1 pattern.each_byte do |b| skip[b] = length - (index += 1) - 1 end skip end def horspool_search(str, pattern) m, n = pattern.length, str.length skip = preprocess(pattern) k, j = m-1, m-1 while k < n i = k while j>=0 and str[i] == pattern[j] i -=1; j -= 1 end return i+1 if j == -1 k += skip[str[k]] end nil end Programming Praxis – String Search: Boyer-Moore « Bonsai Code said August 29, 2009 at 8:18 PM […] Praxis – String Search: Boyer-Moore By Remco Niemeijer In yesterday’s Programming Praxis problem we have to implement a more efficient string search algorithm than the […] Remco Niemeijer said August 29, 2009 at 8:18 PM My Haskell solution (see http://bonsaicode.wordpress.com/2009/08/29/programming-praxis-string-search-boyer-moore/ for a version with comments): import Data.Map (findWithDefault, fromList, (!)) horspool :: Ord a => [a] -> Maybe Int -> [a] -> Maybe Int horspool pat skip xs = f (lp - 1 + maybe 0 id skip) p' where (lp, lxs, p') = (length pat, length xs, reverse pat) t = fromList \$ zip pat [lp - 1, lp - 2..] m = fromList \$ zip [0..] xs f n [] = Just (n + 1) f n (p:ps) | n >= lxs = Nothing | p == m ! n = f (n - 1) ps | otherwise = f (n + findWithDefault lp (m ! n) t) p' Vikas Tandi said May 9, 2011 at 6:28 AM Here is my code in C #include <limits.h> /* Boyer–Moore–Horspool algorithm */ int string_index_3(char *s, int str_size, char *p, int pattern_size) { int i, j; int bad_char_shift[UCHAR_MAX+1]; /* sanity check */ if(!s || !p || pattern_size <= 0) return -1; /* initialize shift table */ for(i = 0; i <= UCHAR_MAX; i++) bad_char_shift[i] = pattern_size; /* build shift table */ for(i = pattern_size-2; i >= 0; i--) if(bad_char_shift[p[i]] == pattern_size) bad_char_shift[p[i]] = pattern_size - i - 1; /* search pattern */ for(i = 0; i < str_size;) { /* check the substring from right to left */ for(j = pattern_size -1; p[j] == s[i+j]; j--) if(j == 0) return i+1; /* move right using skip table */ i += bad_char_shift[s[i+pattern_size-1]]; } return -1; } Leave a Reply Enter your comment here... Fill in your details below or click an icon to log in: Email (required) (Address never made public) Name (required) Website You are commenting using your WordPress.com account. ( Log Out /  Change ) You are commenting using your Google+ account. ( Log Out /  Change ) You are commenting using your Twitter account. ( Log Out /  Change ) You are commenting using your Facebook account. ( Log Out /  Change ) w Cancel Connecting to %s var highlander_expando_javascript = function(){ var input = document.createElement( 'input' ), comment = jQuery( '#comment' ); if ( 'placeholder' in input ) { comment.attr( 'placeholder', jQuery( '.comment-textarea label' ).remove().text() ); } // Expando Mode: start small, then auto-resize on first click + text length jQuery( '#comment-form-identity' ).hide(); jQuery( '#comment-form-subscribe' ).hide(); jQuery( '#commentform .form-submit' ).hide(); comment.css( { 'height':'10px' } ).one( 'focus', function() { var timer = setInterval( HighlanderComments.resizeCallback, 10 ) jQuery( this ).animate( { 'height': HighlanderComments.initialHeight } ).delay( 100 ).queue( function(n) { clearInterval( timer ); HighlanderComments.resizeCallback(); n(); } ); jQuery( '#comment-form-identity' ).slideDown(); jQuery( '#comment-form-subscribe' ).slideDown(); jQuery( '#commentform .form-submit' ).slideDown(); }); } jQuery(document).ready( highlander_expando_javascript ); Notify me of new comments via email. Notify me of new posts via email. ```
``` Categories Administrivia Exercises Archives June 2018 May 2018 April 2018 March 2018 February 2018 January 2018 December 2017 November 2017 October 2017 September 2017 August 2017 July 2017 June 2017 May 2017 April 2017 March 2017 February 2017 January 2017 December 2016 November 2016 October 2016 September 2016 August 2016 July 2016 June 2016 May 2016 April 2016 March 2016 February 2016 January 2016 December 2015 November 2015 October 2015 September 2015 August 2015 July 2015 June 2015 May 2015 April 2015 March 2015 February 2015 January 2015 December 2014 November 2014 October 2014 September 2014 August 2014 July 2014 June 2014 May 2014 April 2014 March 2014 February 2014 January 2014 December 2013 November 2013 October 2013 September 2013 August 2013 July 2013 June 2013 May 2013 April 2013 March 2013 February 2013 January 2013 December 2012 November 2012 October 2012 September 2012 August 2012 July 2012 June 2012 May 2012 April 2012 March 2012 February 2012 January 2012 December 2011 November 2011 October 2011 September 2011 August 2011 July 2011 June 2011 May 2011 April 2011 March 2011 February 2011 January 2011 December 2010 November 2010 October 2010 September 2010 August 2010 July 2010 June 2010 May 2010 April 2010 March 2010 February 2010 January 2010 December 2009 November 2009 October 2009 September 2009 August 2009 July 2009 June 2009 May 2009 April 2009 March 2009 February 2009 August 2009 S M T W T F S « Jul   Sep »  1 2345678 9101112131415 16171819202122 23242526272829 3031   Archives June 2018 May 2018 April 2018 March 2018 February 2018 January 2018 December 2017 November 2017 October 2017 September 2017 August 2017 July 2017 June 2017 May 2017 April 2017 March 2017 February 2017 January 2017 December 2016 November 2016 October 2016 September 2016 August 2016 July 2016 June 2016 May 2016 April 2016 March 2016 February 2016 January 2016 December 2015 November 2015 October 2015 September 2015 August 2015 July 2015 June 2015 May 2015 April 2015 March 2015 February 2015 January 2015 December 2014 November 2014 October 2014 September 2014 August 2014 July 2014 June 2014 May 2014 April 2014 March 2014 February 2014 January 2014 December 2013 November 2013 October 2013 September 2013 August 2013 July 2013 June 2013 May 2013 April 2013 March 2013 February 2013 January 2013 December 2012 November 2012 October 2012 September 2012 August 2012 July 2012 June 2012 May 2012 April 2012 March 2012 February 2012 January 2012 December 2011 November 2011 October 2011 September 2011 August 2011 July 2011 June 2011 May 2011 April 2011 March 2011 February 2011 January 2011 December 2010 November 2010 October 2010 September 2010 August 2010 July 2010 June 2010 May 2010 April 2010 March 2010 February 2010 January 2010 December 2009 November 2009 October 2009 September 2009 August 2009 July 2009 June 2009 May 2009 April 2009 March 2009 February 2009 Blogroll WordPress.com WordPress.org ```
``` Create a free website or blog at WordPress.com. /* <![CDATA[ */ var HighlanderComments = {"loggingInText":"Logging In\u2026","submittingText":"Posting Comment\u2026","postCommentText":"Post Comment","connectingToText":"Connecting to %s","commentingAsText":"%1\$s: You are commenting using your %2\$s account.","logoutText":"Log Out","loginText":"Log In","connectURL":"https:\/\/programmingpraxis.wordpress.com\/public.api\/connect\/?action=request","logoutURL":"https:\/\/programmingpraxis.wordpress.com\/wp-login.php?action=logout&_wpnonce=4b089b3297","homeURL":"https:\/\/programmingpraxis.com\/","postID":"1228","gravDefault":"blank","enterACommentError":"Please enter a comment","enterEmailError":"Please enter your email address here","invalidEmailError":"Invalid email address","enterAuthorError":"Please enter your name here","gravatarFromEmail":"This picture will show whenever you leave a comment. Click to customize it.","logInToExternalAccount":"Log in to use details from one of these accounts.","change":"Change","changeAccount":"Change Account","comment_registration":"","userIsLoggedIn":"","isJetpack":"","text_direction":"ltr"}; /* ]]> */ Privacy & Cookies: This site uses cookies. By continuing to use this website, you agree to their use. To find out more, including how to control cookies, see here: Cookie Policy (function(){ var corecss = document.createElement('link'); var themecss = document.createElement('link'); var corecssurl = "https://s1.wp.com/wp-content/plugins/syntaxhighlighter/syntaxhighlighter3/styles/shCore.css?ver=3.0.9b"; if ( corecss.setAttribute ) { corecss.setAttribute( "rel", "stylesheet" ); corecss.setAttribute( "type", "text/css" ); corecss.setAttribute( "href", corecssurl ); } else { corecss.rel = "stylesheet"; corecss.href = corecssurl; } document.getElementsByTagName("head")[0].insertBefore( corecss, document.getElementById("syntaxhighlighteranchor") ); var themecssurl = "https://s2.wp.com/wp-content/plugins/syntaxhighlighter/syntaxhighlighter3/styles/shThemeDefault.css?m=1363304414h&amp;ver=3.0.9b"; if ( themecss.setAttribute ) { themecss.setAttribute( "rel", "stylesheet" ); themecss.setAttribute( "type", "text/css" ); themecss.setAttribute( "href", themecssurl ); } else { themecss.rel = "stylesheet"; themecss.href = themecssurl; } //document.getElementById("syntaxhighlighteranchor").appendChild(themecss); document.getElementsByTagName("head")[0].insertBefore( themecss, document.getElementById("syntaxhighlighteranchor") ); })(); SyntaxHighlighter.config.strings.expandSource = '+ expand source'; SyntaxHighlighter.config.strings.help = '?'; SyntaxHighlighter.config.strings.alert = 'SyntaxHighlighter\n\n'; SyntaxHighlighter.config.strings.noBrush = 'Can\'t find brush for: '; SyntaxHighlighter.config.strings.brushNotHtmlScript = 'Brush wasn\'t configured for html-script option: '; SyntaxHighlighter.defaults['pad-line-numbers'] = false; SyntaxHighlighter.defaults['toolbar'] = false; SyntaxHighlighter.all(); // Infinite scroll support jQuery( function( \$ ) { \$( document.body ).on( 'post-load', function() { SyntaxHighlighter.highlight(); } ); } ); /* <![CDATA[ */ var actionbardata = {"siteID":"6649073","siteName":"Programming Praxis","siteURL":"https:\/\/programmingpraxis.com","icon":"<img alt='' src='https:\/\/s2.wp.com\/i\/logo\/wpcom-gray-white.png' class='avatar avatar-50' height='50' width='50' \/>","canManageOptions":"","canCustomizeSite":"","isFollowing":"","themeSlug":"pub\/ambiru","signupURL":"https:\/\/wordpress.com\/start\/","loginURL":"https:\/\/programmingpraxis.wordpress.com\/wp-login.php?redirect_to=https%3A%2F%2Fprogrammingpraxis.com%2F2009%2F08%2F28%2Fstring-search-boyer-moore%2F","themeURL":"","xhrURL":"https:\/\/programmingpraxis.com\/wp-admin\/admin-ajax.php","nonce":"6b4d8aeb33","isSingular":"1","isFolded":"","isLoggedIn":"","isMobile":"","subscribeNonce":"<input type=\"hidden\" id=\"_wpnonce\" name=\"_wpnonce\" value=\"a30a8d274d\" \/>","referer":"https:\/\/programmingpraxis.com\/2009\/08\/28\/string-search-boyer-moore\/?like=1&source=post_flair&_wpnonce=9233933e84","canFollow":"1","feedID":"196856","statusMessage":"","customizeLink":"https:\/\/programmingpraxis.wordpress.com\/wp-admin\/customize.php?url=https%3A%2F%2Fprogrammingpraxis.wordpress.com%2F2009%2F08%2F28%2Fstring-search-boyer-moore%2F%3Flike%3D1%26source%3Dpost_flair%26_wpnonce%3D9233933e84","postID":"1228","shortlink":"https:\/\/wp.me\/prTJ7-jO","canEditPost":"","editLink":"https:\/\/wordpress.com\/post\/programmingpraxis.com\/1228","statsLink":"https:\/\/wordpress.com\/stats\/post\/1228\/programmingpraxis.com","i18n":{"view":"View site","follow":"Follow","following":"Following","edit":"Edit","login":"Log in","signup":"Sign up","customize":"Customize","report":"Report this content","themeInfo":"Get theme: Ambiru","shortlink":"Copy shortlink","copied":"Copied","followedText":"New posts from this site will now appear in your <a href=\"https:\/\/wordpress.com\/\">Reader<\/a>","foldBar":"Collapse this bar","unfoldBar":"Expand this bar","editSubs":"Manage subscriptions","viewReader":"View site in Reader","viewReadPost":"View post in Reader","subscribe":"Sign me up","enterEmail":"Enter your email address","followers":"Join 789 other followers","alreadyUser":"Already have a WordPress.com account? <a href=\"https:\/\/programmingpraxis.wordpress.com\/wp-login.php?redirect_to=https%3A%2F%2Fprogrammingpraxis.com%2F2009%2F08%2F28%2Fstring-search-boyer-moore%2F\">Log in now.<\/a>","stats":"Stats"}}; /* ]]> */ // <![CDATA[ (function() { try{ if ( window.external &&'msIsSiteMode' in window.external) { if (window.external.msIsSiteMode()) { var jl = document.createElement('script'); jl.type='text/javascript'; jl.async=true; jl.src='/wp-content/plugins/ie-sitemode/custom-jumplist.php'; var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(jl, s); } } }catch(e){} })(); // ]]> %d bloggers like this: _tkq = window._tkq || []; _stq = window._stq || []; _tkq.push(['storeContext', {'blog_id':'6649073','blog_tz':'0','user_lang':'en','blog_lang':'en','user_id':'0'}]); _stq.push(['view', {'blog':'6649073','v':'wpcom','tz':'0','user_id':'0','post':'1228','subd':'programmingpraxis'}]); _stq.push(['extra', {'crypt':'UE5XaGUuOTlwaD85flAmcm1mcmZsaDhkV11YdWtpP0NsWnVkPS9sL0ViLndld3BuVT01Unp2dX5PUGg5S0U1NzhRY2ItWDJQVGMmRG1jT2ZiWWc1ZUo1LCt4LTVdaFhJc0xofEFOX2UxbEV8WjZJQW1nVz1idThMOCxUcUM/YSw5Y0NXVz1WLmxPVjh0aWxEVUFvfExoRzRzUmFdU3pDU2lUWmhaTl1IdGhlWmVuSzQwTkRqbUV2JXMxcS8wPzRpU1JCUEJIOWRSSkEsVnZKT2JEc21HTCs0JX50P0NaenZhci8sQlZyWDFMdW5HNXZsRVhVYlpQTGlJYllJVV9VdG4sSlNWWz8xakNaYmhnc0ExeiVxY3FJRSxKcmhtR0lMLVQsdnkzWEw='}]); _stq.push([ 'clickTrackerInit', '6649073', '1228' ]); if ( 'object' === typeof wpcom_mobile_user_agent_info ) { wpcom_mobile_user_agent_info.init(); var mobileStatsQueryString = ""; if( false !== wpcom_mobile_user_agent_info.matchedPlatformName ) mobileStatsQueryString += "&x_" + 'mobile_platforms' + '=' + wpcom_mobile_user_agent_info.matchedPlatformName; if( false !== wpcom_mobile_user_agent_info.matchedUserAgentName ) mobileStatsQueryString += "&x_" + 'mobile_devices' + '=' + wpcom_mobile_user_agent_info.matchedUserAgentName; if( wpcom_mobile_user_agent_info.isIPad() ) mobileStatsQueryString += "&x_" + 'ipad_views' + '=' + 'views'; if( "" != mobileStatsQueryString ) { new Image().src = document.location.protocol + '//pixel.wp.com/g.gif?v=wpcom-no-pv' + mobileStatsQueryString + '&baba=' + Math.random(); } } ```