Longest Duplicated Substring
December 14, 2010
In a previous exercise, we looked at the problem of finding the longest palindrome in a string. In today’s exercise, we look at a similar problem, finding the longest duplicated substring in a string.
The algorithm for this task is well known. First build a suffix list that has all the substring suffices of the input string. For instance, if the input is “banana”, the suffix list is “a”, “na”, “ana”, “nana”, “anana”, “banana”. Then sort the suffix list, bringing suffices with common beginnings together. Finally, scan the sorted suffix list, computing the common lengths of adjacent pairs, and report the longest.
Your task is to write a program to find the longest duplicated substring within a string. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.
[…] today’s Programming Praxis exercise, our task is to implement the algorithm to find the longest duplicated […]
My Haskell solution (see http://bonsaicode.wordpress.com/2010/12/14/programming-praxis-longest-duplicated-substring/ for a version with comments):
Mine in F#
Correction,
need solution in Core Java
My Python version. It turns out to be very similar to Remco’s Haskell code. The combination of starmap and pairwise corresponds to “mapAdjacent” or and the ‘suffices’ generator expression corresponds to “tail”.
a little bit late:
my commented solution in OCaml (http://www.gleocadie.net/?p=543&lang=en) using OCaml Batteries