Maximum Sum Subsequence

December 3, 2010

We’ll categorize today’s exercise as part of our on-going series of interview questions, but it’s really more than that:

Given a sequence of integers, both positive and negative, find the contiguous subsequence with the maximum sum. For instance, given the sequence 31, -41, 59, 26, -53, 58, 97, -93, -23, 84, the maximum sum subsequence is 59, 26, -53, 58, 97, which sums to 187.

Your task is to write a series of functions that takes a sequence (use either a list or an array, whichever is convenient for your programming language) and returns the sum of the maximum sum subsequence; you should find solutions with time complexities of O(n3), O(n2), O(n log n), and O(n). When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

Pages: 1 2

15 Responses to “Maximum Sum Subsequence”

  1. […] maximale — tel qu’exposé dans Programming Pearls de Jon Bentley et dans un exercice de Programming Praxis — consiste à trouver, parmi les sommes de tous les segments (sous-séquences continues) […]

  2. […] Praxis – Maximum Sum Subsequence By Remco Niemeijer In today’s Programming Praxis exercise, we have to implement four different ways of solving the problem of […]

  3. My Haskell solution (see http://bonsaicode.wordpress.com/2010/12/03/programming-praxis-maximum-sum-subsequence/ for a version with comments):

    import Data.List
    
    maxSum1 :: (Ord a, Num a) => [a] -> a
    maxSum1 xs = maximum . map sum $ inits =<< tails xs
    
    maxSum2 :: (Ord a, Num a) => [a] -> a
    maxSum2 xs = maximum $ scanl (+) 0 =<< tails xs
    
    maxSum3 :: (Ord a, Num a) => [a] -> a
    maxSum3 = fst . head . until (null . tail) f . map (\x -> (x, [x])) where
        f ((lm, l):(rm, r):xs) = (maximum [maximum (scanl (+) 0 r) +
            maximum (scanr (+) 0 l), lm, rm], l ++ r) : f xs
        f xs = xs
    
    maxSum4 :: (Ord a, Num a) => [a] -> a
    maxSum4 = snd . foldl (\(here, sofar) x -> let m = max 0 $ here + x 
                                               in (m, max m sofar)) (0, 0)
    
  4. www said

    “if he isn’t at least interested enough in his profession to read Bentley, he isn’t interested enough in his profession to work for me”

    Why not just make that the opening criteria, then? Ask if they’ve read the book, ask how many chapters are there, ask when it was published. If the candidate doesn’t know these, stop the interview immediately.

    Are there any books (or combination of books) that are acceptable substitutes? What if the candidate read Code Complete three times? Does reading Norvig through override the necessity to have read Bentley?

  5. programmingpraxis said

    I seem to have struck a nerve; Remco had much the same comment on his blog.

    I just did a Google search for maximum sum subsequence bentley; there are 4500 results. There are several references to Bentley’s original book, of course. There are articles on Springer, ACM, IEEE, and Elsevier referring to Bentley’s book. There are course materials at multiple colleges (I stopped counting after half a dozen) that refer to Bentley’s book. TopCoder and UVA give the exercise. Several blogs mention Bentley’s book, including LtU. I conclude that the O(n) algorithm is part of the computing folklore.

    It is not unreasonable to expect a candidate to at least know of the algorithm, even if he can’t recall it exactly. A single such failing wouldn’t knock out a candidate, but it would make me ask another question from the computing folklore. Multiple such failures likely would knock out a candidate.

    Actually, I am more interested to find out what projects a candidate has worked on, what role he played in those projects, and how does that previous work match up to the job for which I am hiring. These “interview questions” play a much smaller role. But I still think a candidate not versed in the programming folklore will be less productive than one who is, and thus less suitable for hiring.

  6. Aaron said

    It was pretty obvious that there was a linear solution, so why would I waste the time on the other solutions, except for intellectual (academic) exercise? Why would I waste my time memorizing obscure algorithms when I know I can figure out the solution on my own or in a reference? Doesn’t seem like a good way to interview, unless you’re looking for Ph.d candidates. As Einstein said, “Never memorize what you can look up in books.”

  7. Carl said

    Oh, dear, Shaun G. That’s a bit clumsy and slow. Try this instead:


    $seq = array( 31, -41, 59, 26, -53, 58, 97, -93, -23, 84 );

    foreach( $seq as $num )
    {
    $current += $num;
    $highest = ( $highest < $current ? $current : $highest );
    $current = ( $current < 0 ? 0 : $current );
    }

  8. Mayson Lancaster said

    Your statement of the problem varies subtly, and somewhat ambiguously, from Bentley’s: his problem states that one of the valid subsequences is the empty sequence, which cleanly disposes of the edge case where all entries are negative.

  9. programmingpraxis said

    My comment about hiring caused quite a stir at Hacker News. As the good folks there pointed out, I may have spoken un-artfully, but there is an important point here that is being missed. I bet that every civil engineer, if asked at an interview, could explain the basic design of Galloping Gertie and the flaws that caused it to fail. I bet that every architect, if asked at an interview, could explain the basic design of the Twin Towers, and discuss why it was less resilient to heavy lateral impacts than other skyscrapers. I bet that every journalist, if asked at an interview, could quote some of the advice of Strunk and White.

    As a broad statement, certainly not always true, computer programmers seem to be less knowledgeable of their professional culture than other professions. There is no equivalent to Architectural Digest or Journal of Accountancy or Morbidity and Mortality for programmers, so programmers have to work harder to keep up with their profession. If you can demonstrate that at an interview, you have a leg up on the candidate who can’t.

    Do you have to know the linear solution to the maximum sum subsequence problem to get hired? Of course not. But if you know it, and the other guy doesn’t, then all other things being equal, you get hired, not him.

    I should also have mentioned that there must be some context to the programming questions being asked at an interview. There are probably better questions to ask a PHP web developer.

  10. Kevin said

    I don’t know if this is right but for some reason I cannot think of any other way to do this except linearly.
    I figured that I could generate all subsets and check their sums via a recursive function but I couldn’t figure out how to do that haha.
    As for the O(n*log n) algorithm…I couldn’t even think of a way to do that.

    using namespace std;
    
    int linear(vector<int> seq)
    {
    	int current = seq[0],
    		prev = seq[0],
    		max = seq[0];
    
    	for(int i=1; i<seq.size(); i++)
    	{
    		prev = current;
    		current += seq[i];
    		if(current < 0)
    			current = 0;
    		if(prev > max)
    			max = prev;
    		else if(current > max)
    			max = current;
    	}
    	return max;
    }
    
  11. F. Carr said

    AI have enjoyed reading the various problems discussed here at programmingpraxis. But a blogger who cannot see any way “forward” — except to re-entrench their ingrained biases and errors — after being caught out making such a ridiculous overstatement? There are many blogs where readers can fruitfully invest their time: multiple such failures on the part of one particular blogger, unable to learn a valuable lesson, are likely to knock them out of the marketplace of ideas.

  12. While I do have Bentley’s book on my shelf, it’s been long enough that I didn’t remember this specific example (perhaps that is a good indication that I should reread it again). I’d certainly expect that any reasonable programmer could come up with an implementation that worked: my own version was the obvious cubic time algorithm. A moment’s further thought reduced that a quadratic algorithm by changing storing the array of sums, allowing you to calculate the total sums between any two points as the difference between two values from the sum table. I’d expect any good programmer to be able to come up with that. Bentley’s linear time algorithm is actually way subtle, and it isn’t clear to me that it actually answers the question you asked: you asked to identify the sequence with maximal sum. All of your code seems to return the maximal sum. Furthermore, you forgot the stipulation that Bentley makes: that when all the entries are negative, you should return the maximal sum of 0. The maximal sum is well defined even if all the entries are negative: the obvious quadratic algorithms will return the proper answer, Bentley’s linear code does not, and gains some traction in simplicity from this stipulation.

  13. Jebb said

    Slightly disappointed nobody posted a Python solution, I was hoping I could find a more elegant example than what I came up with (verbose as I’ve made it, I might as well have written it in C).

    I couldn’t come up with the O(n log n) solution, even though I correctly figured out it had to be some sort of recursive, divide-and-conquer approach…

    def cubic(L):
    	'''Cubic solution: two loops, and the summing of the list
    	on top'''
    	if L == None:
    		return None
    	max_subsum = L[0]
    	max_subset = L[:1]
    	for j in range(1, len(L)):
    		for i in range(j):
    			s = sum(L[i:j])
    			if s > max_subsum:
    				max_subsum = s
    				max_subset = L[i:j]
    	return (max_subsum, max_subset)
    
    def quadratic(L):
    	'''Avoid using the O(n) 'sum' function within the double loop,
    	use a single addition instead'''
    	if L == None:
    		return None
    	max_subsum = L[0]
    	max_subset = L[:1]
    	for i in range(len(L)):
    		subsum = 0
    		for j in range(i,len(L)):
    			subsum += L[j]
    			if subsum > max_subsum:
    				max_subsum = subsum
    				max_subset = L[i:j+1]
    	return (max_subsum, max_subset)
    
    def linear(L):
    	'''The problem can also be seen as looking for a minimum and a maximum
    	of the function f(n)=sum(L[0:n]), with the minimum occuring at a lower
    	index than the maximum; we want to maximise f(n+m) - f(n)'''
    	if L == None:
    		return None
    	list_sum = 0
    	relevant_min, relevant_max = 0, 0
    	relevant_min_index, relevant_max_index = 0, 0
    	current_min, current_max = relevant_min, relevant_max
    	current_min_index, current_max_index = relevant_min_index, relevant_max_index
    	for i in range(len(L)):
    		list_sum += L[i]
    		if list_sum > current_max:
    			current_max = list_sum
    			current_max_index = i
    		if list_sum < current_min:
    			if relevant_max - relevant_min < current_max - current_min:
    				# Found a new best subset
    				relevant_max, relevant_min = current_max, current_min
    				relevant_max_index, relevant_min_index = current_max_index, current_min_index
    			# Now start looking for the next subset
    			current_min, current_max = list_sum, list_sum
    			current_max_index, current_min_index = i, i
    	# Now in case we finished the loop while the last subset was also the relevant one
    	if relevant_max - relevant_min < current_max - current_min:
    		relevant_max, relevant_min = current_max, current_min
    		relevant_max_index, relevant_min_index = current_max_index, current_min_index
    	return (relevant_max - relevant_min, L[relevant_min_index + 1: relevant_max_index + 1])
    
  14. […] Here is my attempt at Programming Praxis’ Maximum Sum Subsequence problem […]

Leave a comment