External Sorting

November 20, 2015

Sorting the lines of a file that fits in memory is a simple matter of loading the file and calling the sort function. When the file doesn’t fit in memory, however, it must be sorted in pieces, then combined into a single output, which is harder.

Your task is to write a program that sorts the lines of a file that is too large to fit into memory into ascending ascii order. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

Advertisement

Pages: 1 2

2 Responses to “External Sorting”

  1. matthew said

    Don’t have time to do actual implementation, but could use memory mapped file with suitable cache-oblivious sorting algorithm (https://en.wikipedia.org/wiki/Funnelsort seems to be state of the art, or at least, some kind of n-way mergesort).

  2. John Cowan said

    See Knuth on replacement selection, where you notice if the incoming line happens to be greater than the last line output to the current temp file, and if so write it out too. This is a huge win on almost-sorted files, a very common case.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: