Problem Statement:

Your goal in this assignment is to implement a fast sorting routine for large files containing 32-bit floating-point numbers. Your program should run on Linux. You should submit a file called supersort.cpp, which will be compiled with g++ using suppport for threads, since the machine that we will be using for benchmarking has a dual-core pentium processor with a 500GB disk (you should assume the machine has much less memory than the size of the input files, e.g., 128MB of RAM). The file to be sorted consists of a set of 32-bit floating-point numbers in binary format.

Your supersort will be tested on a variety of inputs. The fastest code in the class that is able to sort all the inputs correctly will receive 100 points. Other codes will be given fractional grading based on its relative speed to the fastest code, and the among of correctly sorted output.

Here is how your code will be called: supersort inputfile sortedoutputfile
Here is some reading material that might be useful in developing your sorting code: The following websites have interesting advice (and code) on in-core sorting:
  1. (http://codercorner.com/RadixSortRevisited.htm)
  2. (http://www.stereopsis.com/radix.html)


Our Solution: ( Siddharth was my teammate) :

Any external memory sorting program has to have 2 components:
  1. Creating Sorted Runs
  2. Merging Sorted Runs

The key is to be able to decide how to parallelize to obtain the maximum speedup on the dual core processor. In our solution we consider these 2 components independently and parallelize each part as much as possible. Here i briefly describe some of the design decisions in each component.

  1. Creating Sorted Runs (Salient Points):
    1. Decide optimal buffer size (we have only 128 mb available in all).
    2. Decide internal sort algorithm.(we used radix sort)
    3. Implement a buffer manager to take care of buffer allocations and freeing (this is necessary to efficiently manage memory across multiple threads)
    4. A single thread for reading input file and filling up the free buffers
    5. Multiple sorter threads for sorting the filled buffers
  2. Merging Sorted Runs (Salient Points):
    1. K Way Merge using a Heap.
    2. Dynamically Setting size of input Buffer and output buffer Sizes to minimize disk I/O (Refer: Knuth- Vol. 2 Sorting and Searching)
    3. A single pass through all files.
    4. K way buffering of input files. We mirror the input buffers (use a buffering factor ) to prevent the merger thread from waiting on disk-I/O
    Essentially, we use a 2 threaded design for the merge:
    1. Reader Thread : responsible for filling up input buffers.
    2. Merge /Write Thread : responsible for Merging data in Input buffers and writing them to output.
Results:
We timed 23 min 50 secs to sort 3 files of sizes 1GB, 2GB and 4GB respectively to come in 3rd, just shy of second place by about 30 secs.