Due: This lab is due before your next lab begins
In this lab you will investigate the performance of various sorting algorithms on several datasets, evaluate them using Big-O notation, and produce tables and graphs of actual measured execution times. You will be using the book’s template-based implementation of the various sorting algorithms, however all arrays that you sort will contain double precision floating point numbers.
Code Starting Point
• Starting Code (gzipped tar file)
This tar file has implementations of the five sorting algorithms we have studied so far: selection sort, insertion sort, bubble sort, mergesort, and quicksort. (Note that the mergeSort implementation here generalizes the book’s implementation in a small, but very important way by dynamically allocating the temporary array needed for the merge operation. Your code will therefore call: mergeSort(anArray, n) instead of mergeSort(anArray, 0, n-1) as is shown in the book.)
Building Your Sort Algorithm Tester
Your main program will receive from the command line (via argv) the size of an array to be generated and sorted, the initial order of data, and the algorithm to be used to perform the sort:
lab7 <data_size> <random|ascending|descending> <selection|insertion|bubble|merge|quick>
lab7 50000 random insertion
Your main program will then pass this information into your Executive class. Your Executive class will do a #include for each of the five files in the given “Starting Code” tar file mentioned above. It will then do the rest of the work described here.
Generating an array to be sorted
Given the data size and the data requirement, you are to dynamically allocate an array of double of the specified size and populate it with the appropriate data. For “ascending”, set array[i] = 0.001*static_cast<double>(i); for “descending”, set array[i] = 0.001*static_cast<double>(SIZE – i – 1), where SIZE is the size of the array. Finally, for “random”, set each element of the array to a unique random number, x, in the range 0.0 ≤ x ≤ 100000.0.
When populating your array with random numbers, I suggest you use the drand48 function defined in <stdlib.h>. You should then also call srand48 with a constant at the start of your program to make sure you get the same sequence of random numbers each time you execute the program.
Performing and timing the sort
Call the specified algorithm, passing it the array you generated. Record the time taken to sort the data; your lab GTA will discuss how to do this. Be sure you do not include in this recorded time any I/O operations, time taken to generate the array to be sorted, or any other extraneous operation. Time only the sorting algorithm itself by inserting the appropriate call immediately before and immediately after the call to the sorting algorithm. Format and print a message to the console reporting the size of the array, the sorting algorithm used, the initial array condition (sorted, reverse sorted, or random), and the total time taken to perform the sort.
For each sorting algorithm and initial array condition, get times for at least 8 different array sizes, none of which should be less than 50000. For quicksort and mergesort (as well as bubble sort and insertion sort for the “ascending” case), record times for much larger sizes – at least 500000 and 1000000. Record these results in a spread sheet. Create five sets of three graphs each; each set is for one of the algorithms (bubble, insertion, etc.), and the three graphs within each set are for “random”, “ascending” and “descending”, respectively.
For each of the cases (random, ascending, descending) and each of the sorting algorithms:
1. State whether the times you observed demonstrate O(N), O(N*log(N)), or O(N2) growth rates. Justify your statement; try to be as quantitative as possible.
2. Predict the time required to sort an array of size 10,000,000. Justify your answer.
Grades will be assigned according to the following criteria:
• General construction of your main program, including array generation and interactive prompts: 10%
• Correct use of timing facilities used gather statistics: 10%
• Data gathered on the sorts (completeness, reasonableness, etc.): 25%
• Spreadsheets and graphs: 30%
• Characterizations and justifications for observed growth rates: 15%
• Predictions of time required for array of size 10,000,000: 10%
Create a tarball with your source code in the usual way. Include in the same directory as your source code a report that includes the tables, graphs, and statements mentioned above. Your report must be a PDF file, and it must be named: YourLastName_Report.pdf. We will not accept .doc, .docx, .odf, .txt, or any other format!
Email this tarball to your TA. The email subject line must look like “[EECS 268] SubmissionName” as follows:
[EECS 268] Lab 07
Note that the subject should be exactly like the line above. Do not leave out any of the spaces, or the bracket characters (“[” and “]”). In the body of your email, include your name and student ID.
EECS 268 Laboratory 7