Asymptotic Analysis Project

COT 4400

1 Overview

This project asks you to evaluate three sorting algorithms we have discussed

in class on various inputs and relate their theoretical run time to their empirical (actual) behavior. These three algorithms are InsertionSort, MergeSort,

and QuickSort. In the interests of time, you have been provided with code

that implements all three algorithms and can run them on inputs of varying

sizes and types.

2 Running the provided program

Along with this document, you have been provided with source code and

a Makefile for the various sorting algorithms and input types to test in this

project. This program will run the specified algorithm (InsertionSort, MergeSort, or QuickSort) on a sorted, random, or constant input array of a given

size. I expect that you can use an IDE or the Linux make utility in order

to compile the code. Once compiled, the program should be run from the

command line, as it expects three arguments. If fewer than three arguments

are specified, the program will assume default variables for any unspecified

arguments.

The first of these three arguments represents the size of the input array. The program only accepts sizes between 1 and 1,000,000,000, inclusive

(default 10,000). The second argument represents the sorting algorithm you

wish to run. Valid algorithms include InsertionSort, MergeSort, and QuickSort (default MergeSort), or you can abbreviate the algorithms by their first

letter (‘i,’ ‘m,’, or ‘q’). The last argument represents the type of input to

1

sort. Valid input types are sorted, random, and constant (or ‘s,’ ‘r’, ‘c’; default ‘r’), where ‘random’ is an unsorted array, ‘sorted’ is a sorted array (in

increasing order), and ‘constant’ is an array where every entry is identical.

In order to improve the timing stability, the algorithm runs the requested

sort three times and reports the median of the three timing results to you.

In order to get the most accurate timing results, there should be no other

processes running on the machine at the same time, but this may not always

be possible, especially if you are running the program on a lab machine.

3 Project report

Your project report should be divided into two parts, Results and Analysis.

For the Results section, you will need to prepare a data file describing the

performance of your algorithms, and in the Analysis portion, you will need

to prepare a table describing their complexity, as well as a response to these

results.

3.1 Results

Run each of the three sorting algorithms on constant, sorted, and random

arrays that are powers of 10. For each of the twelve cases, you should record

the following:

1. nmin: the smallest power of 10 that takes 20 milliseconds or more to

run;

2. tmin: the time to sort an array of size nmin;

3. nmax: the largest power of 10 that takes 10 minutes (600 seconds) or

less to run, or 1,000,000,000 if no input took more than 10 minutes;

4. tmax: the time required to sort nmax elements.

Note, you may need to ensure that you have sufficient stack space before

testing QuickSort to ensure that you do not run out (“stack overflow”). You

can run ulimit -s unlimited in Linux before executing the algorithm to

increase the available stack space. This parameter can also be changed in

many IDEs on Mac and Windows; consult the appropriate documentation if

this becomes a problem.

2

If you are having trouble getting good timing results (e.g., one power-often gives good timing results, but the next is too long and the previous too

quick), you may try using other input sizes, such as 3.2 · 10x

, 1.8 · 10x

, or

5.6·10x

. You may also use different systems to perform the tests for different

algorithm/input combinations, but obviously, you’ll want all of the runs for a

single experiment to be on the same machine, as a timing ratio from different

machines is virtually meaningless.

You should enter your results into a comma-separated value (CSV) file.

The CSV file should contain 5 columns and 10 rows. Your first column

should label the 9 different experiments, while the first row label the experiment variables (nmin, tmin, nmax, and tmax). Your row labels should include

the algorithm name (InsertionSort, MergeSort, or Quicksort) and input type

(Sorted, Random, or Constant). You may abbreviate these labels as I, M, Q,

and S, R, C. (For example, MS represents your MergeSort result on a sorted

array.) An example table appears below. You may use Excel (or any other

software) to prepare your data.

nmin tmin nmax tmax

IC

IS

IR

MC

MS

MR

QC

QS

QR

3.2 Analysis

In this section, you will estimate the complexity of the four algorithms by

comparing the ratio between tmin and tmax to ratios representing the complexity of the algorithm. Specifically, you should compute f(nmin)/f(nmax)

for f1(n) = n, f2(n) = n ln(n), and f3(n) = n

2

. You should round to the

nearest integer when computing these ratios. Finally, you should label each

experiment according to the ratio tmax/tmin most resembles.

For example, if one of your experiments resulting in nmin = 100 and

nmax = 10, 000, 000, your ratios would be f1(nmax)/f1(nmin) = 100, 000,

f2(nmax)/f2(nmin) = 350, 000, and f3(nmax)/f3(nmin) = 10, 000, 000, 000. You

would then label the algorithm based on which of these three ratios tmax/tmin

is most closest to.

3

As part of your report, you should create a chart that includes the computed ratios as well as the behavior of the algorithm (Linear, n lg n, or

Quadratic), across all 9 experiments. An example chart appears below:

tmax/tmin f1 ratio f2 ratio f3 ratio Behavior

IC

IS

IR

MC

MS

MR

QC

QS

QR

You should then write a summary of (1) how your results compare to

the theoretical analysis for the three algorithms (below), and (2) why your

results make sense or are surprising. You should spend more time explaining

your results when they are unusual or unexpected.

Best-case Average-case Worst-case

complexity complexity complexity

InsertionSort Ω(n) Θ(n

2

) O(n

2

)

MergeSort Θ(n lg n) Θ(n lg n) Θ(n lg n)

QuickSort Ω(n lg n) Θ(n lg n) O(n

2

)

4 Submission

For this project, you should submit a zip archive containing (1) a CSV file

containing your results (described in Section 3.1), and (2) your tables and

analysis (described in Section 3.2), in PDF format.

Note: This is an individual project. You are not allowed to submit work

that has been pulled from the Internet, nor work that has been done by your

peers. Your submitted materials will be analyzed for plagiarism. Project 1

will be evaluated out of 50 points:

4

5 Grading

Data file containing results 15 points

Table with ratios 15 points

Analysis 20 points

Requirements for each portion of the grade are described in Sections 3.1

and 3.2.

5

Sale!

# Asymptotic Analysis Project

$30.00

## Reviews

There are no reviews yet.