COMP3011: Design and Analysis of Algorithms

Assignment 2

1. Write a Java, Python, C, or C++ program that outputs a longest common subsequence of three

input sequences X, Y , and Z of symbols from a fixed alphabet. For example, if the input is:

X = hN, I, N, Ei,

Y = hS, I, N, G, I, N, Gi,

Z = hN, I, N, J, A, Si

then the output should be hN, I, Ni. For this input, the optimal solution is unique. For inputs where

the optimal solution is not unique, your program may output any longest common subsequence.

(If you want to use another programming language, get the instructors’ approval first.) (4 marks)

2. In the report, briefly explain how your program works and analyze its time and space complexity.

Also, show a small example input created by you and the corresponding output. (2 marks)

3. When all three input sequences have the same length n, what is the largest n for which your program

manages to compute a solution in less than one minute? Is the execution time or running out of

memory the main issue here? Put this “extreme” example in “Max99999999Z.txt”. (1 mark)

4. A binary sequence is a sequence over the alphabet {0, 1}. For example, h0, 1, 0, 1, 1, 0, 0, 0i is a

binary sequence. Give an example of three binary sequences X, Y , Z, each of length 8, such

that the length of a longest common subsequence of X and W, where W is a longest common

subsequence of Y and Z, is different from the length of a longest common subsequence of X, Y , Z.

You may use your program to find such an example. (2 marks)

5. For 10 different values of n selected by you, use your program to compute A(n), defined as follows.

A(n) is the average length of a longest common subsequence of three randomly generated binary

sequences X, Y , Z of length n, where all symbols in the sequences are chosen independently and

uniformly at random from {0, 1} and the average is taken over 10 independent runs. In other

words, you need to run your program 10 times for each n and therefore 100 times in total.

Present your results in a table with ten rows and three columns: one column for n, one column

for A(n), and one column for the ratio A(n)/n. According to your experiments, does A(n)/n seem

converge to any constant, and if so, what is its value? (3 marks)

6. This question concerns the case of two input sequences. Prove that every algorithm that outputs all

longest common subsequences of two input sequences has a worst-case running time that is exponential. To do so, show how to define, for every positive integer n, two length-n sequences Xn, Yn

having Ω(c

n) different longest common subsequences, where c is a constant. You are allowed to use

an alphabet of size n, i.e., the symbols in Xn, Yn can come from a set {a1, a2, . . . , an}. (3 marks)

## Reviews

There are no reviews yet.