## Description

Bonus Project

Vector Norms

A vector norm assigns a size to a vector, in such a way that scalar multiples do what we

expect, and the triangle inequality is satisfied. There are four common vector norms in

dimensions:

The vector norm

The (or “Euclidean”) vector norm

The vector norm

The vector norm

To compute the norm of a vector in Matlab:

norm(x,1);

norm(x,2)= norm(x);

norm(x,p);

norm(x,inf)

(Recall that inf is the Matlab name corresponding to .)

Exercise 1: For each of the following column vectors:

x1 = [ 4; 6; 7 ]

x2 = [ 7; 5; 6 ]

x3 = [ 1; 5; 4 ]

compute the vector norms, using the appropriate Matlab commands. Be sure your

answers are reasonable.

L1 L2 L Infinity

x1 __________ __________ __________

x2 __________ __________ __________

x3 __________ __________ __________

Matrix Norms

A matrix norm assigns a size to a matrix, again, in such a way that scalar multiples do what

we expect, and the triangle inequality is satisfied. However, what’s more important is that we

want to be able to mix matrix and vector norms in various computations. So we are going to

be very interested in whether a matrix norm is compatible with a particular vector norm, that

is, when it is safe to say:

There are four common matrix norms and one “almost” norm:

The or “max column sum” matrix norm:

Remark: This is not the same as the norm of the vector of dimension whose

components are the same as .

The matrix norm:

where is a (necessarily real and non-negative) eigenvalue of or

where is a singular value of ;

The or “max row sum” matrix norm:

Remark: This is not the same as the norm of the vector of dimension whose

components are the same as .

There is no matrix norm in Matlab.

The “Frobenius” matrix norm:

Remark: This is the same as the norm of the vector of dimension whose

components are the same as .

The spectral radius (not a norm):

(only defined for a square matrix), where is a (possibly complex) eigenvalue of .

To compute the norm of a matrix in Matlab:

norm(A,1);

norm(A,2)=norm(A);

norm(A,inf);

norm(A,’fro’)

See below for computation of (the spectral radius of )

Compatible Matrix Norms

A matrix can be identified with a linear operator, and the norm of a linear operator is usually

defined in the following way.

(It would be more precise to use rather than here but the surface of a sphere in finitedimensional space is a compact set, so the supremum is attained, and the maximum is

correct.) A matrix norm defined in this way is said to be “vector-bound” to the given vector

norm.

In order for a matrix norm to be consistent with the linear operator norm, you need to be able

to say the following:

(1)

but this expression is not necessarily true for an arbitrarily chosen pair of matrix and vector

norms. When it is true, then the two are “compatible”.

If a matrix norm is vector-bound to a particular vector norm, then the two norms are

guaranteed to be compatible. Thus, for any vector norm, there is always at least one matrix

norm that we can use. But that vector-bound matrix norm is not always the only choice. In

particular, the matrix norm is difficult (time-consuming) to compute, but there is a simple

alternative.

Note that:

The , and matrix norms can be shown to be vector-bound to the

corresponding vector norms and hence are guaranteed to be compatible with them;

The Frobenius matrix norm is not vector-bound to the vector norm, but is

compatible with it; the Frobenius norm is much faster to compute than the matrix

norm (see Exercise 6 below).

Exercise 2: Consider each of the following column vectors:

x1 = [ 4; 6; 7 ];

x2 = [ 7; 5; 6 ];

x3 = [ 1; 5; 4 ];

and each of the following matrices

A1 = [38 37 80

53 49 49

23 85 46];

A2 = [77 89 78

6 34 10

65 36 26];

Check that the compatibility condition (1) holds for each case in the following table by

computing the ratios

and filling in the following table. The third column should contain the letter “S” if the

compatibility condition is satisfied and the letter “U” if it is unsatisfied.

Suggestion: I recommend you write a script m-file for this exercise. If you do, please

include it with your summary.

Matrix Vector

norm norm S/U r(A1,x1) r(A1,x2) r(A1,x3) r(A2,x1) r(A2,x2) r(A2,x3)

(p) (q)

1 1 ___ _________ __________ __________ __________ __________ __________

1 2 ___ _________ __________ __________ __________ __________ __________

1 inf ___ _________ __________ __________ __________ __________ __________

2 1 ___ _________ __________ __________ __________ __________ __________

2 2 ___ _________ __________ __________ __________ __________ __________

2 inf ___ _________ __________ __________ __________ __________ __________

inf 1 ___ _________ __________ __________ __________ __________ __________

inf 2 ___ _________ __________ __________ __________ __________ __________

inf inf ___ _________ __________ __________ __________ __________ __________

Types of Errors

A natural assumption to make is that the term “error” refers always to the difference between

the computed and exact “answers.” We are going to have to discuss several kinds of error, so

let’s refer to this first error as “solution error” or “forward error.” Suppose we want to solve a

linear system of the form , (with exact solution ) and we computed . We

define the solution error as .

Usually the solution error cannot be computed, and we would like a substitute whose behavior

is acceptably close to that of the solution error. One example would be during an iterative

solution process where we would like to monitor the progress of the iteration but we do not

yet have the solution and cannot compute the solution error. In this case, we are interested in

the “residual error” or “backward error,” which is defined by

where, for convenience, we have defined the variable

to equal . Another way of looking at the residual error is to see that it’s

telling us the difference between the right hand side that would “work” for versus

the right hand side we have.

If we think of the right hand side as being a target, and our solution procedure as determining

how we should aim an arrow so that we hit this target, then

The solution error is telling us how badly we aimed our arrow;

The residual error is telling us how much we would have to move the target in order

for our badly aimed arrow to be a bullseye.

There are problems for which the solution error is huge and the residual error tiny, and all the

other possible combinations also occur.

Exercise 3: For each of the following cases, compute the Euclidean norm (two norm)

of the solution error and residual error and characterize each as “large” or “small.”

(For the purpose of this exercise, take “large” to mean greater than 1 and “small” to

mean smaller than 0.01.) In each case, you must find the true solution first (Pay

attention! In each case you can solve in your head for the true solution, xTrue.), then

compare it with the approximate solution xApprox.

1. A=[1,1;1,(1-1.e-12)], b=[0;0], xApprox=[1;-1]

2. A=[1,1;1,(1-1.e-12)], b=[1;1], xApprox=[1.00001;0]

3. A=[1,1;1,(1-1.e-12)], b=[1;1], xApprox=[100;100]

4. A=[1.e+12,-1.e+12;1,1], b=[0;2], xApprox=[1.001;1]

Case Residual large/small xTrue Error large/small

1 ________ _______ ______ _______ ________

2 ________ _______ ______ _______ ________

3 ________ _______ ______ _______ ________

4 ________ _______ ______ _______ ________

The linear system “problem”

The linear system “problem” can be posed in the following way. Find an –

vector that satisfies the matrix equation

(1)

where is a matrix and is a -vector. In Matlab,

both and are column vectors.

You probably already know that there is “usually” a solution if the matrix is

square, (that is, if ). We will concentrate on this case for now. But you may

wonder if there is an intelligent response to this problem for the cases of a square

but singular matrix, or a rectangular system, whether overdetermined or

underdetermined.

At one time or another, you have probably been introduced to several algorithms

for producing a solution to the linear system problem, including Cramer’s rule

(using determinants), constructing the inverse matrix, Gauß-Jordan elimination and

Gauß factorization. We will see that it is usually not a good idea to construct the

inverse matrix and we will focus on Gauß factorization for solution of linear

systems.

We will be concerned about several topics:

Efficiency: what algorithms produce a result with less work?

Accuracy: what algorithms produce an answer that is likely to be more

accurate?

Difficulty: what makes a problem difficult or impossible to solve?

Special cases: how do we solve problems that are big? symmetric? banded?

singular? rectangular?

The inverse matrix

A classic result from linear algebra is the following:

Theorem The linear system problem (1) is uniquely solvable for arbitrary if and

only if the inverse matrix exists. In this case the solution can be expressed

as .

So what’s the catch? There are a few:

Computing the inverse takes a lot of time-more time than is necessary if you

only want to know the solution of a particular system.

In cases when the given matrix actually has very many zero entries, as is the

case with the dif2 matrix, computing the inverse is enormously more

difficult and time consuming than merely solving the system. And it is quite

often true that the inverse matrix requires far more storage than the matrix

itself. The second difference matrix, for example, is an “M-matrix” and it

can be proved that all the entries in its inverse are positive numbers. You

only need about numbers to store the dif2 matrix because you can often

avoid storing the zero entries, but you need numbers to store its inverse.

Sometimes the inverse is so inaccurate that it is not worth the trouble to

multiply by the inverse to get the solution.

In the following exercises, you will see that constructing the inverse matrix takes

time proportional to (for an matrix), as does simply solving the system,

but that solving the system is several times faster than constructing the inverse

matrix. You will also see that matrices with special formats, such as tridiagonal

matrices, can be solved very efficiently. And you will see even simple matrices can

be difficult to numerically invert.

You will be measuring elapsed time in order to see how long these calculations

take. The problem sizes are designed to take a modest amount of time (less than 6

minutes) on newer computers.

WARNING: Most newer computers have more than one processor (core). By

default, Matlab will use all the processors available by assigning one “thread” to

each available processor. This procedure will mess up our timings, so you should

tell Matlab to use only a single thread. To do this, you should start up Matlab with

the command line

matlab -singleCompThread

If you do not know how to start up Matlab from the command line, use the

following command at the beginning of your session.

maxNumCompThreads(1);

Matlab will warn you that this command will disappear in the future, but it should

work fine now.

Exercise 4: The Matlab command inv(A) computes an approximation for the

inverse of a matrix A. Do not worry for now just how it does it, but be

assured that it uses one of the most efficient and reliable methods available.

Matlab provides the commands tic and toc to measure computational time.

The tic command starts the stopwatch, and the toc command stops it, either

printing the elapsed time or returning its value as in the

expression elapsedTime=toc;. The times are in seconds.

(Note: tic and toc measure elapsed time. When a computer is doing more

than one thing, the cputime function can be used to measure how much

computer time is being used for this task alone.)

1. Copy the following code to a Matlab function m-file

named exer4.m and modify it to produce information for the table

below. Be sure to add comments and your name and the date. Include

the file with your summary.

2. function elapsedTime=exer4(n)

3. % elapsedTime=exer4(n)

4. % comments

5.

6. % your name and the date

7.

8. if mod(n,2)==0

9. error(‘Please use only odd values for n’);

10. end

11.

12. A = magic(n); % only odd n yield invertible matrices

13. b = ones(n,1); % the right side vector doesn’t change the

time

14. tic;

15. Ainv = ??? % compute the inverse matrix

16. xSolution = ??? % compute the solution

17. elapsedTime=toc;

18.You have seen in lecture that the time required to invert

an matrix should be proportional to . Fill in the following

table, where the column entitled “ratio” should contain the ratio of the

time for n divided by the time for the preceeding value of n.

(Note: timings are not precise! Furthermore, the first time a function

is used results in Matlab reading the m-file and “compiling it,” and

that takes considerable time. The last row of this table may take

several minutes!)

Remark: Compute the first line of the table twice and use the second

value. The first time a function is called involves substantial overhead

that later calls do not require.

19. Time to compute inverse matrices

20. n time ratio

21. 161 _________

22. 321 _________ _________

23. 641 _________ _________

24. 1281 _________ _________

25. 2561 _________ _________

26. 5121 _________ _________

27. 10241 _________ _________

28.Are these solution times roughly proportional to ?

Exercise 5: Matlab provides a special operator, the backslash (\) operator,

that is designed to solve a linear system without computing the inverse. It is

used in the following way, for a matrix A and column vector b.

x=A\b;

It may help you to remember this operator if you think of the A as being

“underneath” the slash. The effect of this operator is to find the solution of

the system of equations A*x=b.

The backslash looks strange, but there is a method to this madness. You

might wonder why you can’t just write x=b/A. This would put the column

vector bto the left of the matrix A and that is the wrong order of operations.

1. Copy exer4.m to exer5.m and replace the inverse matrix computation

and solution with an expression using the Matlab “\” command, and

fill in the following table

2. Time to compute solutions

3. n time ratio

4. 161 _________

5. 321 _________ _________

6. 641 _________ _________

7. 1281 _________ _________

8. 2561 _________ _________

9. 5121 _________ _________

10. 10241 _________ _________

11.Are these solution times roughly proportional to ?

12.Compare the times for the inverse and solution and fill in the

following table

13. Comparison of times

14. n (time for inverse)/(time for solution)

15. 161 _________

16. 321 _________

17. 641 _________

18. 1281 _________

19. 2561 _________

20. 5121 _________

21. 10241 _________

22.Theory shows that computation of a matrix inverse should take

approximately three times as long as computation of the solution. Are

your results consistent with theory?

You should be getting the message: “You should never compute an inverse

when all you want to do is solve a system.”

Warning: the “\” symbol in Matlab will work when the matrix A is not square. In

fact, sometimes it will work when A actually is a vector. The results are not usually

what you expect and no error message is given. Be careful of this potential for error

when using “\”. A similar warning is true for the / (division) symbol because

Matlab will try to “interpret” it if you use it with matrices or vectors and you will

get “answers” that are not what you intend.

Gauß factorization

The standard method for computing the inverse of a matrix is Gaußian elimination.

You already know this algorithm, perhaps by another name such as row reduction..

First, we will prefer to think of the process as having two separable parts, the

“factorization” or “forward substitution” and “back substitution” steps.

We now turn to writing code for the Gauß factorization. The discussion in this lab

assumes that you are familiar with Gaußian elimination (sometimes called “row

reduction”), and is entirely self-contained. Because, however, interchanging rows is

a source of confusion, we will assume at first that no row interchanges are

necessary.

The idea is that we are going to use the row reduction operations to turn a given

matrix into an upper triangular matrix (all of whose elements below the main

diagonal are zero) and at the same time keep track of the multipliers in the lower

triangular matrix . These matrices will be built in such a way that could be

reconstructed from the product at any time during the factorization procedure.

Alert: The fact that even before the computation is completed can be a

powerful debugging technique. If you get the wrong answer at the end of a

computation, you can test each step of the computation to see where it has gone

wrong. Usually you will find an error in the first step, where the error is easiest to

understand and fix.

Let’s do a simple example first. Consider the matrix

There is only one step to perform in the row reduction process: turn the 1 in the

lower left into a 0 by subtracting half the first row from the second. Convince

yourself that this process can be written as

(2)

If you want to write, , you need to have be the inverse of the first matrix

on the left in (2). Thus

and

are matrices that describe the row reduction step ( ) and its result ( ) in such

a way that the original matrix can be recovered as .

The Hilbert matrix

The Hilbert matrix is related to the interpolation problem on the interval [0,1]. The

matrix is given by the formula . For example,

with the Hilbert matrix is:

The Hilbert matrix arises in interpolation and approximation contexts because it

happens that . The Hilbert matrix is at once nice because

its inverse has integer elements and also not nice because it is difficult to compute

the inverse accurately using the usual formulæ to invert a matrix.

Now, suppose you have a 5 by 5 Hilbert matrix (switching to Matlab notation) A,

and you wish to perform row reduction steps to turn all the entries in the first

column below the first row into zero, and keep track of the operations in the

matrix L and the result in U. Convince yourself that the following code does the

trick.

Matlab has special functions for the Hilbert matrix and its inverse,

called hilb(n) and invhilb(n), and we will be using these functions in this lab.

n=5;

Jcol=1;

A=hilb(5);

L=eye(n); % square n by n identity matrix

U=A;

for Irow=Jcol+1:n

% compute Irow multiplier and save in L(Irow,Jcol)

L(Irow,Jcol)=U(Irow,Jcol)/U(Jcol,Jcol);

% multiply row “Jcol” by L(Irow,Jcol) and subtract from row “Irow”

% This vector statement could be replaced with a loop

U(Irow,Jcol:n)=U(Irow,Jcol:n)-L(Irow,Jcol)*U(Jcol,Jcol:n);

end

Exercise 6:

1. Use the above code to compute the first row reduction steps for the

matrix A=hilb(5). Print the resulting matrix U and include it in the

summary report. Check that all entries in the first column in the

second through last rows are zero or roundoff. (If any are non-zero,

you have an error somewhere. Find the error before proceeding.)

2. Multiply L*U and confirm for yourself that it equals A. An easy way to

do this is to compute norm(L*U-A,’fro’)/norm(A,’fro’). (As you have

seen, the Frobenius norm is faster to compute than the default 2-norm,

especially for large matrices.)

3. Use the code given above as a model for an m-file

called gauss_lu.m that performs Gaußian reduction without pivoting.

The function should start out

4. function [L,U]=gauss_lu(A)

5. % function [L,U]=gauss_lu(A)

6. % performs an LU factorization of the matrix A using

7. % Gaussian reduction.

8. % A is the matrix to be factored.

9. % L is a lower triangular factor with 1’s on the diagonal

10. % U is an upper triangular factor.

11. % A = L * U

12.

13. % your name and the date

14.Use your routine to find L and U for the Hilbert matrix of order 5.

Include L and U in your summary.

15.What is U(5,5), to at least four significant figures?

16.Verify that L is lower triangular (has zeros above the diagonal)

and U is upper triangular (has zeros below the diagonal).

17.Confirm that L*U recovers the original matrix.

18.The Matlab expression R=rand(100,100) will generate

a matrix of random entries. Do not print this matrix-it has

10,000 numbers in it.Use gauss_lu to find its factors LR and UR.

o Use norms to confirm that LR*UR=R, without printing the

matrices.

o Use tril and triu to confirm that LR is lower triangular

and UR is upper triangular, without printing the matrices.

It turns out that omitting pivoting leaves the function you just wrote vulnerable to

failure.

Exercise 7:

1. Compute L1 and U1 for the matrix

2. A1 = [ -2 1 0 0 0

3. 1 0 1 -2 0

4. 0 0 0 1 -2

5. 1 -2 1 0 0

6. 0 1 -2 1 0]

The method should fail for this matrix, producing matrices with

infinity (inf) and “Not a Number” (NaN) in them.

7. On which step of the decomposition (values of Irow and Jcol) does

the method fail? Why does it fail?

8. What is the determinant of A1? What is the condition number

(cond(A1))? Is the matrix singular or ill-conditioned?

In Gaußian elimination it is entirely possible to end up with a zero on the diagonal,

as the previous exercise demonstrates. Since you cannot divide by zero, the

procedure fails. It is also possible to end up with small numbers on the diagonal

and big numbers off the diagonal. In this case, the algorithm doesn’t fail, but

roundoff errors can overwhelm the procedure and result in wrong answers. One

solution to these difficulties is to switch rows and columns around so that the

largest remaining entry in the matrix ends up on the diagonal. Instead of this “full

pivoting” strategy, it is almost as effective to switch the rows only, instead of both

rows and columns. This is called “partial pivoting..