CSC2503—Foundations of Computer Vision

Assignment 1: Robust Estimation

The goal of this part of the assignment is to gain experience using robust estimation to extract simple

image models from noisy image measurements.

To begin: Download the starter code A1handout.zip from the course homepage.

What to submit. Write a short report addressing each of the itemized questions below. Pack your

PDF report and the completed Matlab files for each question into a file called A1.zip, and submit

the zip file electronically.

How to submit. Create a tarfile called assign1.tar.gz and use the following command on a

Teaching Labs (a.k.a CDF) computer to submit it electronically:

> submit -c csc2503h -a assign1 assign1.tar.gz

Late policy. There will be a 10% marks deduction for each day late, for up to three days. Deductions

begin 24 hours after the due date and time (ie. 2pm on October 12th). Assignments submitted more

than 72 hours late will not be accepted.

Robustly Fitting Circles. A geneticist wishes to automatically count cells in microscope images,

such as the one depicted in Figure 1 (left). The geneticist is especially interested in cells at different

stages of cell division. Here we consider an application of robust estimation that will get us started

on a solution for the geneticist. Cells are actually elliptical but we’ll begin by fitting circles.

Figure 1: Cell image (left), foreground pixels (middle), and fitted cells (right, in red).

The main script in the handout code, called findCellScript.m, reads in a microscope image of

cells (Figure 1 (left)). We have already preprocessed the cell images to detect foreground cell pixels

(Figure 1 (middle)). We have also labelled the connected sets of foreground pixels, which are read

in as a second image. Finally, the script computes Canny edgels from the foreground mask (we will

briefly discuss this edge detection method later in the course; knowing and/or understanding it is

not necessary for this assignment). The code then considers each connected component individually,

1

looking for circular cells (see Figure 1 (right)).

Your job is to fit circles to the edgels obtained from the boundaries of the connected components.

Each fitted circle should correspond to one “cell” including, perhaps, a small circle for a cell bud

just being born, by splitting off from another cell (see Figure 1 (right)). This is a vague definition

of what constitutes a “cell”. However, if you study Figure 1 (right) (e.g., by enlarging the electronic

copy), it should be intuitively clear what is meant by a “cell”. Part of your job in this assignment is

to decide on how to operationally define our intuitive notion of one cell.

We aim to fit circles using edgel measurements. Let the k

th edgel be denoted by (~xk, ~nk), where ~xk

is the measured image position and ~nk is the measured edgel normal. The edgel normal points in

the direction of increasing brightness of the foreground regions Figure 1 (middle); i.e., they point

towards the interior of the cells.

Because cells occur in close proximity, a single connected foreground component will often contain

more than one cell. It is therefore natural to formulate circle fitting as a robust estimation problem.

That is, a set of edgel measurements {~xk, ~nk}

N

k=1 from a single connected component will often

correspond to multiple cells. When estimating the parameters of one circle (cell) we can view the

measurements from any other circle (cells) as outliers. To this end we will use the Geman-McLure

(GM) estimator,

ρ(e, σg) = e

2

σ

2

g + e

2

. (1)

In what follows we are going to take a greedy approach to this problem, finding one circle at a time.

The approach comprises four main steps: 1) quickly generating a set of plausible circle hypotheses,

2) choosing one among them as an initial guess for optimization, 3) robust estimation to optimize

the fit, and 4) the model update, for which we decide whether or not the optimized fit is sufficient

and which edgel measurements are owned by the circle.

Below there are several questions, including some derivations and some programming. To simplify

the programming, the assignment handout includes a partial implementation of the general strategy

outlined above, i.e., findCellScript.m. For the programming parts of the assignment you need to

complete several Matlab m-functions, as described below. You may also make small changes to the

handout code.

1. Noise model: To estimate the parameters of a circle, we’ll focus on the use of the edgel position

measurements ~xk. The key constraint on position measurements is that the distance from the

center to any position on the circle must equal the radius. For measurement ~xk, and a circle with

centre ~xc and radius r, we can write the error as

ek = ||~xk − ~xc||2 − r

2 = ~a T

~xk + b + ~x T

k

~xk . (2)

where ~a ≡ −2~xc and b ≡ ~x T

c ~xc − r

2

. The change in parameterization, from ~xc and r to ~a and b

is convenient because ek is linear in ~a and b. Of course it is straightforward to find the circle’s

centre ~xc and radius r from ~a and b.

To properly formulate an estimator we should understand the nature of the expected errors, ek. To

that end, let’s assume that the observed positions (measurements) are equal to the true positions,

denoted ˆ~xk, plus independent, isotropic Gaussian noise ~m ∼ N (0, σ2

I); i.e.,

~xk = ˆ~xk + ~m , where p(~m) = 1

2πσ2

e

−~mT ~m/2σ

2

. (3)

2

With this measurement noise model, what is the distribution of the errors ek? How does it depend

on the circle parameters ~xc and r?

2. LS estimator: Suppose we approximate the distribution of the errors, ek, as a mean-zero Gaussian (Normal) distribution. In other words, assume that the only source of error on the RHS of

Eqn. (2) is additive, mean-zero Gaussian noise. What is a reasonable choice for the variance of

the Gaussian noise?

Assuming that you are given a set of independent measurements, {~xk}

N

k=1, all of which arise from

the same circle, a natural way to derive an estimator is to formulate the negative log likelihood

of the measurements (eg. see Sections 4.1 and 4.4 of Prince’s book). The maximum likelihood

estimator is found by minimizing the negative log likelihood with respect to the unknowns ~a and

b. Beginning with this likelihood, derive a least-squares estimator.

3. Circle proposals. The first step of our greedy strategy is to generate a set of circle proposals

for a given set of edgel measurements {(~xk, ~nk)}

N

k=1. For this step you may find it helpful to use

both the position measurements and the normal measurements.

This requires that you complete the m-function getProposals.m (in the circleFit subdirectory).

When finished this function should return a P × 3 array named circles. Each row of circles

corresponds to the parameters (xc, yc, r) of a circle, where ~xc = (xc, yc)

T denotes the image

position of the center of the circle, and r denotes the radius of the circle. Your implementation of

getProposals should attempt to efficiently produce up to numGuesses proposals. The idea is to

quickly find initial guesses from which you can select one to be subsequently refined with the robust

estimator. Since this will be run numerous times, it is essential that this step require minimal

computational effort. In your write up, clearly explain the algorithm you used to automatically

generate circle proposals, along with the reasons why you chose it over other possibilities.

If you have trouble with this step, you can begin by writing getProposals so that it simply

generates proposals from moused-in points (say a center point plus one point on the circle). You

won’t get any marks for this, but it will help you get started on other parts of the problem.

4. Circle selection. The next step in findCellScript.m is to select the best circle from the

proposed circles in the first step (Question 3 above). For example, a good circle might be close

to many edgels in the data. Implement your selection process in the function bestProposal, so

that it chooses a promising circle from the list of proposals. In your write up, clearly describe

how you select the best circle (i.e., be specific about what you mean by “best”), and explain your

reasons for this choice.

5. Robust fitting. Beginning with the best circle from the second step (Question 4) as an initial

guess, the next step is to optimize the fit. To this end, derive an IRLS algorithm for estimating

the circle parameters (~a, b) which (locally) minimize the objective function

O(~a, b) = X

k

ρ(ek(~a, b), σg) . (4)

Here ek is as defined in (2). In your write-up, include the derivation of the these equations for ~a

and b, and clearly describe your IRLS algorithm.

Complete the function fitCircleRobust to implement this IRLS algorithm for robustly fitting

a circle to your edgel data. In order to test your code, you can initially set demoRobustConv to

true. This displays how your code behaves for each initial guess provided by getProposals in the

second step above. Experiment with different values of the robust estimator’s scale parameter σg

(sigmaGM in the code). Ideally, we would choose σ

2

g

to be equal to the variance of the errors ek for

3

inlier measurements. In practice, describe is a suitable way to choose σg for your implementation.

What goes wrong when σg is much too large or much to small?

Before continuing on, set demoRobustConv back to false.

6. Model update. Finally, given the robustly fit circle, decide if it should be kept in the model

(see the function isGoodCircle). If it is decided that it should be kept, then the handout code

greedily removes the edgels with sufficiently high weights from the data. In your write up, explain

how you would choose a good threshold to be applied to the weights. Also, describe on what basis

your isGoodCircle function decides to keep a new circle, and justify your choice.

These four steps (Questions 3-6) are then repeated to fit additional circles. The cell finder is now

complete.

7. Brief evaluation. Your completed code should now provide a very reasonable first cut at solving

our geneticist’s problem. It should be expected to miss cells that have been significantly cropped

due to the side of the images. Also, small buds on some cells might be missed (see Figure 1).

Other than these two “failure modes”, your algorithm should work well. What other situations

are observed to cause your algorithm to fail to find a plausible set of circles? Briefly describe

what you might do to try to fix these remaining problems.

Academic Honesty. Cheating on assignments has very serious repercussions for the students

involved, far beyond simply getting a zero on their assignment. This is especially so for graduate

students.

This assignment is strictly individual work. It should not be discussed with anyone, even at the

level of ideas. The TAs are very experienced in detecting signs of collaboration between students

and there is a zero tolerance on cheating. Even minor infractions will be identified and reported

to the department and the Dean’s office.

The course’s assignments cover widely-used vision techniques, some of which have been used in prior

instalments of this course. Answers to some of the written questions—and even code for some of the

assignments—may be just a mouse click (or google search) away. You must resist the temptation to

search for, download and/or adapt someone else’s solutions and submit them as your own without

proper attribution. Accidentally stumbling upon a solution is no excuse either: the minute you

suspect a webpage describes even a partial solution to this assignment, either stop reading it or cite

it in the report you submit. Simply put: if an existing solution is easy for you to find, it is just as

easy for us to find it as well. In fact, you can be sure we already know about it!!

Clearly, web searches related to generic Matlab commands and/or generic concepts from calculus or

probability theory are not out of bounds. However, if you have any doubt about whether or not you

should cite material that you read online (or somewhere else) as you were working on the assignment,

the safest thing to do is to just cite it. In such a case, your grade will depend on how much of the

solution is your own work and how much of it came from the cited sources. But regardless of the

grade you get, you will not be in violation of the University’s Academic Integrity policies.

4

CSC2503—Foundations of Computer Vision

# Assignment 1: Robust Estimation

Original price was: $40.00.$30.00Current price is: $30.00.

## Reviews

There are no reviews yet.