## Description

EECS 442 Computer Vision: Homework 4

Instructions

• The submission includes two parts:

1. To Gradescope: a pdf file as your write-up. Please reading the grading checklist for each part

before you submit it.

You might like to combine several files to make a submission. Here is an example online link

for combining multiple PDF files: https://combinepdf.com/.

Please mark where each question is on gradescope.

2. To Canvas: a zip file including all your code.

• The write-up must be an electronic version. No handwriting, including plotting questions. LATEX is

recommended but not mandatory.

1 Optimization and Fitting [20 pts]

For Problem 1, you will work on the same problem as you did in HW3 1.2.6. Instead of using RANSAC to

fit a homography, you need to fit an affine transformation

y = Sx + t

by gradient descent.

1. (15 pts) Implement the Linear layer and L2 loss in layers.py. In forwards propagation, you’ll

store all inputs in cache and save it for backwards propagation. The functions are shown below.

• Linear layer.

y = Wx + b

.

• L2 loss layer.

L(x, label) = 1

N

X

N

i=1

(xi − labeli)

2

,

2. (5 pts) Implement the gradient descent in fitting.py, report the hyperparameters you choose and

the results you get. Include the figure in your report as well.

1

2 Softmax Classifier with One Layer Neural Network [40 pts]

For Problem 2 and Problem 3, you will implement a softmax classifier from scratch to classify images. You

cannot use any deep learning libraries such as PyTorch in this part.

Implement the ReLU layer and softmax layer in layers.py. These functions are shown below:

• ReLU layer.

y =

(

x, x ≥ 0,

0, otherwise.

• Softmax layer.

yi =

e

xi

PN

j=1 e

xj

Note: When you exponentiate even large–ish numbers in your softmax layer, the result could be quite large

and numpy would return inf. To prevent this case, you can calculate softmax layer with max subtraction:

yi =

e

xi−max(x)

PN

j=1 e

xj−max(x)

It’s also not hard to see these two softmax equations are in fact equivalent since

e

xi−max(x)

PN

j=1 e

xj−max(x)

=

e

xi/emax(x)

PN

j=1 e

xj /emax(x)

=

e

xi

PN

j=1 e

xj

In softmax.py, you need implement your network. You’re only allowed to use layer functions you

implement. However, you’re encouraged to implement additional features in layers.py and use it here.

Filling in all TODOs in skeleton codes will be sufficient.

After making sure your network works, you need train it on CIFAR–10 [1] dataset, which is available at

https://www.cs.toronto.edu/∼kriz/cifar-10-python.tar.gz. CIFAR–10 has 10 classes, 50000 training images,

and 10000 test images. You need split the training set/validation set on the training images by yourself. After

decompressing the downloaded dataset, you can use the provided python inferface to read the CIFAR–10

dataset in train.py, although you’re free to modify it. Preprocess your images and tune the model

hyperparameters in train.py.

2

Figure 1: Example images from CIFAR10 dataset [1]

This task is open-ended. However, here are some suggestions:

• When working on the backward function in layer.py, you need to calculate the gradients of the

objective function. To check your implementation, you can take use of this:

∂f(· · · , xi

, · · ·)

∂xi

= lim

δ→0

f(· · · , xi + δ, · · ·) − f(· · · , xi

, · · ·)

δ

where f : [x1, · · · , xn] ∈ R

n

7→ R. So, you can derive some approximation for the gradients

by setting some small value δ. You can compare the gradient your implementation gets with the

numerical value of the approximation.

• Image preprocessing is important, you can either normalize them to zero mean and unit standard

deviation, or simply scale them to the range [0,1].

• Training neural networks can be hard and slow. Start early!

Grading checklist:

1. For all layers, we will have a series of tests that automatically test whether your functions are written

correctly. We won’t use edge cases. You do not need to include any discussion of your code for part

one unless you have decided to implement some extra features.

2. Your report should detail the architecture you used to train on CIFAR–10. Include information on

hyperparameters chosen for training and a plot showing both training and validation accuracy across

iterations.

3. Report the accuracy on the test set of CIFAR–10. You should only evaluate your model on the test set

once. All hyperparameter tuning should be done on the validation set. We expect you to achieve 40%

accuracy on the test set.

4. Include discussion of the parameter you chose for experimentation, and include your results and

analysis. In terms of discussion, we expect to see plots and data instead of a general description.

3

3 Softmax Classifier with Hidden Layers [20 pts]

Continue to work on softmax.py, add a hidden layer with N dimension on your neural network if

hidden dim is set to a positive integer N, there’s no hidden layer if hidden dim=None. Use ReLU

as your activation function.

Use this model to do the classification as you did in Problem 2 again. We expect you to achieve 50%

accuracy on the test set. Also include the number of hidden dimension in your report. The grading checklist

is the same as the one for Problem 2.

Once you finish the training, save the model with highest test accuracy for your next problem, we’ve already

provided save/load functions for you in softmax.py.

4 Fooling Images [20 pts]

Fooling images is a good way to see where the model works and fails, it’s highly related to another important

field of machine learning: adversarial attack and robustness. In the following example, a few spots on the

image will make the model to misclassify this ship image as airplane although it still looks like a ship.

Figure 2: Original ship image (left) and the fooling image classified as airplane (right).

In this part, you will use gradient ascent to generate a fooling image from a correctly classified image from

CIFAR-10.

1. (5 pts) Finish the remaining code in softmax.py, return the gradient of loss with respect to the input if

return dx flag is True.

2. (15 pts) Gradient ascent is similar to gradient descent. You’ll fix the parameters of the model and

compute the gradient of the classification score with respect to the input image and update the image

iteratively. In this problem, you will use a fooling class to compute the gradient of loss with respect to

the input image instead, in which way you’re doing gradient descent to minimize the loss but gradient

ascent to maximize the classification score of that fooling class.

You will implement the following steps in fooling image.py

• load the trained model you get from the last problem.

4

• load an image that is correctly classified by your model, choose a different class as the fooling

class, fix the model parameters and compute the gradient of the loss with respect to your input

image.

• update your input image with the gradient, repeat this process until your model classifies the

input image as the fooling class.

Include the original image, the fooling image and their difference in your report, you can magnify the

difference if it’s too small. Comment on the robustness of your model.

References

[1] Learning Multiple Layers of Features from Tiny Images, Alex Krizhevsky, 2009.