BME646 and ECE60146: Homework 7

1 Introduction

Got pizza? The goal for this HW is to create your own pizza-generating

Generative Adversarial Networks (GAN). The learning objectives are:

1. Understand how the generator and discriminator networks are trained

to compete against each other in a minimax game.

2. Experiment with two different GAN learning criteria: Binary CrossEntropy (BCE) and the Wasserstein distance.

3. Evaluate your generated images qualitatively, and quantitatively using

the Fr´echet Inception Distance (FID).

2 Getting Ready for This Homework

Before embarking on this homework, do the following:

1. Go through Slide 28 through 44 of the Week 9 slide deck on Semantic

Segmentation [3] and develop a good understanding of the concept of

what is meant by Transpose Convolution.

2. Also go through the Slide 43 through 51 of the same set of Week 9

slides to fully understand the relationship between the Kernel Size,

Padding, and Output Size for Transpose Convolution. Make sure you

understand the example shown on Slide 44 in which a 4-channel 1 × 1

noise vector is expanded into a 2-channel 4 × 4 noise image. This

example is foundational to designing the Generator side of a GAN.

3. Understand the GAN material on Slide 60 through 77 of the Week 11

slide deck on “Generative Adversarial networks” [2]. For additional

1

depth, you may wish to read the original GAN paper by Goodfellow

et al. [5]:

https://arxiv.org/pdf/1406.2661.pdf

4. When you are learning about a new type of a neural network, playing

with an implementation by varying its various parameters and seeing

how that affects the results can often help you gain deep insights in a

short time. If you believe in that philosophy, execute the following the

script in the ExamplesAdversarialLearning directory of DLStudio:

python dcgan_DG1.py

It uses the PurdueShapes5GAN dataset that is described on Slide 56

through 61 of the Week 11 slides. Instructions for downloading this

dataset are on the main DLStudio webpage.

5. For understanding the Wasserstein distance, you need to first read

the explanation on Slide 38 through 43 of the Week 11 slides. Make

sure you understand the 1-Lipschitz condition for imposing smoothness

constraints on the distance function. Now go over Slide 92 through 97

to understand how to create the Critic part of a Critic-Generator pair

for estimating the Wasserstein distance. Finally, look over Slide 100

for how the 1-Lipschitz condition is actually implemented in code.

To play with the Wasserstein-GAN in DLStudio yourself, execute the

following script: wgan_CG1.py, or wgan_CG2.py if you are also interested in applying the gradient penalty. For a good alternative source

of reference on how Wasserstein GAN with gradient penalty can be

implemented, you can read the networks.py file from [1].

3 Programming Tasks

3.1 Building and Training Your GAN

Here are the steps for making your own (fake) pizza:

1. Before starting, make sure you have downloaded the provided pizza

dataset from BrightSpace along with this handout. The dataset contains 8k+ images for training and 1k images for evaluation, all resized

to 64 × 64. Example images are shown in Figure 1.

2

Figure 1: Real pizzas.

2. Your first task in this homework is to conjure up your own generator and discriminator networks. Just like the previous homeworks,

you have total freedom on how you design your networks. The only

network-building requirement is that your generator has to be able to

generate RGB pizza images of size 64 × 64 from random noise vectors

and must do so while utilizing transposed convolutions.

3. Subsequently, you’ll need to write your own adversarial training logic.

You can refer to Slide 64 through 69 of the Week 11 slides to familiarize yourself with how it can be done. For this HW, we ask you to

experiment with two different adversarial learning criteria: the Binary

Cross-Entropy (BCE) loss as originally used by Goodfellow et al. in

[5], and the Wasserstein distance as introduced by Arjovsky et al. in

[4]. You should train two GANs in total, one with the BCE loss and

another with the Wasserstein distance. In the rest of this handout, we

shall call them the BCE-GAN and the W-GAN, respectively.

Note that for the purpose of this HW, we do not require you to enforce

the 1-Lipschitz constraint on your Critic for W-GAN. For enforcing

that constraint, authors of the original W-GAN used the heuristics

of weight clipping (as mentioned on Slide 92), while that is generally

replaced later by Gradient Penalty [6] (Slide 106 through 108).

4. In your report, plot the adversarial losses over training iterations for

both the generator and the discriminator in the same figure. Note that

3

you should make two separate figures for BCE-GAN and W-GAN.

3.2 Evaluating Your GAN

Here are the steps for evaluating your GANs:

1. First, you should generate 1k images of fake pizza from randomly

sampled noise vectors using your trained generator (BCE-GAN or WGAN).

2. For evaluating your generated images quantitatively, you will use the

Fr´echet Inception Distance (FID). Originally proposed in [7], the FID

is a widely used metrics for measuring both the quality and the diversity of GAN-generated images. More specifically, it does so by

measuring how close the distribution of the fake images is to the distribution of the real images. To calculate the FID, one would first

encode the set of real images into feature vectors using a pretrained

Inception network, and then model the resulting distribution of feature vectors using a multivariate Gaussian distribution. The same is

carried out for the set of fake images. Once that is done, the FID is

simply the Fr´echet distance between the two multivariate Gaussians.

3. For this homework, you will be using the pytorch-fid package [8] for

calculating the FIDs. To install the package, use the command:

pip3 install pytorch-fid

Once installed, you can use the pytorch-fid package in a Python

script as follows:

1 from pytorch_fid . fid_score \

2 import calculate_activation_statistics , \

3 calculate_frechet_distance

4 from pytorch_fid . inception import InceptionV3

5

6 real_paths = [’/ real /0.jpg ’, ’/ real /1.jpg ’, …]

7 fake_paths = [’/ fake /0.jpg ’, ’/ fake /1.jpg ’, …]

8 dims = 2048

9 block_idx = InceptionV3 . BLOCK_INDEX_BY_DIM [ dims ]

10 model = InceptionV3 ([ block_idx ]) . to ( device )

11 m1 , s1 = calculate_activation_statistics (

12 real_paths , model , device = device )

13 m2 , s2 = calculate_activation_statistics (

14 fake_paths , model , device = device )

15 fid_value = calculate_frechet_distance ( m1 , s1 , m2 , s2 )

16 print ( f’FID: { fid_value :. 2f}’)

4

4. In your report, for qualitative evaluation, display a 4 × 4 image grid,

similar to what is shown in Figure 1, showcasing images randomly

generated by your BCE-GAN. Also display the same with images by

your W-GAN. You might find the functions in torchvision.utils

really helpful here. For quantitative evaluation, present the FID values

for both GAN variants. Finally, include a paragraph discussing your

results: BCE-GAN v.s. W-GAN, which is better?

4 Submission Instructions

Include a typed report explaining how did you solve the given programming

tasks.

1. Your pdf must include a description of

• The figures and descriptions as mentioned in Sec. 3.

• Your source code. Make sure that your source code files are

adequately commented and cleaned up.

2. Turn in a zipped file, it should include (a) a typed self-contained pdf

report with source code and results and (b) source code files (only .py

files are accepted). Rename your .zip file as hw7 <First Name><Last

Name>.zip and follow the same file naming convention for your pdf

report too.

3. Make sure your submission zip file is under 10MB. Compress

your figures if needed.

4. Do NOT submit your network weights.

5. For all homeworks, you are encouraged to use .ipynb for development

and the report. If you use .ipynb, please convert it to .py and submit

that as source code.

6. You can resubmit a homework assignment as many times as you want

up to the deadline. Each submission will overwrite any previous

submission. If you are submitting late, do it only once on

BrightSpace. Otherwise, we cannot guarantee that your latest submission will be pulled for grading and will not accept related regrade

requests.

7. The sample solutions from previous years are for reference only. Your

code and final report must be your own work.

5

8. To help better provide feedbacks to you, make sure to number your

figures.

References

[1] pytorch-CycleGAN-and-pix2pix. URL https://github.com/junyanz/

pytorch-CycleGAN-and-pix2pix.

[2] Generative Adversarial Networks for Data Modeling, . URL https:

//engineering.purdue.edu/DeepLearn/pdf-kak/GAN.pdf.

[3] Encoder-Decoder Architectures for Semantic Segmentation of Images, . URL https://engineering.purdue.edu/DeepLearn/pdf-kak/

SemanticSeg.pdf.

[4] Martin Arjovsky, Soumith Chintala, and L´eon Bottou. Wasserstein generative adversarial networks. In International conference on machine

learning, pages 214–223. PMLR, 2017.

[5] Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David

Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. Generative adversarial networks. Communications of the ACM, 63(11):139–

144, 2020.

[6] Ishaan Gulrajani, Faruk Ahmed, Martin Arjovsky, Vincent Dumoulin,

and Aaron C Courville. Improved training of wasserstein gans. Advances

in neural information processing systems, 30, 2017.

[7] Martin Heusel, Hubert Ramsauer, Thomas Unterthiner, Bernhard

Nessler, and Sepp Hochreiter. Gans trained by a two time-scale update

rule converge to a local nash equilibrium. Advances in neural information

processing systems, 30, 2017.

[8] Maximilian Seitzer. pytorch-fid: FID Score for PyTorch. https://

github.com/mseitzer/pytorch-fid, August 2020. Version 0.3.0.

6

BME646 and ECE60146

# Homework 7 Pizza-generating Generative Adversarial Networks (GAN)

Original price was: $35.00.$30.00Current price is: $30.00.

## Reviews

There are no reviews yet.