Assignment #4 Logistic Regression


Rate this product

Machine Learning
COMP 5630/ COMP 6630/ COMP 6630 – D01

Assignment #4
Logistic Regression
Submission Instructions
This assignment is due Tuesday, September 22, 2022, at 11:59pm. Please submit your solutions via
Canvas ( You should submit your assignment as a typeset PDF.
Please do not include scanned or photographed equations as they are difficult for us to grade.
Late Submission Policy
The late submission policy for assignments will be as follows unless otherwise specified:
1. 75% credit within 0-48 hours after the submission deadline.
2. 50% credit within 48-96 hours after the submission deadline.
3. 0% credit after 96 hours after the submission deadline.
1 General Questions About Logistic Regression [60 pts]
1. [10 Points] Explain why logistic regression is a discriminative classifier (as opposed to a
generative classifier such as Naive Bayes).
2. [10 Points] Recall the prediction rule for logistic regression is if p(y
j = 1|x
) > p(y
j = 0|x
then predict 1, otherwise predict 0. What does the decision boundary of logistic regression
look like? Justify your answer (e.g., try to write out the decision boundary as a function of
w0, w1, w2 and x
, x
3. In this question, we will derive the logistic regression algorithm (the M(C)LE and its gradient). For simplicity, we assume the dataset is two-dimensional. Given a training set
, yi
);i = 1, …, n} where x
i ∈ R2
is a feature vector and y
i ∈ 0, 1 is a binary label, we
want to find the parameters ˆw that maximize the likelihood for the training set, assuming a
parametric model of the form.
(a) [20 Points] Below, we give a derivation of the conditional log likelihood. In this
derivation, provide a short justification for why each line follows from the previous one.
Next, we will derive the gradient of the previous expression with respect to w0, w1, w2,
i.e., ∂l(w)
, where l(w) denotes the log likelihood from part 1. We will perform a few steps
of the derivation, and then ask you to do one step at the end. If we take the derivative
of Expression 8 with respect to wi for i ∈ {1, 2}, we get the following expression:
The blue expression is linear in wi
, so it can be simplified to Pn
j=1 y
. For the red
expression, we use the chain rule as follows (first we consider a single j ∈ [1, n]).
(b) [20 Points] Now, use Equation 13 (and the previous discussion) to show that overall,
Expression 9, i.e., ∂l(w)
, is equal to
Hint: does Expression 13 look like a familiar probability?
Since the log likelihood is concave, it is easy to optimize using gradient ascent. The final
algorithm is as follows. We pick a step size η, and then perform the following iterations
until the change is < ϵ:
2 Logistic Regression Implementation [40 pts]
In this assignment you will implement simple linear classifiers and run them on the following
Mushroom dataset: a simple categorical binary classification dataset. Please note that the
labels in the dataset are 0/1, as opposed to -1/1 as in the lectures, so you may have to change
either the labels or the derivations of parameter update rules accordingly.
The goal of this assignment is to help you understand the fundamentals of the classic logistic
regression method and become familiar with scientific computing tools in Python. You will also
get experience in hyperparameter tuning and using proper train/validation/test data splits.
Download the starting code from the package (“Logistic”) provided
to you.
The top-level notebook (“Logistic.ipynb”) will guide you through all of the steps. Setup instructions are below. The format of this assignment is inspired by the Stanford CS231n assignments,
and we have borrowed some of their data loading and instructions in our assignment IPython
None of the parts of this assignment require the use of a machine with a GPU. You should be
able to complete the assignment using your local machine.
Environment Setup (Local): You will need a Python environment set up with the appropriate packages.
IPython: The assignment is given to you in the Logistic.ipynb file. ensure that IPython is
installed ( You may then navigate to the assignment directory
in the terminal and start a local IPython server using the jupyter notebook command.
Reporting: Describe the hyperparameter tuning you tried for learning rate and number of
epochs. Report the optimal hyperparameter setting you found in the list below. Also report your
training, validation, and testing accuracy with your optimal hyperparameter setting.
• Optimal hyperparameters:
• Training accuracy:
• Validation accuracy:
• Test accuracy:
Also, create two plots as following:
1. Fix the optimal learning rate, then create plot where x axis varies the number of epochs and
y-axis plots the training, validation, and testing accuracy.
2. Fix the optimal number of epochs, then create plot where x axis varies the learning rate and
y-axis plots the training, validation, and testing accuracy.
Finally, submit “” on Canvas as well.
Disclaimers: This assignment re-uses some materials from the publicly available website:
CMU Introduction to Machine Learning Course, 10-315, Spring 2019. I personally thank Prof.
Maria-Florina Balcan for sharing her teaching materials publicly. This assignment is exculively
used for instructional purposes.
We also thank University of Illinois at Urbana-Champaign for developing this assignment.
Credit geoes to Justin Lizama (, Daniel Gonzales (
and Weilin Zhang (

Scroll to Top