Assignment 2 gradient descent algorithm




5/5 - (2 votes)

Assignment 2
75 marks
Remember to comment your code well, in order to get marks reserved for comments
Q1. This question builds on the gradient descent question I asked in the mid sem exam.
I have shown you above the completed version of the gradient descent algorithm. Now,
(a) Use this function to find minima for (i) x2
+ 3x+4 and (ii) x4
– 3×2
+2x. [5 points]
(b) Write a gradient function to calculate gradients for a linear regression y = ax + b [10 points]
(c) Generate artificial data for this regression according to the following protocol
and use gradient descent to find the optimal parameters relating X with y. If you do this correctly,
you should get {a,b} ~ {0.3, 2}. [10 points]
(d) Implement minibatch stochastic gradient descent using the code base you have developed so far.
[15 points]
(e) Does SGD do better or worse in terms of time performance on our data? Is there an optimal
minibatch size that works best? Quantify and interpret your findings. [10 points]
Q2. Surprise! This problem too builds on a problem that I asked in the mid-sem exam. Consider
again this Bayesian network
and calculate
(i) the probability that someone has both cold and a fever [5 points]
(i) the probability that someone who has a cough has a cold. [10 points]
Show your work, not just the final answer.
Q3. Derive the MLE for the parameters of a k-sided multinomial distribution. [10 points]