## Description

Assignment 2

75 marks

Remember to comment your code well, in order to get marks reserved for comments

Q1. This question builds on the gradient descent question I asked in the mid sem exam.

I have shown you above the completed version of the gradient descent algorithm. Now,

(a) Use this function to find minima for (i) x2

+ 3x+4 and (ii) x4

– 3×2

+2x. [5 points]

(b) Write a gradient function to calculate gradients for a linear regression y = ax + b [10 points]

(c) Generate artificial data for this regression according to the following protocol

and use gradient descent to find the optimal parameters relating X with y. If you do this correctly,

you should get {a,b} ~ {0.3, 2}. [10 points]

(d) Implement minibatch stochastic gradient descent using the code base you have developed so far.

[15 points]

(e) Does SGD do better or worse in terms of time performance on our data? Is there an optimal

minibatch size that works best? Quantify and interpret your findings. [10 points]

Q2. Surprise! This problem too builds on a problem that I asked in the mid-sem exam. Consider

again this Bayesian network

and calculate

(i) the probability that someone has both cold and a fever [5 points]

(i) the probability that someone who has a cough has a cold. [10 points]

Show your work, not just the final answer.

Q3. Derive the MLE for the parameters of a k-sided multinomial distribution. [10 points]