HOMEWORK 6

Instructions:

• Please submit your answers in a single pdf file. Preferably made using latex. No need to submit latex code.

• See piazza post @114 for some recommendations on how to write the answers.

• Submit code for programming exercises. You can use any programming language you like, but we highly

recommend python and pytorch.

• Submit all the material on time.

1 Kernel SVM [30 points]

Consider the following kernel function defined over z, z′ ∈ Z:

k(z, z′

) = (

1 if z = z

′

,

0 otherwise.

1. (10 pts) Prove that for any integer m > 0, any z1, . . . , zm ∈ Z, the m × m kernel matrix K = [Kij ] is

positive semi-definite, where Kij = k(zi

, zj ) for i, j = 1 . . . m. (Let us assume that for i ̸= j, we have

zi ̸= zj .) Hint: An m × m matrix K is positive semi-definite if ∀v ∈ R

m, v⊤Kv ≥ 0.

2. (10 pts) Given a training set (z1, y1), . . . ,(zn, yn) with binary labels, the dual SVM problem with the above

kernel k will have parameters α1, . . . , αn, b ∈ R. (Assume that for i ̸= j, we have zi ̸= zj .) The predictor

for input z takes the form

f(z) = Xn

i=1

αiyik(zi

, z) + b.

Recall the label prediction is sgn(f(z)). Prove that there exists α1, . . . , αn, b such that f correctly separates

the training set. In other words, k induces a feature space rich enough such that in it any training set can be

linearly separated.

3. (10 pts) How does that f predict input z that is not in the training set?

Comment: One useful property of kernel functions is that the input space Z does not need to be a vector space; in

other words, z does not need to be a feature vector. For all we know, Z can be turkeys in the world. As long as we

can compute k(z, z′

), kernel SVM works on turkeys.

2 Game of Classifiers (70 pts)

2.1 Implementation (10 pts)

Implement the following models in choice of your programming language. Include slack variables in SVM implementation if needed. You can use autograd features of pytorch, tensorflow etc. or derive gradients on your own.(

But don’t use inbuilt models for SVM, Kernel SVM and Logistic Regression from libraries)

• Implement Linear SVM (without kernels).

• Implement Kernel SVM, with options for linear, rbf and polynomial kernels. You should keep the kernel

parameters tunable (e.g. don’t fix the degree of polynomial kernels but keep it as a variable and play with

different values of it. Is Linear SVM a special case of Kernel SVMs?

• Implement Logistic Regression with and without kernels ( use same kernels as above).

1

Homework 6 CS 760 Machine Learning

2.2 Evaluation on Synthetic Dataset

2.2.1 Synthetic Dataset-1 (20 pts)

Generate 2-D dataset as following,

Let µ = 2.5 and I2 be the 2 × 2 identity matrix. Generate points for the positive and negative classes respectively

from N ([µ, 0], I2), and N ([−µ, 0], I2). For each class generate 750 points, (1500 in total). Randomly create train,

validation and test splits of size 1000, 250, 250 points respectively. Do the following with this dataset:

• ( 5 pts ) Train your Linear SVM, Logistic Regression models and report decision boundaries, test accuracies.

• (5 pts) Show decision boundaries with K-NN and Naive Bayes Classifiers. ( You can use library implementations or implement from scratch. Figure out the hyper-parameters using the validation set.)

• (5 pts) Repeat the process by varying µ from 1.0 to 2.4 with step size of 0.2, for each value of µ obtain test

accuracies of the models and plot ( µ on x-axis and test accuracy on y-axis). ( You will have a curve for

each of the 4-classifiers mentioned above)

• (5 pts) What are your conclusions from this exercise?

2.2.2 Synthetic Dataset-2 (20 pts)

Generate 1500 data points from the 2-D circles dataset of sklearn ( sklearn.datasets.make circles).

Randomly create train, validation and test splits of size 1000, 250, 250 points respectively. Evaluate the above

classifiers on this setting.

• ( 5 pts ) Show decision boundaries for Linear SVM and Logistic Regression classifiers.

• ( 5 pts ) Show decision boundaries for Kernel SVM and Kernel Logistic Regression ( use rbf, polynomial

kernels). Try different values of hyperparameters, report results with whichever works the best.

• ( 5 pts ) Train Neural Network from HW4, and K-NN classifiers on this dataset and show decision boundaries. ( You can use library implementation for these classifiers).

• ( 5 pts ) What are your conclusions from this exercise?

2.3 Evaluation on Real Dataset (20 pts)

Lets put all this to some real use. For this problem use the the Wisconsin Breast Cancer dataset. You can download

it from sklearn library( sklearn.datasets.load breast cancer)

• ( 10 pts ) Do all the points of section 2.2.2 on this dataset. Since this is high-dimensional data, so you

don’t have to show the decision boundaries. Just report test accuracies for these classifiers and discuss your

findings.

• ( 10 pts ) In addition, you also want to figure out the important features which determine the class. Which

regularization will you use for this? Upgrade your SVM, Kernel SVM implementation to include this

regularization. Discuss what are the important features that you obtain by running your regularized SVM

on this dataset. ( You might have to normalize this dataset before training any classifier).

2