*Bounty: 50*

*Bounty: 50*

I have a problem which i have attached as an image.

**what I understand**

*error function* is given by: e(y,y^)=0 if y*a(x-b) >=1 or e(y,y^) = 1-y*a*(x-b) if ya(x-b) < 1.

Gradient descent at current t is (1,3).

Gradient Descent(Ein(a,b)) as per definition should be equal to partial derivative of the equation of Ein(a,b) wrt a and b. (w is equivalent to a,b, according to me)

**My doubt**

*Please note that when i say sum over N points, i mean to use that greek symbol used for summing of N points*

- I am not sure how would I calculate the partial derivative in this case. When we say to fix ‘a’ and vary ‘b’, does it mean to find differentiation only wrt ‘b’? which would mean that gradient(Ein)= -1/N(sum over N points(ya)). But this removes dependence on x and thats why i doubt my approach.
- The final equation for whom derivative needs to be done is:

1/N(sum(1-ya(x-b)) for N points misclassified). But as per the dataset, no point is misclassified as the error function for each point with the (a,b)=(1,3) is equal to 0.For point 1, where x=1.2 and y = -1, as y

*a*(x-b) =

(-1)*(1)*(1.2-3)=+1.8. This means that e(y_1,h(x_1)) = 0.

So what should be the answer to this question?

I hope my doubt in the question is cleared to audience. But in any case please let me know if there is something that I am unable to explain.

I would really appreciate if anyone of you can help me with this. I have read many places but am not sure how would i approach this problem.

Thanks & Regards,

Akshit Bhatia