*Bounty: 100*

*Bounty: 100*

I’m studying the difference between regularization in RKHS regression and linear regression, but I have a hard time grasping the crucial difference between the two.

Given input-output pairs $(x_i,y_i)$, I want to estimate a function $f(cdot)$ as follows

begin{equation}f(x)approx u(x)=sum_{i=1}^m alpha_i K(x,x_i),end{equation}

where $K(cdot,cdot)$ is a kernel function. The coefficients $alpha_m$ can either be found by solving

begin{equation}

{displaystyle min *{alphain R^{n}}{frac {1}{n}}|Y-Kalpha|*{R^{n}}^{2}+lambda alpha^{T}Kalpha},end{equation}

where, with some abuse of notation, the $i,j$’th entry of the kernel matrix $K$ is ${displaystyle K(x_{i},x_{j})} $. This gives

begin{equation}

alpha^*=(K+lambda nI)^{-1}Y.
end{equation}
Alternatively, we could treat the problem as a normal ridge regression/linear regression problem:
begin{equation}
{displaystyle min *=(K^{T}K +lambda nI)^{-1}K^{T}Y}.

*{alphain R^{n}}{frac {1}{n}}|Y-Kalpha|*{R^{n}}^{2}+lambda alpha^{T}alpha},end{equation} with solution begin{equation} {alpha^

end{equation}

What would be the crucial difference between these two approaches and their solutions?