#StackBounty: #pca #kernel-trick Kernel function for use in Kernel-PCA given a known piecewise linear true data generating process

Bounty: 50

If I know that a multivariate dataset has a piecewise-linear data generating process with known knots (or breakpoints), then what is the appropriate kernel function to use in Kernel-PCA?

For example, given \$n = 1, …, N\$ observations and \$j = 1, …, J\$ variables, assume the true data generating process:
begin{equation}
X_{n,j} = alpha_{1,j} F_{n} I_{F_n leq 0} + alpha_{2,j} F_{n} I_{F_n > 0} + e_{n,j}
end{equation}

where \$I\$ is an indicator function. That is, the true data generating process is piecewise linear in a single factor \$F_n\$, with a knot at \$0\$ and slopes \$alpha_{1,j}, alpha_{2,j}\$. Assuming I want to use Kernel-PCA, is there a known most efficient kernel function to use?

My guess is that the answer is probably related to the standard hinge function for piecewise-linear regression, i.e. \$f(x) = max(x, 0)\$, and so the appropriate kernel function for Kernel-PCA might be some combination of the inner products of the centred variables and their hinge functions. But how would I derive this?

More generally, is there a standard methodology for converting a known true data generating process into the appropriate corresponding kernel function?

Any pointers to good reading on this material would also be greatly appreciated.

Get this bounty!!!

This site uses Akismet to reduce spam. Learn how your comment data is processed.