#StackBounty: #pca #kernel-trick Kernel function for use in Kernel-PCA given a known piecewise linear true data generating process

Bounty: 50

If I know that a multivariate dataset has a piecewise-linear data generating process with known knots (or breakpoints), then what is the appropriate kernel function to use in Kernel-PCA?

For example, given $n = 1, …, N$ observations and $j = 1, …, J$ variables, assume the true data generating process:
begin{equation}
X_{n,j} = alpha_{1,j} F_{n} I_{F_n leq 0} + alpha_{2,j} F_{n} I_{F_n > 0} + e_{n,j}
end{equation}

where $I$ is an indicator function. That is, the true data generating process is piecewise linear in a single factor $F_n$, with a knot at $0$ and slopes $alpha_{1,j}, alpha_{2,j}$. Assuming I want to use Kernel-PCA, is there a known most efficient kernel function to use?

My guess is that the answer is probably related to the standard hinge function for piecewise-linear regression, i.e. $f(x) = max(x, 0)$, and so the appropriate kernel function for Kernel-PCA might be some combination of the inner products of the centred variables and their hinge functions. But how would I derive this?

More generally, is there a standard methodology for converting a known true data generating process into the appropriate corresponding kernel function?

Any pointers to good reading on this material would also be greatly appreciated.


Get this bounty!!!

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.