#StackBounty: #pca #eigenvalues #high-dimensional #random-matrix PCA: inference on the proportion of explained variance, in a large p s…

Bounty: 50

I am interested in doing inference on the proportion of total variance explained by the first principal component, for a PCA based on the correlation matrix R. I want to know the (asymptotic) distribution of

$$lambda^R_1/sum_ilambda^R_i=lambda^R_1/tr(R)=lambda^R_1/p$$

where $lambda^R_i$ is the ith eigenvalue of R, the sample correlation matrix, and $p$ is the number of variables.

What is the distribution of this statistic, are there methods available to form confidence intervals? I found surprisingly little references for this. I am particularly interested in a large dimensional setup, where $ptoinfty$, $Ntoinfty$ but $p/Nto c$, not necessarily the classical case where p is fixed and $p/Nto 0$. From random matrix theory and Marcenko Pastur, we know that the first eigenvalue will be biased upwards, but I am still unclear how this is going to affect the distribution of $lambda^R_1/p$ as $ptoinfty$.

Thanks!


Get this bounty!!!

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.