# #StackBounty: #machine-learning #covariance-matrix How to factorise a covariance matrix with one observation per row-column combination?

### Bounty: 500

I have $$N$$ correlated random variables. I assume that these random variables are given by the following expression:

$$tilde{x}_i = alpha_i + beta_i cdot tilde{m} + gamma_i cdot tilde{varepsilon_i},$$

where $$tilde{m}$$ is a "global" random variable and $$tilde{varepsilon_i}$$ are "variable specific" random variables (as can be seen from absence and presence of the index $$i$$, respectively). The mean and sigma of both $$tilde{m}$$ and $$tilde{varepsilon_i}$$ are assumed to be zero and one, respectively. The $$tilde{varepsilon_i}$$ are also assumed to be independent. As a consequence, the covariance matrix should be given by the following expression:

$$C_{ij} = beta_i cdot beta_j + delta_{ij} cdot gamma_i cdot gamma_j,$$

where $$delta_{ij}$$ is Kronecker delta.

Now I say that each random variable comes with one number (feature $$f_i$$) that determines values of $$alpha_i$$, $$beta_i$$ and $$gamma_i$$:

$$alpha_i = alpha (f_i),$$

$$beta_i = beta (f_i),$$

$$gamma_i = gamma (f_i),$$

where $$alpha$$, $$beta$$ and $$gamma$$ are some "universal" functions (the same for all N random variables).

Using the available observations of $$x_i$$ I can calculate the covariance matrix $$C_{ij}$$ and try to find such functions $$beta$$ and $$gamma$$ that approximate it well:

$$C_{ij} = C(f_i, f_j) = beta(f_i) cdot beta(f_j) + delta_{ij} cdot gamma(f_i) cdot gamma(f_j).$$

So far no problems. The problem comes from the fact that features $$f_i$$ are not constants as well as the number of random variables.

For example, on the first time step I might have 3 random variables with the following values of features: $$f_1 = 1.3, f_2 = 4.5, f_3 = 0.3$$ and I also have the corresponding observations of the random variables: $$x_1 = 1.0, x_2 = -0.5, x_3 = 4.0$$. On the second step I might have 5 random variables coming with some new 5 values of features $$f_i$$ and 5 new observations $$x_i$$. How can I find functions $$beta(f)$$ and $$gamma(f)$$ in this case? Or, in other words, I can assume one pair of functions ($$beta_1(f)$$, $$gamma_1(f)$$) and another pair ($$beta_2(f)$$, $$gamma_2(f)$$). How can I determine which pair of functions approximate my data set better?

Get this bounty!!!

This site uses Akismet to reduce spam. Learn how your comment data is processed.