# #StackBounty: #regression #mathematical-statistics #multivariate-analysis #least-squares #covariance Robust Covariance in Multivariate …

### Bounty: 50

Assume we are in the OLS setting with $$y = Xbeta + epsilon$$. When $$y$$ is a response vector, and $$X$$ are covariates, we can get two types of covariance estimates:

The homoskedastic covariance
$$cov(hat{beta}) = (X’X)^{-1} (e’e)$$, and robust covariance
$$cov(hat{beta}) = (X’X)^{-1} X’ diag(e^2) X (X’X)^{-1}$$.

I’m looking for help on how to derive these covariances when $$Y$$ is a response matrix, and $$E$$ is a residual matrix. There is a fairly detailed derivation on slide 49 here, but I think there are some steps missing.

For the homoskedastic case, each column of $$E$$ is assumed to have a covariance structure of $$sigma_{kk} I$$, which is the usual structure for a single vector response. Each row of $$E$$ is also assumed to be i.i.d with covariance $$Sigma$$.

The derivation starts with collapsing the $$Y$$ and $$E$$ matrices back into vectors. In this structure $$Var(Vec(E)) = Sigma otimes I$$.

First question: I understand the kronecker product produces a block diagonal matrix with $$Sigma$$ on the block diagonal, but where did $$sigma_{kk}$$ go to? Is it intentional that the $$sigma_{kk}$$ values are pooled together so that the covariance is constant on the diagonal, similar to the vector response case?

Using $$Sigma otimes I$$, the author gives a derivation for $$cov(hat{beta})$$ on slide 66.

begin{align} cov(hat{beta}) &= ((X’X)^{-1} X’ otimes I) (I otimes Sigma) (X (X’X)^{-1} otimes I) \ &= (X’X)^{-1} otimes Sigma end{align}.

The first line looks like a standard sandwich estimator. The second line is an elegant reduction because of the I matrix and properties of the kronecker product.

Second question: What is the extension for robust covariances?
I imagine we need to revisit the meat of the sandwich estimator, ($$I otimes Sigma$$), which comes from the homoskedastic assumption per response in the Y matrix. If we use robust covariances, we should say that each column of $$E$$ has variance $$diag(e_k^2)$$. We can retain the second assumption that rows in E are i.i.d. Since the different columns in $$E$$ no longer follow the pattern $$scalar * I$$, I don’t believe $$Var(Vec(E))$$ factors into a kronecker product as it did before. Perhaps $$Var(Vec(E))$$ is some diagonal matrix, $$D$$?

Revisiting the sandwich-like estimator, is the extension for robust covariance

begin{align} cov(hat{beta}) &= ((X’X)^{-1} X’ otimes I) (D) (X (X’X)^{-1} otimes I) \ &= ? end{align}.

This product doesn’t seem to reduce; we cannot invoke the mixed product property because D does not take the form of a scalar multiplier on I.

The first question is connected to this second question. In the first question on homoskedastic variances, $$sigma_{kk}$$ disappeared, allowing $$Var(Vec(E))$$ to take the form $$Sigma otimes I$$. But if the diagonal of $$Var(Vec(E))$$ was not constant, it would actually have the same structure as the robust covariance case ($$Var(Vec(E))$$ is some diagonal matrix $$D$$). So, what allowed $$sigma_{kk}$$ to disappear, and is there a similar trick for the robust case that would allow the $$D$$ matrix to factor?