#StackBounty: #regression #mathematical-statistics #multivariate-analysis #least-squares #covariance Robust Covariance in Multivariate …

Bounty: 50

Assume we are in the OLS setting with $y = Xbeta + epsilon$. When $y$ is a response vector, and $X$ are covariates, we can get two types of covariance estimates:

The homoskedastic covariance
$cov(hat{beta}) = (X’X)^{-1} (e’e)$, and robust covariance
$cov(hat{beta}) = (X’X)^{-1} X’ diag(e^2) X (X’X)^{-1}$.

I’m looking for help on how to derive these covariances when $Y$ is a response matrix, and $E$ is a residual matrix. There is a fairly detailed derivation on slide 49 here, but I think there are some steps missing.

For the homoskedastic case, each column of $E$ is assumed to have a covariance structure of $sigma_{kk} I$, which is the usual structure for a single vector response. Each row of $E$ is also assumed to be i.i.d with covariance $Sigma$.

The derivation starts with collapsing the $Y$ and $E$ matrices back into vectors. In this structure $Var(Vec(E)) = Sigma otimes I$.

First question: I understand the kronecker product produces a block diagonal matrix with $Sigma$ on the block diagonal, but where did $sigma_{kk}$ go to? Is it intentional that the $sigma_{kk}$ values are pooled together so that the covariance is constant on the diagonal, similar to the vector response case?

Using $Sigma otimes I$, the author gives a derivation for $cov(hat{beta})$ on slide 66.

$
begin{align}
cov(hat{beta}) &= ((X’X)^{-1} X’ otimes I) (I otimes Sigma) (X (X’X)^{-1} otimes I) \
&= (X’X)^{-1} otimes Sigma
end{align}
$
.

The first line looks like a standard sandwich estimator. The second line is an elegant reduction because of the I matrix and properties of the kronecker product.

Second question: What is the extension for robust covariances?
I imagine we need to revisit the meat of the sandwich estimator, ($I otimes Sigma$), which comes from the homoskedastic assumption per response in the Y matrix. If we use robust covariances, we should say that each column of $E$ has variance $diag(e_k^2)$. We can retain the second assumption that rows in E are i.i.d. Since the different columns in $E$ no longer follow the pattern $scalar * I$, I don’t believe $Var(Vec(E))$ factors into a kronecker product as it did before. Perhaps $Var(Vec(E))$ is some diagonal matrix, $D$?

Revisiting the sandwich-like estimator, is the extension for robust covariance

$
begin{align}
cov(hat{beta}) &= ((X’X)^{-1} X’ otimes I) (D) (X (X’X)^{-1} otimes I) \
&= ?
end{align}
$
.

This product doesn’t seem to reduce; we cannot invoke the mixed product property because D does not take the form of a scalar multiplier on I.

The first question is connected to this second question. In the first question on homoskedastic variances, $sigma_{kk}$ disappeared, allowing $Var(Vec(E))$ to take the form $Sigma otimes I$. But if the diagonal of $Var(Vec(E))$ was not constant, it would actually have the same structure as the robust covariance case ($Var(Vec(E))$ is some diagonal matrix $D$). So, what allowed $sigma_{kk}$ to disappear, and is there a similar trick for the robust case that would allow the $D$ matrix to factor?

Thank you for your help.


Get this bounty!!!

#StackBounty: #least-squares #measurement-error Measurement error in one indep variable in OLS with multiple regression

Bounty: 50

Suppose I regress (with OLS) $y$ on $x_1$ and $x_2$. Suppose I have i.i.d. sample of size n, and that $x_1$ is observed with error but $y$ and $x_2$ are observed without error. What is the probability limit of the estimated coefficient on $x_1$?

Let us suppose for tractability that the measurement error of $x_1$ is "classical". That is the measurement error is normally distributed with mean 0 and is uncorrelated with $x_2$ or the error term.


Get this bounty!!!

#StackBounty: #least-squares #measurement-error Measurement error in one indep variable in OLS with multivariate regression

Bounty: 50

Suppose I regress (with OLS) $y$ on $x_1$ and $x_2$. Suppose I have i.i.d. sample of size n, and that $x_1$ is observed with error but $y$ and $x_2$ are observed without error. What is the probability limit of the estimated coefficient on $x_1$?

Let us suppose for tractability that the measurement error of $x_1$ is "classical". That is the measurement error is normally distributed with mean 0 and is uncorrelated with $x_2$ or the error term.


Get this bounty!!!

#StackBounty: #regression #multiple-regression #least-squares #mse What's the MSE of $hat{Y}$ in ordinary least squares using bias…

Bounty: 100

Suppose I have the following model: $$Y = mu + epsilon = Xbeta + epsilon,$$ where $Y$ is $n times 1$, $X$ is $n times p$, $beta$ is $p times 1$, and $epsilon$ is $n times 1$. I assume that $epsilon$ are independent with mean 0 and variance $sigma^2I$.

In OLS, the fitted values are $hat{Y} = HY$, where $H = X(X^TX)^{-1}X^T$ is the $N times N$ hat matrix. I want to find the MSE of $hat{Y}$.

By the bias-variance decomposition, I know that

begin{align}
MSE(hat{Y}) &= bias^2(hat{Y}) + var(hat{Y})\
&= (E[HY] – mu)^T(E[HY] – mu) + var(HY)\
&= (Hmu – mu)^T(Hmu – mu) + sigma^2H\
&= 0 + sigma^2H
end{align
}

I’m confused by the dimension in the last step. The $bias^2$ term is a scalar. However, $var(hat{Y})$ is an $N times N$ matrix. How can one add a scalar to an $N times N$ matrix where $N neq 1$?


Get this bounty!!!

#StackBounty: #regression #multiple-regression #least-squares #mse What's the MSE of $hat{Y}$ in ordinary least squares using bias…

Bounty: 100

Suppose I have the following model: $$Y = mu + epsilon = Xbeta + epsilon,$$ where $Y$ is $n times 1$, $X$ is $n times p$, $beta$ is $p times 1$, and $epsilon$ is $n times 1$. I assume that $epsilon$ are independent with mean 0 and variance $sigma^2I$.

In OLS, the fitted values are $hat{Y} = HY$, where $H = X(X^TX)^{-1}X^T$ is the $N times N$ hat matrix. I want to find the MSE of $hat{Y}$.

By the bias-variance decomposition, I know that

begin{align}
MSE(hat{Y}) &= bias^2(hat{Y}) + var(hat{Y})\
&= (E[HY] – mu)^T(E[HY] – mu) + var(HY)\
&= (Hmu – mu)^T(Hmu – mu) + sigma^2H\
&= 0 + sigma^2H
end{align
}

I’m confused by the dimension in the last step. The $bias^2$ term is a scalar. However, $var(hat{Y})$ is an $N times N$ matrix. How can one add a scalar to an $N times N$ matrix where $N neq 1$?


Get this bounty!!!

#StackBounty: #regression #multiple-regression #least-squares #mse What's the MSE of $hat{Y}$ in ordinary least squares using bias…

Bounty: 100

Suppose I have the following model: $$Y = mu + epsilon = Xbeta + epsilon,$$ where $Y$ is $n times 1$, $X$ is $n times p$, $beta$ is $p times 1$, and $epsilon$ is $n times 1$. I assume that $epsilon$ are independent with mean 0 and variance $sigma^2I$.

In OLS, the fitted values are $hat{Y} = HY$, where $H = X(X^TX)^{-1}X^T$ is the $N times N$ hat matrix. I want to find the MSE of $hat{Y}$.

By the bias-variance decomposition, I know that

begin{align}
MSE(hat{Y}) &= bias^2(hat{Y}) + var(hat{Y})\
&= (E[HY] – mu)^T(E[HY] – mu) + var(HY)\
&= (Hmu – mu)^T(Hmu – mu) + sigma^2H\
&= 0 + sigma^2H
end{align
}

I’m confused by the dimension in the last step. The $bias^2$ term is a scalar. However, $var(hat{Y})$ is an $N times N$ matrix. How can one add a scalar to an $N times N$ matrix where $N neq 1$?


Get this bounty!!!

#StackBounty: #regression #multiple-regression #least-squares #mse What's the MSE of $hat{Y}$ in ordinary least squares using bias…

Bounty: 100

Suppose I have the following model: $$Y = mu + epsilon = Xbeta + epsilon,$$ where $Y$ is $n times 1$, $X$ is $n times p$, $beta$ is $p times 1$, and $epsilon$ is $n times 1$. I assume that $epsilon$ are independent with mean 0 and variance $sigma^2I$.

In OLS, the fitted values are $hat{Y} = HY$, where $H = X(X^TX)^{-1}X^T$ is the $N times N$ hat matrix. I want to find the MSE of $hat{Y}$.

By the bias-variance decomposition, I know that

begin{align}
MSE(hat{Y}) &= bias^2(hat{Y}) + var(hat{Y})\
&= (E[HY] – mu)^T(E[HY] – mu) + var(HY)\
&= (Hmu – mu)^T(Hmu – mu) + sigma^2H\
&= 0 + sigma^2H
end{align
}

I’m confused by the dimension in the last step. The $bias^2$ term is a scalar. However, $var(hat{Y})$ is an $N times N$ matrix. How can one add a scalar to an $N times N$ matrix where $N neq 1$?


Get this bounty!!!

#StackBounty: #regression #multiple-regression #least-squares #mse What's the MSE of $hat{Y}$ in ordinary least squares using bias…

Bounty: 100

Suppose I have the following model: $$Y = mu + epsilon = Xbeta + epsilon,$$ where $Y$ is $n times 1$, $X$ is $n times p$, $beta$ is $p times 1$, and $epsilon$ is $n times 1$. I assume that $epsilon$ are independent with mean 0 and variance $sigma^2I$.

In OLS, the fitted values are $hat{Y} = HY$, where $H = X(X^TX)^{-1}X^T$ is the $N times N$ hat matrix. I want to find the MSE of $hat{Y}$.

By the bias-variance decomposition, I know that

begin{align}
MSE(hat{Y}) &= bias^2(hat{Y}) + var(hat{Y})\
&= (E[HY] – mu)^T(E[HY] – mu) + var(HY)\
&= (Hmu – mu)^T(Hmu – mu) + sigma^2H\
&= 0 + sigma^2H
end{align
}

I’m confused by the dimension in the last step. The $bias^2$ term is a scalar. However, $var(hat{Y})$ is an $N times N$ matrix. How can one add a scalar to an $N times N$ matrix where $N neq 1$?


Get this bounty!!!

#StackBounty: #regression #multiple-regression #least-squares #mse What's the MSE of $hat{Y}$ in ordinary least squares?

Bounty: 100

Suppose I have the following model: $$Y = mu + epsilon = Xbeta + epsilon,$$ where $Y$ is $n times 1$, $X$ is $n times p$, $beta$ is $p times 1$, and $epsilon$ is $n times 1$. I assume that $epsilon$ are independent with mean 0 and variance $sigma^2I$.

In OLS, the fitted values are $hat{Y} = HY$, where $H = X(X^TX)^{-1}X^T$ is the $N times N$ hat matrix. I want to find the MSE of $hat{Y}$.

By the bias-variance decomposition, I know that

begin{align}
MSE(hat{Y}) &= bias^2(hat{Y}) + var(hat{Y})\
&= (E[HY] – mu)^T(E[HY] – mu) + var(HY)\
&= (Hmu – mu)^T(Hmu – mu) + sigma^2H\
&= 0 + sigma^2H
end{align
}

I’m confused by the dimension in the last step. The $bias^2$ term is a scalar. However, $var(hat{Y})$ is an $N times N$ matrix. How can one add a scalar to an $N times N$ matrix where $N neq 1$?


Get this bounty!!!

#StackBounty: #regression #multiple-regression #least-squares #mse What's the MSE of $hat{Y}$ in ordinary least squares?

Bounty: 100

Suppose I have the following model: $$Y = mu + epsilon = Xbeta + epsilon,$$ where $Y$ is $n times 1$, $X$ is $n times p$, $beta$ is $p times 1$, and $epsilon$ is $n times 1$. I assume that $epsilon$ are independent with mean 0 and variance $sigma^2I$.

In OLS, the fitted values are $hat{Y} = HY$, where $H = X(X^TX)^{-1}X^T$ is the $N times N$ hat matrix. I want to find the MSE of $hat{Y}$.

By the bias-variance decomposition, I know that

begin{align}
MSE(hat{Y}) &= bias^2(hat{Y}) + var(hat{Y})\
&= (E[HY] – mu)^T(E[HY] – mu) + var(HY)\
&= (Hmu – mu)^T(Hmu – mu) + sigma^2H\
&= 0 + sigma^2H
end{align
}

I’m confused by the dimension in the last step. The $bias^2$ term is a scalar. However, $var(hat{Y})$ is an $N times N$ matrix. How can one add a scalar to an $N times N$ matrix where $N neq 1$?


Get this bounty!!!