# #StackBounty: #regression #sampling #error #consistency #identification Basic Questions about regression formula, sampling variability,…

### Bounty: 50

lets say I run the simple regression, $$y_i = beta_o + beta_1x_i + epsilon_i$$.. Assume $$cov(epsilon,x)$$=0

This yields the formula people write in terms of covariances for the slope parameter:

$$hat{beta_1}$$ = $$frac{sum(x-bar{x})y_i}{sum({x-bar{x})^2}}$$

and then plugging in the true assumed dgp for y, we get:

= $$beta + frac{sum(x-bar{x})epsilon_i}{sum({x-bar{x})^2}}$$

With this, I have a few questions.

1. is this is now a statement not about the population, but the ‘draw’ of $$epsilon_i$$‘s we so happened to draw in this sample? so it is the numerator second term the $$textit{sample}$$ covariance between epsilon and x? if true, can I think of each random sample as a given draw of $$epsilon_i$$‘s, and that draw is what drives the sampling variability of the estimator?

2.taking probability limits, the covaraince =0 seems to be sufficient for consistency of the estimator. however, is covariance only not sufficient for unbiasedness? is mean indepence of $$epsilon$$ and x needec for finite sample properties?

1. An also a question about thinking about ‘identification’. if i think of the model above as the causal model, and I can say my ols is consistent, does that mean I have ‘identified’ the true $$beta_1$$? so can it hink of the model not being identified if the $$cov(epsilon,x) neq 0$$, which would say that $$hat{beta}$$ converges in probability to the true $$beta_1$$ + some other term? so I fail to isolate the underlying parameter?

Get this bounty!!!

This site uses Akismet to reduce spam. Learn how your comment data is processed.