#StackBounty: #least-squares #asymptotics #average Asymptotic dist of an average involving OLS coefs?

Bounty: 50

Suppose that we have iid sample of size $n$. i.e., the random vector $(Y_{i}, X_{1i}, X_{2i}, X_{3i})$ is iid from $1,ldots,n$. And suppose the following relationship is true:

$$
Y_i = beta_0 + beta_1X_{1i} + beta_2X_{2i} + beta_3X_{1i}X_{2i} + epsilon_i
$$

Suppose for simplicity that $X_{1i}$ and $X_{2i}$ are uniformly distributed from 0 to 1, and are correlated. Let’s assume further that $epsilon_i$ is normally distributed and independent of $X_{1i}$ and $X_{2i}$.

Let the OLS estimators be $hat{beta}_0, hat{beta}_1, hat{beta}_2$.

Let $Z_i$ be

$$
Z_i = 1hat{beta}_0 + 2X_{1i}hat{beta}_1 + 3X_{2i}hat{beta}_2 + 4hat{beta}3*X{1i}*X_{2i}
$$

How do I find the asymptotic distribution of $bar{Z}=frac{1}{n}sum_{i=1}^n Z_i$?

I cannot apply a CLT since the $Z_i$ are correlated with each other because of the $hat{beta}$. In addition to solving this particular case, any reference to theory I can study related to this would be helpful. I do not have an advanced statistical theory knowledge.

I would like to derive a non-degenerate asymptotic distribution, i.e., something like $sqrt{n}(bar{Z} – E(Z_i))$.


Get this bounty!!!

#StackBounty: #average #smoothing #binning Smoothing a binned averages

Bounty: 200

I am trying to smooth some binned data. I have a discrete variable X which might best be thought of as time and a continuous variable Y. I want to know the average Y value for each value of X and this is pretty straight forward. However, if some specific X values have few associated Y values I suffer from high statistical error. I would like to smooth the averages by using the support from their adjacent bins.

An example might be illustrative. Let Y be days and X be sale price from a retail store. If I want to know the average sale price trend over time this can be easily calculated. However if there are days where only one or two items were sold they could cause the plot to fluctuate wildly. I would like to reduce statistical error on each day by incorporating the adjacent few days. I assume there are a number of ways to do this so please let me know if there is a standard.

Please note I do not want to interpolate. I just want to control the statistical fluctuations. Also, the most recent day is likely of the most importance. Since this day only has bins on one side of it I suspect a bias from many methods. Is there a way to add an error bar to the adjusted average in a meaningful way?


Get this bounty!!!