So I have a question about the unbiasedness of the OLS estimator. It is unbiased when $E\{\epsilon|X\} = 0$ and some other assumptions, where $X$ is the regressor. Is it still unbiased if I relax the requirement and say that $E\{\epsilon\}=0$ and $E\{X\epsilon\} = 0$? I read a paper and it is refered to as the weak orthogonality. However, I do not know if these two conditions: $E{\epsilon}=0$ and $E{X\epsilon} = 0$ can lead to unbiasedness, or just consistency?
-
Are you considering $X$ to be random, or fixed? If $X$ is fixed then $E(\epsilon) = 0 \Rightarrow E(X \epsilon) = X E(\epsilon) = 0$. – Andrew M Oct 28 '15 at 23:15
-
No, $X$ is random. I am not sure if these conditions lead to consistency or unbiasedness. – StellaLee Oct 28 '15 at 23:23
-
Do you have an alternate model in mind besides $Y = X\beta + \epsilon$ for the true distribution of $Y|X$? Are we omitting a predictor or otherwise mis-specifying the model? Or are you defining the estimand to be $E( (X^T X)^{-1} X^T y)$ (ie, the population OLS estimate for covariates $X$). – Andrew M Oct 28 '15 at 23:40
-
Yes the estimator should be $(X^TX)^{-1}X^Ty$. No we are not omitting a predictor. I am curious as to when the OLS is unbiased, is $E\{\epsilon|X\}=0$ sufficient or not? – StellaLee Oct 29 '15 at 00:11
-
1Because I think the conditions $E\{\epsilon\} = 0$ and $E\{X\epsilon\} = 0$ are weaker conditions, I don't know if they still lead to unbiasedness or just consistency... – StellaLee Oct 29 '15 at 00:13
1 Answers
The OLS estimate is defined by $$ \hat \beta = \left(X^T X \right)^{-1}X^Ty $$ and we wish to find out when $E(\hat \beta) = \beta$ under the model $Y = X \beta + \epsilon$ with both $X$ and $\epsilon$ drawn from distribution $P$.
$$ E (\hat \beta) = E \left( \left( X^T X \right)^{-1}X^T \left( X \beta + \epsilon \right) \right) $$
$$ = E \left( \left( X^T X \right)^{-1} \left(X^T X \right) \beta + \left( X^T X \right)^{-1}X^T \epsilon \right) $$
$$ = E \left( \beta \right) + E \left( \left( X^T X \right)^{-1}X^T \epsilon \right) $$ so we need kill off the second term somehow. Clearly $E(\epsilon |X)=0$ would do it, as would just assuming that the second term has expectation zero, which is to say that the moore-penrose psuedoinverse of $X$ times $\epsilon$ has zero expectation.

- 2,696
- 14
- 25
-
1Your derivation seems to be claiming that $E \left( \left( X^T X \right)^{-1}X^T \epsilon \right)=E(\epsilon)$, which is not true. – Christoph Hanck Oct 29 '15 at 07:06
-
Not only is that assertion untrue, the conclusion is false as well. The weaker conditions do not imply the OLS estimator is unbiased. (In fact, additional conditions are needed just to make sure the OLS estimator is a well defined random variable! In particular, $X$ cannot be discrete.) – whuber Oct 29 '15 at 16:49
-
@whuber: Interesting regarding the conditions for $\hat \beta$ to exist as a random variable. I would have guessed that the only condition would need to be that $X$ has full column rank almost surely. Do you have a reference? – Andrew M Oct 29 '15 at 17:26
-
-
@When $X$ is discrete, there is a positive chance that all the $X_i$ are identical, in which case the OLS estimator is undefined. You need to assume $X$ is continuous in order to avoid this possibility. – whuber Oct 29 '15 at 19:09