Difference between linear regression with or without intercept? Why and when to use which one?
Asked
Active
Viewed 171 times
1 Answers
0
Take a simple linear regression specification $Y_i = \alpha + \beta X_i + U_i$ with the usual assumptions.
The OLS estimator for $\beta$, $\hat\beta$, converges in probability to $Cov(X_i, Y_i) / Var(X_i)$. Substituting in, we get that this expression is equal to $\beta$.
Not including the intercept is equivalent to assuming $\alpha = 0$. The OLS estimator in this case converges to $E(X_i Y_i) / E(X_i^2)$ which is not equal to $\beta$ unless the residual term is mean zero, that is, $E(U_i)=0$.
Including the intercept normalizes the residual to have mean zero, and so I am hard pressed to give you an example when you should not include it.

Student
- 838
- 5
- 12
-
This might be a little confusing, but let me express my quibble with your answer. There is an identification problem: either we assume $\alpha\neq 0$ and $\mathbb{E}(U)=0$ or $\alpha=0$ and $\mathbb{E}(U)\neq0$; otherwise, we cannot separately identify $\alpha$ and $\mathbb{E}(U)$. Normally, $\mathbb{E}(U)=0$ is assumed. With this out of the way, the question is, should we assume $\alpha=0$. If we do and the assumption holds, $\hat\beta$ converges to $\beta$. If we do and the assumption fails, this does not happen. But the reason is the failed assumption on $\alpha$, not on $\mathbb{E}(U)$. – Richard Hardy Nov 07 '19 at 15:42
-
(Not sure if this is relevant but: the fact that the mean of errors will not be exactly zero in a finite sample does not affect the asymptotic results.) – Richard Hardy Nov 07 '19 at 15:44
-
@Richard Hardy, I absolutely agree with you. It would have helped if I’d gone into more detail with my answer. The important assumption is $E(U_i \mid X_i) = E(U_i)$ (mean independence), and if we include the intercept, OLS will not be consistent for $\alpha$ but it _will_ be for $\beta$. If we don’t include the intercept, then in effect we are assuming $E(U_i)=-\alpha$. If this assumption is not satisfied, interceptless OLS will also not be consistent for $\beta$. – Student Nov 07 '19 at 16:08
-
Regarding the penultimate sentence, since we already have the assumption $\mathbb{E}(U)=0$, this implies the assumption of $\alpha=0$. If $\alpha$ is included, OLS should still be consistent regardless of the true value of $\alpha$, given the assumption $\mathbb{E}(U)=0$. By the way, a very similar argument can also be extended from intercept to other variables and is known as the omitted variable bias. The difference is that the column of zeros is uncorrelated to any other variable, while other variables can be correlated, and this has some consequences. – Richard Hardy Nov 07 '19 at 16:36
-
Right but we don’t need to assume $E(U_i)=0$ as long as we have the intercept. Mean independence is enough for $\hat\beta\to_p\beta$. – Student Nov 07 '19 at 16:44
-
As I tried to explain in my first comment, $\mathbb{E}(U)=0$ is an identifying assumption; it is not restrictive by itself. I think that only in its light does it make sense to consider the assumption $\alpha=0$, so I suggest to keep the $\mathbb{E}(U)=0$ assumption in all considerations (as is common in regression analysis). Errors that have nonzero expected value can always be redefined by shifting them by their population mean, which I think is why the $\mathbb{E}(U)=0$ assumption is so ubiquitous. – Richard Hardy Nov 07 '19 at 16:48
-
$E(U_i)=0$ is an identifying assumption for $\alpha$ but not for $\beta$. $E(U_i\mid X_i)=E(U_i)$ is sufficient to identify $\beta$ because it implies $Cov(X_i,U_i)=0$ which is all we need. I don’t think it is worthwhile to argue about this longer. You are making good points but they don’t invalidate mine. We are approaching the answer from different starting points. – Student Nov 07 '19 at 17:32