Role of Gauss-Markov Theorem in Linear Regression

Question

How does being BLUE matter in Linear Regression for the coefficients? What does Heteroscedasticity Consistent & Auto-correlation Consistent Dispersion matrix take care off in this regard?

Gauss-Matrkov theorem is probably more of an historical interest today. Knowing that your estimators is "best linear unbiased" is of little import when there are much better nonlinear estimators! — kjetil b halvorsen, Mar 17 '17 at 16:43
@kjetilbhalvorsen your comment is largely besides the point, if you the model is actually linear then why use a nonlinear model? You gain nothing. The theorem is extremely useful, and is probably one of the reasons OLS is one of the most used estimators in modern applied work. Whether that is a good or bad thing, is a different question. — Repmat, Mar 17 '17 at 18:18
@Repmat: Why does a linear model force you to use a linear estimator? — kjetil b halvorsen, Mar 17 '17 at 18:21
@generic_user: Economist could use some modern statistics! Oh, maybe they do, but they can forget Gauss-Markov — kjetil b halvorsen, Mar 17 '17 at 18:22
@kjetilbhalvorsen it doesn't, but nobody cares... The class of linear estimators is extremely wide and quite general. Linear estimators even handle non linear models... — Repmat, Mar 17 '17 at 18:38
@Repmat: Yes, tradition is strong... main problem is maybe that more robust methods do not have developed thge same machinery for inference & control of assumptions ... with errors far from normally distributed linear estimators can be very suboptimal. — kjetil b halvorsen, Mar 17 '17 at 18:46
@kjetilbhalvorsen : Is it a linear model if the estimators are not linear functions of the response variables? — Michael Hardy, Mar 17 '17 at 20:01
@Michael Hardy: ??? Yes, of course, that is possible. A linear model usually means that the expectation is a linear function of predictors. It says nothing about how parameters are estimated. If you estimate the $\beta$s in some linear model with, say, some robust M-estimator, the estimator is non-linear function of the data, model is still linear. See also my http://stats.stackexchange.com/questions/120776/why-should-we-use-t-errors-instead-of-normal-errors/120787#120787 — kjetil b halvorsen, Mar 17 '17 at 20:04
@kjetilbhalvorsen : I was hasty. One definition is that the expected value of the response variable is a linear function of the unobservable but estimable parameters. $\qquad$ — Michael Hardy, Mar 17 '17 at 20:34
@kjetilbhalvorsen could you provide a reference (possibly open access) to these nonlinear estimators of the linear regression model, you are referring to? — DeltaIV, Mar 18 '17 at 11:28
@DeltaIV: I added some refs in my answer, but not open access. Just google the web for robust statistics, or look up MASS (thebook, Venables & Ripley) — kjetil b halvorsen, Mar 19 '17 at 18:12
@kjetilbhalvorsen thanks to the Robust.pdf by Ripley now I understand what you were referring to. Then I have a small correction for your answer - see my comment there. — DeltaIV, Mar 19 '17 at 19:25

score 3 · Answer 1 · edited Apr 13 '17 at 12:44

"How does being BLUE matter in Linear Regression for the coefficients?" (I leave the other question about robust standard errors for others)

BLUE is best linear unbiased estimates. The gauss-Markov theorem gives that for linear models with uncorrelated errors and constant variance, the BLUE estimator is given by ordinary least squares, among the class of all linear estimators. That might have been comforting in times where limited computation power made computing some non-linear estimators close to impossibe, even least squares estimators could be a significant effort (read man-hours)!

The ordinary least squares estimator also happen to be the maximum likelihood estimator when the errors (in addition to suppositions above) have gaussian distributions. So knowing that OLS are BLUE, then, makes it comforting to think that the gaussian assumption is of little import (even if the errors are clearly non-normal).

The situation today is different, computing non-linear estimators are computationally cheap. One of the commenters above ask what is this non-linear estimators of $\beta$ in the linear model. They are many. One traditional way could be using least squares after analysing residuals & influence, and removing outliers. That will be a non-linear function of the original, complete, data. Other examples are many, specially modern robust estimators give many examples, see for instance the good book https://www.amazon.com/Robust-Statistics-Methods-Ricardo-Maronna/dp/0470010924/ref=sr_1_1?s=books&ie=UTF8&qid=1489942791&sr=1-1&keywords=maronna+robust or see my answer at Why should we use t errors instead of normal errors?

Also, while variance is easy to interpret in the context of gaussian distributions, it is not at all clear that variance is that meaningful for non-gaussian distributions, so in such cases it is less clear that minimizing variance is a good criterion. See What is the difference between finite and infinite variance

Quoting Tukey (referenced in the book by Maronna above):

It is perfectly proper to use both classical and robust/resistant methods routinely, and only worry when they differ enough to matter. But when they differ, you should think hard.

These robust/resistant methods will usually be non-linear. One good reference (shorter than the book above) is http://web.archive.org/web/20160611192739/http://www.stats.ox.ac.uk/pub/StatMeth/Robust.pdf document by Brian Ripley. A quote from its first paragraph:

The classical books on this subject are Hampel et al. (1986); Huber (1981), with somewhat simpler (but partial) introductions by Rousseeuw & Leroy (1987); Staudte & Sheather (1990). The dates reflect the development of the subject: it had tremendous growth for about two decades from 1964, but failed to win over the mainstream. I think it is an important area that is used a lot less than it ought to be.

I hope above justifies my conclusion that the Gauss-Markov theorem today is mostly of historical interest. What holds back more routine use of robust/non-linear methods is probably that implementations often lack post-estimation inferential machinery, ready implemented to use as easily as one can use least squares in, say, R. (Maybe that is changing, did'nt look into this a long time).

@Shreyo Mallik: Did you read my answer (and the comments thread below the question) I am trying just to argue that being BLUE without normality does not mean much! Maybe I was not clear enough? — kjetil b halvorsen, Mar 19 '17 at 17:41
@kjetilbhalvorsen: Were you talking about this document: http://web.archive.org/web/20160611192739/http://www.stats.ox.ac.uk/pub/StatMeth/Robust.pdf — iugrina, Mar 19 '17 at 18:04
@iugrina: Yes, thank you! I will incorporate that link into the question! — kjetil b halvorsen, Mar 19 '17 at 18:05
Actually there is no lack of valid and tested implementations of robustor resistant regression methods, including tools for inference. See `rlm` and `lqs` functions in the R package `MASS`, among others: examples and some theory [here](http://users.stat.umn.edu/~sandy/courses/8053/handouts/robust.pdf). I've very rarely used these tools, but thanks to your answer now I understand much more about them (+1). — DeltaIV, Mar 19 '17 at 19:32
@DeltaIV: Thanks! I will look at it and maybe add some as an addendum to the answer — kjetil b halvorsen, Mar 19 '17 at 19:54

Role of Gauss-Markov Theorem in Linear Regression

1 Answers1

Linked