8

Suppose

$$ \mathbf{y} = \mathbf{X} \mathbf{b} + \mathbf{e} \, , \\ \mathbf{e} \sim \mathcal{N}(0,\mathbf{I}_P) \, . $$

We know that $\mathbf{\hat{b}} = (\mathbf{X}^T \mathbf{X})^{-1} \mathbf{X}^T \mathbf{y}$ is the BLUE.

Is it also the UMVUE? I can only find a single source (page 6) that claims this, so I'm unsure.

In case $\mathbf{X}=\mathbf{1}$ it is true ($\mathbf{\hat{b}}$ becomes the sample mean).

But other, related results like Stein's example make me cautious.

And if it's true, then why isn't it more famous?

Patrick
  • 761
  • 4
  • 14
  • The second U in UMVUE stands, like the U in BLUE, for "unbiased". Stein's estimator gives up on unbiasedness through shrinkage in order to achieve an overall lower risk. – Christoph Hanck May 15 '18 at 11:26
  • Yup, I'm aware. I just brought it up as a *related* case. – Patrick May 15 '18 at 11:29
  • 2
    Another source: http://www.stat.wisc.edu/~doksum/STAT709/n709-36.pdf – amoeba May 15 '18 at 12:48
  • Thanks @amoeba . I saw that one, but it talks about $l^\tau b$ (what's $l$?) and presumes that it is estimable (?), and from the start of the proof the linearity seems imposed. Maybe I'm not understanding it well, but it does not seem to answer my question. – Patrick May 15 '18 at 12:59
  • 1
    You don't need to care about $l$, just look at the proof of (i) until the last line or so. They show that $\hat\beta$ is a function of complete sufficient statistic from which UMVUE follows by https://en.wikipedia.org/wiki/Lehmann%E2%80%93Scheff%C3%A9_theorem. Nothing is imposed here. Anyway, I just wanted to give this link. +1, good question. – amoeba May 15 '18 at 13:10
  • 1
    https://stats.stackexchange.com/questions/288674/are-there-unbiased-non-linear-estimators-with-lower-variance-than-the-ols-estim#comment551719_288674 – Cagdas Ozgenc May 15 '18 at 13:23
  • Thanks @CagdasOzgenc , that's brilliant! While I now understand better the reference of amoeba , I think I prefer the more direct proof of your reference [http://www.econ.ohio-state.edu/dejong/note5.pdf , page 17]. Please write an answer if you wish. And, then, can somebody tell me: Why is the BLUE result so much more famous than the UMVUE result? – Patrick May 15 '18 at 14:00
  • No problem. Stein's example doesn't apply here because we consider only unbiased estimators. Stein's example and other shrinking techniques introduce bias in exchange of lower variance. I think BLUE is more advertised because in basic regression setting they consider spherical errors not necessarily Gaussian errors (which is only one type of spherical distribution) or some other predefined distribution where a non-linear estimator will be better than BLUE. https://en.wikipedia.org/wiki/Gauss%E2%80%93Markov_theorem. Basically if it is not Gaussian, but spherical OLS will be BLUE but not UMVUE. – Cagdas Ozgenc May 15 '18 at 15:08
  • Thanks @CagdasOzgenc , although I guess I was more thinking of the question "Why is imposing linearity favoured (in popularity) to assuming Gaussianity?" I'm guessing the reason is historical, but IMHO it's an omission by textbooks to not present the UMVUE result more equitably... – Patrick May 15 '18 at 15:23
  • 1
    Does this answer your question? [Prove that OLS estimator of the intercept has minimum variance](https://stats.stackexchange.com/questions/175373/prove-that-ols-estimator-of-the-intercept-has-minimum-variance) – StubbornAtom May 04 '20 at 20:00
  • 1
    This is a great question and I am baffled that it is still not answered. I am in the same shoes. I found the bits and pieces that mention between the lines "if noise is Gaussian then it's also MVUE" but no more details. If anyone can give a concise answer between the relationship of OLS, BLUE and MVUE, that would be very helpful. – divB Oct 20 '20 at 06:44
  • 1
    BTW, the references to Lehman Scheffe theorem are helpful but a great answer would discuss why only the Gaussian case is MVUE. What is the MVUE when e is, say, Rayleigh? – divB Oct 20 '20 at 06:50

0 Answers0