In Ordinary Least Square (OLS) estimation: is the slope actually an "Inverse-variance weighting" estimator?

Question

I am suspecting the answer is yes, but I'd appreciate help in proving it (even though we know that the estimator is BLUE, so it should probably hold).

For context: An Inverse-variance weighting is when we have a bunch of estimators for some parameters (say, $\mu$), in which we know the variance of each of them ($\sigma_i$). In such a case, the minimal variance estimator is to use the following weighted estimator:

$$ \hat{y} = \frac{\sum_i y_i / \sigma_i^2}{\sum_i 1/\sigma_i^2}$$

At the same time, in Simple linear regression, we can find that the intuition for the slope is that it's actually a weighted average of estimating the slope for each data point, as follows:

$$ \hat \beta = \frac{ \sum_{i=1}^n (x_i - \bar{x})(y_i - \bar{y}) }{ \sum_{i=1}^n (x_i - \bar{x})^2 } = \frac{ \sum_{i=1}^n (x_i - \bar{x})^2\frac{(y_i - \bar{y})}{(x_i - \bar{x})} }{ \sum_{i=1}^n (x_i - \bar{x})^2 } = \sum_{i=1}^n \frac{ (x_i - \bar{x})^2}{ \sum_{i=1}^n (x_i - \bar{x})^2 } \frac{(y_i - \bar{y})}{(x_i - \bar{x})} $$

Thinking of each observation getting the weight of: $w_i = \frac{ (x_i - \bar{x})^2}{ \sum_{i=1}^n (x_i - \bar{x})^2 } $.

Hence, the question is, is it true (when using the usual normality assumption) how to show it exactly?

Asymptotic result

If we assume the sample size goes to Infinity, then $\bar y \rightarrow \mu$. Hence:

$Var(\frac{(y_i - \bar{y})}{(x_i - \bar{x})}) = \frac{1}{ (x_i - \bar{x})^2} V(y_i - \mu) = \frac{\sigma^2}{ (x_i - \bar{x})^2}$

This means we can think of the real weight as:

$$w_i = \frac{ (x_i - \bar{x})^2 \frac{1}{\sigma^2} }{ \sum_{i=1}^n (x_i - \bar{x})^2 \frac{1}{\sigma^2}} = \frac{ (x_i - \bar{x})^2 }{ \sum_{i=1}^n (x_i - \bar{x})^2 } $$

So this shows that indeed, for the n goes to inf case, the slope estimator is an inverse-variance weight. But I'm not sure how to do it for finite samples since there is a dependency structure between $y_i$ and $\bar y_i$. It probably requires going through the multivariate case (see here), but I'd love for some help on how to do it.

Thanks :)

Since $\sum (x_i-\overline x)(y_i-\overline y)=\sum (x_i-\overline x)y_i$, it seems $\hat\beta=\sum w_i y_i$ with $w_i=\frac{x_i-\overline x}{\sum (x_i-\overline x)^2}$. — StubbornAtom, Jan 02 '22 at 20:55
@StubbornAtom - the elegent point about using $\frac{y_i-\bar y}{x_i-\bar x}$, is that it is an estimator of the slope by a single point (given the mean). So that the slope $\hat \beta$ gets this lovely interpretation of a weighted mean of many estimators of the slope, so that the weight is that of inverse variance. And this nice description is what I want to see proof of. Makes sense? — Tal Galili, Jan 05 '22 at 09:40
It's not evident what you mean by "inverse-variance weighting estimator." Your analysis clearly shows $\hat\beta$ is a weighted linear combination of the responses. This continues to hold for any OLS multiple regression (just inspect the usual matrix formula: it represents a linear combination of $\mathbf y.$) But the weights just as clearly are *not* inverse variances of the quantities they are multiplying. Could you explain your terminology, then? — whuber, Jan 05 '22 at 20:57
Hey @whuber ! You wrote: "Your analysis clearly shows ̂ is a weighted linear combination of the responses". That is true, but it's the less interesting part. The more interesting part is that it's a weighted linear combination of the estimation of the slope from each data point. I.e.: having a linear combination of $y_i$ is nice. Having a linear combination of $\frac{y_i- \bar y}{x_i- \bar x}$ is VERY COOL. WDYT? — Tal Galili, Jan 09 '22 at 15:01
Also, the asymptotic result I wrote shows that the weights ARE inverse variance of the quantities I care about. But only asymptotically. To be exact probably requires something more complex, which is why I'm asking about it here. — Tal Galili, Jan 09 '22 at 15:03
Re very cool: not at all, I'm afraid. Literally every linear combination of the residuals $e_i=y_i-\bar y$ can be so expressed. This is due to the requirement that all the $x_i-\bar x$ are nonzero along with the rules for multiplication of fractions, because given arbitrary coefficients $\omega_i,$ you can always write $$\sum \omega_i e_i = \sum \left(\omega_i (x_i-\bar x)\right) \frac{e_i}{x_i-\bar x}.$$ I still don't understand what you mean by "inverse variance" or even what the asymptotics might be. — whuber, Jan 09 '22 at 18:33
Allow me to try explaining this again. Let's take a simplified case: a linear regression that always goes through the (0,0) point (lm(y~x+0)). Let's say that we have only 2 points for fitting this linear regression. Then the thing we care to estimate is only the slope. In such a case, we could have said we wanted to estimate the slope using the following method: estimate the slope using only the first point $\beta_{(1)} = \frac{y_1}{x_1}$, and then again using only the second point $\beta_{(2)} = \frac{y_2}{x_2}$. — Tal Galili, Jan 10 '22 at 05:44
Now our estimator could simple be the average of the two points: $\beta' = \frac{\beta_{(1)} + \beta_{(2)}}{2} $. Is this a good estimator or a bad estimator? If we had known the variance of $\beta_{(1)}$ (say, $v_1$) and $\beta_{(2)}$ (say, $v_2$) , it would have been better (i.e.: lower variance) to estimate the slope using: $\beta'' = \frac{\frac{1}{v_1}\beta_{(1)} + \frac{1}{v_2}\beta_{(2)}}{\frac{1}{v_1} + \frac{1}{v_2}} $. What I'm suspecting is that using $\beta''$ is actually the same as using our regular regression equations. And I want to prove that's the case. — Tal Galili, Jan 10 '22 at 05:47
If I am following you, it appears that the "variance" to which you refer is *not* the variance of the $y_i;$ it is *not* the variance of the $y_i-\bar y;$ it is *not* the variance of any of the explanatory variables $x;$ but it *is* the variance of some particular estimator. Please clarify which meaning of "variance" you intend. — whuber, Jan 10 '22 at 14:59
Indeed. I meam the variencce of the y_i/x_i (for fixed intercept of 0), and substracting the means otherwise. — Tal Galili, Jan 10 '22 at 18:22
Thank you--now I see what you're getting at. The inverse variances are $\omega_i=n/(n-1)\times(x_i-\bar x)^2/\sigma^2$ and (therefore) $\hat\beta$ is the $\omega$ weighted mean of the slopes $(y_i-\bar y)/(x_i-\bar x)$ (the common factor of $n\sigma^2/(n-1)$ drops out). But what is there left to show? — whuber, Jan 11 '22 at 22:49
@whuber great, glad we're on the same page :). Things to show: (1) deriving the exact variance of $V(\frac{y_i - \bar y}{x_i - \bar x})$. (2) what is $cov(\frac{y_i - \bar y}{x_i - \bar x}, \frac{y_j - \bar y}{x_j - \bar x})$? (3) if the cov from 2 is not 0 (which I expect it is not), how to plug it into the generalized inverse variance formula. — Tal Galili, Jan 12 '22 at 08:48
(1) is an elementary exercise in computing (co)variances, because it's a linear combination of the $y_i.$ (2) Ditto. (3) What do you mean by a "generalized inverse variance formula"? — whuber, Jan 12 '22 at 16:03
@whuber first, could you please leave some answer? I'd be happy to give you the bounty, and you're welcome to update the answer later. — Tal Galili, Jan 13 '22 at 14:57
@whuber, second. Regarding 1 and 2, I agree these should be simple. The trick is with 3. The formula of inverse variance weights is based on estimators that are independent (!), while the estimators I mentioned are not independent. So it should probably be used with this formula: https://en.wikipedia.org/wiki/Inverse-variance_weighting#Multivariate_Case But I'm not sure how to use it exactly. — Tal Galili, Jan 13 '22 at 14:59
I see. I haven't time to write things out, but it seems like you wish to minimize the variance of a linear combination of estimators while keeping that combination unbiased. That (fortunately) is a straightforward optimization problem: the unbiased condition is a linear constraint on the coefficients and the variance (computed with the covariance matrix) is a quadratic function. Solve it with by introducing a Lagrange multiplier. — whuber, Jan 13 '22 at 15:02
@whuber, I'm patient - if you'd get to write it in the future, I'd be happy to read it :) — Tal Galili, Jan 13 '22 at 15:32

In Ordinary Least Square (OLS) estimation: is the slope actually an "Inverse-variance weighting" estimator?

Asymptotic result

0 Answers0