2

I'm estimating a simple OLS regression model of the type: $y = \beta X + u$

After estimating the model, I need to generate a weighted combination of coefficients (e.g. $w_1 \beta_1 + w_2 \beta_2$) and estimate standard errors for the combined statistic. What's the right way to calculate the standard errors of the sum of coefficients?

I've got this far: I have plenty of cases, so it's safe to say that the asymptotic normality assumption is satisfied. Let's call $s_1$ and $s_2$ the standard errors for $\beta_1$ and $\beta_2$, respectively. If the $\beta$'s were independent estimates, we could use the basic sum-of-normals function to say that the variance of $\beta_1+\beta_2$ is $w_1^2s_1^2 + w_2^2s_2^2$. But unless I'm deeply mistaken, the $\beta_1$ and $\beta_2$ aren't independent. Is there a simple way to fold the variance-covariance matrix of $X$ in to solve this problem?

Abe
  • 1,250
  • 3
  • 13
  • 22
  • See answer to this question: http://stats.stackexchange.com/questions/10439/confidence-interval-for-difference-of-means-in-regression – mark999 Nov 16 '12 at 05:05

3 Answers3

3

You need to add a third term: $2 \cdot w_{1} \cdot w_{2} \cdot Cov(\beta_{1},\beta_{2})$. You can find the estimated covariance in the off-diagonal part of the variance-covariance matrix. If you have an intercept and 2 regressors, that would (typically) be either V[2,3] or V[3,2] since $Cov(\beta_{1},\beta_{2})=Cov(\beta_{2},\beta_{1})$.

dimitriy
  • 31,081
  • 5
  • 63
  • 138
2

So, we start with $Y = X\beta + \epsilon$ as the regression, with $\beta$ being the column vector of coefficients to be estimated and $\epsilon \sim N(0,\sigma^2 I)$. The maximum likelihood estimate $\widehat{\beta}$ of $\beta$ is well-known to be $\widehat{\beta} = (X^{\top} X)^{-1} X^{\top} Y$. Note that $\text{Var}(\widehat{\beta})$ is known to be $\sigma^2 (X^{\top}X)^{-1}$.

Your question is generalised by asking what the variance of some quantity $w^{\top}\widehat{\beta}$, where $w$ is some vector being the same size as $\beta$. The answer is

$\begin{align} \text{Var}(w^{\top} \widehat{\beta}) &= w^{\top} \text{Var}(\widehat{\beta}) w\\ &= \sigma^2 w^{\top} (X^{-1}X)^{-1} w \end{align}$

Even more generally, you can ask what the variance-covariance matrix of some vector $W \widehat{\beta}$ is where $W$ is some weight matrix. The answer is practically the same:

$\begin{align} \text{Var}(W \widehat{\beta}) &= W \text{Var}(\widehat{\beta}) W^{\top}\\ &= \sigma^2 W (X^{-1}X)^{-1} W^{\top} \end{align}$

In fact, the above result is used to derive $\text{Var}( \widehat{\beta})$ in the first place!

P.S. A mistake on your part is to ask for the variance of $w_1\beta_1 + w_2\beta_2$ - this is zero, since $\beta_1$ and $\beta_2$ are unknown constants to be estimated. The question ought to have been to ask for the variance of $w_1\widehat{\beta}_1 + w_2\widehat{\beta}_2$.

PPS. @Sam Livingstone - there was no need to appeal to asymptotic results as these are necessarily approximate in general - since all the distributions in the question are Gaussian, we can derive all the distributions for finite data.

queenbee
  • 568
  • 4
  • 7
1

Look up the delta method. Essentially you have a function $g(\boldsymbol{\beta}) = w_1\beta_1 + w_2\beta_2$. The variance of $g$ is asymptotically: \begin{equation} Var(g(\boldsymbol{\beta})) \approx [\nabla g(\boldsymbol{\beta})]^T Var(\boldsymbol{\beta})[\nabla g(\boldsymbol{\beta})] \end{equation} Where $Var(\boldsymbol{\beta})$ is your covariance matrix for $\boldsymbol{\beta}$ (given by the inverse of the Fisher information, see: http://en.wikipedia.org/wiki/Fisher_information if you're unsure of this).

Sam Livingstone
  • 1,385
  • 11
  • 18