1

Essentially after performing regression on three variables,

$$ y = a_0 + a_1 \cdot x_1 + a_2 \cdot x_2 + a_3 \cdot x_3 $$

I want to find variance for $a_1+a_2$ to get CI. Logically, I think I can do

$$\text{Var}(a_1+a_2)=\text{Var}(a_1)+\text{Var}(a_2)+\text{Cov}(a_1,a_2)$$

and calculate covariance of two normals because from the model results I'd know mean and variance of $a_1$ and $a_2$, and they are asymptotically normally distributed.

  1. I'm stuck at how to get covariance of two normal RV. Any guidance?
  2. Is there a simple code to calculate this in python or R?
Igor F.
  • 6,004
  • 1
  • 16
  • 41
datalover
  • 21
  • 2
  • Have a look at this post: https://stats.stackexchange.com/questions/104704/are-estimates-of-regression-coefficients-uncorrelated – passerby51 Dec 30 '20 at 07:39
  • For more threads about this topic, see the hits for [this site search](https://stats.stackexchange.com/search?q=variance+vcov+coeff*). – whuber Dec 30 '20 at 14:31

1 Answers1

1

you can use vcov(model) in R to find the covariance matrix.

a = rnorm(100)
b = rnorm(100,1,1)
c = rnorm(100,2,2)
y = rnorm(100,3,1)
m1 = lm(y~a+b+c)

Assume you have a linear model $y = \beta_1 \cdot a + \beta_2 \cdot b + \beta_3 \cdot c+\epsilon$ where $a, b, c$ are the regressors, then you can use the above code to fit the model. Then simply type vcov(m1), you can get the variance-covariance matrix.

> vcov(m1)
              (Intercept)             a             b             c
(Intercept)  0.0236168925  0.0008928804 -0.0072752173 -0.0048195656
a            0.0008928804  0.0089417637 -0.0007706158 -0.0005058700
b           -0.0072752173 -0.0007706158  0.0084035744  0.0002730054
c           -0.0048195656 -0.0005058700  0.0002730054  0.0022051924

Then you can use the ordinary formula to get the CI.

btw: $\text{Var}[X+Y] = \text{Var}[X] + \text{Var}[Y] + 2 \cdot \text{Cov}[X,Y]$

Igor F.
  • 6,004
  • 1
  • 16
  • 41
Leafstar
  • 11
  • 2
  • These variables are the estimate of coefficients not data sets. Is there any formula to do it so I can prove it rather than doing it empirically? – datalover Dec 30 '20 at 04:42
  • Yes, I can do that but like I said, I'd like to have more theoretical answer for two random normal variables covariance, not empirical results. – datalover Dec 30 '20 at 09:52
  • @datalover The distinction you're making is not at all unclear. Regression coefficients are, by their very nature, an empirical result. – Sycorax Dec 30 '20 at 14:24
  • What I meant by that is I want some formula of cov(a1, a2) with respect to mean and variance obtained from estimated coefficients, not like generating data to infer that from limited sample population. Does this make sense? – datalover Dec 31 '20 at 06:10