1

I'm linearly regressing some response y onto some predictor x. I'm interested in knowing for what x does y = 0.

I can think of two ways to do this. Let me illustrate with some sample data:

x <- 1:10
y <- 20 - 2 * x + rnorm(10)

I can either linearly regress y onto x and solve the equation explicitly:

- coef(lm(y ~ x))[1] / coef(lm(y ~ x))[2]
(Intercept) 
   10.29915 

Or I can try to be clever and observe that my problem is equivalent to regressing x onto y and predicting x for y = 0:

coef(lm(x ~ y))[1]
(Intercept) 
   10.19658

However, this doesn't give quite the same result. So which approach is correct?

lindelof
  • 741
  • 4
  • 19

1 Answers1

1

The first approach is correct, as regressing $y$ on $x$ is not equivalent to regressing $x$ on $y$. The error terms in each model are different. See here for a thorough explanation.

Scortchi - Reinstate Monica
  • 27,560
  • 8
  • 81
  • 248