5

I have two lines in a x-y plot:

  1. observed migration distance (y) and predicted migration distance (x) $$Y=0.95X+0.31,$$ where 0.95 and 0.31 are the slope coefficient and the intercept (p-value = 0.001)

  2. 1:1 line (slope: 1 and intercept: 0)

The null hypothesis is

$$y=ax+b$$ $$a_1=1=a_2\quad and\quad b_1=b_2$$

Please kindly advise what test to use to test the above hypothesis.

chl
  • 50,972
  • 18
  • 205
  • 364
Elaine Kuo
  • 121
  • 2
  • 5
  • This looks like standard bookwork. If this is for some subject could you add the self-study tag please? – Glen_b Apr 29 '13 at 00:26

2 Answers2

5

In this particular case, one of your lines has a known slope and intercept (intercept 0, slope 1), so you don't fit some larger interaction model, you can just jointly test whether the other model is consistent with the population intercept and slope being 0 and 1 respectively.

This is a standard thing for a linear model.

It's slightly easier to regress y-x on x and in the second regression test for both intercept and slope being 0.

The RSS for the reduced model is the sum of (y-x)^2. The RSS for the full model can be extracted from the anova of the linear regression and you can perform an F test, but if you're working in R you can do this kind of thing:

 nullm <- lm((y-x)~0)
 fullm <- lm((y-x)~x)
 anova(nullm,fullm)

The model "nullm" is the LS model $y = 0 + 1 x + \varepsilon$

The model "fullm" is just the LS model with two parameters, but it has to have the same LHS as "nullm" to go into anova, so it looks unconventional. The function anova then calculates the F-test for the improvement of the full model over the null, which adds two parameters and reduces the residual sum of squares by the SS explained by the full model. This acts as a test of the null ($\text{H}_0: \alpha=\beta=0$) against the alternative that at least one of the two is not 0.

However, in this case, you can already see that the hypothesis is going to be rejected because the intercept is already very different from 0 (p=0.001), so there's probably no need to go through and do the whole thing, the result will be rejection at typical significance levels.

Glen_b
  • 257,508
  • 32
  • 553
  • 939
  • Thank you. However, I don't understand well the slope part. Please kindly explain why using F test of the fullm and nullm can prove that the regression line of $Y=0.95X+0.31$ is significantly different from $Y=X$ – Elaine Kuo Apr 29 '13 at 04:39
  • Further, please kindly advise how to regress y-x on x, since y-x is a matrix rather than a variable – Elaine Kuo Apr 29 '13 at 04:44
  • How is $x$ a matrix? Your post says it's a single variable (predicted migration distance). You seem to have left something very important out of your question. If $x$ is really a matrix, *how the heck are you writing* "y = 0.95 x + 0.31"? – Glen_b Apr 29 '13 at 05:17
  • y is whatever you defined y to be in the equation Y=0.95X+0.31, and x is whatever you defined x to be in the equation Y=0.95X+0.31. The definitions are *yours*, not mine. I can't say what to do in R code if your algebra is inconsistent with your code, as it seems it must be. I suggest you post a reproducible example of fitting a model if you want my R code to correspond to yours. I'm not a mind-reader. – Glen_b Apr 29 '13 at 05:36
  • wait I got it. In `nullm – Elaine Kuo Apr 29 '13 at 05:47
  • A flaw detected. Even if the null hypothesis is rejected (at least one of the intercept and the slope coefficient), it may mean that the intercept is not zero. Therefore, whether the slope coefficient is zero still cannot directly proved. – Elaine Kuo Apr 29 '13 at 06:05
  • 3
    That's not a flaw in the answer - what you were given is exactly what you asked for. If you wanted something different than a test of the null hypothesis you specified, you should not have asked for a test of that null hypothesis. [Besides that, you never *prove a null hypothesis*.] – Glen_b Apr 29 '13 at 08:38
  • @Glen_b - would you be so kind and also have a look at [the generalized version of this question, with two estimated models](http://stats.stackexchange.com/questions/151916/are-two-linear-regression-models-significantly-different)? – Robert Pollak May 12 '15 at 09:31
3

Just estimate both lines in a single model using an interaction effect and test whether the interaction effect and the main effect equals 0.

Maarten Buis
  • 19,189
  • 29
  • 59
  • err shouldn't it be [...]using *two* interaction *effects* and test whether the *two* interaction *effects* are *jointly* 'equal to 0' ? – user603 Apr 28 '13 at 14:59
  • 2
    no, the interaction effect measures the difference between slopes. There are only two slopes, so only one difference, so only one interaction effect. The difference between intercepts is captured by the main effect of the indicator variable of group. – Maarten Buis Apr 29 '13 at 07:33
  • thanks, I think I'd miss read the question. That said, shouldn't we test for both interaction terms (the one on the intercept and the one on the slopes) to be jointly 0? – user603 Apr 29 '13 at 08:01
  • 1
    The difference between us is only terminology: to me there is no such thing as an interaction term on the intercept; that is just a main effect. Other than that, you are right: the null hypothesis should be that both the main effect and the interaction term are 0 – Maarten Buis Apr 29 '13 at 08:14