2

Assume that I have $p$ independent variables:

$$X_{1}, X_{2}, \ldots, X_{p}$$

I wish to test the hypothesis

$$H_{0}\!: b_{1}=b_{2}=0$$

According to the partial $F$-test (Wald test), I need to run a full model and a reduced model with variables $3, 4, \ldots, p$, and to compare them in the $F$-statistic.

My question is why? Why can't I just run the model with these two variables and look at the "full $F$-test" for this particular model? What is the difference between running a model with these two variables and using the $F$-test, and running the partial $F$-test with two models, involving the other variables, which are not of any interest for this hypothesis?

gung - Reinstate Monica
  • 132,789
  • 81
  • 357
  • 650
user3275222
  • 675
  • 8
  • 18

1 Answers1

2

The $F$-test for the model taken as a whole can be understood as a multiple partial $F$-test, but I can see what you mean.

The distinction is that one test (the nested version) tests the variables in the context of your other covariates, $X_3, \ldots, X_p$, whereas the $F$-test of the single two variable model tests them without the covariates. Put another way, the former controls for the other variables, whereas the latter ignores them1. If the other variables are at all relevant, the former test should have greater statistical power. If your data are observational, the tests are of the variables projected on a different margin, and the true answer might differ irrespective of whether the tests yield correct decisions or not.

1. It may help you to read my answer here: Is there a difference between 'controlling for' and 'ignoring' other variables in multiple regression?

gung - Reinstate Monica
  • 132,789
  • 81
  • 357
  • 650