1

I have 2 non-nested models which I would like to compare. Both models are based on the same dataset but use different predictors.

Model1 predictor A+B Model2 predictor B+C

I know there are multiple tests available to select the "best" method: 1) jtest (Davidson-MacKinnon J test) 2) coxtest (Cox test) 3) encomptest (Davidson & MacKinnon)

All of the test are described in r for the comparison of non-nested models. However, which test is prefered?

If I understand the test correctly, all test say that Model1 is the best.

> coxtest(Model1,Model2)
Cox test

Model 1: group ~ A + B
Model 2: group ~ C + B
                Estimate Std. Error  z value Pr(>|z|)    
fitted(M1) ~ M2  -3.0809     3.1646  -0.9735   0.3303    
fitted(M2) ~ M1 -31.1339     2.0889 -14.9043   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1


> jtest(Model1,Model2)
J test

Model 1: group ~ A + B
Model 2: group ~ C + B
                Estimate Std. Error t value  Pr(>|t|)    
M1 + fitted(M2)  0.18681    0.21166  0.8826    0.3786    
M2 + fitted(M1)  0.93740    0.13155  7.1257 2.149e-11 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1


> encomptest(Model1,Model2, data=data)
Encompassing test
Model 1: group ~ A + B
Model 2: group ~ C + B
Model E: group ~ A + B + C
          Res.Df Df       F   Pr(>F)    
M1 vs. ME    188 -1  1.2402   0.2669    
M2 vs. ME    188 -1 24.3536 1.76e-06 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

2 Answers2

4

For your particular problem, the answer would be: "None of these tests is appropriate."

The help pages for these 3 functions indicate that they all are expecting standard linear regression models (from lm() in R), implicitly with a continuous outcome variable.

In your case, however, your outcome variable is a choice between 2 groups, not a continuous outcome variable. Thus you should be using logistic regression instead of linear regression. Finding the "best" linear regression, as you seem to be attempting, could be far from finding the best model for predicting group membership.

There is considerable discussion on this site about the best ways to compare non-nested generalized linear models (like logistic regression). This page provides arguments for using the Akaike Information Criterion (AIC) to choose among non-nested models, while further providing a link to an informed difference of opinion.

The standard stats package in R has 2 functions to determine the AIC from models (AIC() and extractAIC()); this page shows a way to use them for comparing 2 models. Just be careful to stick with one or the other as the outputs from the 2 functions differ by additive constants.

EdM
  • 57,766
  • 7
  • 66
  • 187
  • I do perform logistic regression, however I was struggling a little to find the correct statistical methods to verify the models. – Nick Vanhuizen Jan 24 '19 at 08:06
0

If your only goal is simply to choose a model with the best fit, then either one of these models is valid since they all work on the same principle. If you are asking this question in order to know which test is preferred for your data type, then it is best to refer to literature in the field.

Also, based on the p-values alone, the fit of Model 2 is statistically significant in this case.

AJoshi
  • 11
  • 2