1

I was a little bit surprised by the high value of the McFadden R^2 given by the "mlogit" R package for this simple model:

f = mFormula(mode ~ log(cost) | 1 | 1)

"mlogit" gives the following output:

Coefficients :
               Estimate Std. Error t-value  Pr(>|t|)    
2:(intercept) -4.002328   0.103865 -38.534 < 2.2e-16 ***
3:(intercept) -2.028449   0.021986 -92.260 < 2.2e-16 ***
log(cost)     -1.781669   0.076317 -23.346 < 2.2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Log-Likelihood: -9138.7
McFadden R^2:  0.62162 
Likelihood ratio test : chisq = 30026 (p.value = < 2.22e-16)

If I try to compute the McFadden R^2 "by hand", I have to estimate the null model, (i.e. mFormula(mode ~ 1)), for which I obtain

Coefficients : 
               Estimate Std.Error t-value  Pr(>|t|)    
(Intercept):2 -2.558537  0.026906 -95.091 < 2.2e-16 ***
(Intercept):3 -2.715562  0.028951 -93.797 < 2.2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Log-Likelihood: -10212, df = 2
AIC:  20427 

If I'm not wrong, given the LL value of -9138 of the model, the McFadden R^2 is computed as 1 - (-9138 / -10212) = 0.105, which is very different from 0.621.

Note that I have aggregated data, and that the discussion in How to calculate pseudo R2 when using logistic regression on aggregated data files? might be useful here, as I use a weighted logit.

Did I miss something or is the R^2 provided by the "mlogit" package not accurate when computed for weighted models ?

0 Answers0