I have used mlogit
package and I am trying to summarize the results I have from my model. I have a question regarding the reference value and will get to that in a moment.
redata.full <- mlogit(no.C~ 1| WR+age+age2+BP+noC.1yr, data=redata, reflevel="0", na.action=na.fail)
no.C = number of offspring
WR = risk
age+age2 = the non-linear relationship that as an individual ages their production decreases
BP = browsing pressure
noC.1yr = number of offspring produced the year before
I recognize that my data is ordinal in nature, but Im following other people's methods who have done this and used the reference based approach rather than ordinal logistic regression. However, I am still shakey on justification other than citing the other person and saying "he did it too!" If anyone has a suggestion I would appreciate it.
My results for this model are:
Call:
mlogit(formula = no.C ~ 1 | WR + age + age2 + BP + noC.1yr, data = redata,
na.action = na.fail, reflevel = "0", method = "nr", print.level = 0)
Frequencies of alternatives:
0 1 2
0.233766 0.675325 0.090909
nr method
5 iterations, 0h:0m:0s
g'(-H)^-1g = 2.16E-07
gradient close to zero
Coefficients :
Estimate Std. Error t-value Pr(>|t|)
1:(intercept) -0.281226 1.225763 -0.2294 0.81854
2:(intercept) -0.605312 1.997179 -0.3031 0.76183
1:WR 0.847273 0.518854 1.6330 0.10248
2:WR 1.347976 0.689916 1.9538 0.05072 .
1:age 0.314075 0.275486 1.1401 0.25425
2:age -0.422368 0.395240 -1.0686 0.28523
1:age2 -0.018998 0.014446 -1.3151 0.18847
2:age2 0.022572 0.018949 1.1912 0.23359
1:BP -0.143720 0.173585 -0.8280 0.40770
2:BP -0.074553 0.331108 -0.2252 0.82185
1:noC.1yr 0.574304 0.377821 1.5200 0.12850
2:noC.1yr 1.251673 0.626033 1.9994 0.04557 *
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Log-Likelihood: -116.6
McFadden R^2: 0.079844
Likelihood ratio test : chisq = 20.236 (p.value = 0.0271)
exp(cbind(OddsRatio = coef(redata.full), ci))
OddsRatio 2.5 % 97.5 %
1:(intercept) 0.7548580 0.06831155 8.341351
2:(intercept) 0.5459038 0.01089217 27.360107
1:WR 2.3332750 0.84394900 6.450831
2:WR 3.8496270 0.99577472 14.882511
1:age 1.3689929 0.79782462 2.349065
2:age 0.6554925 0.30209181 1.422317
1:age2 0.9811815 0.95379086 1.009359
2:age2 1.0228284 0.98553735 1.061530
1:BP 0.8661299 0.61634947 1.217136
2:BP 0.9281585 0.48504538 1.776078
1:noC.1yr 1.7758933 0.84686698 3.724076
2:noC.1yr 3.4961862 1.02497823 11.925441
I would like confirmation of my interpretations: The model is better than a null - obtained from the likelihood ratio test.
Question: How do I test how well the model is actually working (i.e., goodness of fit)? Hosmer-Lemshow test? Ive read warnings about using the McFaddin's Pseudo R where they really aren't applicable to multinomial regressions. Ive found a HL test with ResourceSelection
library and it says my model is NOT doing well at all. Now what?
Interpretation: WR and noC.1yr are the only variables that are coming out as slightly significant. But this is only between the reference value of 0 and production of 2 calves. It is not significantly different between 0 or 1 for these variables.
Question: Ive been trying to find somewhere in the vignette what the t-value is - it is just a t-test? How would I refer to the estimate as being significant? "The estimated odds for 2-offspring being produced versus 0 were 3.85 (95% CI = 1.0-14.88) which was significant (t= 1.99, P=0.05)"
Referring to my statement regarding setting the reference value. When I run this exact same model using my other options of 0 or 1 offspring - I get completely different results of which variables are significant. If I use 2 as the reference value then Age+WR+noC.yr are significant. If I use 1, then Age only is sig. So, which one to use? I have read you want to pick one that is most relevant to your hypothesis, but in this case I could motivate any of the 3 levels.