I am using the package nnet
to fit multinomial regression models using multinom()
.
When fitting the model using an independent variable with much greater mean than variance, the confidence intervals get narrower and the p-value gets smaller. This is unexpected behavior for me.
As a simple univariate example I modified the given example on the multinom()
help page. I will ilustrate it using the confint.multinom()
function from the same package, since p-values are not natively supported by nnet
.
library(MASS)
example(birthwt)
######## Base Line glm logistic Regression #############
glm_model <- glm(formula = low ~ age, family = binomial, data = bwt)
confint(glm_model)
2.5 % 97.5 %
(Intercept) -1.0336270 1.847399861
age -0.1150799 0.008986436
######## Base Line multinomial logistic Regeression ###########
multinom_model <- multinom(formula = low ~ age, data = bwt)
confint(multinom_model)
# weights: 3 (2 variable)
initial value 131.004817
final value 115.955980
converged
2.5 % 97.5 %
(Intercept) -1.0494438 1.8205227
age -0.1129621 0.0105759
######## Transformed glm logistic Regression ###########
glm_model_mod <- glm(formula = low ~ I(age+200), family = binomial, data = bwt)
confint(glm_model_mod)
2.5 % 97.5 %
(Intercept) -2.8001518 24.832982980
I(age + 200) -0.1150799 0.008986436
######## Transformed multinomial logistic Regeression ###########
multinom_model_mod <- multinom(low ~ I(age+200), data=bwt)
confint(multinom_model_mod)
# weights: 3 (2 variable)
initial value 131.004817
iter 10 value 115.984705
final value 115.955988
converged
2.5 % 97.5 %
(Intercept) 10.64489882 10.64491130
I(age + 200) -0.05267766 -0.04989515
At first the confidence intervals of multinom are a bit more conservative compared to glm
. But when adding 200 to the independent variable, the confidence intervals are getting extremely narrow and even the coefficient for age changes (-0.05119 -> -0.05128). The coefficient for age of the normal logistic regression is, as expected, unaffected by this intercept change.
This example also holds for real multinomial models with a 3 level dependent variable and models with multiple independent variables.
Am I violating model assumptions? Or am I misunderstanding how the multinomial regression works?