I fitted a multinomial probit model with one independent categorical variable Y (levels 1,2,3) and two explanatory variables X1 and X2.
Using mlogit package in R like this:
library(mlogit)
df = read.csv("https://gitlab.com/cristiandavidarteaga/rtraining/raw/b40daf27a52bf01ce58d0ea32c5e4854f5b23836/mlogit_2var/data.csv",header = T)
d = mlogit.data(df,shape = "wide",choice = "y")
myprobit = mlogit(y~0|x1+x2, d, probit = TRUE)
summary(myprobit)
Gives me the following coefficients:
Frequencies of alternatives:
1 2 3
0.509 0.128 0.363
bfgs method
21 iterations, 0h:0m:34s
g'(-H)^-1g = 9.56E-08
gradient close to zero
Coefficients :
Estimate Std. Error t-value Pr(>|t|)
2:(intercept) -10.7685665 0.9330425 -11.5413 <2e-16 ***
3:(intercept) -11.4357413 1.0913296 -10.4787 <2e-16 ***
2:x1 0.1097622 0.0093004 11.8019 <2e-16 ***
3:x1 0.1094478 0.0094566 11.5737 <2e-16 ***
2:x2 0.1010603 0.0100107 10.0952 <2e-16 ***
3:x2 0.1150660 0.0116610 9.8676 <2e-16 ***
2.3 0.9781048 0.0471720 20.7348 <2e-16 ***
3.3 0.0676135 0.0521005 1.2978 0.1944
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Log-Likelihood: -199.84
McFadden R^2: 0.79498
Likelihood ratio test : chisq = 1549.8 (p.value = < 2.22e-16)
I can't find a clear explanation about how to use these coefficients to predict outcomes for new data.
For example, If I have these coefficients, How can I manually predict (by hand, not using R) the outcome (1, 2 or 3) for x1 = 26 and x2 = 55 ?
Do I need to use the co-variance matrix to do this?
I know R or STATA can do it, however, for my research it's important to understand how to do it since I need to write a custom version of probit.