Following the incredible demonstration in Statalist by Jeff Pitblado on how to calculate - using the Delta Method - the Standard Errors for Average Marginal Effects of a Logit Model.
Q: What would be the formula to calculate the SEs for the AMEs of a multinomial logit using the Delta Method?
In particular, for the marginal effect (or better, the average partial effects) of a $k$ predictor, taking the derivative w.r.t. to $x_{ik}$ of
$$ Pr\{Y_i=j\} = P_{ij}=\frac{e^{x_i'\beta_j}}{\sum_r e^{x_i'\beta_r}} $$
will give us
$$ \frac{\partial P_{ij}}{\partial x_{ik}} = P_{ij} \left(\beta_{jk}-\sum_r P_{ir}\beta_{rk}\right) $$
which is straightforward how to compute.
If I get it right, the formula for the SEs of this AME will be
$$ SE\left(\frac{\partial P_{ij}}{\partial x_{ik}}\right) = J \hat{V} J' $$
where $\hat{V}$ is the Variance-Covariance matrix of the multinomial logit parameter estimates (and usually returned after estimation in most software like Stata or R) and $J$ is the Jacobian matrix, which is the gap in my calculations.
For the logit model, we can calculate the Jacobian in the following way in Stata:
webuse margex, clear
logit outcome i.treatment distance, nofvlabel
predict p, pr
gen dpdxb = p*(1-p)
gen dpdx = dpdxb*_b[distance]
gen d2pdxb2 = p*(1-p)*(1-p) - p*p*(1-p)
matrix vecaccum Jac = d2pdxb2 0b.treatment 1.treatment distance
matrix Jac = Jac*_b[distance]/e(N)
sum dpdxb
matrix Jac[1,3] = Jac[1,3] + r(mean)
mat list Jac
or in `R' like this:
library(MASS)
data(birthwt)
m <- glm(low~ smoke + age, binomial(link="logit"), data = birthwt)
p <- m$fitted.values
dpdxb <- p*(1-p)
dpdx <- dpdxb*coef(m)[3]
d2pdxb2 <- p*(1-p)*(1-p) - p*p*(1-p)
Jac <- t(d2pdxb2)%*%cbind((birthwt$smoke==0)*0, (birthwt$smoke==1)*1, birthwt$age, 1)*coef(m)[3]/length(p)
Jac[1,3] <- Jac[1,3]+mean(dpdxb)
Jac
but what would be the way to calculate the Jacobian for the multinomial logit motel?