A question on Multinomial Regression

Question

Assume an experiment with 6 outcomes, dubbed A, B, C, D and E. For outcome A, there are Na subject, for B there are Nb subject and so on.

Now assume we fit a multinomial regression using a Bayesian approach,

y ~ Mult($\theta$)

Where $\theta_i$ the probability of observing outcome $i$, and

$\theta_i = \frac{exp(\phi_i)} {\sum exp(\phi)}$

with $\phi_i$ being a linear combination of the $J$ covariates,

$\phi_i = \beta_0 + \sum_j^J \beta_j\cdot x_j$

And there are simple non-informative $\mathcal{N}(0,1)$ priors on the regression coefficients.

Also note that in this model I have not chosen a reference category.

The general interpretation of such a model could be, what is the relative risk (ratio?) of A compared to B? That would be given by,

RR(A, B) = $\frac{exp(\phi_A)}{exp(\phi_B)}$

If anything is unclear, wrong or imprecise in the model, please let me know! Even if it does not answer the actual questions.

Now, my questions are:

~~1. Is there an actual need to choose a reference category? Or is it purely to make the maximum-likelihood estimation easier? Does it really matter in a Bayesian setting?~~

When comparing two groups, does one get an estimate of the Relative Risk or Relative Risk Ratio?
What if I wanted to compute, say, the RR of belonging to outcome A or B compared to C? Or what about the RR of belonging to A or B, compared to C or D?

A previous answer on SO states that the ratio is a Relative Risk, but I have read some papers reporting it as a Relative Risk Ratio.

Thanks in advance!

edit 1: I have managed to answer one of the questions myself by re-reading the literature.

Yes! Otherwise the estimate is not unique, due to the constraint that all probabilities must sum to one. That is, $\sum_i \theta_i = 1$. See also Wikipedia

I am still not sure about the other two questions, though.

edit 2 Fitting the model with Stan, it is next to impossible to get it to converge. Inspecting the data more closely, it seems that there is a very large number of zeros. The exact model I am trying to fit is,

$\theta_i = \beta_0 + \beta_{gender} + \beta_{age} + \beta_{age^2}$

Gender is a discretised variable, and age is taken as the midpoint of a group in 5 year intervals. This means that there are 20 combinations in total, and a lot of these observations are zero. That is, for each combination, maybe 2-3 of each outcome has a zero value. I am thinking this is what is causing trouble, and I have been looking into a zero-inflated Poisson (ZIP) model instead. However, from what I have read so far, the ZIP model only handles discrete variables, meaning I would not be able to model the quadratic term, $age^2$.

Any pointers on how to account for zero inflation in multinomial models would be much appreciated!

A question on Multinomial Regression

0 Answers0