1

I have thinking about how to model proportions for a problem with hierarchical structure.

In the problem, I have observations of users over multiple days, where each observation is a proportion of time spent on different activities in that day.

Each user belongs to a company and companies also belong to clusters.

The output of the model is to predict proportion of time they will spend on activities for the next observation.

An additional requirement is that we will need to make this prediction sometimes for new users, In that case we would like the prediction to be based on the other users in the company. Similarly, we will need to make predictions for new users in companies where we don't have data on others, and would like the prediction to be based on the cluster level.

I was looking for some sort of hierarchical multinomial-dirichlet model that might fit this problem and didn't see one. I took a stab at it with the following, but I wasn't sure if it made sense to use these dirichlet priors like so.

$$ \begin{eqnarray} \boldsymbol{y_{u,t}} &=& (y_1, y_2,...,y_k),\;\sum y_i = 1\\ \boldsymbol{y_{u, t}} &\sim& Multinomial(\boldsymbol\theta_{u,co,cl})\\ \boldsymbol{\theta_{u,co,cl}} &\sim& Dirichlet(\boldsymbol\alpha_{u,co,cl})\\ \boldsymbol{\alpha_{u,co,cl}} &\sim& Dirichlet(\boldsymbol\alpha_{co,cl})\\ \boldsymbol{\alpha_{co,cl}} &\sim& Dirichlet(\boldsymbol\alpha_{cl})\\ \boldsymbol{\alpha_{cl}} &\sim& Dirichlet(\boldsymbol\alpha)\\ \end{eqnarray} $$

I guess an alternative approach is a hiearchical glm model with cluster, company, and user level intercepts.

I'm quite confused as to how to approach this problem, any suggestions?

Jeff
  • 141
  • 3
  • 1
    See [this page](https://stats.stackexchange.com/a/151800/28500) for the close relationship between hierarchical Bayesian and frequentist mixed models. See [this question](https://stats.stackexchange.com/q/477313/28500) and its linked reference for an example of how such a model might be structured. Ways to code random effects in R are illustrated on this [cheat sheet](https://stats.stackexchange.com/questions/13166/rs-lmer-cheat-sheet) and its links. I can't help with the multinomial-dirichlet model. – EdM Aug 05 '20 at 18:44
  • After looking at this question on [hyper priors for dirichlet distributions](https://stats.stackexchange.com/questions/44144/multinomial-dirichlet-model-with-hyperprior-distribution-on-the-concentration-pa), I have realized that the above model doesn't make sense. – Jeff Aug 06 '20 at 17:11

0 Answers0