I fail to understand how to define the prior distribution for a multinomial regression.
- In what unit the prior probability should be set given that the response don't really have units (but just categories)? Should it be a probability, or maybe a log of odds?
- Does the dimensions of the prior depend on the number of levels in the categorical response variable?
- What is the meaning of the variance-covariance matrix for a categorical response variable?
Here is a subset of my data (where the variable names and outcomes have been renamed). An example of working MCMCglmm
on these data would be awesome!
df=read.table(text="y x1 x2
1 yellow 106.00 6.190476
2 yellow 120.00 5.254762
3 yellow 57.00 6.202381
4 yellow 115.33 5.652381
5 yellow 175.00 6.154762
6 yellow 74.00 8.285714
7 yellow 104.67 3.766667
8 yellow 95.50 7.976190
9 yellow 108.00 8.792857
10 yellow 121.33 7.935714
11 yellow 66.67 6.969048
12 yellow 30.00 7.333333
13 yellow 45.00 6.811905
14 yellow 70.00 7.550000
15 yellow 48.00 7.316667
16 yellow 211.00 4.650000
17 yellow 69.00 8.369048
18 yellow 110.50 6.621429
19 yellow 203.00 6.095238
20 yellow 75.33 8.211905
21 yellow 207.33 6.211905
22 yellow 54.00 7.961905
23 yellow 74.00 7.019048
24 yellow 113.00 4.221429
25 yellow 23.00 7.942857
26 yellow 80.00 7.511905
27 yellow 257.00 7.878571
28 yellow 211.00 7.754762
29 yellow 99.00 8.016667
30 yellow 120.00 7.728571
31 yellow 222.50 5.840476
32 yellow 44.00 4.209524
33 yellow 63.00 6.614286
34 yellow 57.00 8.669048
35 yellow 223.33 7.033333
36 yellow 128.00 6.754762
37 yellow 128.00 5.561905
38 yellow 121.00 7.471429
39 yellow 70.00 7.445238
40 yellow 85.67 5.261905
41 yellow 113.33 8.509524
42 yellow 82.00 6.697619
43 red 207.33 4.180952
44 red 167.67 5.302381
45 red 366.50 7.102381
46 red 230.00 4.942857
47 red 201.00 5.754762
48 red 226.00 9.076190
49 red 193.33 7.066667
50 red 170.00 7.314286
51 red 361.33 7.502381
52 blue 154.00 4.342857
53 red 199.33 6.361905
54 blue 97.00 7.750000
55 blue 82.33 6.209524
56 blue 55.67 5.321429
57 blue 47.50 5.911905
58 blue 15.67 7.185714
59 blue 96.50 6.452381
60 blue 202.33 8.576190
61 blue 157.00 6.669048
62 blue 117.33 5.828571
63 blue 105.67 8.485714
64 blue 108.67 5.714286
65 blue 296.67 5.852381
66 blue 206.50 6.826190
67 blue 88.50 6.178571
68 blue 163.00 7.833333
69 blue 151.50 8.983333")
and here is a MCMCglmm
call for which the default priors lead to an error message
set.seed(12)
m = MCMCglmm(y ~ -1 + trait:(x1) + trait:(x2) , rcov = ~ us(trait):units,
data = df, family = "categorical", verbose = TRUE, burnin = 8000,
nitt = 40000, thin = 50)
ill-conditioned G/R structure (CN = 24007848728601288.000000):
use proper priors if you haven't or rescale data if you have