I have circular data such that multiple human participants were, each shown a color from a color wheel, asked to remember it for a "retention interval", then report it back by clicking a color wheel. Each participant was tested many times in each of two conditions testing conditions (short retention interval vs long retention interval). I have written and evaluated code that, for a given participant and condition, will model their data (represented as degrees of error between stimulus and report) as a mixture of a Von Mises distribution and a uniform distribution, estimating the proportion of Von Mises as well as the concentration of the Von Mises (the code also optionally permits estimating the location of the Von Mises, if one is unwilling to assume unbiased error, but for now I am willing to make this assumption for this data). However, the code was really hacked together from help that others provided me (see author's note in the previously linked paper) and I really don't understand the EM algorithm enough to extend it to achieve either of the following:
1) Permit estimation of data from multiple conditions from the same participant whilst enforcing a shared value of one or more parameters. This would allow me to evaluate, within each participant, the degree to which various hypotheses about the differences between conditions (no difference, difference on all parameters, difference on just the mixing proportion, difference on just the Von Mises concentration) are supported by the data. I'd likely want to use AIC-corrected (or cross-validated, if AIC is unavailable) likelihoods for comparing these hypotheses.
2) Implement (1) plus a hierarchical strategy that permits simultaneous estimation of parameters across the entire set of observed participants. I don't fully understand the concept of shrinkage yet, but I gather that it might usefully apply here, making (2) a more powerful approach than simply applying (1) to each participant individually and aggregating the results.
Can anyone provide assistance with (1) or (1)&(2)?
--
Following a request to clarify the model, here is my attempt. I'm more familiar with representing models algorithmically than symbolically, so I'll use R code. The data from the ith participant in the jth condition is assumed to be generated by a process like:
library(CircStats)
data_ij = c(
rvm( n_ij*rho_ij , pi , k_ij )
, runif( n_ij*(1-rho_ij) , 0 , 2*pi )
)
data_ij = data_ij[ sample(n_ij) ] #shuffle to make mixture identity latent
where n_ij
is the number of observations made (determined by the experiment, so not a to-be-estimated parameter), rho_ij
is the proportion of observations sampled from the Von Mises, and k_ij
is the concentration of the Von Mises.
With 2 or more conditions measured within each participant,
rho_ij = rho_i + conditionEffectOnRho_i*condition_j
k_ij = k_i + conditionEffectOnK_i*condition_j
where conditionEffectOnRho
and conditionEffectOnK
are vectors supplying the effect on rho
and k
, respectively, in each condition.
I don't have strong a priori expectations for the distribution of rho_i
and k_i
, nor am I sure that it is appropriate to restrict the variance of condition effects across participants to zero (indeed, the covariance of the effects might be of theoretical interest), hence the _i
associated with conditionEffectOnRho
and conditionEffectOnK
.