3

I have a bunch of 2D distribution/histograms that represent variable X against variable Y, across several conditions. The histograms look something like:

enter image description here

For each condition (each data summary as displayed above), I would like to calculate some measure of "statistical significance" using permutation testing. But I'm not sure how.

Someone suggested doing something like: compute a relevant DV for each distribution (which one?!), then randomly shuffle the data and recompute the DV .. do that 10,000 times to obtain the distribution, then compute the p-value for the original DV on the basis of that distribution.

Can anyone suggest/clarify how this can be done?

ran8
  • 149
  • 8
z8080
  • 1,598
  • 1
  • 19
  • 38
  • 1
    You might want to take a look [this thesis](http://edgwiki.wdfiles.com/local--files/output/mcmillan_phd_2008.pdf), in particular, chapters 4 and 7. I don't think it uses permutation tests, but what you're interested in seems similar to what some people interested in speech use to analyse visual articulatory data. – Ian_Fin Jul 05 '16 at 14:47

1 Answers1

1

Consider the more general case of comparing 2 multidimensional distributions. That question has already been asked. There I suggest r's np package's Entropy vignette which works on categorical variables. You should be able to generate a solid distribution just by sampling alot. For example:

sample(duration,size=2000,replace = T)
ran8
  • 149
  • 8