What method is simulating pvalues from re sampling from the data

Question

A while back I asked a question about correlating times between time stamps and received a response from Peter Ellis that said I could calculate mean distances between codes...

This already will give you some sense of which behaviours are clustered together, but you also should check that this isn't plausibly due just to chance.

To check that, I would create simulated data generated by a model under the null hypothesis of no relation. Doing this would require generating data for each behaviour's time from a plausible null model, probably based on resampling the times between each event (eg between each yawn) to create a new set of time stamps for hypothetical null model events. Then calculate the same indicator statistic for this null model and compare to the indicator from your genuine data. By repeating this simulation a number of times, you could find out whether the indicator from your data is sufficiently different from the null model's simulated data (smaller average time from each yawn to the nearest stretch, for example) to count as statistically significant evidence against your null hypothesis.

I finally possess the skill set to do this and have done so in R but I don't know what this method or technique is called so that I can (a) learn more about it (b) speak intelligently about the theory behind what I'm doing.

Some people have suggested this is called a permutation test, others say similar to but not the same as bootstrapping and some have told me it's related to Monte Carlo re sampling.

What is this method of resampling, given the NULL is TRUE, called? If you have a reference or two to back up your response that may be helpful but not necessary.

Jake Westfall · Accepted Answer · 2013-11-01T22:02:21.007

It seems to me that Ellis could be referring to as many as three distinct ideas here. First he says something about creating "simulated data generated by a model under the null hypothesis of no relation." I would call this a form of parametric bootstrapping. Then he says that this would be "probably based on resampling the times between each event (eg between each yawn) to create a new set of time stamps for hypothetical null model events." Which, let's just be clear here, to do this is not to "create simulated data." We are instead, if I understand correctly, resampling from our actually observed data. This latter procedure is either a permutation test or nonparametric bootstrapping, depending on how the resampling takes place.

I guess I should say a few more words about parametric bootstrapping, permutation tests, and nonparametric bootstrapping.

Usually parametric bootstrapping is done by simulating based on the actually estimated model, and not based on a hypothetical model that is just like the estimated model except the null hypothesis is assumed true, as Ellis seems to suggest at first. By "simulate data" I mean something like as an example: my model states that my data come from two groups, each with a normal distribution, with means $\mu_1$ and $\mu_2$, respectively, and standard deviation $\sigma$, so I will generate many sets of data that satisfy this and use the distribution of test statistics computed from each of these simulated datasets as my sampling distribution. Note, I am creating this data using something like rnorm() in R, not directly using my observed data. Now, one could certainly do this procedure and get a sort of sampling distribution under the null hypothesis of, say, no difference in group means--we would just assume $\mu_1=\mu_2$ in all the simulated datasets, contrary to what we actually observed--and in this way we get a bootstrapped p-value (rather than a bootstrapped confidence interval, which is what the former/traditional method affords you). Again, I would just call this a way of obtaining a p-value via parametric bootstrapping.

A permutation test, on the other hand, involves shuffling your observed data over and over in a way that would be consistent with the null hypothesis. So for example, if the null hypothesis implies that group assignment makes no difference in terms of the group means, you can randomly shuffle the group labels among all your observations many many times and see what mean differences you would get for all possible ways of shuffling in this way. And then you would see where within the distribution of test statistics computed from these shuffled datasets does your actual observed statistic lie. Note that there is a finite (but usually large) number of ways that you can shuffle your actually observed data.

Finally, nonparametric bootstrapping is very similar to the permutation test, but we resample the observed data with replacement to try to get closer to an infinite "population" of values that our data might have been drawn from. There are many, many more ways to resample from your data with replacement than there are to shuffle your data (although it is technically finite in practice as well). Again, similar to parametric bootstrapping, this is usually done not under the null hypothesis, but under the model implied by the observed data, yielding confidence intervals around the observed test statistics, not p-values. But one could certainly imagine doing this under the null hypothesis like Ellis suggests and obtaining p-values in this way. As an example of nonparametric bootstrapping here (in the traditional fashion, i.e., not under the null hypothesis) using the same difference-in-group-means example I used in the parametric bootstrapping paragraph, to do this we would resample with replacement the observations within each group many times but not mixing observations between groups (unlike in the permutation test), and build up the sampling distribution of group mean differences that we get this way.

What method is simulating pvalues from re sampling from the data

1 Answers1