How to perform a bootstrap test to compare the means of two samples?

Question

I have two heavily skewed samples and am trying to use bootstrapping to compare their means using t-statistic.

What is the correct procedure to do it?

The process I am using

I am concerned about the appropriateness of using the standard error of the original/observed data in the final step when I know that this is not normally distributed.

Here are my steps:

Bootstrap - randomly sample with replacement (N=1000)
Calculate t-statistic for each bootstrap to create a t-distribution: $$ T(b) = \frac{(\overline{X}_{b1}-\overline{X}_{b2})-(\overline{X}_1-\overline{X}_2) }{\sqrt{ \sigma^2_{xb1}/n + \sigma^2_{xb2}/n }} $$
Estimate t confidence intervals by getting $\alpha/2$ and $1-\alpha/2$ percentiles of t-distribution
Get confidence intervals via:

$$ CI_L = (\overline{X}_1-\overline{X}_2) - T\_{CI_L}.SE_{original} $$ $$ CI_U = (\overline{X}_1-\overline{X}_2) + T\_{CI_U}.SE_{original} $$ where $$ SE = \sqrt{ \sigma^2_{X1}/n + \sigma^2_{X2}/n } $$
Look where the confidence intervals fall to determine if there is a significant difference in means (i.e. non-zero)

I have also looked at the Wilcoxon rank-sum but it is not giving very reasonable results due to the very heavily skewed distribution (e.g. the 75th == 95th percentile). For this reason I would like to explore the bootstrapped t-test further.

So my questions are:

Is this an appropriate methodology?
Is it appropriate to use the SE of observed data when I know it is heavily skewed?

Possible duplicate: What method is preferred, a bootstrapping test or a nonparametric rank-based test?

How large are the samples? – Michael M Apr 05 '14 at 09:46 — Michael M, Apr 05 '14 at 09:46
@Michael Mayer Around 800 – CatsLoveJazz Apr 07 '14 at 08:39 — CatsLoveJazz, Apr 07 '14 at 08:39
See also http://stats.stackexchange.com/questions/189587 – amoeba Mar 09 '17 at 10:15 — amoeba, Mar 09 '17 at 10:15

score 19 · Accepted Answer · edited Mar 26 '19 at 09:41

I would just do a regular bootstrap test:

compute the t-statistic in your data and store it
change the data such that the null-hypothesis is true. In this case, subtract the mean in group 1 for group 1 and add the overall mean, and do the same for group 2, that way the means in both group will be the overall mean.
Take bootstrap samples from this dataset, probably in the order of 20,000.
compute the t-statistic in each of these bootstrap samples. The distribution of these t-statistics is the bootstrap estimate of the sampling distribution of the t-statistic in your skewed data if the null-hypothesis is true.
The proportion of bootstrap t-statistics that is larger than or equal to your observed t-statistic is your estimate of the $p$-value. You can do a bit better by looking at $($the number of bootstrap t-statistics that are larger than or equal to the observed t-statistic $+1)$ divided by $($the number of bootstrap samples $+1)$. However, the difference is going to be small when the number of bootstrap samples is large.

You can read more on that in:

Chapter 4 of A.C. Davison and D.V. Hinkley (1997) Bootstrap Methods and their Application. Cambridge: Cambridge University Press.
Chapter 16 of Bradley Efron and Robert J. Tibshirani (1993) An Introduction to the Bootstrap. Boca Raton: Chapman & Hall/CRC.
Wikipedia entry on bootstrap hypothesis testing.

This is essentially what Im doing but looking at the proportion of times the original/observed t-statistic is >= bootsrapped t-statistic. Is it ok to do a t-test on heavily skewed data in the first instance though, this is one of the reasons why I want to boostrap. — CatsLoveJazz, Apr 04 '14 at 15:25
Techically, for the bootstrap test you just need a test-statistic so that is not a problem. Substantively, a t-test compares means and in skewed data medians are often more meaningful than means. So a test comparing medians instead of means may make more sense. However, that depends on your null-hypothesis, which is your choice and your choice alone. — Maarten Buis, Apr 04 '14 at 15:35
Ok thanks, it is the mean we want to test as all our other output has been in this form. — CatsLoveJazz, Apr 04 '14 at 15:38

How to perform a bootstrap test to compare the means of two samples?

1 Answers1

Linked