5

Here is an experiment I did:

  1. I bootstrapped a sample $S$ and stored the results as empirical distribution under the name $S_1$.
  2. Then I bootstrapped $i=10000$ times in a row the same sample $S$ and compare the resulting empirical distributions $S_i$ with $S_1$ using Kolmogorov-Smirnov test .

Results from the experiment: The comparisons return different $p$-values (from $0.01$ to $0.99$) and different $D$ values (from $0.02$ to $0.06$).

Is that expected? If I bootstrap the same sample 1000 times isn't it expected that all 1000 empirical distributions to be from the same distribution?

If yes then should I try to establish the distribution of the empirical distributions ($S_1$, $S_i$)?

For instance: Three empirical distributions $S_1$, $S_2$, $S_3$ bootstrapped from the same initial sample $S$:

S1: 1,2,3,4,5,6
S2: 1,3,4,5,6,7
S3: 2,4,5,6,7,8

If I add them up I get:

1,1,2,2,3,3,4,4,4,5,5,5,6,6,6,7,7,8
  • What's the objective? Did you look at Lilliefors test and the paper which describes it? – Aksakal Dec 10 '14 at 13:56
  • The objective is to be able to KS-test empirical distributions established bootstrapped from different samples. Hence I test KS-test on the empirical distributions bootstrapped from one and the same underlying sample. As of now KS-test does not show that all empirical distributions come from the same underlying distribution which is strange. – Dimitar Bakardzhiev Dec 10 '14 at 14:15
  • That's the point of bootstrapping. They are not supposed to. Look at test stats and compare with critical values. – Aksakal Dec 10 '14 at 14:17
  • What do you mean by "they are no supposed to"? If I bootstrap the same sample 1000 times isn't it expected that all 1000 empirical distributions to be from the same distribution? – Dimitar Bakardzhiev Dec 10 '14 at 15:55
  • It depends on how you bootstrap. That's why I asked if you looked at how it's done properly, like in Lilliefors' paper, for instance. – Aksakal Dec 10 '14 at 16:09
  • I bootstrap using Java. I don't need to test for normality just if the bootstrapped empirical distributions are drawn from the same distribution. – Dimitar Bakardzhiev Dec 10 '14 at 16:12
  • "If I bootstrap the same sample 1000 times isn't it expected that all 1000 empirical distributions to be from the same distribution?" **The bootstrap *samples* do not have the same distribution!** (If they did why would you bother bootstrapping?) However, they are *drawn* from the same distribution. Consider drawing a sample from the distribution ${1,4,5}$: you get ${1,1,4}$ the first sample, but ${1,5,5}$ the second sample: each bootstrap (i.e. each sample, for size $n=3$) has a different (technically *possibly* different) distribution, but each sample is *drawn from* the same distribution. – Alexis Sep 21 '21 at 21:30

3 Answers3

2

The thing to recognize here is that all of your bootsamples come from the same population. That is, the null hypothesis obtains here. Bear in mind that under the null hypothesis, the $p$-value is distributed as a uniform. So it sounds like everything worked fine (although I don't know if that is what you were trying to do).

gung - Reinstate Monica
  • 132,789
  • 81
  • 357
  • 650
  • My objective is using KS-test to compare if bootstrapped empirical distributions coming from different populations are actually drawn from the same underlying distribution. Hence the first thing I tried was to test that with one population and it looks like null hypothesis doesn't hold. – Dimitar Bakardzhiev Dec 10 '14 at 16:09
  • 1
    I don't see why you think the null doesn't hold. You state that the p-values are distributed from .01 to .99. That's what they are supposed to do under the null. – gung - Reinstate Monica Dec 10 '14 at 16:12
  • Isn't it that p-value should be >0.1 to have no presumption against the null hypothesis? – Dimitar Bakardzhiev Dec 10 '14 at 16:14
  • 2
    No. Where did you get "0.1"? This is your confidence level, which *you* set. What if *I want* 0.05? How would your bootstrapping *know* that I want 0.05? You'll be getting different p-values every time. – Aksakal Dec 10 '14 at 16:20
  • 1
    @DimitarBakardzhiev, Aksakal is right. The p-value is not 'supposed' to be >0.1 when the null holds. It is supposed to take all values (0, 1.0) equally often. – gung - Reinstate Monica Dec 10 '14 at 16:22
2

I think I understand your problem now. You alluded to your assumption that somehow KS test should show that all bootstrapped samples should be shown to be from the original sample. However, consider this: what does it mean to show that they're from the same distribution?

It usually means that p-value is over some $\alpha$ confidence. If bootstrapping is done properly you'll get p-value sometimes over, sometimes under the $\alpha$. Build the distribution of test statistics you get from running KS test on bootstrapped samples. Observe p-values for various critical values, they should match the theoretical values for which KS test was designed.

Aksakal
  • 55,939
  • 5
  • 90
  • 176
  • Yes my expectation is that KS test should show that all bootstrapped samples should be shown to be from the original sample. Note that the bootstrapped samples I KS-test are of the same size (1000). If I build the distribution of KS-test statistics how this will help me when I start comparing if bootstrapped samples from two different samples are actually drawn from the same underlying distribution? Do you have a paper to recommend? – Dimitar Bakardzhiev Dec 10 '14 at 16:32
  • [This](https://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=5&cad=rja&uact=8&ved=0CD8QFjAE&url=http%3A%2F%2Flya.fciencias.unam.mx%2Frfuentes%2FK-S_Lilliefors.pdf&ei=-XWIVNngJsaogwTprIDABw&usg=AFQjCNF5JSw6cG-iw4E8avU8Xame27DS0w&sig2=L5WpN1fiBULQaIXL4wOdmQ&bvm=bv.81456516,d.eXY) is the paper I referred to. It's about KS-test. Also see the discussion here with code example: http://stats.stackexchange.com/questions/126539/testing-whether-data-follows-t-distribution/126552#126552 – Aksakal Dec 10 '14 at 16:36
0

It seems to me that you could apply permutation testing for checking whether your samples come from the same distribution. It is similar to Boostrapping in the sense that you will make a Monte-Carlo simulation. Under the Null hypothesis you can permute all the observations from one set to another. Then you (repeatedly) compute the statistic of interest on the permuted sets. This is a really nice website explaining the logic behind permuation tests: https://www.jwilber.me/permutationtest/

Ggjj11
  • 100
  • 5