Probability Given Datapoint Does Not Appear in a Bootstrap Sample?

Question

I received this question for a Statistics / Machine Learning assignment and I'd like to you if any of you know the proper answer.

If we have n data points, what is the probability that a given data point does not appear in a bootstrap sample?

Sounds simple enough right? I'm reading Introduction to Statistical Learning to try and find the solution but I would definitely appreciate some help

This question appears to be off-topic because it is about statistics and has been flagged for migration to Cross Validated. — Thomas, Mar 03 '14 at 09:41

score 0 · Answer 1 · answered Mar 03 '14 at 00:46

0

Well, the probability of something not happening in n trials is something like (1 - P1)^n where P1 is the probability for one trial. What is the probability of selecting a given value? If you substitute that, does it look like an equation you know involving Euler's number e ??

answered Mar 03 '14 at 00:46

Robert Dodier

660
4
12

I don't know what the probability is of finding a given datapoint from n in a bootstrap sample, we are not given any numerical values in this question. I believe it is simply 1 trial with n datapoints, but I do not know how to find the probability that one of those n datapoints will be in the bootstrap sample. I believe the formula for this pertains directly to the way Bootstrap sampling works, but I can't seem to figure it out from other formulas for Bootstrapping. – Mar 03 '14 at 01:13

score 0 · Answer 2 · answered Mar 06 '14 at 03:55

0

the probability for one trial with n data points is 1/n. hence the equation looks like (1-1/n)**n

answered Mar 06 '14 at 03:55

Probability Given Datapoint Does Not Appear in a Bootstrap Sample?

2 Answers2