Say I was tasked with estimating the median income of graduates of a given MBA program. Let's say what I have is 1000 salaries, no more and no less. I could take the median of this sample. But this doesn't give me much in terms of confidence about my estimate.
Does it make sense to bootstrap samples from the pool of 1000 graduates, collect their medians and examine this distribution? The mean and standard deviation of my pool of bootstrapped-sample-medians might give me a better idea of where the true median value lies and my confidence therein.
A few questions...
- Is this a principled approach?
- How should I decide the size of n in my bootstrapping scheme?
- Any other rules for doing this sort of analysis that I should be aware of?