How to estimate the effect of sample size on the variance of a statistic (using a given sample)?

Question

Question

Let's say I have some sample, of size N, with observations ($y_i$ with $i=1,...,N$). And I have some statistic based on the observations (say $S = f(y_1, ..., y_N)$).

I would now like to estimate the variability of S, given some sample size n ($V_n(S)$), where $n \leq N$.

For example, to know which $N^* << N$ I can choose so to get some level of precision.

Idea

If I had wanted to estimate $V_n(S)$ for N ($V_N(S)$), for some general statistic, I could use bootstrap or jackknife.

But how would I estimate this value for smaller values of n?

One direction I had was to have repeated B (bootstrap) samples from our N observations, each of size n, and then estimate $V_n(S)$.

But this raises several questions for me:

what do I gain from using samples with replacement? Why not take B samples, each of size n, without replacement? (for $n = N-1$ this would just be similar to the jackknife) What is a good way to decide if there is "substantial" benefit to using samples with replacement here?
If the statistic was something like the mean, we would know the exact formula of the standard error. And we would just estimate the variance and then divide it by n, for each sample size we care about. Wouldn't that be better also for the general case if we knew that it depends on n? I.e.: why not estimate the $V_N(S)$ for the full sample size, then multiply it each time by $\frac{N}{n}$ to get the relationship we care about? Is there a known mapping of statistics that would depend on 1/n linearly (e.g.: the variance of the mean, the variance of the variance), vs ones that are not? (such as the variance of the max)

Any thoughts?

If your statistic is asymptotically linear then $\sqrt{n}(T_n-T_0) \rightarrow_d N(0, \sigma_0^2)$ so the asymptotic effect of sample size is found as you would for the sample mean. This doesn't take into account any finite sample variability. For that you can do bootstrap with replacement with fewer samples. With replacement makes sure the bootstrap samples behave like samples from the true data distribution, at least asymptotically. — , Jul 22 '21 at 19:10

How to estimate the effect of sample size on the variance of a statistic (using a given sample)?

Question

Idea

0 Answers0