When is a sample size too large?

Question

At what point exactly is a sample size considered "really large"?
In hypothesis testing, do small differences become consistently significant when n is 100,000?
What about 100,001?
Is there a definite limit between a large and a small sample size?

An example in R or a published article would also be helpful!

You might be interested by this post: https://stats.stackexchange.com/questions/2516/are-large-data-sets-inappropriate-for-hypothesis-testing — Pitouille, Nov 22 '21 at 15:37
As a general proposition, questions in statistics that use vague and unquantitative terms such as "really large" are considered ... vague and unquantitative; and therefore tend not to have general answers. — whuber, Nov 22 '21 at 15:40
With _any_ sample size you need to have a clear idea what difference is of practical importance. // If you have used that difference in a power and sample size computation (and made the right assumptions) then sample size should be just the right size to have a good chance of detecting that difference. // If you have data on essentially the entire population of interest, you should be describing the population, not testing hypotheses about it. — BruceET, Nov 22 '21 at 15:40
If one considers it a problem when a hypothesis test has enough power to reject small differences, I tend to believe that one is misapplying frequentist statistics. — Dave, Nov 22 '21 at 15:42
@Pitouille, I did see that one - thank you. But, I'd like to know at what point do I have to start considering that I'm seeing statistical significance because of sample size or a real difference? — Nate, Nov 22 '21 at 15:46
@Nate I wouldn't say any sample is too large. What you should do is looking at more information, in the first place a confidence interval. If the confidence interval does not contain zero (therefore the test is significant), but contains effect sizes that are so small to be practically meaningless, you haven't found anything of *practical* significance. (The wording "real difference" is misleading, as even a very small difference may be real but not meaningful.) — Christian Hennig, Nov 22 '21 at 16:41
In pondering this question, I found it helpful to consider that in many areas of chemistry and physics, interesting properties of a sample can't even be measured until it contains about $10^{15}$ times as many molecules as your tentative limit of $100,000.$ — whuber, Nov 22 '21 at 20:45

score 4 · Accepted Answer · answered Nov 22 '21 at 19:55

A sample is too large when it costs too much. Too much time, too much money, too much effort, too many graduate students, or too many slithy toves. From the point of view of determining the properties of the statistical population being sampled, more is better.

Some of the concerns about 'overly large' samples concern a low p-value being taken as a sign of importance or real-world 'significance'. Avoid that by paying attention to the magnitude of the effect observed. Always report the effect size and always scale it relative to real world considerations.

Other concerns about 'overly large' samples come from the idea that in a repeated testing (sequential testing) situation, sampling until the p-value is smaller than any arbitrary value will always stop with the low p-value, in theory. In practice that is not true because in the real world sampling is constrained by time, money, effort, etc. And anyway, even if you did stop with a false positive low p-value from such a protocol you would most often be protected from real world mistake by paying attention to the previous paragraph.

When is a sample size too large?

1 Answers1