Let's say I have sample of the size 800.000. All the test says it's normally distributed, and a fitted distribution looks very nice on the data. I take a subsample of it (like 10.000 records). Can I say they are normally distributed too? My intuition says they are, since the subsample is infered from a normal distribution, too. I take subsamples a lot of times, so I would not want to do the test every time. Thank you in advance!
Asked
Active
Viewed 174 times
0
-
5Did you subsample randomly and uniformly? If so, then *a fortiori* your subsample was drawn randomly from a Normal distribution. – whuber Jan 30 '20 at 14:21
-
Related: https://stats.stackexchange.com/questions/442016/if-a-sample-is-not-normally-distributed-can-a-subset-of-the-sample-be-normal – Tim Jan 30 '20 at 14:25
-
1@whuber, yes the subsample was sampled randomly with replacement. That's what I thought, thanks! – ThePhysicist92 Jan 30 '20 at 14:30
-
4Well, if you are truly nitpicky, then sampling with replacement will give you a non-normal distribution, because you have a nonzero probability of seeing the exact same observation twice, and such an event has zero probability for continuous distributions. But for practical purposes, you can probably disregard this – Stephan Kolassa Jan 30 '20 at 14:37
-
randomly subsample without replacement would be much better from a theoretical perspective – Davide ND Jan 30 '20 at 14:41