0

I've seen a couple similar questions on stack exchange but did not find their answers to be enlightening. Perhaps the issue traces back to a confusion with the Bernoulli and Binomial distributions. (Sidebar, can the Binomial population variance be determined via the formula np(1-p) or is this just the sample variance?)

It's my understanding that the t-test is preferred when population variance is unknown, (thus the sample variance must be used instead) when dealing with sample means.

Additionally, the t-test is appropriate when samples are drawn from normal (potentially approximately normal?) distributions. In the case of proportions, the distribution is defined on the range [0,1] and so the assumption of normality does not hold.

However, due to the central limit theorem, the means sampled from a binomial distribution will be approximately normal when n is large enough, so the z-test can be used. But, the t-distribution converges on the normal distribution when n and degrees of freedom are large enough. And in the case of proportions, n is usually very high (certainly above 30 for commercial AB tests, anyway.)

So, it would seem that the justification is more practical than proof/theory based; the t-test is just a pain to use and since n is so high, why not just use a z-test? However, in one of the questions, user whuber, addressed this specifically in a comment, however, I didn't understand his argument.

two samples, hypothesis test of proportions t or z test

https://math.stackexchange.com/questions/680587/why-isnt-a-t-test-used-when-comparing-two-proportions#:~:text=For%20paired%20data%20where%20you,pointed%20out%2C%20it%20is%20not.

jbuddy_13
  • 1,578
  • 3
  • 22
  • "The reason you can use a -test with proportion data is because the standard deviation of a proportion is a function of the proportion itself. Thus, once you have estimated the proportion in your sample, you don't have an extra source of uncertainty that you have to take into account." (from the second answer) this actually clears some up for me as the t-dist is *very* concerned with degrees of freedom (DF) and I guess you could say that with proportions, the number of DF is unstable; thus better to use the z-test. – jbuddy_13 Aug 25 '21 at 20:26

0 Answers0