0

Why do we use Z test for proportions and why not T test. I have found a similar question here but I am unable to get what the answer tries to convey. It would be of great help if anyone could explain the reason in comparatively easy words.

learnToCode
  • 65
  • 1
  • 4

1 Answers1

1

As this answer says in detail, the assumptions underlying the t-test only strictly hold when the individual data values are sampled from a normal distribution.

Proportions are limited to values between 0 and 1, while values taken from a normal distribution can be any real number. And unlike a normal distribution, where the mean and variance of a sample are independent, once you know the proportion you have some information about the variance. So proportions don't meet the assumptions needed for a t-test to be valid.

As you take more and more samples, however, the distribution of average values in most practical applications comes close to a normal distribution. The z-test is based directly on the normal distribution. So although the z-test might not be exact with very few observations it doesn't take very many observations for it to be a very good approximation.

EdM
  • 57,766
  • 7
  • 66
  • 187
  • How do we know the variance once we know the proportion. I didn't get that part. – learnToCode May 02 '20 at 03:50
  • @learnToCode that was an oversimplification; I edited a bit. If you have a probability _p_ of success in each of _n_ independent trials, you are working with a [binomial distribution](https://en.wikipedia.org/wiki/Binomial_distribution). The mean number of successes is _np_ and the variance of the number of successes is known to be _np(1-p)_. So if you know the true probability of success and the number of trials, you already know the variance. (continued...) – EdM May 02 '20 at 14:43
  • @learnToCode You estimate the true probability _p_ from the fraction of successes in your sample of trials. The t-test assumes that the mean and variance of your _sample_ are independent. If your sample size is only 1 or 2 then examples on [this page](https://stats.stackexchange.com/q/320936/28500) show that the sample mean exactly predicts the sample variance for a binomial distribution. For larger sample sizes the relationship isn't so strict, but the mean and variance still aren't independent (as they would be for sampling from a normal distribution) so the t-test assumptions aren't met. – EdM May 02 '20 at 14:54