Does alpha (the significance level) represent the probability of making a Type I error?

Question

P-values do not represent the probability of making a Type I error. This is well known. We know that a p-value is the probability of seeing an effect as extreme in the universe of the null hypothesis.

However - I have read that alpha values are the highest probability that we're willing to accept making an error.

This doesn't sit well with me. How can we compare a P-value (which DOES NOT represent the probability of making a Type I error) with the an alpha value (which represents the probability of making a type I error)?

score 4 · Answer 1 · answered Mar 25 '16 at 04:52

It may be easier to understand the relationship between the probability of a Type I error and the p-value if we think about the alternative way of conducting hypothesis tests via a rejection region (which is equivalent to rejecting based on a p-value).

To keep this simple, that suppose we are considering the one sided hypothesis test $H_0:$ $\mu \le \mu_0$ vs $H_1:$ $\mu>\mu_0$ in the case where our data is the single observation $X$ which has a distribution $Normal(\mu,1)$. Consequently, we will use $X$ as our test statistic, which under the null hypothesis has a distribution of $Normal(\mu_0,1)$

When we want to conduct this test via a rejection region, we basically say that we will reject the null hypothesis if our test statistic is too big, and to make this an $\alpha$-level test we want the probability of rejecting when the null hypothesis is true to be $\alpha$. This implies that for some $k$, \begin{align*} P(\hbox{reject}|\hbox{H_0 is True}) &= \alpha\\ \Rightarrow P(X>k|\mu = \mu_0) &=\alpha\\ \Rightarrow P(X-\mu_0 >k-\mu_0 |\mu = \mu_0) &= \alpha\\ \Rightarrow P(Z >k-\mu_0) &= \alpha,\\ \end{align*} where $Z$ is a standard normal Random variable (i.e. $Z \sim Normal(0,1)$). In order for $P(Z >k-\mu_0) = \alpha$, k must be selected such that $k-\mu_0$ is the $1-\alpha$ percentile of the $N(0,1)$ distribution which is usually denoted $Z_\alpha$. This in turn would make $k = Z_{\alpha} + \mu_0$, the $1-\alpha$ percentile of $N(\mu_0,1)$. So, if we reject when $X-\mu_0 >Z_\alpha$ (or equivalently when $X >Z_\alpha+\mu_0$) then our test will be an $\alpha$ level test.

So, we can see how this informs our understanding of the p-value when we look at the definition of the p-value. Let $X_{obs}$ represent our observed data (as compared to $X$, which is a random variable that represents the possible value before we actually observe the data). Then in this case, the probability of seeing data "as extreme" under the null hypothesis is \begin{align*} \hbox{p-value} &= P(X\ge X_{obs} |\mu =\mu_0) \\ &=P(X-\mu_0\ge X_{obs}-\mu_0 |\mu =\mu_0). \\ &=P(Z\ge X_{obs} -\mu_0). \\ \end{align*} If we think about this probability as a function of $X_{obs}$, then we can ask, "what value of $X_{obs}$ would make our p-value $=\alpha$?", the largest p-value we can have where we would reject the null hypothesis. Given our last line, we see that if $X_{obs} -\mu_0 = Z_{\alpha}$ (which implies that $X_{obs} = Z_\alpha+\mu_0$) then our p-value will be $\alpha$. This means that in conducting our hypothesis test via the p-value, if our p-value is less than $\alpha$, then that implies that $X_{obs} >Z_\alpha+\mu_0$, but this is identical to our test conducted via a rejection region!

So, when we are comparing the p-value to our significance, it is because of this equivalence to the test via rejection region and how we constructed the test by requiring that the probability of a Type I error be less than or equal to $\alpha$.

I hope that makes things clearer!

Kelvin · Answer 2 · 2016-03-25T02:07:16.580

1

Alpha = probability of rejecting the null hypothesis given that the null hypothesis is true, so if the null hypothesis is true, then alpha is the probability of making a type I error by incorrectly rejecting the null.

However, if the null hypothesis is false, then the probability of making a type I error is zero, because we can't make a type I error by rejecting the null when it is false.

Now if the null hypothesis may or may not be false (i.e., we don't know), then the probability of making a type I error must lie between these two extremes of alpha and zero. So in that sense, yes, alpha is the highest probability that we're willing to accept making a type I error BEFORE we perform the test.

BUT, note that this isn't the same as probability of making a type I error given that we reject the null AFTER performing the test, which is the false discovery rate and lies between 0 (if the null is false) and 1 (if the null is true).

edited Mar 25 '16 at 02:07

answered Mar 25 '16 at 01:45

Kelvin

1,051
9
18

(1) The O.P. gets it right (at least initially): for many composite hypotheses, $\alpha$ is *not* the probability of rejecting the null conditional on the null being true. It is only the *supremum* of those probabilities (the "highest probability"). (2) Your logic departs the traditional account by supposing there is some intermediate probability. It seems like it's drifting into a vague Bayes-like analysis. For instance, when the null hypothesis is simple, then $\alpha$ really is the chance of a type I error when the null is true and otherwise the chance is zero: it is *never* in between. – whuber Apr 14 '16 at 15:41
If alpha is the p-value below which the null is rejected, and the p-value is the probability that a data point is at least as extreme given the null is true, then why isn't alpha the probability that the null is rejected given that the null is true? If you consider the CDF of the null, then alpha does indeed represent the probability that it will be rejected if it is true. I know this is not the normal way it is phrased, but that is the logical consequence, is it not? – Kelvin Apr 14 '16 at 19:38
The "null hypothesis" typically is a set of distributions, not a single distribution. For instance, in the classical Normal theory setting the set of distributions is $\mathcal{N}(\mu,\sigma^2)$ and the "null hypothesis" $H_0$ might be the subset of these for which $\mu \le 0$. The critical region $\mathcal{R}$ is constructed so that *no matter what value $\mu$ might have in $H_0$,* the chance of the test statistic being in $\mathcal{R}$ does not exceed $\alpha$. Consider the case where (say) the true value of $\mu$ is $-100$: there will be practically no chance of rejection. – whuber Apr 14 '16 at 22:01
Isn't a "set of distributions" just a single combined (mixed) distribution of distributions given that we don't know which individual component distribution applies under uncertainty? Why consider them separately? – Kelvin Apr 14 '16 at 22:37
They *must* be considered separately in this situation because the distribution of the sample statistic is not the same for every member of the null hypothesis. For more information about this, look at the theory of UMP tests. – whuber Apr 14 '16 at 22:48
Ok, thanks, that stuff is way beyond me and my intuition. Should I delete my answer? I don't mind, if it is wrong or misleading. How should I do that? – Kelvin Apr 14 '16 at 23:15

Does alpha (the significance level) represent the probability of making a Type I error?

2 Answers2