Background
It looks p-value is not easy to understand and there are few people who are able to explain in a simple intuitive manner. After having watched YouTube and read articles, still not sure what p-value is.
To be clear, everyone I spoke with at METRICS could tell me the technical definition of a p-value — the probability of getting results at least as extreme as the ones you observed, given that the null hypothesis is correct — but almost no one could translate that into something easy to understand.
It’s not their fault, said Steven Goodman, co-director of METRICS. Even after spending his “entire career” thinking about p-values, he said he could tell me the definition, “but I cannot tell you what it means, and almost nobody can.” Scientists regularly get it wrong, and so do most textbooks, he said. When Goodman speaks to large audiences of scientists, he often presents correct and incorrect definitions of the p-value, and they “very confidently” raise their hand for the wrong answer. “Almost all of them think it gives some direct information about how likely they are to be wrong, and that’s definitely not what a p-value does,” Goodman said.
Objective
To build the understanding about p-value by trying and error, I like to get feedbacks on what is fundamentally wrong in my understanding below if any.
Criteria $\alpha$ for Highly Unlikely
It is subjective but we can regard 2.5%
chance for an event to happen as "highly unlikely" for directional-one-tailed situation. Likewise 5%
for two-tailed non-directional. Then we use it as the criteria $\alpha$ to decide if an event is an extreme case.
p-value
Suppose there is a distribution D
of sampling means of the word cats spaek. 0
for myao, -1
for nyau and 1
for bau. The area of D is normalized to 1 so that a probability can be calculated by the size of an area in D
.
The probability where a sample mean $\overline {x} \ge 0.05$ would be $P( \ge 0.05 | D)$. This is the p-value and it is 1.27%
by calculating the area in D
.
I like to clarify that Calculating p-value as a probability is one thing, but Comparing p-value with $\alpha$ to test a hypothesis is another. p-value is calculated from D
and the sample mean $\overline {x}$. Either use it to test the hypothesis or not is a different matter.
The articles and YouTube videos I saw always started with Null Hypothesis but how to calculate p-value can be explained regardless with using it in hypothesis testing.
Using p-value for Hypothesis Testing
Now we have discovered a new island and found a specie that speaks the words myao, nyau and bau. So we put a hypothesis $H_0$ that they are cats. We let them speak and collect the words they said, and the mean was 0.05
. The p-value is 1.27%.
As we established, less than $\alpha$ (2.5%)
is an extreme case to happen. Hence we will say they are not cats (reject $H_0$).
$\alpha$ as False Negative Rate in Hypothesis Testing
Even if we take samples from cats, there is a chance when all or most of them say bau. Then the p-value for the sample mean will be $< \alpha$ and we will say they are not cats, which is false negative. Hence the $\alpha$ is the False Negative rate we accept.