What do I miss in this derivation？

Question

The school is closed due to the ongoing pandemic. And I am interested in the application of the Bayes Theorem in COVID-19.

Here is what I thought. The total population in U.S. is approximately 327,200,000 $P(tested)$ is the percentage of people who are tested for COVID-19 in U.S. which is $\frac{103,945}{327,200,000} = 0.00031768$, $P(tested | infected)$ is known that the testee is infected, and later applied testing, this is almost 100% because the test method is guaranteed to accurately indicate whether it is positive or not. $P(infected | tested)$ is the percentage of people who have infected and was later discovered by testing, which is $\frac{14,250}{103,945} = 0.137091731$

And I am looking for $P(infected)$, the actual infected population.

Then I applied the Bayes’ theorem

$$ \because P(infected | test) = P(test | infected)*P(infected)/P(test) $$

$$ \therefore 0.137091731= 1*P(infected)/0.00031768$$

Then $P(infected) = 0.0000435513$

Then I know that the actual infected people, whether tested or not is 0.0000435513 times the population of the U.S., which is 14249.98536 that’s about 14 thousand people. This is very close to the number CDC release which is 15,219. (https://www.cdc.gov/coronavirus/2019-ncov/cases-updates/cases-in-us.html)

But, I still feel like there are something wrong in this conclusion. If so, what was wrong? And if it is not Bayes' formula that I want to use, what is the correct way to make the prediction.

What do you believe is incorrect? I think I disagree with you that $P(T\vert I)=1$, but what do you think is incorrect? — Dave, Mar 23 '20 at 17:50
Perhaps see [this](https://stats.stackexchange.com/questions/455129/trying-to-estimate-disease-prevalence-from-fragmentary-test-results/455133#455133). If the test is "gold standard" so that sensitivity and specificity are both 1, and there is widespread random testing, then you could get a reasonable estimate of the number infected in the US. — BruceET, Mar 23 '20 at 18:37
Hi, @Dave. I don't know, maybe it is too good to be true. My concern is on $P(T|I)$. Whether it is that given a person is infected, the probability he or she **will conduct the test**; Or, it is given a person is infected, the probability he or she will **tested to be positive** — YiLuo, Mar 23 '20 at 19:32
Hi, @BruceET. I read your link, that is an interesting analysis. Pardon me for not able to fully follow your idea in that answer, I am not in the statistic domain. Correct me if I am wrong. The analysis in your answer use n = 11500 as the population, the result $\hat \pi$ will then be the percentage of actual patient among those who tested (i.e. not including those who are infected but have not conduct the test) — YiLuo, Mar 23 '20 at 20:13
I took it to mean the probability of conducting the test. If you want to incorporate the probability of having the disease given a positive test, that can be done (though you’ll have to know or make an assumption about the false alarm rate: how many people test positive who don’t have corona). — Dave, Mar 23 '20 at 20:19

score 0 · Answer 1 · answered Mar 24 '20 at 15:58

Many thanks to @Dave in the comment. I think this is it. I miss understood the meaning of (|). Whether it is that given a person is infected, the probability he or she will conduct the test; Or, it is given a person is infected, the probability he or she will tested to be positive. and I took the later.

What do I miss in this derivation？

1 Answers1