Is this a correct way to continually update a probability using Bayes Theorem?

Question

Let's say I'm trying to find out the probability that someone's favorite ice cream flavor is vanilla.

I know that the person also enjoys horror movies.

I want to find out the probability that the person's favorite ice cream is vanilla given that they enjoy horror movies.

I know the following things:

$5\%$ of people choose vanilla as their favorite ice cream flavor. ( This is my $P(A)$ )
$10\%$ of people whose favorite is vanilla ice cream also love horror movies. ( This is my $P(B|A)$ )
$1\%$ of people whose favorite is not vanilla ice cream also love horror movies ( This is my $P(B|\lnot A)$ )

So, I calculate it like this: $$P(A|B)=\frac{0.05\times0.1}{(0.05 \times 0.1)+(0.01 \times(1-0.05))}$$ I find that $P(A|B) = 0.3448$ (rounded to the nearest ten-thousandth). There is a $34.48\%$ chance that a horror movie fan's favorite ice cream flavor is vanilla.

But then I learn that the person has seen a horror movie in the past 30 days. Here's what I know:

$34.48\%$ is the updated posterior probability that vanilla is the person's favorite ice cream flavor -- the $P(A)$ in this next problem.
$20\%$ of people whose favorite is vanilla ice cream have seen a horror movie in the past 30 days.
$5\%$ of people whose favorite is not vanilla ice cream have seen a horror movie in the past 30 days.

This gives: $$\frac{0.3448\times0.2}{(0.3448\times0.2)+(0.05\times(1-0.3448))} = 0.6779$$ when rounded.

So now I believe there is a $67.79\%$ chance that the horror movie fan loves ice cream given that they've seen a horror movie in the past 30 days.

But wait, there is another thing. I also learned that the person owns a cat.

Here's what I know:

$67.79\%$ is the updated posterior probability that vanilla is the person's favorite ice cream flavor -- the $P(A)$ in this next problem
$40\%$ of people whose favorite is vanilla ice cream also own cats
$10\%$ of people whose favorite is not vanilla ice cream also own cats

This gives: $$\frac{0.6779\times0.4}{(0.6779\times 0.4)+(0.1\times(1-0.6779))} = 0.8938$$ when rounded.

My question basically boils down to this: Am I correctly updating probability using Bayes' theorem? Am I getting anything else wrong in my methods?

love = favorite? you're not posting degrees of loving. if you love it, it is your favorite. clarify if needed. — generic_user, Dec 19 '12 at 09:10
Good point. I changed "love" to "favorite." It's not grammatically correct, but it's less wordy than saying "choose vanilla for their favorite ice cream flavor." I hope that clears things up. — user1626730, Dec 19 '12 at 14:16

score 7 · Accepted Answer · answered Dec 19 '12 at 14:49

7

This is not correct. Sequential updating of this type only works when the information you are receiving sequentially is independent (e.g. iid observations of a random variable). If each observation is not independent, as in this case, you need to consider the joint probability distribution. The correct way to update would be to go back to the prior, find the joint probability that someone loves horror movies, has seen a horror movie in the last 30 days, and owns a cat given that they do or do not choose vanilla as their favorite ice cream flavor, and then update in a single step.

Updating sequentially like this when your data are not independent will rapidly drive your posterior probability much higher or lower than it ought to be.

answered Dec 19 '12 at 14:49

Jonathan Christensen

3,989
19
25

1

How do you mean by "when the information you are receiving sequentially is independent?" If you mean "independent of the event you're trying to predict," do you know how I can tell if the info I'm getting is independent? – user1626730 Dec 19 '12 at 15:19
Conditionally independent given the event you are trying to predict. If they were independent of the event you're trying to predict then they wouldn't do you any good. As for how you can tell--you have to think about what your data is. In this case, whether someone has watched a horror film in the last 30 days is clearly not independent of whether they love horror films. – Jonathan Christensen Dec 19 '12 at 15:32
When you say "conditionally independent," I'm guessing you mean that each P(B) (i.e., horror-movie-loving, cat-ownership) aren't related to one another? If so, wouldn't the cat-ownership variable be independent of the horror-movie-loving? – user1626730 Dec 19 '12 at 15:47
Yes, you can make an argument that cat-ownership is independent of horror-movie-loving. It's not, necessarily, though--e.g., maybe women are both more likely to love cats and less likely to love horror movies. – Jonathan Christensen Dec 19 '12 at 16:18
Hm, I'm not quite sure what you mean by adding in that bit about women and cats. Could you explain further, please? – user1626730 Dec 19 '12 at 16:29
Suppose, in a hypothetical world, only women love cats and only men love horror movies. Then even though loving cats and loving horror movies seem to have nothing to do with each other, they are not independent. – Jonathan Christensen Dec 19 '12 at 16:44
I think I am starting to understand. In your hypothetical scenario, if all cat-lovers are women and all horror movie fans are men, then it is impossible to be both a cat-lover and a horror movie fan at the same time. And if I did my Bayesian updating the way I did in my original question, then I would fail to take into account that very crucial fact.... Instead, I should find out what percentage of vanilla ice cream favoritists are cat-owning horror movie fans. That way, I'm more likely to minimize the effect of a lurking variable. – user1626730 Dec 19 '12 at 17:11
`percentage of vanilla ice cream favoritists are cat-owning horror movie fans` would be my new P(A), I'm guessing? – user1626730 Dec 19 '12 at 17:22
P(A) is still the prior probability that someone favors vanilla ice cream. `percentage of vanilla ice cream favoritists who are cat-owning horror movie fans` will be your new P(B|A). Otherwise your understanding is correct. In the hypothetical case, if you assumed that cat-ownership and horror-movie-fanship were independent, then you would be updating twice using what is essentially the same information, which will make your answer more extreme than it should be. – Jonathan Christensen Dec 19 '12 at 17:25
Whoops, meant to type P(B|A) instead of P(A) for that one. So, the only remaining question I have is still on figuring out if the data I'm looking at (each P(B)) is independent of one another. I know you mentioned that seeing a horror movie in the past 30 days and being a fan of horror movies are, of course, very related to each other -- and thus, not independent. That example seems obvious, though I wonder if there are other guidelines for seeing if my P(B)'s are independent of one another... – user1626730 Dec 19 '12 at 17:32
...For instance, what if I don't know of any relation between cat ownership and horror movie fanship? Or, what if it is possible to be both a cat owner and horror movie fan? – user1626730 Dec 19 '12 at 17:35
If you can make a strong argument for why you think two things should be independent then you can probably go ahead and treat them as independent. Otherwise, you're better off using the joint distribution. If they are "close" to being independent then it won't make a huge difference; the more dependent they are, the more wrong you will be if you assume independence. – Jonathan Christensen Dec 19 '12 at 17:40
Right then. In that case, thanks for taking the time to fully explain all that to me. I'll be doing my best to make sure my predicting variables are as independent from one another as possible. – user1626730 Dec 19 '12 at 17:42

Is this a correct way to continually update a probability using Bayes Theorem?

1 Answers1

Linked