-1

I understand conditional probability whereby I can use known or unknown variables. In terms of the variables from this question, for instance, I can safely get the favourable probability by using the formula

$$ P (a,b|c) = P (a∣c,b)P(c∣b) $$

Where

  • event a = goals conceded so far by a team
  • event b = goals scored so far by a team
  • event c = total wins by a team

These events are used in context of each of the competing sides we seek to compare.

The equation above would then be given as

goals conceded so far by team A x team A total wins ...(8.8)

goals conceded so far by team B x team B total wins ... (7.7)

Then I intend finding the probability of a loss for team A using $P(a|b) = P(a)$ which would evaluate as

1- 8.8 =  8.8

which is obviously impossible. How do I correlate independent events when all prior probabilities have already been found? Or is there another formula I should use instead?

I Want Answers
  • 103
  • 1
  • 5
  • What are events $a$, $b$ and $c$? – LmnICE Aug 02 '17 at 04:27
  • @LmnICE I've added those details to the original post so it's easily accessible to anyone else trying to help. Thank you – I Want Answers Aug 02 '17 at 06:23
  • Your previous question was put on hold as unclear. This one is also unclear. What is your data? How do you know that it's independent? How did you calculate those values? – Tim Aug 02 '17 at 06:42
  • @Tim I clearly posted the data in that question. I don't know what format you expect the data to be but for your convenience, I'm [reposting it here](https://pastebin.com/JeeYWbLP). I know they are independent because Team B's results or wins does not affect team A's wins or losses. Those variables still stand for team A even if they were playing against another team. I calculated the values using the RHS of Bayes' formula $P(a∣z,b)P(z∣b)$. The variables I used are in my original post. Please don't block this question also – I Want Answers Aug 02 '17 at 06:50
  • Your data does not contain any information about conditional probabilities so where did you get the conditional probabilities? – Tim Aug 02 '17 at 06:57
  • The conditional probabilities stem from the row "team A/B total wins". The other rows rely or submit to that row i.e. the eventual outcome of how many goals they score or concede is channeled and reflected in their total percentage win – I Want Answers Aug 02 '17 at 07:08
  • This has nothing to do with conditional probabilities! – Tim Aug 02 '17 at 07:59
  • @Tim What has it got something to do with? Help me friend. We're here to guide each other in the right direction right? I thought they are conditional since it's those other rows that dictate what the eventual percentage total win is – I Want Answers Aug 02 '17 at 08:06
  • You have probabilities of 8.8 and 7.7. These are impossible. Probabilities are between 0 and 1. – Peter Flom Aug 02 '17 at 11:13
  • As stated by others, your question is confusing. Clearly you want to find out what are the chances of a loss for Team A. So you need to think about how your data (number of goals scored and conceded, number of wins and percentage of home wins) relates to the probability of a loss. For example, clearly the more goals a team has scored, the lower the probability of a loss, but what is the exact relationship between those two quantities? In other words, you need a model. – LmnICE Aug 02 '17 at 18:19

1 Answers1

0

You seem to be misunderstanding the notation that you are using.

$$ P(A,B|C) = P(A∣C,B)\;P(C∣B) $$

in plain English is probability that $A = x$ and $B = y$ when $C = z$, i.e.

$$ P(A=x,B=y\,|\,C=z) $$

so this is probability that the number of "goals conceded so far by a team" is equal to $x$ and, at the same time, the "number of goals scored so far by a team" is $y$, when the "total wins by a team" is $z$ etc. Same with $P(A∣C,B)$ and $P(C∣B)$, they should be probabilities, not the "numbers of goals" or other values.

So your data should consist of a table with different values $x,y,z$ and accompanying probabilities. What you present as your "data" has nothing to do with it. You do not have information about neither of the probabilities.

You definitely should check the Wikipedia article on conditional probabilities and refer to some probability handbook before proceeding further.

Tim
  • 108,699
  • 20
  • 212
  • 390
  • But it does and I said so in my original post. event a = goals conceded so far by a team (team A or B) event b = goals scored so far by a team(team A or B) event c = total wins by a team(team A or B) the values are substituted in both equations. – I Want Answers Aug 02 '17 at 08:30
  • How else do you propose I go about finding probabilities when I'm not given the total number of outcomes but my current sample data? – I Want Answers Aug 02 '17 at 08:31
  • @IWantAnswers But the number of goals is **not** the same as probability of scoring such number of goals! Bayes theorem can be applied to *probabilities*, not to any numbers. If you don't have the data on probabilities, then you don't have the data, period. You cannot use Bayes theorem in here. – Tim Aug 02 '17 at 08:59
  • OK. My data source have no business with probabilities, if they even know of its existence. That's why I tried deriving it by myself using the row with percentage values to fill the probability spot in Bayes' equation. What concept or formula do you suggest I use to find any probabilities or arrive at any certainty, with the dataset I've shown? – I Want Answers Aug 02 '17 at 09:16
  • Probabilities cannot be "filled". Either you have data or not. Your question is unclear (as previous one) since you assume the readers to know things about your data and your problem that are not stated in your question. What you are showing us is a very limited data and to understand it we would need to know more about it (as stated in comments to the other question). You want to calculate from it something that cannot be calculated, unless not with the information you provided in the question. – Tim Aug 02 '17 at 09:20
  • All rows as I have received (unadulterated) in the dataset can be found [here](https://pastebin.com/knCeh0VS). All the rows that end with the number '1' represent the opposing/away/visiting team e.g wins1 is the same as 'wins for the visiting team'. What probabilities can you derive from that? NOTE: I don't intend using all those columns at the same time, just the ones I know will influence the eventual win/loss for either team. They are also the same rows I posted in my original question – I Want Answers Aug 02 '17 at 09:53
  • Or does the dataset present an impasse instead of a solution, as discussed in [these answers](https://stats.stackexchange.com/questions/222179/how-to-know-that-your-machine-learning-problem-is-hopeless)? – I Want Answers Aug 02 '17 at 10:23
  • I don't know what your data is about, so I cannot help. You should probably start with some kind of probability & statistics handbook and you should try to define your problem more precisely. – Tim Aug 02 '17 at 12:18
  • I said the data is about football or soccer. The rows define statistics about two teams participating in a match or fixture. The rows contain various information about the pair against other teams in the league up till the moment the data is being served. You keep complaining about my vagary and non clarity. I don't know what language you want me to explain in. I've posted the entire data and repeated myself over and over to the point where it seems I'm spamming the forum. Is it that you don't understand how football works? Or you don't get the row names? Or you don't want to help? – I Want Answers Aug 02 '17 at 19:45
  • You are being **rude**. Your question was about applying Bayes theorem to non-probabilities. My answer was simple: you can't do this. I answered your question. Now you post your data and ask "how to analyze this". Such question is broad an unanswerable without context (what is the data [not only what are the column names]; what are aims of your analysis etc.). Please check: https://stats.meta.stackexchange.com/questions/1479/how-to-ask-a-good-question-on-crossvalidated – Tim Aug 02 '17 at 20:39
  • The aim of my analysis is to find any possible probability among the data I sent. I'm analyzing to find all likely outcomes in a match between those two teams based on their record or data or statistics against other teams in the league – I Want Answers Aug 02 '17 at 21:22
  • Probabilities of what exactly? It sounds like a logistic regression problem. Are you familiar with logistic regression? If not, then you should start with some kind of statistics handbook. – Tim Aug 02 '17 at 21:42
  • Probabilities as basic as 'team A has a 38% chance of beating team B' head to head, based on the facts that though both teams have conceded the same number of goals against other teams, team A has scored more goals against other teams thereby having a greater percentage of (row) 'wins' – I Want Answers Aug 02 '17 at 22:39