I am a computer scientist, so I have a background at maths (however limited). I am reading about posterior distribution from here http://en.wikipedia.org/wiki/Posterior_distribution .
It says there:
The posterior probability is the probability of the parameters θ given the evidence X: p(θ|x).
It contrasts with the likelihood function, which is the probability of the evidence given the parameters: p(x|θ).
My question is firstly can you provide me with a very simple example to understand better these concepts? And in addition in machine learning, what we want isn't the probability p(Class|x)?
Thanks a lot