I am trying to figure out how to do learning using probabilistic programming languages. For this I am following different paths to get a hold on the way of thinking.
I understand modelling using neural networks, and understand how learning in this context works. Now I am trying to figure out the analog in Bayesian reasoning.
I understand following:
- input and output vector of neural networks correspond to distributions (in particular categorical distributions)
- weight matrices corresponds to inference of a prior distribution to a posterior distribution
- learning algorithms, such as backpropagation, corresponds to what?
So, my questions is what learning corresponds to in the probability theoretic terminology? more specifically: How to learn inference functions?
I might have overlooked something quite simple, maybe even trivial. In that case forgive me this question.