I have been reading James V. Stone's very nice books "Bayes' Rule" and "Information Theory". I want to know which sections of the books I did not understand and thus need to re-read further. The following notes which I wrote down seem self-contradictory:
- The MLE always corresponds to the uniform prior (the MAP of the uniform prior is the MLE).
- Sometimes a uniform prior is not possible (when the data lacks an upper or lower bound).
- Non-Bayesian analysis, which uses the MLE instead of the MAP, essentially sidesteps or ignores the issue of modeling prior information and thus always assumes that there is none.
- Non-informative (also called reference) priors correspond to the maximizing the Kullback-Leibler divergence between posterior and prior, or equivalently the mutual information between the parameter $\theta$ and the random variable $X$.
- Sometimes the reference prior is not uniform, it can also be a Jeffreys prior instead.
- Bayesian inference always uses the MAP and non-Bayesian inference always uses the MLE.
Question: Which of the above is wrong?
Even if non-Bayesian analysis does not always correspond to "always use the MLE", does MLE estimation always correspond to a special case of Bayesian inference?
If so, under which circumstances is it a special case (uniform or reference priors)?
Based on the answers to questions [1][2][3][4] on CrossValidated, it seems like 1. above is correct.
The consensus of a previous question I asked seems to be that non-Bayesian analysis cannot be reduced to a special case of Bayesian analysis. Therefore my guess is that 6. above is incorrect.