The max entropy philosophy states that given some constraints on the prior, we should choose the prior that is maximum entropy subject to those constraints.
I know that the Beta($\alpha, \beta$) is the max entropy distribution with domain $[0,1]$ subject to the constraints that $\mathbb{E}[ \ln(x) ] = \psi(\alpha) - \psi(\alpha + \beta)$ and $\mathbb{E}[ \ln(1 - x) ] = \psi (\beta) - \psi( \alpha + \beta)$. (Where $\psi$ is the digamma function.) (Reference: https://en.wikipedia.org/wiki/Maximum_entropy_probability_distribution#Other_examples )
My question is: Why might these constraints be reasonable descriptions of state of knowledge about the world?
The constraint on the domain being $[0,1]$ is clear from the description as a prior on a binomial or geometric parameter; is there some way to interpret the other two constraints from that angle?
Thoughts: By the law of large numbers, $\mathbb{E}[ \ln(x)]$ is approximated by $\ln ( \sqrt[n]{ \prod_{i = 1}^n X_i })$, where the $X_i$ are iid Beta($\alpha, \beta$); so constraining $\mathbb{E}[ \ln(x) ]$ is like saying that we have knowledge about the arithmetic mean of a large sample. (And similarly for the other constraint.)
However, it is unclear to my why this is a natural piece of knowledge to have -- unlike, say the max entropy justification for the Gaussian where it seems natural to have prior beliefs about the mean and variance. Perhaps this bias towards accepting that mean and variance constraints are natural is just due to exposure to certain kinds of datasets / generative models over other ones? What would be a good example to accept the naturalness of the constraints that make the beta distribution max entropy?