I would like to know why in the below formula the prior distribution of theta is not conditioned on X (observations):
$$P(\theta|X, y)=\frac{P(y|X, \theta)P(\theta)}{P(y|X)}$$
In my understanding, the correct formula should be:
$$P(\theta | X, y) = \frac{P(y| X, \theta) P(\theta| X)}{P(y|X)}$$
But I think I am missing something.