I've read here that generative models have less degrees of freedom than discriminant ones, so they are more robust and less prone to overfitting. I would like to understand this statement with a simple example.
Suppose I want to predict a binary Y with a binary X.
With a discriminant model, I predict P(Y|X) directly and the model has 2 degrees of freedom: once P(Y=0|X=0) and P(Y=0|X=1) are estimated, the other two probabilities P(Y=1|X=0) and P(Y=1|X=1) are determined.
With a generative model, I have to estimate P(X|Y) and P(Y). It appears to me that the model has three degrees of freedom though: for example once I have estimated P(X=0|Y=0),P(X=1|Y=0) and P(Y=0), all the other parameters (P(X=0|Y=1),P(X=1|Y=1) and P(Y=1)) are determined.
Am I wrong? Or is the statement about generative models having less degrees of freedom than discriminant ones false?