Assume that I have a rating system with $R$ rating classes, the classes have a probability of default $p_{i0}, i=1,2..,R$. The probability of default is the probability that a company in that class has a payment problem within one year.
I have a set of companies that I have to partition over these classes, using one or other model.
Let's say that I classified $N$ companies last year and each rating class contains $N_i$ companies where $\sum_{i=1}^R N_i=N$. Today, one year later, I count the observed number of defaults in each class and I find $d_i$.
I want to test whether my system is well calibrated, i.e. whether these observed numbers of defaults are in line with my a priori fixed probabilities $p_{i0}$.
I define my test statistic as $X^2=\sum_{I=1}^R \frac{(d_i-N_ip_{i0})^2}{N_i p_{i0}}$. If the expected cell counts $N_ip_{i0}$ are not too low and if the defaults in different rating classes are independent, then this test statistic is $\chi^2$, but what is the number of degrees of freedom , is it $R$ (because I have a sum of $R$ squared standard normal random variables that are independent) or is it $R-1$ as in pearson's $\chi^2$ ? If the latter is the case; where do I loose one degree of freedom ( a reference is fine also).