First off, I am aware that there are some problems with stepwise regression as for instance described here ;) I am saying this to avoid that the discussion goes in the direction of stepwise being an appropriate technique or not.
Let me now describe my problem.
Financial institutions have to estimate customers' default risk; i.e. the probability that a customer will not pay back his debt in full. Typically, this is done using logistic regression.
When there is a lot of (internal) information about customers not paying back their debt in full, the target variable, Y, is a binary variable; e.g. Y = 1 customer did not pay debt back in full and Y = 0 customer did pay debt back in full.
When there is no or hardly any (internal) data about customers not paying back their debt in full, the target variable, Y, can be the rank of an internal default risk rating denoting default risk/credit worthiness. E.g. a financial institution could use ratings similar to S&P's like AAA, AA, A, BBB, ... which conveys the default risk ranks 1, 2, 3, 4, ... These ranks are then the Y's. In that case, we are in an ordinal logistic regression set-up.
In the ordinal logistic regression case, some financial institutions proceed as follows:
- Estimate Pr(Y <= j|X) = alpha_j + X' * beta, where j is the rank of the rating, alpha_j is a rating specific intercept, beta is a column vector of coefficients and X is a matrix of covariates.
- Drop the alpha_j, which is rating specific and use X' * beta as a scoring function. The resulting customer scores reflect a default risk ranking.
- Determine a mapping function that maps the scores to the ratings.
The purpose of the scoring function is thus to properly rank customers in terms of default risk. In this context, I was wondering when one is trying to select the covariates for the model, via stepwise regression, what a proper stopping rule would be.
I am currently using the fastbw()
function from the rms
package. Initially I used the AIC as the stopping criteria but I am wondering whether this appropriate. The AIC is based on the likelihood function which measures the goodness of fit rather than the model's ranking capability. Would a p-value based stopping rule be more appropriate?
Edit: if a p-value based stopping rules is not appropriate as one of the commenters below suggests, what would be the best stopping rule knowing that only ranking is important?