Say I have an independent variable with the following relationship to the binary dependent variable, DV:
___________________________________________________________________________
|verx_s | # Recs | % Recs | # DV | DV Rt |
|________________________________|_________|_____________|_________|________|
|0 | 75,700| 6.4467%| 941| 1.243% |
|1 | 277,129| 23.6009%| 1,471| .5308% |
|2 | 51,662| 4.3996%| 219| .4239% |
|3 | 769,737| 65.5526%| 2,269| .2948% |
|All |1,174,228| 100.0000%| 4,900| .4173% |
|________________________________|_________|_____________|_________|________|
It's common practice at my company to recode the values of verx_s to the value of DV Rt and treat it as a continuous variable when modeling logistic regression. Confidence intervals are not important. All we care about is whether the model validations on an out-of-time sample. Is there anything inherently wrong with taking this shortcut?
It should also be mentioned that in most cases the independent variable is crafted in such a way that it makes intuitive sense to our customers. Therefore the ordering of the target mean is important. Hence, why we can't use simple dummy vars.