Incorporate an external estimate of probability as predictor in a logistic regression model

Question

I am predicting a binary outcome (e.g., credit default) with logistic regression.

For each observation, in addition to my own observed predictors, I have obtained a probability estimate from an external source (e.g., a likelihood of default from a black-box algorithm). I'd like to test whether incorporating this estimate improves my model.

The approach I am considering is to include this estimate as a predictor in the model and do a $\chi^2$ test for the improvement. Because I am expecting a linear relationship between the estimate and the underlying probability of the outcome, I suppose that rather than $x$ I should include the term transformed as $\ln\frac{x}{1-x}$.

Is this transformation appropriate, and is the overall approach recommended?

this unaccepted [answer](http://stats.stackexchange.com/a/199417/39632) seems to support this approach in the context of a prior distribution - however, in this case I don't have the distribution, just a black-box estimate -- not sure whether that changes anythinng — C8H10N4O2, Jun 16 '16 at 13:55

kjetil b halvorsen · Accepted Answer · 2018-05-18T14:00:12.367

It should be OK to include a score from an external source as a predictor. This could be useful especially if the people making the external score has access to information not available to you. Remember that the linear predictor in logistic regression is on the scale of log odds, so if the external score is meant to be interpreted as a probability, like a default probability, you could consider first transforming it to log odds via $\log \left( \text{score}/(1-\text{score}) \right) $. Alternative, if the score varies over a large range, you could model it vith a spline term.

Incorporate an external estimate of probability as predictor in a logistic regression model

1 Answers1