logistic regression prediction: changing interpretation with changing prior

Question

The data include 3 equally sized subsets A, B and C, belonging to two classes:

A belongs to class 1.
B and C belong to class 2.

The prior probabilities of an observation coming from class 1 and class 2 are thus 0.33 and 0.67.

Next, a logistic regression model is fitted on all 3 subsets.
The predicted value of this model is the probability of an observation belonging to class 2 given his predictors values.

In reality I know for sure that I will never have observations belonging to subset C. So the observations will allways originate from either subset A or B and since both subsets are equally sized, I can assume that the prior probability of a new observation to be from class 1 or class 2 will changes to 0.5.

My questions are:

Given the knowledge that all observations are from either A or B but not C, can you still interpret the predicted values as the probability of being in class 2 with the logistic regression model fitted on all 3 subsets?
Are these probabilities biased because of the changed prior probability of being in class 1 and 2?
If so, how to correct for this?

As long as you specify the outcome variable the same, where 1=belong to class 2, and 0=belong to class 1, for both models, I believe the interpretation will be the same. — robin.datadrivers, Feb 22 '15 at 23:17
Linked: http://stats.stackexchange.com/questions/6067/does-an-unbalanced-sample-matter-when-doing-logistic-regression/6072#6072 — Zhubarb, Feb 23 '15 at 09:29
@Zhubarb Did I understand correctly from the link above, that the probabilities are indeed biased and a possible solution would be to perform weighted logistic regression? — statastic, Feb 23 '15 at 10:22
It's not clear to me -- how did your priors arise? Please explain the second one in detail. — Glen_b, Mar 05 '15 at 02:52

logistic regression prediction: changing interpretation with changing prior

0 Answers0