Strange outcomes in binary logistic regression in SPSS

Question

I did a binary logistic regression with SPSS 23 and I found some strange outcomes. This is for NOACprev until No_Prev_treatment, the last 6 variables. First of all they have very high outcomes for B, the S.E. is extremely high, the Wald is 0 and there is no 95% C.I.

Does anyone know what has gone wrong?

Excuse me, I've edited the question. It's about the last several variables, they have, amongst others, no 95%-C.I., a Wald of 0 and extremely high S.E. outcomes — Joris Komen, May 03 '16 at 10:59

score 5 · Accepted Answer · edited Apr 13 '17 at 12:44

mdewey already gave a good answer. However, given that SPSS did give you parameter estimates, I suspect you don't have full separation, but more probably multicollinearity, also known simply as "collinearity" - some of your predictors carry almost the same information, which commonly leads to large parameter estimates of opposite signs (which you have) and large standard errors (which you also have). I suggest reading up on multicollinearity.

mdewey already addressed how to detect separation: this occurs if one predictor (or a set of predictors) allow a perfect fit to your binary target variable. (Multi-)collinearity is present when some subset of your predictors carry almost the same information. This is a property of your predictors alone, not of the dependent variable (in particular, the concept is the same for OLS and for logistic regression, unlike separation, which is pretty intrinsical to logistic regression). Collinearity is commonly detected using Variance Inflation Factors (VIFs), although there are alternatives.

How you should address separation or collinearity depends on your science. If you have separation, you may actually be quite happy, since you have a perfectly fitting model! In the case of collinearity, you may want to simply delete one or more of the collinear predictors, or transform them via a Principal Components Analysis (PCA), retaining only the first principal component(s). Or you may want to look at this earlier question with some excellent suggestions. In either case, I'd suggest looking at whether the original or the modified model predicts well on a new sample. (If you don't have a new sample, you may want to perform cross-validation.)

Incidentally, you don't get a confidence interval for numerical reasons. SPSS tries to take the parameter estimate, add 1.96 times the standard error, and exponentiate the result. Unfortunately, $e^{5000+}$ won't really fit into the table window...

If you use default algorithms - they generally have fixed number of iterations. Sometimes they can't get to "infinity" in the number of iterations. — probabilityislogic, May 03 '16 at 11:12
Thanks for the help, after deleting 2 predictors the problem was solved. — Joris Komen, May 03 '16 at 11:11
No problem. I really like this answer. Thanks for taking the time to improve it. — Silverfish, May 03 '16 at 15:11
So the problem was solved by @Joris by deleting the two best predictors of the outcome. — mdewey, May 04 '16 at 09:50

score 3 · Answer 2 · answered May 03 '16 at 11:00

You almost certainly have separation here. If you tabulate the outcome by your suspect predictors you will find that (a) if the predictor is binary there is only one level of your outcome for one level of the predictor (b) if you predictor is continuous then for a range of values above (below) a cut-off you only have one level of the outcome. What you do next depends on the underlying science of the problem but you can get finite estimates using Firth's method but I do not know whether it is available in SPSS (which I do not use).

score 0 · Answer 3 · answered May 04 '16 at 14:22

One other postscript: logistic regression does have problems when there is separation. Two alternatives would be to use penalized logistic, which is available as the STATS FIRTHLOG extension command or to use DISCRIMINANT, which works even when there is separation.

Strange outcomes in binary logistic regression in SPSS

3 Answers3