Multiple regression with 2 correlated variables

Question

I have a data set with a dichotomous outcome variable (surgery result good/bad) and two MR scan markers that are continuous (measurements on the scan).

Now if you have a large measurement in one marker, you probably also have a large measurement in the other marker.

When regressing the relationship with one variable at a time, there is no significance, but if I regress both together, there is a high significance, that is it predicts a good surgery outcome if you have a larger marker in either one.

Does this mean that the measurement cannot predict surgery results on its own, but if both markers are enlarged, it can predict surgery results?

Note that the model does run without multicollinearity problems.

Just to check: with a dichotomous outcome measure, is this being treated as a logistic regression? — EdM, Jan 02 '20 at 16:23

score 1 · Accepted Answer · answered Jan 02 '20 at 16:13

1

With no data, I will attempt to take a stab at this:

The first part of the problem suggests that you have a strong correlation between two markers, but the second part of your question indicates that there is no strong evidence that surgery can be determined by either marker, but needs both to be significant. This sounds like multicollinearity a multicollinearity problem.

Multicollinearity causes two basic types of problems:

There can be problems with your coefficient estimates
There can be precision problems with your estimates, leading to p-values which are not meaningful.

Based on what you presented, you may want to try some type of penalized likelihood regression like LASSO or elastic net. This procedure will minimize the multicollinearity in your data. After that, if both markers appear in your regression model, then your prediction of good result vs bad result will depend on both markers, not just one.

answered Jan 02 '20 at 16:13

akash87

522
2
7

2

Since the OP states "Note that the model does run without multicollinearity problems," shouldn't we consider explanations that do *not* rely on multicollinearity? – whuber Jan 02 '20 at 16:21
OP also states "Now if you have a large measurement in one marker, you probably also have a large measurement in the other marker," though., and if this is the case then there is ambiguity in regards to multicollinearity. I see your point though – akash87 Jan 02 '20 at 16:24
With only 2 variables, you won't have multicollinearity by conventional criteria (ie, VIF >10) if the correlation is >.95. You can have the OP's stated situation w/ correlations much less than that. – gung - Reinstate Monica Jan 02 '20 at 20:40

Multiple regression with 2 correlated variables

1 Answers1