2

I am trying to fit a simple linear model with OLS. I have say 4 terms and the number of data points is >> 4, so no explicit overfitting.

1) My terms or my independent variables do have a strong correlation among each other
2) They are all positively correlated with my y

My question is that on building the OLS model, I actually see the coefficient on some independent variables to be negative.

I was expecting all positive signs actually, given the positive correlation. What could cause this? What is a remedy, Non Neg Least Squares.

kjetil b halvorsen
  • 63,378
  • 26
  • 142
  • 467
ganesh
  • 21
  • 1
  • 1
    It sounds like you are suffering from multicollinearity. This happens when your independent variables are highly correlated. – Akavall Oct 10 '12 at 18:53
  • 1
    There are a lot of threads on CV already that discuss this issue. You might start by reading these: [regression-coefficients-that-flip-sign-after-including-other-predictors](http://stats.stackexchange.com/questions/1580/) & [coefficients-change-signs](http://stats.stackexchange.com/questions/31841/), but there are others, if you search for them. You may also want to search under the [tag:multicolinearity] tag. If you still have a question after reading these, come back & clarify what you still need to know. – gung - Reinstate Monica Oct 10 '12 at 18:55
  • Sure thanks a lot guys for the response. Can you give me a heuristic as to when I should be catuious, as in at what level of correlation. Is anything greater than 0.6 too high? – ganesh Oct 10 '12 at 19:16
  • 1
    Hi Ganesh, welcome to the site. I changed your +ve'ly to 'positively'. Many readers of this site have English as a second (or third, or fourth!) language and non-standard abbreviations may be hard for them to follow. – Peter Flom Oct 10 '12 at 19:22
  • 1
    Collinearity is *not* the same as correlation; you can have high correlation without problematic collinearity and you can have low correlation and high collinearity. To test for collinearity, use condition indexes, not correlation. The links provided byu @gung have details. – Peter Flom Oct 10 '12 at 19:24
  • Thanks, will be mindful in the future. Sorry I am a newbie, I would really appreciate if you could elaborate. Here wiki implies that collinearlity and correlation go hand in hand. http://en.wikipedia.org/wiki/Multicollinearity. How can there be case with low correlation but high collinearity? – ganesh Oct 10 '12 at 19:28
  • 1
    Don't worry about being a newbie, @ganesh, almost everyone asking questions here is. But this issue has been covered many times already, in depth, on this site. Why don't you start by reading through the linked threads, searching around the site & reading a few more as your ideas become clearer, & then, **if** you still have any residual confusion, come back here & update your Q w/ what you still need to know. – gung - Reinstate Monica Oct 10 '12 at 19:58
  • Absolutely Gung, thanks a lot, I will bug you guys once I am through :) – ganesh Oct 10 '12 at 20:20

0 Answers0