1

In a regression, there is a independent variable x, say x is a positive number indicating number of months. Is it appropriate to include both the original value(X) and its Log transformation(logX) as the independent variables in a regression? Does it have multi-col-linearity issue even if the VIF test is passed?

If it is appropriate, what is the rational behind this, I understand X and X square can be both included to capture the non linear effects. But never see X and LogX before.

kjetil b halvorsen
  • 63,378
  • 26
  • 142
  • 467
Bruce
  • 11
  • 1
  • Maybe a dup: https://stats.stackexchange.com/questions/277316/including-both-transformed-and-original-data-untransformed-in-a-multivariable – kjetil b halvorsen Jan 28 '20 at 21:48
  • Thanks it is a little different, as he claimed X^2 and X, My case is X and Log X, I guess one possible explanation is that col-linearity test only concern about linear relationship in IVs, while logX and X are non linear relationship – Bruce Jan 28 '20 at 22:24
  • It's fine. The easiest way to think about it is that it is just a more flexible non-linear representation of the effect. They can't be perfectly collinear unless you have fewer than 3 distinct values of the exposure. – AdamO Jan 28 '20 at 22:51

1 Answers1

0

There is not a problem in principle, but you should do it for some reason (not just blindly throw it in.) You should have told us your application, with some context, to get a better answer. As for multicolinearity, as logs are locally linear, if the $x$ variable you are transforming varies over a small range, there would be a problem. But, in that case neither would there be any point in including both. But you say VIF does not indicate any problems, so that is not the case.

kjetil b halvorsen
  • 63,378
  • 26
  • 142
  • 467