Should the same scaling strategy be applied to both the target class and explanatory features?

Question

I'm working with a dataset where the dependent variable is continuos (sale price of houses) and there a couple dozen features I'm using to predict the sale price using a linear regression model. These features include binary dummy variables, categorical, and continuous variables - all on different scales.

The dependent variable (sale price) is skewed, so I've instead created a new feature that is log(salePrice) so the distribution is centered. My question is, I had planned on using SckiKit-Learn's StadardScaler class on the explanatory features. Does it make sense to use two different preprocessing techniques, or should I simply use the log of all the explanatory features like I do with the dependent variable?

score 3 · Accepted Answer · edited Apr 13 '17 at 12:44

Threre is no reason to require that the predictor variables should be transformed in the same way as the $Y$-variable. Depending on the nature of the variables, such a requirement make not even make sense! Like, as in your case, some of the explanatory variables are dummys---does not make much sense to transform dummys. Scale differences between the $Y$-variable and predictors are taken care of by the estimation algorithm.

For more information on reasons to transform---or to not transform, see the excelent answers: Why not log-transform all variables that are not of main interest?
Pitfalls to avoid when transforming data?

An answer to an almost identical question: Analysing log and square-root transformed variables

Should the same scaling strategy be applied to both the target class and explanatory features?

1 Answers1