3

Related reading:

Background:

I am comparing the effectiveness of various forms of linear regression machine learning, such as sklearn.linear_model.Ridge, sklearn.linear_model.Lasso, sklearn.svm.SVR.

Question:

The linked questions above discuss various reasons to standardize, center, or neither the predictor variables in regression settings. If I standardize the X matrix do I have to then standardize the y array? If I center the X matrix do I have to center the y array?

For either of those situations, would failing to standardize/center give me incorrect results?

kjetil b halvorsen
  • 63,378
  • 26
  • 142
  • 467
kingledion
  • 741
  • 7
  • 20
  • NO, just because you standardized $X$ (or some of them) do not force you to standardize $y$ also. Ask yourself: Why do I standarize? and see what the standarization is doing. Otherwise see: https://stats.stackexchange.com/questions/244507/what-algorithms-need-feature-scaling-beside-from-svm/252625#252625 – kjetil b halvorsen May 09 '17 at 20:02
  • So I know that for simple linear regression you do not need to standardize y (or x for that matter). I am asking about various other methods, like ridge, lasso, and SVR. It is not clear to me that the argument you linked applies to those methods. – kingledion May 09 '17 at 20:10
  • They applies. ridge and lasso are not invariant, so needs standardization. But they only need it for $X$, not $y$ (but standardizin $y$ do no harm). I do not know about SVR, but the same principles apply. – kjetil b halvorsen May 09 '17 at 20:13

1 Answers1

2

NO, just because you standardized the predictors $X$ do not force you to standardize the response $y$. Ask yourself "Why do I standardize?" and see what the standardization is doing. Some answers to that can be found at: What algorithms need feature scaling, beside from SVM? As to the additional question in comments: The arguments in my answer linked at above do also apply for ridge and lasso. The arguments to standardize $X$ in those cases do not apply to $y$ (but if you want you can standardize $y$ too, it does no harm, but can complicate interpretations). The same principles apply to SVR, but I do not know the answer in that case.

kjetil b halvorsen
  • 63,378
  • 26
  • 142
  • 467