0

I've been writing some simple machine learning algorithms and looking at time-series data. In doing so, I have come across the use of feature scaling: rescaling and standardizing as they are referred to on Wikipedia.

For something like k-Nearest Neighbors, I imagine rescaling to be the best scaling to use so there's a completely level playing field between variables' influence. Is that true?

When would you apply standardizing to data instead? What are the advantage of standardizing and rescaling?

Noel Evans
  • 101
  • 2
  • So .... paraphrasing the linked article, is the answer, you don't often need to do feature scaling. Do rescaling when the magnitude of one predictor differs greatly from another. Do standardizing when doing something like linear regression because the Beta for x^0 will be skewed? – Noel Evans Oct 23 '14 at 16:14
  • 1
    For a lot of machine learning algorithms (& even things like PCA), you typically want to scale your variables so that they are on "a completely level playing field", as you say. Even that can depend on what you're doing, see, eg, @cbeleites' answer [here](http://stats.stackexchange.com/a/86372/7290). As for the arguments amongst [centering, scaling & normalizing](http://stats.stackexchange.com/a/70555/7290) the linked thread discusses that thoroughly. Regarding your specific question, in standard regression the betas are not typically skewed & linear transformations wouldn't unskew them. – gung - Reinstate Monica Oct 23 '14 at 16:29

0 Answers0