1

I would like to know what is the correct way to normalize a dataframe before applying PCA. I have found two options and I got different results for each one:

    min_max_scaler = preprocessing.MinMaxScaler()
    x_scaled = min_max_scaler.fit_transform(x)
    scaled_data = pd.DataFrame(x_scaled)

or

    scaler = StandardScaler()
    scaler.fit(df)
    scaled_data = scaler.transform(df)
kjetil b halvorsen
  • 63,378
  • 26
  • 142
  • 467
Itzy Death
  • 11
  • 1
  • 2
    What is "correct" depends on your application. Sometimes PCA on raw data is enough, sometimes you would need demeaning or log-transformation. Scaling seem to be an overkill. – Sergey Bushmanov Jun 02 '21 at 13:39
  • https://stats.stackexchange.com/questions/53 answers essentially the same question (albeit concerning a different form of normalization). – whuber Jun 03 '21 at 12:28
  • [Very many similar Qs](https://www.google.com/search?q=site%3Ahttps%3A%2F%2Fstats.stackexchange.com++pca++standardization&safe=off&client=ubuntu&hs=OVq&channel=fs&sxsrf=ALeKk00HTWemfM05nPOwtaI1vtEeMABCfw%3A1622724761881&ei=mdC4YLudNd6z5OUPxMKZqAk&oq=site%3Ahttps%3A%2F%2Fstats.stackexchange.com++pca++standardization&gs_lcp=Cgdnd3Mtd2l6EAM6BwgAEEcQsANQ5eUBWM_1AWCwgwJoAXACeACAAWuIAfoBkgEDMi4xmAEAoAEBqgEHZ3dzLXdpesgBCMABAQ&sclient=gws-wiz&ved=0ahUKEwj7zf_nwPvwAhXeGbkGHURhBpUQ4dUDCA0&uact=5) – kjetil b halvorsen Jun 03 '21 at 12:57

0 Answers0