1

I want to find out if it is ok not to center the data in a PCA when working with stock returns. Centering would remove the trend from the dataset which I believe contains valuable information.

The paper copied below looks at similarities between the use of centred and uncentred data concluding that the results are more similar than expected. However, I am more interested in finding out in which case is ok not to center given that it is a widely use methodology and need to just not using it (or use it if needed).

Just to clarify, by centering I am not referring only to demeaning but also to standardizing.

Cadima J, Jolliffe IT. 2009. On relationships between uncentred and column-centred principal component analysis. Pak. J. Stat. 25, 473–503

kjetil b halvorsen
  • 63,378
  • 26
  • 142
  • 467
Camalo
  • 11
  • 1
  • 1
    I have always worked under the assumption that failing to centre PCA will usually make things worse. Scaling is really a question of whether different variables are comparable before scaling (if not, then scaling is sensible, but if they are then you might lose useful information). https://stats.stackexchange.com/questions/385775/normalizing-vs-scaling-before-pca and https://stats.stackexchange.com/questions/89809/is-it-important-to-scale-data-before-clustering may be of interest – Henry Jun 18 '21 at 11:49
  • It seems like you have a time-series, maybe add tha tag [tag:time-series] – kjetil b halvorsen Jun 20 '21 at 01:36
  • 1
    with stock returns you usually get away without de-meaning because the data is already mean zero (almost). otherwise, you must de-mean. whether you also scale depends on what are you trying to do. – Aksakal Jun 20 '21 at 02:48

0 Answers0