0

I am doing a pca analysis to construct a financial stress index from different variables which I expect they will move together in a period of "financial stress". As I have read in different papers I will take the coefficients of the first PCA (if enough explanatory power) divide them by the first eigenvalue and take this as the weights of the different variables.

My input variables are time series like the VIX Index, CDS spread,... which all seems to be instationary. Now my questions are:

  1. Should I do a first differencing on all the variables in order to have stationary data?
  2. Should then from this differenced data do the z-score (value - mean)/std in order to have them in the same units?

Or should I do the PCA directly on the instationary Time series data? Or directly on the z-score without differencing them?

In all the paper I have found no one explained how to deal with instationary time series ...

kjetil b halvorsen
  • 63,378
  • 26
  • 142
  • 467
  • Since you are working with time series data you may want to look into something more like independent components analysis. – Matt Barstead Feb 05 '18 at 16:57
  • What do you mean? I am going tonuse pca since a few papers gave excellent results and we want to apply this method, so my question is not if it's good or not but how to proceed in this analysis with timeseries. – PieroBerna Feb 05 '18 at 17:02
  • I will have a look at it for sure, but are you able to help me with my questions? Thanks! – PieroBerna Feb 05 '18 at 17:28
  • Check [here](https://stats.stackexchange.com/questions/158281/can-pca-be-applied-for-time-series-data) for further details. – bastian.abaleiv Feb 05 '18 at 17:59
  • Hi already saw that discussion but there is not really an answer. It says you can apply it to time series... but ehat about instationarity? What about z-scoring the data? – PieroBerna Feb 05 '18 at 18:03
  • What papers are you referring to? It would be helpful to cite them so we can see more about the context of the problem. PCA can be applied to anything, so this is more of a modelling question than a PCA one. – A. G. Feb 05 '18 at 22:54
  • 3
    @MattBarstead ICA is not specifically designed for time series data, nor is PCA improper for it. – Firebug Nov 01 '18 at 22:05
  • PCA will minimize the variance within each group/signal. You may want to think about whether that will be useful for your task or not. E.g., stationarizing the series might help grouping overall more similar signals, while with non-stationary data you might find that large shocks or trends will dominate the components – runr Jun 20 '20 at 23:33

1 Answers1

0

As far as I understand, there is no need to difference the series. In this paper the authors provide a very intuitive explanation of PCA to capture the intra-day variation without taking differences of any type. I know its not the same DGP, but the analysis should be similar.

  • I went quickly through the paper but difin't find anything helpful. Nobody who could help? – PieroBerna Feb 05 '18 at 20:37
  • Just another question, for indices like VIX should I use the index itself or for example the daily change in % ? – PieroBerna Feb 06 '18 at 16:52
  • @PieroBerna the index design is up to you. If you are interested in risk measures, risk (volatility) is measured as the temporal change as you expose. The VIX series appears to be non-stationary, so differencing is needed to obtain a weakly stationary time series (mean value function $\mu_{t}$ is constant and does not depend on time $t$, and the autocovariance function will only depend through the lag difference $|s-t|$ – bastian.abaleiv Feb 07 '18 at 17:27
  • Yes this is what would make sense (and what is done in linear regression,...), but I wasnt sure if for PCA we can also work with unstationary time series. The literature is somewhat confusing on this topic. I still didn't find a clear answer. In the construction of other indices like the St. Louises, Kansas City and Cleveland Fed Financial Stress Indexes nothing is said about stationarity/unstationarity of the data (but everything else is very clear). They also used dvariables like an index value (MSCI world) divided by the 200 MA which also is unstationary. I am still confused – PieroBerna Feb 08 '18 at 06:18
  • Here a few links [link](https://core.ac.uk/download/pdf/6592699.pdf) and [link](https://core.ac.uk/download/pdf/6592699.pdf) – PieroBerna Feb 08 '18 at 12:04