5

The motivation for this question is from Finance. I have some market data (daily time series) for the price of some securities and I would like to generate synthetic versions of these which are statistically "similar" (in some sense) for testing trading strategies. Is there literature on this subject?

I was hoping there would be a way of manipulating the market data that I have in a deterministic way (such as, say, taking the first difference between consecutive values and swapping these around) rather than extracting statistical information about the time series e.g. autocorrelation and then generating new random variables etc. to get a new time series.

I am being deliberately vague about what I mean by "similar" as I don't know how realistic my question is and don't want to constrain it further.

mathman
  • 171
  • 3

3 Answers3

2

Maybe you can take a Fourier transform or a wavelet transform, and then flip the signs of the randomly selected components (or shift phases in Fourier space), and then re-assemble the series back. Of course there's also a certain amount of literature on how to bootstrap time series (block bootstrap, mostly), which may or may not be related to what you want to do.

StasK
  • 29,235
  • 2
  • 80
  • 165
  • Thanks for the block bootstrapping reference. I'm investigating just now. What do you mean by "flip the signs of the randomly selected components"? – mathman Sep 07 '11 at 08:17
  • 1
    @mathman This is a way to randomly change the data while retaining its absolute spectrum. Think of the data as a superposition of oscillations of different frequencies. Randomly shift the phase of each oscillation and reassemble them: you get new data that have many properties of the old. – whuber Sep 30 '11 at 16:42
2

There is are several papers under the label of "surrogate data" in the nonlinear data-analysis literature which deals with the question of how to generate data that have "similar" properties to some reference data. This data is then used to run tests to see whether there is additional (nonlinear/chaotic) structure in the data that is not covered by the surrogate-creation technique.

There are many different papers on this issue. Theiler and colleagues worked on it:

and they do use spectral methods with Fourier and Wavelet-transforms...

thias
  • 815
  • 8
  • 19
0

Not sure about the Fourier transform approach, never heard of that. On the other hand, if you can make some distribution assumption (e.g. changes are multivariate normal) it is easy to simulate from a multivariate normal distribution by running a Cholesky decomposition on the sample covariance matrix of your data set. You simply take the triangular matrix you get and multiply by a vector of uncorrelated standard normal random variable samples, and you get a vector of samples respecting the covariance structure observed in your data set.

For example, in finance, we typically model log-returns (the log of a day's price divided by the previous day's price) as normally distributed. So we create data sets of log-return data, calc the covariance matrix, do the Cholesky decomposition, and simulate paths by drawing standard normally distributed random variables, multiplying by the triangular matrix obtained in the decomposition.

William
  • 723
  • 3
  • 7