Test to distinguish periodic from almost periodic data

Question

Suppose I have some unknown function $f$ with domain $ℝ$, which I know to fulfill some reasonable conditions like continuity. I know the exact values of $f$ (because the data comes from a simulation) at some equidistant sampling points $t_i=t_0 + iΔt$ with $i∈\{1,…,n\}$, which I can assume to be sufficiently fine to capture all relevant aspects of $f$, e.g., I can assume that there is at most one local extremum of $f$ in between two sampling points. I am looking for a test that tells me whether my data complies with $f$ being exactly periodic, i.e., $∃τ: f(t+τ)=f(t) \,∀\,t$, with the period length being somewhat resonable, for example $Δt < τ < n·Δt$ (but it’s conceivable that I can make stronger constraints, if needed).

From another point of view, I have data ${x_0, …, x_n}$ and am looking for a test that answers the question whether a periodic function $f$ (fulfilling conditions as above) exists such that $f(t_i)=x_i ∀ i$.

The important point is that $f$ is at least very close to periodicity (it could be for example $f(t) := \sin(g(t)·t)$ or $f(t) := g(t)·\sin(t)$ with $g'(t) ≪ g(t_0)/Δt$) to the extent that changing one data point by a small amount may suffice to make the data comply with $f$ being exactly periodic. Thus standard tools for frequency analysis such as the Fourier transform or analysing zero crossings will not help much.

Note that the test I am looking for will likely not be probabilistic.

I have some ideas how to design such a test myself but want to avoid reinventing the wheel. So I am looking for an existing test.

Given that you have *data*, could you explain what you mean by the test not being "statistical"? What kind of test do you have in mind then? — whuber, Sep 13 '14 at 18:43
@whuber: The test does not need to involve probability at all, e.g., by making probabilistic assumptions about the data or returning a *p*-value. However, I just noted that this is not necessarily included in the definition of *statistical,* so I replaced this part. — Wrzlprmft, Sep 13 '14 at 18:50
"will likely not be probabilistic" is still equivalent to "not statistical" in my view. — tchakravarty, Sep 13 '14 at 19:05
By the way, you might want to start [here](http://www.jstor.org/stable/2287456) in case you _are_ looking for a statistical test of periodicity. — tchakravarty, Sep 13 '14 at 19:06
@fgnu: As I said, *probabilistic* is not necessarily included in the definition of *statistical* (see, e.g., [Wikipedia on this](http://en.wikipedia.org/wiki/Statistics)). Anyway, this does not matter, as my intention for that sentence captured by *probabilistic* as well, if not better. I already looked in the citation environment of Siegel’s paper (but may have missed something). — Wrzlprmft, Sep 13 '14 at 19:15
I would like to suggest you clarify what you mean by a "test," then, given that you might not adopt any probability model. Obviously it would not be a hypothesis test. What other kind of test might it be? — whuber, Sep 13 '14 at 19:22
@Wrzlprmft: Do you maybe mean that you want to check ("test") if your data is exactly periodic up to machine precision? Or do you have some other precision in mind? What does "exactly" periodic mean for you? — amoeba, Sep 13 '14 at 19:26
@whuber: I (hopefully) clarified this. Whether it is a hypothesis test or not, depends on your definition of *hypothesis test.* It does test a hypothesis in some sense (“there is a periodic function such that …”), however it is not a probabilistic hypothesis. — Wrzlprmft, Sep 13 '14 at 20:15
@amoeba: Yes (to your first question). The main difficulty is that I do not have arbitrarily fine samples of the data. See also my previous comment. — Wrzlprmft, Sep 13 '14 at 20:19
What I am trying to find out is exactly how your test would work. It is not at all clear what you conceive of as being a "test" when you have no model--even a qualitative one--of how the data could possibly vary. — whuber, Sep 13 '14 at 21:10
@whuber: … varies from what? The data is exact. The only lacking information is due to the finite sampling. — Wrzlprmft, Sep 13 '14 at 21:39
How were the sampling points determined? Since you presumably don't know exactly what $f$ is, then if someone else were to sample $f$, wouldn't they use different "times" and therefore obtain different values? That's variability. Incidentally, there is no such thing as exact *data* unless you are performing a theoretical mathematical exercise, so it would be a good idea to explain how you have found the values of $f$. — whuber, Sep 13 '14 at 21:49
@whuber: It is simulated data (I added that to the question), so it’s exact up to floating-point precision (or at least close). In particular any inaccuracy is many orders of magnitude lower than, e.g., the standard deviation of the time series. If the test performs any arithmetic operations on the data, its numerical implementation will likely cause comparable inaccuracies. — Wrzlprmft, Sep 13 '14 at 22:07
As @whuber and amoeba are driving at, this question will remain difficult to answer until a satisfactory definition of *periodic* and/or *test* is supplied. Given $n$ arbitrary points sampled without error there are infinitely many continuous periodic functions (using the literal definition) that will fit the points. It's a simple exercise in interpolation. But this is obviously no more an answer to your question than the fact that a set of $n$ random predictors will perfectly fit $n$ points via linear regression. Hence, we wait with bated breath for your clarification. — cardinal, Sep 19 '14 at 23:41
@cardinal: I have never heard a definition of periodicity other than the one that I now gave in my question. Regarding your interpolation suggestion: That’s why I added the criterion that the sampling points are sufficiently fine to capture relevant aspects of $f$, such as the local extrema, as well as the criterion that the period length shall be somewhat reasonable (i.e., not extremely short or long). I could specify all those things, but this could make my requirements on the test overly specific. — Wrzlprmft, Sep 20 '14 at 08:33
I confess to remaining completely puzzled about the setting and the objectives. By denying any role for variability and failing to *quantify* the sense in which "very close" to periodicity is meant, this question seems to reduce to a purely mathematical problem and it seems to have a trivial answer: apply your periodicity criterion to each of the $n$ possible values of $\tau\in\{i\Delta t,i=1,2,\ldots,n\}$ to see whether it holds. (If you contemplate any other value of $\tau$ then @Cardinal's comment comes in with full force.) — whuber, Sep 22 '14 at 13:57
@whuber: And I fail to see what is so unclear (not that I would blame anybody for it). Your suggestion applies only if the period length $τ$ is a multiple of $Δt$. So, yes I consider other values of $τ$, but I already tried to explain in my last comment why this does not make this an interpolation problem: The resulting function will either have an unreasonably long period length ($τ>n·Δt$) or the resulting $f$ will have multiple local extrema between sampling points (in which the condition that the sampling is sufficiently fine to capture all local extrema of $f$ does not hold anymore). — Wrzlprmft, Sep 22 '14 at 14:20
For any $\tau$ that is not a rational multiple of $\Delta t$, the data you have can *always* be viewed as a sample of a continuous periodic function of period $\tau$ because you have no observations exactly an integral multiple of $\tau$ apart. This leads to @cardinal's observations, which amount to noting that this conclusion is too trivial to be useful but nevertheless you haven't provided any criteria to rule it out. — whuber, Sep 22 '14 at 16:05
@whuber: I disagree. The respective continuous periodic function has to heavily oscillate to match all the observations. In particular it will have several local extrema between two sampling points. This is ruled out by the assumption that my sampling points are sufficiently fine to capture all relevant aspects of the function. — Wrzlprmft, Sep 22 '14 at 17:28

Wrzlprmft · Accepted Answer · 2015-11-10T06:57:14.877

As I said, I had an idea how to do this, which I realised, refined and wrote a paper about, which is now published: Chaos 25, 113106 (2015) – preprint on ArXiv.

The investigated criterion is almost the same as sketched in the question: Given data $x_1, \ldots, x_n$ sampled at time points $t_0, t_0 + Δt, \ldots, t_0 + nΔt$, the test decides whether there is a function $f: [t_0, t_0 + Δt] → ℝ$ and a $τ ∈ [2Δt,(n-1)Δt]$ such that:

$f(t_0 + iΔt)=x_i\quad \forall i∈\{1,…,n\}$
$f(t+τ)=f(t) \quad∀t∈[t_0, t_0 + Δt-τ]$
$f$ has no more local extrema than the sequence $x$, with the possible exception of at most one extremum close to the beginning and end of $f$ each.

The test can be modified to account for small errors, such as numerical errors of the simulation method.

^{I hope that my paper also answers why I was interested in such a test.}

score -1 · Answer 2 · answered Sep 16 '14 at 14:00

-1

Transform the data into frequency domain using the discrete Fourier transform (DFT). If the data is perfectly periodic, there will be exactly one frequency bin with a high value, and other bins will be zero (or near zero, see spectral leakage).

Note that the frequency resolution is given by $\frac{\text{sampling frequency}}{\text{Number of samples}}$. So this sets the limit for the detection precision.

answered Sep 16 '14 at 14:00

asa

7
1

1

As I already stated in the question, the Fourier transform (at least all by itself) is not even remotely precise enough to detect the differences I am interested in and will hardly detect any difference between $\sin(x)$ and $(1+εx)·\sin(x)$. Also, what you are claiming only holds for sinusoidal data. For any other data, the subharmonics will show up. – Wrzlprmft Sep 16 '14 at 14:16

score -2 · Answer 3 · answered Sep 16 '14 at 14:12

-2

If you know the actual periodic signal, calculate

$\text{difference} = \Big|\text{theoretical data} - \text{measured data}\big|$

Then sum the elements of $\text{difference}$. If it is above a threshold (consider error from floating point arithmetic) the data is not periodic.

answered Sep 16 '14 at 14:12

asdsaj

1

1

Apart from the fact that I do not know the underlying signal, this has nothing to do with periodicity but would work whenever I know the underlying signal. – Wrzlprmft Sep 16 '14 at 14:23

Test to distinguish periodic from almost periodic data

3 Answers3

Linked