2

I noticed that R computes the acf(data,n) in order $O(n)$ time, not $O(n^2)$, so it cannot compute it brute force using double for loops. How does R calculate it this quick? What is the algorithm?

Mikkel Rev
  • 721
  • 3
  • 16
  • 2
    How do you so confidently state that it's $O(n)$ rather than say $O(n\log n)$? – Glen_b Aug 23 '17 at 22:53
  • Note that an explanation of how to see the code that performs R's `acf` function is given [here](https://stats.stackexchange.com/questions/254227/manual-calculation-of-acf-of-a-time-series-in-r-close-but-not-quite). I expected it would use a fast Fourier transform, but (at least judging from a quick glance) the `acf`-related parts of `filter.c` don't seem to be doing that. – Glen_b Aug 23 '17 at 22:58
  • 2
    @Glen_b In this question, $n$ is the maximum lag, not the data size. It is apparent that the algorithm, whatever it might be, will be $O(n)$, since the output contains $n+1$ numbers. However, the implicit constant is $N$ where $N$ is the length of the data--and this will dominate the asymptotics, of course (since $N \ge n$). Thus, **this question is misleading:** the algorithm *must* be at least $O(Nn) \ge O(n^2)$, not $O(n)$ or even $O(n\log(n))$. – whuber Aug 24 '17 at 14:49
  • $O(Nn) = O(n)$ by definition for all $N \in \mathbb{N}$ – Mikkel Rev Aug 24 '17 at 19:42
  • 1
    Marius, your comment makes no sense after what @whuber just explained. – Digio Aug 25 '17 at 09:07
  • 2
    Marius, your assertion is correct *when $N$ does not depend on $n$*. However, in your question--as I pointed out--necessarily $N$ is no less than $n$. Thus $N$ cannot be considered a constant (asymptotically in $n$). In this case it is no longer true that $O(Nn)=O(n)$: it follows that $O(Nn) \ge O(n^2)$. – whuber Aug 25 '17 at 12:40
  • I can change it to whatever you want. What would you like, $O(Nn)$? If you know the answer to the original problem, would you like to answer? – Mikkel Rev Aug 26 '17 at 09:19

0 Answers0