8

I have large number of time-series that are independent of each other, but share some seasonality patterns. I need to detect anomalies/changes (increased volume, change in mean), that appear in the individual time-series. I also have some potential explanatory variables, but they seem to be rather weak as predictors.

What would be the best approach for such data? I need something that scales to larger data. I would rather sacrifice accuracy for computational performance, then the opposite, so I'm looking for something simple.

Tim
  • 108,699
  • 20
  • 212
  • 390
  • 1
    As a broad strategy, maybe estimate the seasonality first, add that as a (more informative) explanatory variable, then do your anomaly/change detection on each time series individually? – Danica Feb 05 '18 at 20:39
  • @Dougal good point, I can estimate this on aggregate of the individual series. – Tim Feb 05 '18 at 20:45
  • How about, characterizing all the time series and then clustering? – Bussller Feb 07 '18 at 03:12
  • @RussellB I'm afraid I can't see how clustering could be used for anomaly detection in time-series (it sounds as [rather bad](https://stats.stackexchange.com/questions/182232/fit-mixture-of-distributions-to-your-time-series-data-in-r/182354#182354) idea) and I'm afraid that clustering in most cases doesn't scale well, so does not meet my constraint. – Tim Feb 07 '18 at 09:31
  • I agree. Clustering can be used for grouping time series, which might help you with anomalies. Having said that, scalability could be an issue. Have you looked at twitter anomaly detection based on time series? – Bussller Feb 07 '18 at 16:46
  • @RussellB this is what I'm considering at the moment, but I'm looking for possible alternatives – Tim Feb 07 '18 at 16:59
  • I meant that I don't think that clustering is a good idea in here – Tim Feb 07 '18 at 17:00
  • Yahoo's approach on the same could be an viable alternative. It's worth checking, if you have not looked at it. – Bussller Feb 07 '18 at 17:02

1 Answers1

0

Transfer Function model identification enter image description here can be used to develop useful equations that can then be classified or segmented.

"Things" can be simplified for expedience , if that is the goal. Doing this with freely available web resources is a little bit more tricky.

IrishStat
  • 27,906
  • 5
  • 29
  • 55