I'm trying to detect outliers of a time series that contain seasonality.
I tried using Grubbs's test but it confuses outliers with high/low peaks (of seasonality).
It would be very helpful if someone could share a solution for this problem?
I'm trying to detect outliers of a time series that contain seasonality.
I tried using Grubbs's test but it confuses outliers with high/low peaks (of seasonality).
It would be very helpful if someone could share a solution for this problem?
The simple answer is: estimate and then subtract off the seasonality. Periodic splines are one way to estimate seasonality, or you can consider FFT. A challenge is that these methods are less sensitive in the presence of outliers, so one can consider trimming or using EM-algorithm to iteratively downweight outliers in estimating the seasonal effects.
Not to complicate things BUT ...
1) If there is some ARIMA structure you often have to peel that away before the anomalies can be identified ...see http://docplayer.net/12080848-Outliers-level-shifts-and-variance-changes-in-time-series.html
as @AdamO once wisely reflected Interrupted Time Series Analysis - ARIMAX for High Frequency Biological Data? on determining the form of the arima model "The correlogram should be calculated from residuals using a model that controls for intervention administration, otherwise the intervention effects are taken to be Gaussian noise, underestimating the actual autoregressive effect."
In the same spirit but upside down and backwards "detecting the intervention effects has to be done using an appropriate arima model otherwise the interventions can be occluded/masked due to the presence of non-Gaussian noise."
2) If there are level shifts or trend changes this may also have to be peeled away before anomalies can be identified
3) If the error variance is non-constant due to the need for a power transform or a weighted least squares adjustment this may have to be peeled away before anomalies can be identified
4) Consider the daily series
1,1,1,5,1,1,1
1,1,1,5,1,1,1
1,1,1,5,1,1,1
1,1,1,-5,1,1,1
1,1,1,-5,1,1,1
1,1,1,-5,1,1,1
where the day4 effect changes at a particular point in time .
This often happens in daily data and presents an opportunity to detect changes in seasonal effects and not to identify them as anomalous..
So many opportunities to do it right ..... and just 1 way to do it rong ...