What am I trying to achieve?
I am trying to test whether there is a structural break in a time-series of proportions at a known break date (21 Dec 2019) . Below is a plot of the original time-series (top panel) and its STL decomposition:.
What approach am I taking?
- Apply logit transformation on the proportions
- Apply STL decomposition to remove seasonality from data.
- Use the seasonally-adjusted values of the STL decomposition to model a linear regression model which will take the form of an AR(p) or MA(q) model.
- Model on entire time-series
- Model on time-series before the break date
- Model on time-series after the break date
- Apply the Chow-test by computing the Chow test-statistic and generating the associated p-value.
Here is the seasonally-adjusted time-series of the logit-transformed data:
Where am I unsure?
I want to apply ARIMA models (including simpler AR(p) and MA(q)). This is because they are like simple linear regression models which closely match what’s required for the Chow-test.
These models require stationary time-series, so de-trended, where the detrending can be done in the estimation of an ARIMA model.
However, if I remove the trend as well as the seasonality, I am left with a time series of stationary (random) errors for the Chow test. For instance, this is what the seasonally-adjusted time-series looks like after first-order differencing:
This is where I am confused. Can I still detect a structural break at the point where there is a sudden change in mean or trend or variance values, when the time-series of stationary errors has, by definition of stationarity, constant mean and variance?
Therefore, what data should I model using AR(p) or MA(q)?
- The de-trended and de-seasonalised (stationary) time-series data? (remainder)
- The de-seasonalised, trend-only time-series data? (trend only)
- The seasonally-adjusted time-series data? (trend + remainder)
Have I taken different approaches?
As an alternative, I am considering to model the seasonally-adjusted (so trend + remainder) time series using a linear regression with auto-regressive errors, by regressing the logit-transform data on time and modelling the errors to have an ARIMA structure:
$$y_t = \beta_0 + \beta_1*time + \Theta^-1(B)w_t$$
Bonus question
Am I taking the right approach for testing for structural breaks, or do you recommend other approaches that can be implemented in R?
Already aware about the strchange
package, so also want to know whether you need the data to be stationary before passing that in.