Choosing a changepoint detection algorithm

Question

I've been reading up on changepoint algorithms (dynamic programming, Bayesian Online Changepoint detection, Hidden Markov Models, etc.) and am looking to implement an algorithm that has a certain set of attributes, ideally in Python. I don't have a formal stats/math background, so it can be a bit tedious to read through the material and many of the algorithms have limitations, which are sometimes discovered halfway through a paper or blog post.

What I'm looking for:

Fast/online. If an algorithm is sufficiently fast, it's probably less important that it's online. My understanding is that some offline algorithms can be adapted to be online by refining the last "known" changepoint until another is found.
Allows for multiple changepoints, where the total number of changepoints is unknown before running the algorithm
Does not require specifying an exact number of classes for the segments resulting from the changepoints
Allows for multivariate data where the variables are evaluated jointly. Perhaps an alternative would be to perform multiple single variate analyses and attempt to merge changepoints that occur in different variables but near in time. Just a thought since I haven't specifically come across this, but a true multivariate solution would be ideal.
Accounts for changes in standard deviation and not just mean. This could perhaps be replicated by calculating the absolute distance from the latest mean in real time and passing that in as an additional variable. In my specific case, I expect the individual time steps and segment means to be +/- from 0, so simply using absolute value might make it easier and avoid the need for additional recursion.

And obviously something that's as exact as possible would be ideal.

Any suggestions?

score 1 · Accepted Answer · answered Jan 28 '21 at 13:04

1

The R package bcp seem to fulfill all of these (associated paper here). It returns the probability of change point at each index in your data, so you have to set a threshold yourself. This is a nice feature compared to many other packages.

For multivariate change point detection, it requires that the data is in a matrix format, i.e., that all outcome variables are observed simultaneously (or close enough that it's OK to pretend).

It does not model autocorrelation which is often an important feature of time series.

I have tried to make an overview of change point packages in R. Few match your problem but take a look at the list of not-yet-reviewed packages at the top of the page.

answered Jan 28 '21 at 13:04

Jonas Lindeløv

1,778
1
17
28

Excellent overview of the packages. I should have specified that I'm working in Python, but it looks like there are ways to run R packages in Python. The `bcp` package does look like a good option but I've also come across a number of Git repositories based on Bayesian Online Changepoint Detection for Python. Do you have any thoughts on PELT? That was another that has stood out so far, but I haven't gotten deep enough into it to see if it handles all of the requirements I mentioned. Not sure if it's part of one of the packages you listed. – SuperCodeBrah Jan 28 '21 at 15:52
Jonas maybe you could add changepoint.geo to your list of not-yet-reviewed? – adunaic Jan 28 '21 at 16:18
@adunaic Will do, thanks for the hint. – Jonas Lindeløv Jan 28 '21 at 20:30

score 1 · Answer 2 · answered Jan 28 '21 at 16:44

1

Another R package that meets these options (and can be implemented online) is changepoint.geo. It reduces the multivariate time series down to a bi-variate (angle and distance) series. These are designed to identify changes in mean and variance. Currently, all other projection-based (for dimension reduction and time efficiencies) approaches only consider changes in the mean behaviour. In contrast to BCP, geomcp uses PELT as the optimizer for the bivariate series and so it is an exact solution to the optimization problem.

As with BCP it doesn't model auto- or cross- correlations though.

Most changepoint algorithms are in R at the moment but you can easily use rpy2 to interface to them from Python. Although if you enjoy coding, the transformation is simple to translate to Python and the code for PELT is already available in Python in the ruptures package. I'd be happy to guide on which lines need changing.

answered Jan 28 '21 at 16:44

adunaic

1,170
7
14

I'm looking more at DP/PELT algorithms. The one thing that throws me off is the penalty term, which to me is less intuitive than trying to compare multiple groupings of time steps (I understand why it's used and it's probably necessary for minimizing complexity - just less intuitive to me). I'm also looking at a graph-based approach (https://arxiv.org/pdf/1209.1625.pdf), which seems straightforward but I can't find much for multiple changepoints. Are you aware of anything in Python or R? Side note: it's crazy how diverse but frequently limited the solutions are for this topic. – SuperCodeBrah Feb 04 '21 at 04:23
Also, the graph-based approach brings to mind a kind of center of mass approach where mass is accumulated as the total *weight* of observations (i.e., mean * n). Then, as the center of mass moves with new points being added to the segment, if a threshold of mass*distance is broken from any prior center of mass in the current segment, a new segment is established. This probably ends up being somewhat similar to BCP but might be simpler to implement (I still need to get into the weeds in terms of BCP implementation). – SuperCodeBrah Feb 04 '21 at 04:39
The paper you reference has an R package, gSeg. The authors have been communicative so you could reach out to them. https://cran.r-project.org/web/packages/gSeg/index.html – adunaic Feb 05 '21 at 08:24
The penalty controls the difference in test statistic that is required for a change. This is the same as the thinking as is required for all hypothesis testing and is directly linked to this in the single changepoint scenario. The intuition is that large values require more evidence to say a change exists and small values requires less evidence (so you get more changepoints). If small deviations in penalty drastically change the segmentation then that is not good. The CROPS algorithm allows you to efficiently find all changes in a penalty range if you prefer the "how many changes?" view. – adunaic Feb 05 '21 at 08:31

Choosing a changepoint detection algorithm

2 Answers2