How to choose Control groups for Causal Impact algorithm?

Question

I'm running an experiment and want to use the Causal Impact function to assess how well it performs.

I have 10 different cities. I'm looking to find out what is the best method for choosing which cities to be the test group (i.e. intervention occurs) and which cities are to be the control group (i.e. no intervention).

So far I've ran a k-means clustering algorithm to divide the data into 2 clusters. My question is how do I use these clusters to divide the cities into control and test groups?

I've seen previous cases where people use one cluster as test and the other as control, but surely this would mean your test group would be very different from your control group.

Surely it would make more sense to simply split one of the clusters (from the k-means algorithm) into control and test groups so that it is guaranteed that they are going to be similar.

Is my thinking correct here?

score 2 · Answer 1 · answered May 21 '19 at 19:28

Take a look at the {MarketMatching} package - it aims to simplify the {CausalImpact} workflow by automating the selection of controls. The criteria for selection are based on "similarity and distance" metrics -- either correlations or DTW (Dynamic Time Warping) or a hybrid mix of the two methods.

https://cran.r-project.org/web/packages/MarketMatching/index.html

score 1 · Answer 2 · answered Oct 24 '18 at 17:00

I think your concern about clustering is valid.

If the goal of the experiment is to estimate what would happen if the treatment policy was rolled out more widely, I think you want your treatment cities to be representative of whatever population you care about (maybe that's the N=10 or maybe some wider set of cities). That should be your main criterion. If you don't think the effects vary very much or you only care about the effect in the cities you treat, that is less of a concern.

I am also worried that N=10 is not quite enough data for CI, but that is an empirical matter that can be verified. You can run the CI analysis before treatment as if treatment has already started and you should get a zero effect.

How to choose Control groups for Causal Impact algorithm?

2 Answers2