I'm running an experiment and want to use the Causal Impact function to assess how well it performs.
I have 10 different cities. I'm looking to find out what is the best method for choosing which cities to be the test group (i.e. intervention occurs) and which cities are to be the control group (i.e. no intervention).
So far I've ran a k-means clustering algorithm to divide the data into 2 clusters. My question is how do I use these clusters to divide the cities into control and test groups?
I've seen previous cases where people use one cluster as test and the other as control, but surely this would mean your test group would be very different from your control group.
Surely it would make more sense to simply split one of the clusters (from the k-means algorithm) into control and test groups so that it is guaranteed that they are going to be similar.
Is my thinking correct here?