Using Google Causal Impact package to assess the significance of a planned intervention

Question

I am using the Causal Impact package in R to infer the causal effect of an intervention in some data which are highly correlated and seasonal.

Specifically, i got 17 days of hourly data, intervetion happening in the end of day 13. I have two control datasets which are not affected at all by the intervention (with linear correlations of 0.708 and 0.701) and the dataset that includes the intervention (aka "treated")

A piece of the data can be found here

My code is the following

days <- 4
daily.obser <- days*24
data.1 <- cbind(treated.signal.3n,the.control.3,the.control.2)
data.1 <- data.1[1:((length(bsl)+1)+daily.obser), ] #check the required amount of data only 

matplot(data.1, type = "l",col = c(2,4,9))
legend("bottomright", inset=.05, legend=c("Treated Zone", "Control Zone 1", "Control Zone 2"), pch=1, col=c(2,4,9), horiz=TRUE)

preperiod <- c(1,length(bsl))
postperiod <- c((length(bsl)+1),(length(bsl)+1+daily.obs))
prior.level.sd.level <- 0.01

imp.1 <- CausalImpact(data.1, pre.period = preperiod, post.period = postperiod, 
         model.args = list(niter = 2500,nseasons=17, season.duration = 24, 
         dynamic.regression = FALSE, prior.level.sd =prior.level.sd.level,standardize.data = TRUE))

summary(imp.1)
plot(imp.1,c("original","pointwise"))
summary(imp.1,"report")

My questions are:

I have read the paper and at some point it is talking about the prior distribution for the variance. I do not understand what should i set my prior.level.sd parameter to, based on my data.

Another problem i m facing is the nseasons,season.duration arguments. When i specify this, in the results, i m getting that the intervention is insignificant (and CI's are becoming huge), whereas when i dont, the intervention is significant. Is nseasons supposed to be say the number of days for the whole dataset or just for the preintervention period (eg 17 or 13)? What does specifying the seasonality trully mean? Can i, based on the data skip this?

Results with seasonality specification plots and numbers

Results without seasonality specification plots and numbers

(not providing cumulative since it is not useful in my case)

(you will notice that in the preintervention period the fit is not that good. Can i fix this somehow?)

I do not understand, how am i supposed to specify if i want to standardize the data or not.

Finally, I m thinking about static or dynamic regression. I read in the paper that it is advised to use static when relationship between control and treated is stable. Can someone explain what is meant by stable?

You may find the paper here

score -1 · Answer 1 · answered Dec 25 '16 at 15:54

About seasonality. From the reference:

. For example, if the data represent daily observations, use 7 for a day-of-week component. This interface currently only supports up to one seasonal component.

and

use model.args = list(nseasons = 7, season.duration = 24) to add a day-of-week component to data with hourly granularity

So I take it that in your case I think nseasons=13, season.duration=24 as well.

What is meant by stable: I take it to mean that the relationship of the covariates, to the series you are tracking is would have been maintained (but for the intervention). So if I was doing a study on some measure affecting house prices (y), and was using as covariates average pay, then if the UK leaving the EU would happen during the intervention period, that might change the relationship between house prices and pay. So the assumption that the relationship between the covariates and y was stable, would be broken.

The prior.level.sd, I don't think is something you can infer from your data. It's a parameter used by the algorithm, isn't it?

Using Google Causal Impact package to assess the significance of a planned intervention

1 Answers1