1

I have seven variables that are stationary at first difference. I am trying to figure out how variable X influences variable Y, conditional on Z, W, A, B, and C.

Which would be more appropriate to use? STATA's cointreg package for Fully Modified OLS (FMOLS), or just stick with VAR?

narujapica
  • 35
  • 5
  • Related question: ["Cointegration with stationary series at first difference"](https://stats.stackexchange.com/questions/345317/cointegration-with-stationary-series-at-first-difference). – Richard Hardy May 10 '18 at 08:50

1 Answers1

2

Notice that having all variables at the same degree of stationarity (I(1), or first-difference stationary), does not imply there is a long run relationship among them. The relationship can still be spurious. So, you still need to proceed by testing whether such relationship exists. Different method exists for this. If no relationship exists, results are meaningless with any method you use.

Going to your question, cointreg is a powerful command, which compute robust SE to do testing and gives you the long run cointegration equations in the output, which for economists is a central piece of information when it comes to evaluating theory. If you find evidence of cointegration, your equation is ready to run, as your equation is balanced, in the sense that all the variables are I(1). I see no reason why you would prefer a VAR, where robust SE is not assured.

I also point you to Stata's egranger package (here), which estimates an error-correction model (ECM), which includes both the long run and the short-run relationship, giving you much more insights. In a multidimensional context, you can estimate a vector ECM, or VECM, using the vec command. For documentation on this and related commands, see this document.

luchonacho
  • 2,568
  • 3
  • 21
  • 38
  • When running cointreg using all series, 1. for the I(0) series should the data at level be used? 2. for the I(1) series should the data at first difference be used? – narujapica May 10 '18 at 06:13
  • @narujapica You said **all** series are I(1). The long run equation uses the level of these variables. The ECM in `egranger` command uses a combination of level and first difference. Check the help file of that package, which has a bit of the theory behind it, in case you are not familiar. – luchonacho May 10 '18 at 08:23
  • @luchonacho, FYI, the same user has asked a related question just before this one, see my comment under the OP. – Richard Hardy May 10 '18 at 08:52
  • @RichardHardy Thanks. Not the best use of forum rules... – luchonacho May 10 '18 at 09:03
  • @luchonacho, I agree. Regarding your answer, I would include considerations on whether the series are cointegrated to begin with. Only after having established the existence of cointegration would I proceed with estimating VEC or VAR models (preferably VEC). – Richard Hardy May 10 '18 at 09:33
  • @RichardHardy Forgot that bit! Long time that I don't run a cointegration model. Thanks! – luchonacho May 10 '18 at 10:04
  • *If no relationship exists, results are meaningless with any method you use.* This sounds a little harsh and may be misunderstood/misleading. In absence of cointegration, it is entirely valid to model the first differences by a VAR model. – Richard Hardy May 10 '18 at 10:20
  • @RichardHardy Well, for any spurious relationship you can run a VAR, but does it mean a meaningful theoretical relationship is being captured? I guess this is a methodological debate. Personally, black-boxes 1D VARs are of no interest beyond short-term forecasting. But I guess others differ. – luchonacho May 10 '18 at 10:28
  • I cannot comment much on economic theory, my argument was purely statistical. Spurious relationships applies for integrated variables, not for their first differences (which are stationary). A VAR for noncointegrated I(1) variables in first differences is a statistically meaningful model with statistically meaningful parameters. It is legitimate in the context of OPs objective, i.e. *to figure out how variable X influences variable Y, conditional on Z, W, A, B, and C*. (A corresponding spurious model for variables in levels is not meaningful even statistically, let alone economically.) – Richard Hardy May 10 '18 at 10:47