AR(q) model with F-test

Question

I am wondering that if we have an AR($q$) model for time series:

$$X_i=\beta_1X_{i-1}+..+\beta_{p}X_{i-p} + \beta_{p+1} X_{i-p-1} +...+\beta_{q} X_{i-q}+\epsilon_i,\epsilon_i \;\text{iid}\; N(0,\sigma^2)$$

Does the F-statistic for the F-test to test the null $H_o: \beta_{p+1}=...=\beta_{q} = 0$ vs $H_1:$ some of the $\beta_i,p+1\leq i\leq n$ are nonzero still follow $F$ distribution with degrees of freedom $q-p,n-q$ under null?

I know that under fixed design of $X$, $Y=X\beta +\epsilon, \epsilon \sim N(0,I_n)$, we could have the F statistic following an $F$ distribution with the right degrees of freedom. I think for random design, we assume $\epsilon |X \sim N(0,I_n)$ so that the same conclusion holds conditional on $X$. Would this be the case for the AR($q$) model? Or I am missing something?

Note that in the typical case, the F test doesn't test that all betas are 0, but just that they are all the same (eg, they can all be 7). — gung - Reinstate Monica, Jan 16 '15 at 17:32
@gung You mean the F test for the AR(q) model only test whether the rest p-q parameter are the same ? — Brian Ding, Jan 16 '15 at 18:23
I don't know enough about time series to answer that w/ confidence, I'm afraid. But if your comparison case is the classical situation (w/o random effects / autocorrelation, etc), F only tests if betas are equal to each other. — gung - Reinstate Monica, Jan 16 '15 at 18:31
@gung , I learnt that F-test could test some of the beta are 0 even when there is no intercept under fixed design with iid normal mean 0 noise. Consider the very simple case that $y=\beta x+\epsilon$, if we want to test $\beta =0$, then the t statistic would do so and is just the square root of the corresponding F-statistic. Did I understand the test correctly? — Brian Ding, Jan 18 '15 at 23:47
In a simple ANOVA if you suppress the intercept, all t statistics will test the null that the individual level mean is 0. However, the F statistic is no longer the square of any of the t statistics. If you have only 2 levels & you use reference level coding, then the t statistic for the intercept tests whether the reference level mean is 0, & the t statistic for the other level will test if the 2 levels differ. In that case, the F statistic will be the square of the 2nd t statistic. — gung - Reinstate Monica, Jan 19 '15 at 02:19

javlacalle · Accepted Answer · 2015-01-22T20:32:41.357

The regression model for the AR model that you specify departs from the assumptions of the classical regression model in that the explanatory variables are not deterministic or fixed. The explanatory variables are stochastic because they are lags of the dependent variable and hence depend on the stochastic term $\epsilon_i$. I think this is the point you are referring to with random design.

To avoid confusion with the dependent variable and for simplicity, I will denote as $Z$ the whole set of explanatory variables $X_{i-1}$, $X_{i-2}$,..., $X_{i-q}$.

In the context of stochastic regressors, we may have three situations, the explanatory variables and the disturbance term $\epsilon_i$ are:

independent of each other, $E(Z_i \epsilon_{i-k}) = 0$, $\forall i,k$;
contemporaneously uncorrelated, $E(Z_i \epsilon_i) = 0$, $\forall i$;
contemporaneously correlated, $E(Z_i \epsilon_i) \neq 0$ for some $i$.

In the context of the AR model we are in the second case (lagged versions of $X$ depend on lagged values of $\epsilon$, but not on the current value at time $i$ or at future times).

Your suggestion of conditioning on $Z$ is not appropriate. To see this, the joint distribution of the dependent variable $X$ is given by the product of the conditional and the marginal distributions:

$$ f(X,Z;\vec{\beta},\sigma^2) = f(X|Z;\vec{\beta},\sigma^2) f(Z;\cdot) \,. $$

If the marginal distribution did not depend on the parameters of the model, then we could get rid of it and work with the conditional distribution. However, since the explanatory variables follow the same AR model, the marginal distribution depends on the parameters of the model and we actually have $f(Z;\vec{\beta},\sigma^2)$. Hence, we cannot ignore it and things are not as easier as they would be in the other case.

Although the explanatory variables are not independent of the innovations $\epsilon$, they are contemporaneously uncorrelated, i.e., at a given time $i$, $Z_{i-j}$ with $j>0$ depends on past values of $\epsilon_i$ but not on the current value $\epsilon_i$. In this case, the distribution of the test statistics is unknown in finite (small) samples. However, the classical results still apply asymptotically and the $F$-test statistic asymptotically follows the $\chi^2$ distribution with $q-p$ degrees of freedom.

Note: be careful, the degrees of freedom would be $(q-p)$, not $(q−p\,, n−q)$ that you suggest, that would be the degrees of freedom of the Snedecor's $F$-distribution to be used in finite samples when the classical assumptions hold.

Details (in addition to the reference given by @CristiánAntuña): The above is true under the classical assumptions and the requirement that $\hbox{plim } Z'Z/n = Q$ is a finite and positive definite matrix. Then, the Mann and Wald theorem, Slutsky's theorem and Cramér's theorem can be applied to show the above result.

Edit 2

Note: When using the asymptotic test statistic, be aware that it needs to be normalized before comparing it with the $\chi^2$ distribution. In particular, the standard $F$-test statistic must be multiplied by the degrees of freedom in the numerator of the $F$ distribution. To see this, note that if a random variable $Y$ has an $F$ distribution and $X_1$, $X_2$ denote $\chi^2$ distributions with $n_1$ and $n_2$ degrees of freedom respectively, then $Y$ can be written as $Y=\frac{X_1/n_1}{X_2/n_2}$, i.e., the ratio of the two $\chi^2$ divided each by their degrees of freedom. The degrees of freedom in the denominator of the $F$-test goes to infinity, making $X_2/n_2$ go to one, hence, we are left with $Y=X_1/n_1$ and hence it is $n_1\times Y$ (not $Y$) what follows the $\chi^2$ distribution.

One issue still remains to be taken into account. As the explanatory variables are lagged versions of the same variable, they are likely to be correlated (more than likely if $X_t$ actually follows an AR structure). This will cause larger standard errors of parameter estimates, which may in turn affect the test statistic.

Roughly, the consequences of collinearity are the same as those caused by a small sample size. There isn't enough information in the data to discern to which variable the influence of their common component should be attributed. As we have already mentioned that in this context we need a large sample size, bootstrapping the test statistic would be a useful strategy that may get to kill two birds with one stone.

Someone may think, why would you want to do this test? Why care about the $F$-test in an AR model? I don't think you want to test for example an economic theory, as it's done with econometric models. If the purpose is to choose the optimal lag order for the AR model, then the Akaike's or Bayesian information criteria might be a better alternative. Anyway, I find this an interesting question and a nice example to discuss the classical regression model and variations that relax some of the assumptions.

Edit (revised)

For illustration, I show a small simulation exercise to assess the issues discussed above. According to this small experiment, the asymptotic test statistic performs well for a sample size of 100 observations. The empirical level found in this exercise was 0.047, close to the chosen nominal level, 5% For a smaller sample size of 60 observations similar results were obtained. The R code that replicates the results of the exercise is shown below.

Generate data (1000 series of length 100) from an AR(2) model. The same data set is used in both cases below.

set.seed(123)
xdata <- matrix(nrow = 100, ncol = 1000)
n <- nrow(xdata) # sample size
niter <- ncol(xdata) # number of iterations
for (i in seq_len(niter))
  xdata[,i] <- arima.sim(n = n, model = list(ar = c(0.6, -0.2)))

Compute the empirical level of the asymptotic $\chi^2$-test statistic for the null that the AR coefficients or order 3 and 4 are zero.

chisq.stats <- pvals <- rep(NA, niter)
resids <- matrix(nrow = n-4, ncol = niter)
coefs <- matrix(nrow = niter, ncol = 2)
for (i in seq_len(niter))
{
  x <- xdata[,i]
  # restricted model
  fit1 <- lm(x[5:n] ~ x[4:(n-1)] + x[3:(n-2)])
  # save estimated coefficients and residuals (extended with zeros)
  # to save computatons next in the bootstrap case
  coefs[i,] <- coef(fit1)[-1]
  resids[,i] <- residuals(fit1)
  # unrestricited model
  fit2 <- lm(x[5:n] ~ x[4:(n-1)] + x[3:(n-2)] + x[2:(n-3)] + x[1:(n-4)])
  # Chisq-test statistic and p-value
  chisq.stats[i] <- 2 * anova(fit1, fit2)$F[2]
  pvals[i] <- pchisq(chisq.stats[i], df = 2, lower.tail = FALSE)
}
sum(pvals < 0.05) / niter
#[1] 0.047

This is exactly I am looking for! Our prof just talked about it and did not explain why and then he introduced AIC and BIC. I know F test is not appropriate for testing in AR(p) model but just curious why it should follow Chi square. — Brian Ding, Jan 17 '15 at 19:09
@BrianDing Checking the simulation exercise I realized that the asymptotic test statistic needs to be normalized before comparing it with the $\chi^2$ distribution. In particular the standard $F$-test statistic must be multiplied by the degrees of freedom in the numerator of the $F$ distribution. For example, if the $F$-test statistic is F(2, 98) = 3, then the corresponding $\chi^2$ statistic is $2\times 3 = 6$. I added a note under "Edit 2" explaining this. — javlacalle, Jan 18 '15 at 10:40
I added a follow-up to this answer [here](https://jalobe.com/blog/f-test-statistic-in-autoregressive-models/). — javlacalle, May 19 '17 at 18:34

score 5 · Answer 2 · answered Jan 16 '15 at 19:58

This answer can be found on Hamilton's "Time Series Analysis", page 215 (http://www.amazon.com/Series-Analysis-James-Douglas-Hamilton/dp/0691042896)

I am sure it is asymptotically valid if you include a constant term in your model:

Since $X_{t+j}, j=1,2,...$ are not independent from $\epsilon_t$, the F test does not hold for small samples but still, its F statistic is distributed $\chi^2$ asymptotically (you can find the proof in the book, pp 215-216, but it is essentially the same as in regular OLS, again if you include a constant term).

If your model must not include a constant term, then I don't know the answer.

AR(q) model with F-test

2 Answers2

Linked