0

In relation to already posted question:

The difference between the three Augmented Dickey–Fuller test (none,drift, trend)

and specifically to the details given by Graeme Walsh's (https://stats.stackexchange.com/users/24617/graeme-walsh) answer, I still miss something about the use of the embed() function and its way to arrange data in a matrix fashion. As we can see from the following dummy example:

> s <- 1:6
> embed(s,3)
     [,1] [,2] [,3]
[1,]    3    2    1
[2,]    4    3    2
[3,]    5    4    3
[4,]    6    5    4

Considering the same piece of code shown in related question at link above:

data(sunspots)
x           <- sunspots
alternative <- "stationary"
k           <- trunc((length(x) - 1)^(1/3))

k <- k + 1          # Number of lagged differenced terms
y <- diff(x)        # First differences
n <- length(y)      # Length of first differenced series
z <- embed(y, k)    # Used for creating lagged series

yt  <- z[, 1]       # First differences
xt1 <- x[k:n]       # Series in levels - the first k-1 observations are dropped
tt  <- k:n          # Time-trend
yt1 <- z[, 2:k]     # Lagged differenced series - there are k-1 of them

Hence, yt <- z[, 1] stores observations "older" than the ones stored inside yt1 <- z[, 2:k] while comparing same row indexes between yt and yt1. At the same time, with constant and time-trend regression, the formula is:

                  yt ~ xt1 + 1 + tt + yt1

I mean yt (also) dependent upon yt1.

Would it be possible to clarify this aspect ? Thanks.

GiorgioG
  • 111
  • 7

1 Answers1

0

I think I overlooked the timeline. Indexes in time series start from old observations (first index) up to the newest one (last index). Hence yt[idx,1] stores a more recent observation than yt[idx, 2:k], for each index idx.

GiorgioG
  • 111
  • 7