13

I am very new to time series analysis.

A random walk is defined as $Y_t=\phi Y_{t-1}+\varepsilon_t$, where $\phi=1$ and $\varepsilon_t$ is white noise. It is said that process is non-stationary for its variance is not constant. However, the mean is constant.

I am having a hard time understanding the "mean is constant" because when I plot a random walk process in R, I can see the variance is clearly changing, but to me the mean is also changing because there is a trend. 

What does it exactly mean when they say random walk has a constant mean?
There are mathematical function that proves it but I want to understand more intuitively.

Really appreciate your help.

set.seed(1)

TT <- 100
y <- ww <- rnorm(n = TT, mean = 0, sd = 1)
for (t in 2:TT) 
{
  y[t] <- y[t - 1] + ww[t]
}

enter image description here

koyamashinji
  • 269
  • 7

2 Answers2

29

To see what is happening you need more than one realisation of the random walk, because the mean and variance are summaries of the distribution of the walk, not of any single realisation.

This code repeats your code to plot 20 random walks

set.seed(1)

ys<-replicate(20,{
TT <- 100
y <- ww <- rnorm(n = TT, mean = 0, sd = 1)
for (t in 2:TT) 
{
  y[t] <- y[t - 1] + ww[t]
}
y
})

matplot(1:100,ys,type="l",
  col=rep(c("black","grey"),c(1,19)), lwd=rep(c(2,1),c(1,19)),lty=1)

to give many random walks

Any single realisation of the random walk will randomly walk off up or down the graph. The entire cloud of possible random walks stays centered at zero and spreads out as time passes; some go up, some go down, some stay near the middle. The mean of the cloud stays at zero; the variance increases linearly with time.

Thomas Lumley
  • 21,784
  • 1
  • 22
  • 73
13

There is a difference between unconditional mean and conditional mean, as there is between unconditional variance and conditional variance.

Mean

For a random walk $$ Y_t=Y_{t-1}+\varepsilon_t $$ with $\varepsilon_t\sim i.i.d(0,\sigma_\varepsilon^2)$, the condtional mean is $$ \mathbb{E}(Y_{t+h}|Y_{t})=Y_t $$ for $h>0$. This means that given the last observed value $Y_t$, the conditional mean of the process after $h$ periods, $\mathbb{E}(Y_{t+h}|Y_{t})$, is that value, regardless of how much time $h$ has passed. If time starts at $t=0$, then we have the mean conditional on the initial value being $\mathbb{E}(Y_{h}|Y_{0})$. From this we can see that the conditional mean varies with the conditioning information but not the time differential $h$.

Meanwhile, the unconditional mean at any fixed time point $h$ is zero: $$ \mathbb{E}(Y_{h})=\mathbb{E}(\sum_{i=0}^h\varepsilon_i)=\sum_{i=0}^h\mathbb{E}(\varepsilon_i)=\sum_{i=0}^h(0)=0. $$ Since it does not vary with $h$, we could say the mean of the process is zero.

Variance

The conditional variance is $$ \text{Var}(Y_{t+h}|Y_t)=h\sigma_\varepsilon^2. $$ For a fixed time differential $h$, the conditional variance is not increasing (the fluctuations are not getting wilder) over time, but conditional on some fixed time point the unconditional variance grows linearly with the time difference. Thus contrary to the conditional mean, the conditional variance does not vary with the conditioning information but does vary with (namely, grows linearly in) the time differential $h$.

Meanwhile, the unconditional variance at any fixed time point $h$ is the number of the time point $h$ times the variance of the increment term: $$ \text{Var}(Y_h)=\text{Var}(\sum_{i=0}^h\varepsilon_i)=\sum_{i=0}^h\text{Var}(\varepsilon_i)=\sum_{i=0}^h(\sigma_\varepsilon^2)=h \sigma_\varepsilon^2 $$ where the second equality uses the independence of the increments $\varepsilon_i$. Note that we can easily define the variance at a fixed time point but it is not as simple otherwise. Without being very rigorous, one could say the variance is undefined for an undefined time point. (This is in contrast to the mean.)

Richard Hardy
  • 54,375
  • 10
  • 95
  • 219
  • 1
    Indeed, the initial condition $Y_0 = 0$ nearly never makes sense in practice. So from a time-series point of view ($\neq$ maths) I would say that the unconditional expectation of a random walk does not exist and that the unconditional variance is infinite. As a general rule ARIMA models need "partially diffuse" initial conditions. – Yves Apr 29 '21 at 08:15
  • @Yves, I share your sentiment. We can talk about the mean and variance at a fixed time point but not without specifying one. I will make that clearer in my answer. – Richard Hardy Apr 29 '21 at 08:17
  • @Yves, I have edited my answer. I think we disagree somewhat, and I welcome constructive criticism; I want to get a better grasp of this myself. Note that in my answer, I presume the random walk starts at time $t=0$. – Richard Hardy Apr 29 '21 at 08:26
  • As a general rule when using a random walk we require to have at least one observation, for an integrated random walk we similarly need *two* observations. For a general ARIMA model, we need a suitable number of obserations related to the differentiation order and a suitable observation design related to the differentiation operator (e.g. seasonal). Without these the expectation and variance do not make sense. Of course we can choose particular initial conditions but that is not what most time series software will do. – Yves Apr 29 '21 at 08:31
  • @Yves, we require this for what exactly? I imagine a machine that is generating i.i.d. epsilons, and we are adding them up. We want to describe the properties of the partial sum. That seems fully possible as long as we fix how many elements there are in the partial sum. (Without knowing the number, we would perhaps need to take the expectation over the possible number of elements, and that may or may not make sense as a model for a particular application.) For such tasks, knowing the initial element is not necessary. But for other tasks it may be. – Richard Hardy Apr 29 '21 at 08:39
  • I agree with you. In the OP the random walk is described as an AR(1) with $\phi = 1$. For $|\phi| <1$ the process is asumed to start form its stationary distribution and this has mean zero. An important point is that this distribution no longer exists for $\phi=1$. – Yves Apr 29 '21 at 08:54
  • @Yves, I think I get what you are saying, at least intuitively. Thank you for your illuminating comments. – Richard Hardy Apr 29 '21 at 09:02
  • A constructive comment following the downvote would be appreciated. – Richard Hardy Apr 29 '21 at 09:14
  • (I upvoted Richard Hardy's answer and did not downvoted later). I agree on that explaining a downvote is to be encouraged. – Yves Apr 29 '21 at 09:30
  • Thank you so much for your help. Ok so I am trying to understand more intuitively the fact that unconditional variance is not constant. For example let's say the random walk is a realisation of a single coin toss result at t, t+1, ... , with heads being 1 and tails being -1. Isn't the unconditional variance constant because the unconditional probability distribution of a coin toss is always the same with both heads and tails being 50%? – koyamashinji Apr 29 '21 at 11:58
  • 2
    @koyamashinji. the distribution of one coin toss is always the same. The distributions of a sum of multiple tosses where the numbers of tosses differ are different. The distribution of the sum approaches a normal one as the number of tosses approach infinity (hint: central limit theorem). A random walk is a sum of tosses, not a single toss. The variance grows linearly with the number of tosses. Also, we can meaningfully discuss variance of a fixed number of tosses but the discussion becomes problematic otherwise. – Richard Hardy Apr 29 '21 at 12:25
  • No-vote from me on this answer, though I `+1`'d the other. The other answer got a `+1` because it seemed to accurately address the issue asked about in the question in a simple, concise way. This answer's pretty well-developed, employing good formatting and a good bit of detail -- which would normally be strong points, making for an easy `+1`. [...] – Nat Apr 30 '21 at 05:20
  • However, the discussion may come off as tangential. This is, the OP's basically just confused about what set-of-values the mean is of; the other answer addressed the confusion in saying **"_because the mean and variance are summaries of the distribution of the walk, not of any single realisation._"**, which I guess is what really needed to be said. – Nat Apr 30 '21 at 05:54
  • @Nat, thank you for the comment. I do not mind an occasional downvote, but I would encourage everyone to downvote answers that are wrong rather than ones that are correct (especially if they also do address the point of the OP). I understood the confusion behind the OP's question differently than the other answerer did, and I stressed the points I think are the crucial ones. Thus, I respectively disagree of what had to be said. (And apparently I was correct in my understanding of the OP given which of the two answers got accepted.) – Richard Hardy Apr 30 '21 at 05:59
  • @RichardHardy: I definitely agree that it'd be good practice to explain a down-vote in a case like this where the reason for a down-vote would be unclear given the quality of the answer. – Nat Apr 30 '21 at 06:03
  • @Nat, makes sense. By the way, I said *respectively* but I of course meant *respectfully* :) – Richard Hardy Apr 30 '21 at 06:47
  • @RichardHardy I down-voted because this answer does not make a distinction between the system of equations $Y_t=Y_{t-1}+\epsilon_t$ for $t \in \mathbb{Z}$ we are trying to solve and stochastic processes $\{Y_t\}_{t\in\mathbb{Z}}$ that are solution(s) of these equations. For example, it is easy to verify that if $\{Y_t\}_{t\in\mathbb{Z}}$ is a solution, then the shifted process $\{Y_t+c\}_{t\in\mathbb{Z}}$ is also a solution. – Jarle Tufto May 03 '21 at 13:55
  • So all you can say is that the unconditional mean and covariance of a particular solution (the one you have written down) is given by the formula you give. Other solutions have different unconditional means and covariance functions. https://stats.stackexchange.com/a/494304/77222 is somewhat related to what I'm saying. – Jarle Tufto May 03 '21 at 13:55
  • @JarleTufto, I think I get part of your point. But are solutions of a form other than $Y_t+c$ possible? If not, how come they may have different covariance functions? If yes, could you provide at least one example? Also, I have defined the time index to be nonnegative while you are taking $t\in Z$. This may make a difference. If we do not allow for negative time, does that not make my answer correct? – Richard Hardy May 03 '21 at 17:22