1

I have a data set made up of 30 observations, years 1980-2020. The second variable is the amount of cargo passing through a particular harbor on the west coast (in tons). I have attempted to fit an ARIMA model to this time series data, with the following result: ARIMA(0,1,0), which I have read is a random walk. What does this mean for the data? Can predictions me made for future values of cargo using a random walk model? When I predicted future values in STATA, I got a simple linear line with an upward trend. How does this differ from just using a standard linear regression model? I know that the Microsoft Excel function "trendline" is the linear regression model, but was told by my teacher that a "trendline" is not a future forecast. How does ARIMA differ from this "trendline?" Thanks!

Richard Hardy
  • 54,375
  • 10
  • 95
  • 219
amvoight
  • 11
  • 2
  • 1980-2020 is 41 years, so if you have only 30 observations, are some years missing? – Stephan Kolassa Feb 02 '21 at 19:43
  • lol apparently I can't count XD, Yes, I meant 40 years. – amvoight Feb 02 '21 at 22:08
  • Well it means the data is a random walk. I think the real question is what do you want to do. Predict? If so you might leave say the last decade out of the data and predict with the linear model and ARIMA (which has many of the assumptions of a linear model) and see which predicts better. Of course there is no way to tell if this will continue, realities change aka structural breaks. If it was easy anyone could do it. :) – user54285 Feb 03 '21 at 00:16

1 Answers1

0

I have attempted to fit an ARIMA model to this time series data, with the following result: ARIMA(0,1,0), which I have read is a random walk. What does this mean for the data?

An ARIMA(0,1,0) process is one where the first differences $y_t-y_{t-1}$ (that's the 1) have no autoregressive (that's the first 0) and no moving average (that's the second 0) component. That is, this difference is white noise. As you say you have an upward trend, I would assume your software modeled an ARIMA process with an intercept $c$ (these are often reported inconspicuously), so your fitted process would look like this:

$$ y_t-y_{t-1} = c+\epsilon\quad\epsilon\sim N(0,\sigma^2), $$

or

$$ y_t=y_{t-1}+c+\epsilon\quad\epsilon\sim N(0,\sigma^2). $$

So in the opinion of your software, next year's observation is just this year's observation, plus an additive increment $c$, plus some random noise.

Can predictions me made for future values of cargo using a random walk model?

Sure they can! To forecast $\hat{y}_{t+1}$, take your known $y_t$ (the last historical observation, add the estimated intercept ("intercept" for the differenced series) $\hat{c}$, and add the expected value of the error term, which is $0$. Your forecast is:

$$ \hat{y}_{t+1} = y_t+\hat{c}. $$

And then you iterate this. Your $k$-step-ahead forecast is simply

$$ \hat{y}_{t+k} = y_t+k\hat{c}. $$

When I predicted future values in STATA, I got a simple linear line with an upward trend. How does this differ from just using a standard linear regression model?

Well, we have seen that your $k$-step ahead forecast indeed is simply an upward line. In this particular situation, this is indeed no different from a standard linear regression on time. The difference is that your ARIMA fitting process also considered other possible ARIMA(p,d,q) models, and discarded them in favor of the simple ARIMA(0,1,0) one.

There is no shame in using a simple model, if it's the best your data can support. Also note that for the differenced data (which, given the upward trend in international shipping, it makes sense to consider), your model is just the overall mean, since $c$ will be estimated as the average of the increments. And you very often find that the simple mean outperforms more complicated ARMA models.

In your particular situation, a simple trend should be fine for short range forecasting. Of course, your data will be perturbed by macroeconomic conditions, but these are hard to forecast themselves, and so is modeling their impact on your series of interest. Another aspect would be port capacity, which may be a constraint. For longer-term forecasts, you may want to dampen your trend.

I recommend the excellent free online forecasting textbook Forecasting: Principles and Practice (2nd ed.) by Athanasopoulos & Hyndman.

Stephan Kolassa
  • 95,027
  • 13
  • 197
  • 357