Forecasting with log transformed data and then taking the exponential of my predictions

Question

I am aware a similar question was asked here Interpreting forecast predictions of log transformed data.

But I would like a deeper comment/answer.

I have a time series and want to make forecast. I found that taking the log of my series (all positive values), helps with my increasing variance over time. If my first difference series is stationary, when I make a forecast with ARIMA(p,1,q), is it safe to take the exponential of my predictions, so to get my predictions back in levels? What problems are there with this procedure?

Here are two excellent treatments of the matter by Dave Giles: [Forecasting From Log-Linear Regressions](https://davegiles.blogspot.com/2013/08/forecasting-from-log-linear-regressions.html) and [More on Prediction From Log-Linear Regressions](https://davegiles.blogspot.com/2014/12/s.html). [This thread](https://stats.stackexchange.com/questions/140713) may also be relevant. — Richard Hardy, Nov 23 '21 at 06:51

Aksakal · Accepted Answer · 2021-11-23T16:00:21.097

2

If you are making a point forecast of the median, then simply get the exponent. If you are forecasting the mean then there are roughly two ways of getting back to original units: simple exponent and the one with variance adjustment: $$\hat y_t=\hat y_{t-1}e^{\hat\Delta_t}$$ $$\hat y_t=\hat y_{t-1}e^{\hat\Delta_t+\hat\sigma_{\Delta}^2/2}$$ where $\Delta_t=\ln y_t-\ln y_{t-1}$

The variance adjustment comes from the equation for the mean of the lognormal distribution $\xi\sim\mathcal{logN}(0,\sigma^2)$ where $E[\xi]=e^{\sigma^2/2}$. So, strictly, you shouldn't be using it unless your errors are normal. Since you mentioned ARIMA it's likely that you already assume normal errors because that's what standard packages do unless told otherwise.

If you knew the true variance $\sigma_{\Delta}^2$ of the process then the latter would have been the only option. Unfortunately, you don’t know the true variance. So, you have to pick one of the above options.

Some practitioners think that it is better to use the former because variance adjustment is not worth the trouble with estimated variance $\hat\sigma_{\Delta}^2$: it introduces its own error. Fortunately, the variance is usually not large enough to matter and you get about the same answers in both options.

To me the whole problem can be avoided by not making a point forecast, and instead proceed with the distribution forecast. In other words don't give your users one number, give them a distribution of the projections. If you are using the standard ARIMA packages then the simplest way to do this is to use their standard facilities, e.g. filter function does it MATLAB, where you supply random disturbances and get back the random paths, from which you can construct neat fan charts, like the ones used by BOE's inflation report.

edited Nov 23 '21 at 16:00

answered Nov 22 '21 at 22:33

Aksakal

55,939
5
90
176

Thank you! Do you have any references on how to make a distribution of the projection (maybe in python)? – Mangostino Nov 23 '21 at 01:10
+1, although I'm not all that sure about "the variance is usually not large enough to matter"... – Stephan Kolassa Nov 23 '21 at 06:31
The variance adjustment you have included only works under an assumption of normality. Consider making that explicit. For more information, see links I have included under the OP. – Richard Hardy Nov 23 '21 at 06:52
@RichardHardy, I agree, but OP uses ARIMA, which are always with normal errors by default in software – Aksakal Nov 23 '21 at 16:04
@Aksakal, the update looks good, thank you. – Richard Hardy Nov 23 '21 at 17:00
@Aksakal. I have a question. To obtain the distribution of my projections, I guess the first step is to transform my estimated with the exponential ŷ_t (which are in logs), so to get back to units, and then obtain the distribution of my projections? – Mangostino Nov 24 '21 at 21:35
@Aksakal and, is it possible to obtain the distribution of my six periods ahead projections? – Mangostino Nov 24 '21 at 21:52
1

to obtain distribution it is easiest to generate a bunch of paths, let's say 100, and that will allow you to show quantiles of the forecast – Aksakal Nov 24 '21 at 21:58

Forecasting with log transformed data and then taking the exponential of my predictions

1 Answers1