0

We need to forecast the daily web page views data. Data consists of a csv file starting with 1st January 2016 and ends with 7th February 2019. Total no. of daily records is 1090.

Following step by step approach was followed for forecasting:

  1. Convert the daily record into a time series with frequency of 365.25
  2. Clean time series for any outliers
  3. Split the time series data into train set (first 90%) and test set (rest 10%)
  4. Fit auto.ARIMA model on train data
  5. Check the residuals. Given below is the residual graphs on fitted data
  6. Here is the result of Ljung-Box test

data: Residuals from ARIMA(3,1,4) with drift Q* = 665.58, df = 722, p-value = 0.9342

Model df: 8. Total lags used: 730

From the p-value (> 0.05) and residual graphs it seems that there is no white noise in residual. Is it good enough fit to forecast the future.

enter image description here

Richard Hardy
  • 54,375
  • 10
  • 95
  • 219
webstat
  • 111
  • 2

1 Answers1

0

Your whole approach assumes that there are no deterministic effects . Your residuals reflect this issue . What you should be looking for is a possible composite model incorporating both fixed effects and memory effects while dealing with anomalies. See Simple method of forecasting number of guests given current and historical data .

You might also want to look at a similar question Model for forecasting daily page views of a web page in R

IrishStat
  • 27,906
  • 5
  • 29
  • 55