1

I am new to time-series data. I have a dataset with Year(2013-2018), Month, Mdate (1,2,3,4...31), Day (Monday - Sunday), Time (0-24) and Hourly Counts. I saw two seasonal trends in the dataset. I need to create a regression model to train and test the model to predict the Hourly Counts in 2018 based on past records(2013-2017). Hourly Counts is the target variable and other attributes will be combined together to form a feature.

enter image description here

I have drawn two boxplots. The left side is the Daily average hourly count and the right side is the monthly total hourly count. May I know how to deal with the seasonal trend in the dataset? Thank you so much

Decision tree regression is used in this case as per requirement.

1 Answers1

1

You don't want to remove the seasonalities, you want to account for them in forecasting.

You have a case of . Take a look at our other threads in that tag (perhaps in decreasing order of votes). The tag wiki has pointers to literature.


EDIT. Since you must use decision trees (why?), your best bet would be to include predictors that model your seasonalities. The trees should be able to account for interactions (the hourly pattern will likely be different on weekdays than on weekends) automatically.

  • Use harmonics for year-over-year seasonality. That is, transform the hour within a year into sine and cosine waves with one, two or three periods per year.
  • Use daily dummies for the day of the week.
  • Again, use harmonics for the day-over-day seasonality. That is, transform the hour of the day into sine and cosine waves with one or two periods per day.

In particular, don't use dummy coding or similar for months within the year or hours within the day, see here for the reason. I am less concerned about dummy coding for the days of the week, because there is usually indeed a sharp difference between weekdays and weekends.

Finally, if you can, I recommend looking at Random Forests as a straightforward generalization of decision trees.

Stephan Kolassa
  • 95,027
  • 13
  • 197
  • 357