I am trying to predict the traffic volume for all the stations. My dataset consists of two response variables (double), the total traffic volume and the net flow traffic volume. There are 5 explanatory variables: the station id (factor), year (factor), month (factor), day (factor), hour (integer), weekday (binary). Due to the fact that there's no obvious linear increment of the traffic volume against year/ month/ day, these variables are treated as categorical ones.
I would like to use the 6 input variables to predict for the total traffic volume for each station and the net flow separately. Simple linear regression could never finish the computation. Random forest could not handle categorical predictors with more than 53 categories.
Could anyone help and point me the right direction about what models might be able to solve this problem in R?