I have a numeric data set with the following format. Y_deseasonal
is a deseasonalized variable of a time series with a time horizon from 2016 up to 2020. Each row represents a day
Y_deseasonal x1 x2 x3
... .. .. ..
342 22 12 25
359 27 12 25
367 27 12 22
367 27 12 22
367 27 12 22
... .. .. ..
I want to make a mathematical model of Y_deseasonal
as a relationship of Xs
and plan to test various methods (multivariate regression, Neural Network, Random Forest etc.) Before fitting the models, I am searching for a pragmatic way to account for the recency of the observations and provide more weight into the most recent ones when building the model.
I thought to sample out old observations with a decay effect. For example
- for 2016 sample and drop 40% of observations
- for 2017 sample and drop 30% of observations
- for 2018 sample and drop 20% of observations
- for 2019 sample and drop 10% of observations
- for 2020 sample and drop 0% of observations
Is it a solid solution or may I explore other options?