Currently, I am working with a lot of time series data. A lot of my time series data have a lot of leading zeros. For example,
ts = [0, 0, 0, 0, 0, 0, 256, 129, 345, ...]
All, of the time series I am investigating are on a monthly cadence. My overall goal is to get the best forecast after fitting with Prohet, ARIMA, etc.
My first thought with these time series that have many zeros, was to remove them and then fit a model. However, this does not always yield the best results in terms of RMSE or MAPE
. So my questions are:
- Is it customary to remove leading zeros from a time series?
- Are there any types of analysis or tests that can determine when to remove leading zeros from a time series?
I have done some searching online but there is not a lot of information on this topic that I could find on my own. Any comments or resources would be greatly appreciated.