Handling missing data in a time series

Question

Consider an epidemic curve like the one below, or any other count-based time series data:

enter image description here

If, as it turns out from digging into the records, retrospective data collection, or just a series of case reports that got sent via carrier pigeon rather than FedEx, that there are a group of cases that we know appear somewhere in this timeline, but not where in this timeline, is it possible to use multiple imputation, or some other probabilistic technique, to insert them into the time series?

Is there any reason to believe that these newly found records are anything but iid random variables? Or is it more likely that this 'series' of case reports is autocorrelated? — gregory_britten, Jun 04 '14 at 21:14
@gregory_britten For the moment, lets say that they're iid, though I would be interested in answers in both circumstances. — Fomite, Jun 04 '14 at 21:29

score 1 · Answer 1 · answered Jun 05 '14 at 05:41

You may want to see Honaker, J. and King, G. (2010). What to do about missing values in time-series cross-section data. American Journal of Political Science, 54(2):561–581, and the related R package Amelia II for multiple imputation with respect to time series. I am not quite sure about your application as to location within the time series, but their imputation model is pretty sophisticated and configurable... so perhaps.

score 1 · Answer 2 · answered May 23 '16 at 23:34

1

No, this no task for imputation algorithms. Since there are no missing values in the time series.

What is described are measurement errors.

So instead of searching for "imputation" the term "measurment error correction" should provide better results.

answered May 23 '16 at 23:34

Steffen Moritz

1,564
2
15
22

Handling missing data in a time series

2 Answers2

Linked