7

Consider an epidemic curve like the one below, or any other count-based time series data:

enter image description here

If, as it turns out from digging into the records, retrospective data collection, or just a series of case reports that got sent via carrier pigeon rather than FedEx, that there are a group of cases that we know appear somewhere in this timeline, but not where in this timeline, is it possible to use multiple imputation, or some other probabilistic technique, to insert them into the time series?

Alexis
  • 26,219
  • 5
  • 78
  • 131
Fomite
  • 21,264
  • 10
  • 78
  • 137
  • Is there any reason to believe that these newly found records are anything but iid random variables? Or is it more likely that this 'series' of case reports is autocorrelated? – gregory_britten Jun 04 '14 at 21:14
  • @gregory_britten For the moment, lets say that they're iid, though I would be interested in answers in both circumstances. – Fomite Jun 04 '14 at 21:29

2 Answers2

1

You may want to see Honaker, J. and King, G. (2010). What to do about missing values in time-series cross-section data. American Journal of Political Science, 54(2):561–581, and the related R package Amelia II for multiple imputation with respect to time series. I am not quite sure about your application as to location within the time series, but their imputation model is pretty sophisticated and configurable... so perhaps.

Alexis
  • 26,219
  • 5
  • 78
  • 131
1

No, this no task for imputation algorithms. Since there are no missing values in the time series.

What is described are measurement errors.

So instead of searching for "imputation" the term "measurment error correction" should provide better results.

Steffen Moritz
  • 1,564
  • 2
  • 15
  • 22