1

I am trying to impute missing values by fitting higher degree polynomial.

I have highly autocorralated time series meaning each value at t must be close to t-1. There are some noise and missing values that im trying to fix. I am not sure how to classify sequences of missing data and how long sequences before and after the gap(sequence of missing values) used for fitting must be. enter image description here

In the picture you can see the gap of missing values which obviously can be recovered by using 3-4th degree polinomial by fitting it on some values before and after the gap. Is there any familiar approach to this?

  • You may be interested in [this post](https://stats.stackexchange.com/questions/104565/how-to-use-auto-arima-to-impute-missing-values/). – javlacalle Jan 05 '19 at 14:09
  • 1
    @javlacalle thanks for the interesting reference, although I managed to impute missing values by using Cubic Splines and do the job perfectly! – Nemanja Boskovic Jan 09 '19 at 09:33
  • 1
    If you want an interpolation technique which includes some measure of uncertainty, consider Kriging (https://en.wikipedia.org/wiki/Kriging), aka Gaussian Process. Though, if the gaps are really as small as shown on your images, I think simply taking the mean between the two end-points would be fine as well (linear interpolation). – LBogaardt May 23 '20 at 10:16

1 Answers1

0

I tried this approach and I think is very useful for filling the gaps in missing data of a numeric variable. The same approach can be used to have some control over numeric data noise (i.e by fitting a suitable polynomial degree to noisy data) where the model can then be used to generate a cleaner data for the same feature.

  • 2
    Could you clarify what you mean by "this approach"? The OP describes two different methods: the one in the question and a different one in a comment. Either one requires more details concerning the selection of local data used to perform the imputation across each gap. – whuber Jul 17 '19 at 20:20