4

There is data indexed by time:

$$ D_1, D_2, D_3, ..., D_T $$

I have a model that I assume the parameter $\theta_t$ changes with time $t$. As a result, I adapt a rolling window strategy:

$$ \theta_{t+1} = \underset{\theta}{\arg \max}~~\mbox{L}(\theta~; D_1, D_2, D_3, ..., D_{t}) $$

i.e.

$$ \theta_{t+n} = \underset{\theta}{\arg \max}~~\mbox{L}(\theta~; D_n, D_{n+1}, ..., D_{t+n}) $$

My problem is: fitting each model for each time step is very time-consuming. My optimization routine takes around 2 minutes and I have thousands of such data.

Are there any technique in statistics that I can exploit the structure of rolling window (i.e. using the previous estimation $\theta_{t}$ and the little difference between two likelihood function) to get $\theta_{t+1}$ quickly?

One thing is that I must use MLE. Not MAP nor something else.

Remarks: I am already using the previous point $\theta_{t}$ for the initial guess point for $\theta_{t+1}$.

kjetil b halvorsen
  • 63,378
  • 26
  • 142
  • 467
wonghang
  • 401
  • 2
  • 9
  • How many iterations your optimization routine needs to find $\theta_{t+1}$? – user603 Dec 22 '15 at 04:17
  • 1
    around ~130 iterations, ~ 900 evaluations of the function, ~200 evaluations of the gradient. If I use a warm initial point (as in said in the remarks), I can reduce it by around 10%. I computed $\frac{||\theta_{t+1} - \theta_{t}||}{\theta_{t}}$ ~= 0.08. As it is a log-likelihood function, the objective function between two optimizations are different by two (log-) density function. – wonghang Dec 22 '15 at 09:17

0 Answers0