10

What are the steps involved in the use of Kalman filters in state space models?

I have seen a couple of different formulations, but I'm not sure about the details. For example, Cowpertwait starts with this set of equations:

$$y_{t} = F^{'}_{t}\theta_{t}+v_{t}$$ $$\theta_{t} = G_{t}\theta_{t-1}+w_{t}$$

where $\theta_{0} \sim N(m_{0}, C_{0}), v_{t} \sim N(0,V_{t})$, and $w_{t} \sim N(0, W_{t})$, $\theta_{t}$ are our unknown estimates and $y_{t}$ are the observed values.

Cowpertwait defines the distributions involved (prior, likelihood and posterior distribution, respectively):

$$\theta_{t}|D_{t-1} \sim N(a_{t}, R_{t})$$ $$y_{t}|\theta_{t} \sim N(F^{'}_{t}\theta_{t}, V_{t})$$ $$\theta_{t}|D_{t} \sim N(m_{t}, C_{t})$$

with

\begin{eqnarray} a_{t}&= G_{t}m_{t-1}, \qquad R_{t} &= G_{t}C_{t-1}G^{'}_{t} + W_{t} \\ e_{t}&=y_{t}-f_{t}, \qquad m_{t}&=a_{t}+A_{t}e_{t} \\ f_{t}&=F^{'}_{t}a_{t}, \qquad Q_{t}&=F^{'}_{t}R_{t}F_{t}+V_{t} \\ A_{t}&=R_{t}F_{t}Q^{-1}_{t}, \qquad C_{t}&=R_{t}-A_{t}Q_{t}A^{'}_{t} \end{eqnarray}

By the way, $\theta_{t}|D_{t-1}$ means the distribution of $\theta_{t}$ given the observed values $y$ up to $t-1$. A simpler notation is $\theta_{t|t-1}$ but I will stick with Cowpertwait's notation.

The author also describes the prediction for $y_{t+1}|D_{t}$ in terms of expectations:

$$E[y_{t+1}|D_{t}] = E[F^{'}_{t+1}\theta_{t+1} + v_{t+1}|D_{t}] = F^{'}_{t+1}E[\theta_{t+1}|D_{t}] = F^{'}_{t+1}a_{t+1} = f_{t+1}$$

As far as I understand, these are the steps, however, please let me know if there is a mistake or an imprecision:

  1. We start with $m_{0}$ , $C_{0}$, that is, we guess a value for our estimates $\theta_{0}$.
  2. We predict a value for $y_{1}|D_{0}$. That should be equal to $f_{1}$ which is $F^{'}_{1}a_{1}$. $a_{1}$ is known since $a_{1} = G_{1}m_{0}$.
  3. Once we have our prediction for $y_{1}|D_{0}$, we compute the error $e_{1} = y_{1} - f_{1}$.
  4. The error $e_{1}$ is used to calculate the posterior distribution $\theta_{1}|D_{1}$ that requires $m_{1}$ and $C_{1}$. $m1$ is given as a weighted sum of the prior mean and the error: $a_{1} + A_{1}e_{1}$.
  5. In the following iteration, we start by predicting $y_{2}|D_{1}$ as in the step 1. In this case, $f_{2} = F^{'}_{2}a_{2}$. Since $a_{2} = G_{2}m_{1}$ and $m1$ is the expectation of $\theta_{1}|D_{1}$ that we already calculated in the previous step, then we can proceed to compute the error $e_{2}$ and the mean of the posterior distribution $\theta_{2}|D_{2}$ as before.

I think the calculation of the posterior distribution $\theta_{t}|D_{t}$ is what some people call the update step and the use of the expectation of $y_{t+1}|D_{t}$ is the prediction step.

For the sake of brevity, I omitted the steps to calculate the covariance matrices.

Did I miss anything? Do you know a better way to explain this? I think this is still somewhat messy, so maybe there is a clearer approach.

Robert Smith
  • 3,191
  • 3
  • 30
  • 46

1 Answers1

3

I think what you say is correct, and I do not think it is messy. A way of phrasing it would be to say that the Kalman filter is an error-correction algorithm, that modifies predictions in the light of the discrepancies with current observations. This correction is made in your step 4) using the gain matrix $A_t$.

F. Tusell
  • 7,733
  • 19
  • 34
  • Thank you for your answer. Maybe it's correct, but I'd like to read a more detailed (and natural) explanation of this. I have read descriptions in books and slides, but most of them are not very clear and there are slight differences. – Robert Smith Oct 01 '13 at 16:53