What are the steps involved in the use of Kalman filters in state space models?
I have seen a couple of different formulations, but I'm not sure about the details. For example, Cowpertwait starts with this set of equations:
$$y_{t} = F^{'}_{t}\theta_{t}+v_{t}$$ $$\theta_{t} = G_{t}\theta_{t-1}+w_{t}$$
where $\theta_{0} \sim N(m_{0}, C_{0}), v_{t} \sim N(0,V_{t})$, and $w_{t} \sim N(0, W_{t})$, $\theta_{t}$ are our unknown estimates and $y_{t}$ are the observed values.
Cowpertwait defines the distributions involved (prior, likelihood and posterior distribution, respectively):
$$\theta_{t}|D_{t-1} \sim N(a_{t}, R_{t})$$ $$y_{t}|\theta_{t} \sim N(F^{'}_{t}\theta_{t}, V_{t})$$ $$\theta_{t}|D_{t} \sim N(m_{t}, C_{t})$$
with
\begin{eqnarray} a_{t}&= G_{t}m_{t-1}, \qquad R_{t} &= G_{t}C_{t-1}G^{'}_{t} + W_{t} \\ e_{t}&=y_{t}-f_{t}, \qquad m_{t}&=a_{t}+A_{t}e_{t} \\ f_{t}&=F^{'}_{t}a_{t}, \qquad Q_{t}&=F^{'}_{t}R_{t}F_{t}+V_{t} \\ A_{t}&=R_{t}F_{t}Q^{-1}_{t}, \qquad C_{t}&=R_{t}-A_{t}Q_{t}A^{'}_{t} \end{eqnarray}
By the way, $\theta_{t}|D_{t-1}$ means the distribution of $\theta_{t}$ given the observed values $y$ up to $t-1$. A simpler notation is $\theta_{t|t-1}$ but I will stick with Cowpertwait's notation.
The author also describes the prediction for $y_{t+1}|D_{t}$ in terms of expectations:
$$E[y_{t+1}|D_{t}] = E[F^{'}_{t+1}\theta_{t+1} + v_{t+1}|D_{t}] = F^{'}_{t+1}E[\theta_{t+1}|D_{t}] = F^{'}_{t+1}a_{t+1} = f_{t+1}$$
As far as I understand, these are the steps, however, please let me know if there is a mistake or an imprecision:
- We start with $m_{0}$ , $C_{0}$, that is, we guess a value for our estimates $\theta_{0}$.
- We predict a value for $y_{1}|D_{0}$. That should be equal to $f_{1}$ which is $F^{'}_{1}a_{1}$. $a_{1}$ is known since $a_{1} = G_{1}m_{0}$.
- Once we have our prediction for $y_{1}|D_{0}$, we compute the error $e_{1} = y_{1} - f_{1}$.
- The error $e_{1}$ is used to calculate the posterior distribution $\theta_{1}|D_{1}$ that requires $m_{1}$ and $C_{1}$. $m1$ is given as a weighted sum of the prior mean and the error: $a_{1} + A_{1}e_{1}$.
- In the following iteration, we start by predicting $y_{2}|D_{1}$ as in the step 1. In this case, $f_{2} = F^{'}_{2}a_{2}$. Since $a_{2} = G_{2}m_{1}$ and $m1$ is the expectation of $\theta_{1}|D_{1}$ that we already calculated in the previous step, then we can proceed to compute the error $e_{2}$ and the mean of the posterior distribution $\theta_{2}|D_{2}$ as before.
I think the calculation of the posterior distribution $\theta_{t}|D_{t}$ is what some people call the update step and the use of the expectation of $y_{t+1}|D_{t}$ is the prediction step.
For the sake of brevity, I omitted the steps to calculate the covariance matrices.
Did I miss anything? Do you know a better way to explain this? I think this is still somewhat messy, so maybe there is a clearer approach.