Some notes on Kalman Filtering

State Space form

Measurement Equation

\displaystyle \boxed{\mathbf{\underbrace{y_{t}}_{N \times 1}=\underbrace{Z_{t}}_{N \times m}\underbrace{a_{t}}_{m \times 1}+d_{t}+\varepsilon_{t}}}

\displaystyle Var(\varepsilon_{t})= \mathbf{H_{t}}

Transition Equation

\displaystyle \boxed{\mathbf{\underbrace{a_{t}}_{m \times 1} =\underbrace{T_{t}}_{m \times m} a_{t-1}+c_{t}+\underbrace{R_{t}}_{m \times g} \underbrace{\eta_{t}}_{g \times 1}}}

\displaystyle Var(\eta_{t})=\mathbf{Q}_{t}

\displaystyle E(a_{0})= \mathbf{a_{0} \; \; \; \; Var(a_{0})=P_{0}} \; \; \; \; E(\varepsilon_{t}a_{0}^{\top}) \; \; \; E(\eta_{t}a_{0}^{\top})

Future form

\displaystyle \mathbf{a_{t+1}=T_{t}a_{t}+c_{t}+R_{t}\eta_{t}}

1.2. Kalman Filter

Recursive procedure for computing the optimal estimator of the state vector at time t. When the model is Gaussian Kalman filter can be interpreted as updating the mean and covariance matrix of the conditional distribution of the state vector as new observations become available.

\displaystyle \alpha_{t-1} \sim N(\mathbf{a_{t-1},P_{t-1}})

then

\displaystyle \alpha_{t} \sim N(\mathbf{a_{t|t-1}, P_{t|t-1}})

where

\mathbf{a_{t|t-1}}=\mathbf{T_{t}a_{t-1}+c_{t}}
\mathbf{P_{t|t-1}}=\mathbf{T_{t}P_{t}T_{t}^{\top}+R_{t}Q_{t}R_{t}^{\top}}

Predictive distribution of \mathbf{y_{t}}

\displaystyle \mathbf{\tilde{y}}_{t|t-1}=\mathbf{Z_{t}a_{t|t-1}+d_{t}}

\displaystyle \mathbf{F_{t}=Z_{t}P_{t|t-1}Z_{t}^{\top}+H_{t}}

\displaystyle \left[ \begin{array}{c} \mathbf{\alpha_{t}}\\ \mathbf{y_{t}} \end{array}\right] \sim N \left[ \left( \begin{array}{c} \mathbf{a_{t|t-1}}\\ \mathbf{Z_{t}a_{t|t-1}+d_{t}} \end{array} \right), \left( \begin{array}{cc} \mathbf{P_{t|t-1}} & \mathbf{P_{t|t-1}Z_{t}^{\top}}\\ \mathbf{Z_{t}P_{t|t-1}} & \mathbf{Z_{t}P_{t|t-1}Z_{t}^{\top}+H_{t}} \end{array}\right) \right]

Updating equations

\displaystyle \boxed{\mathbf{a_{t}=a_{t|t-1}+\underbrace{P_{t|t-1}Z_{t}^{\top}}_{\Sigma_{12}} \underbrace{F_{t}^{-1}}_{\Sigma_{22}^{-1}}(y_{t}\underbrace{-Z_{t}a_{t|t-1}-d_{t}}_{-\mu_{2}})}}

and

\displaystyle \boxed{\mathbf{P_{t}=P_{t|t-1}-P_{t|t-1}Z_{t}^{\top}F_{t}^{-1}Z_{t}P_{t|t-1}}}

 

Contemporaneous filter: \displaystyle \mathbf{a_{t-1}} \rightarrow \mathbf{a_{t}}

Predictive filter: \displaystyle \mathbf{a_{t|t-1}} \rightarrow \mathbf{a_{t+1|t}}

In the latter case

\displaystyle \mathbf{a_{t+1|t}=T_{t+1}a_{t|t-1}+c_{t+1}+K_{t}v_{t}}

or

\mathbf{a_{t+1|t}\underbrace{=}_{\mathbf{T_{t+1}a_{t}}}(T_{t+1}-K_{t}Z_{t})a_{t|t-1}+K_{t}y_{t}+(c_{t+1}-K_{t}d_{t})}

where the gain matrix {\mathbf{K_{t}}} is given by

\displaystyle \boxed{\mathbf{K_{t}=T_{t+1}P_{t|t-1}Z_{t}^{\top}F_{t}^{-1}}}

and

\displaystyle \boxed{\mathbf{P_{t+1|t}=T_{t+1}} \underbrace{\mathbf{(P_{t|t-1}-P_{t|t-1}Z_{t}^{\top}F_{t}^{-1}Z_{t}P_{t|t-1}})}_{{\mathbf{P_{t}}}} \mathbf{T_{t+1}^{\top}+R_{t+1}Q_{t+1}R_{t+1}^{\top}}}

 

Initialization

Start Kalman at {t=0} with diffuse prior

\displaystyle \boxed{\mathbf{P_{0}=\kappa I} \; \; \; \; \kappa \rightarrow \infty}

Prediction


\displaystyle \boxed{\mathbf{a_{T+\ell |T}=T_{T+\ell}a_{T+\ell-1|T}+c_{T+\ell}}}

\displaystyle \boxed{\mathbf{P_{T+\ell |T}=T_{T+\ell}P_{T+\ell-1|T}T_{T+\ell}^{\top}+R_{T+\ell}Q_{T+\ell}R_{T+\ell}^{\top}}}

Taking conditional expectations in the measurement equation for {y_{t+\ell}}

\displaystyle \boxed{\mathbf{E[y_{t+\ell}|Y_{T}]=\tilde{Y}_{T+\ell|T}=Z_{T+\ell}a_{T+\ell|T}+d_{T+\ell}}}

with MSE matrix

\displaystyle MSE(\tilde{y}_{T+\ell|T})=\mathbf{Z_{T+\ell}P_{T+\ell|T}Z_{T+\ell}^{\top}+H_{T+\ell}}

 

MLE and prediction error decomposition

\displaystyle \mathbf{p(Y;\psi)=\prod_{t=1}^{T}p(y_{t}|Y_{t-1})}

Prediction errors or innovations


\displaystyle \boxed{\mathbf{v_{t}=y_{t}-\tilde{y}_{t|t-1} \sim NID(0,F_{t})}}

Prediction error decomposition

\displaystyle \boxed{\mathbf{\ell(\psi)= -\frac{T}{2} log2 \pi-\frac{1}{2} \sum_{t=1}^{T} log|F_{t}|-\frac{1}{2} \sum_{t=1}^{T} v_{t}^{\top} F_{t}^{-1}v_{t}}}

 

Diagnostic tests can be based on the standardized innovations

{\mathbf{F_{t}^{-1/2} v_{t}}} which are serially independent if {\psi} is known

 

 

* {L(\mathbf{\psi})} is maximized w.r.t. { \mathbf{\psi} } numerically. Diffuse prior {\Rightarrow} exact likelihood.
* { \mathbf{\psi= \left[ \underbrace{\psi^{\top}_{*} }_{n-1} , \sigma^{2}_{*} \right] ^{\top}} } and run independently on { \mathbf{\psi} }.

—————————————————————————————————————–

Further Reading:

Time Series Analysis

Time Series: Theory and Methods (Springer Series in Statistics)

An Introduction to State Space Time Series Analysis (PRACTICAL ECONOMETRICS SERIES)

Forecasting, Structural Time Series Models and the Kalman Filter

Tagged , , ,

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s