My IT-blog: GSoC 2016 #2

Markov switching autoregression

If you studied statistics and remember basics of time series analysis, you should be familiar with Autoregressive model, usually denoted as AR(p):

Here y is an AR process, e is a white noise term, nu is a mean of the process. Phi is a polynomial of order p:

L is a lag operator, which, multiplied by time series element, gives previous element. So (1), actually, can be rewritten in the following explicit form:

Since the process definition (1) is essentially a linear equation between process lagged values and error, it can be put in a state space form, which is shown in [1], chapter 3.3.

Again, let's extend equation (1) by adding an underlying Markov discrete process St of changing regimes:

You can notice, that the mean, error variance, and lag polynomial become dependent on switching regime value. This is a so called Markov switching autoregressive model (MS AR). Where can it be used in practice? Let's look at the example from [2], chapter 4.4, which I also used for testing of my code:

This is a sequence of U.S. real GDP. Looking at the data, two regimes are noticeable - expansion and recession. Using maximum likelihood estimation, we can fit this data into two-regime switching mean AR model to describe real GDP changing law quantitatively. Authors use AR(4) model, so do we. The next picture displays (smoothed, that is conditional on the whole dataset) probabilities of being in the recession regime:

Peaks of probability accurately correspond to recession periods, which proves that Markov switching AR provides a sophisticated tool for analyzing an underlying structure of time process.

Implementation

Markov switching autoregression is implemented in ms_ar.py file in my PR to Statsmodels. This file contains MarkovAutoregression class, which extends RegimeSwitchingMLEModel. This class "translates" equation (4) to the state space "language".
It was quite entertaining to express ideas, explained in chapter 3.3 of [1] within the Python code. One thing I had to be very careful about was that having AR(p) model of k regimes, state space representation should carry k^(p+1) regimes, since switching means occur in (4) with different regime indices. Thus, every state space regime represents p+1 lagged AR regimes.

Such a big number of regimes leads to longer computation time, which caused some problems. For example, Kim filtering of the former real GDP model took 25 seconds, which is inappropriate, when we are doing a lot of BFGS iterations to find likelihood maximum. Luckily I found a way to optimize Kim Filter, which was quite straightforward, in fact. If you remember a previous blog post, Kim filter iteration consists of heavy-weight Kalman filter step, where Kalman filtering iteration is applied a lot of (k^(2(p+1)) for MS AR!) times, and then summing the results with weights, equal to joint probabilities of being in current and previous regime. The thing is that in the case of sparse regime transition matrix, which MS AR model is about, these joint probabilities are very rare to be non-zero, and we don't need to calculate Kalman filtering for zero ones! This feature decreased Kim filter routine evaluation dramatically, giving 2-3 seconds on my machine (which is not very powerful, by the way).

EM-algorithm

MarkovAutoregression class also has a feature of EM-algorithm. Markov switching autoregressive model, defined by (4), with some approximations, though, is a regression with switching parameters and lagged observations as regressors. Such model, as shown in chapter 4.3.5 of [2], has a simple close-form solution for EM iteration. EM-algorithm is a great device to reach a very fast convergence. For example, in the comments to my PR I copied a debug output with the following numbers:

#0 Loglike - -1941.85536159
#1 Loglike - -177.181731435

Here #0 indicates random starting parameters likelihood, and #1 indicates the likelihood of parameters after one iteration of EM-algorithm. A very significant improvement, isn't it?
MarkovAutoregression has two public methods to run EM-algorithm: fit_em and fit_em_with_random_starts. First just performs a number of EM iterations for given starting parameters, while the second generates a set of random starting parameters, then applies EM-algorithm to all of them, finally choosing one with the best likelihood.

Testing

Right now there are two test files for MarkovAutoregression class, each based on one model - test_ms_ar_hamilton1989.py and test_ms_ar_garcia_perron1996.py. Besides formal functional tests, such as that filtering, smoothing and maximum likelihood estimation give correct values against this and this Gauss code samples, these files contain testing of EM-algorithm in its typical usage scenario - when user knows nothing about correct parameters, but wants to estimate something close to likelihood global maximum. This task is handled by already mentioned fit_em_with_random_starts method, which, by default, runs 50 sessions of EM-algorithm from random starts, each session consists of 10 iterations.

What's next?

I hope that the hardest part of the project, that is developing of Kim filter and Markov autoregression, is passed. Two more models remain: dynamic factor and time-varying parameters model with regime switching. There also will be a lot of refactoring of already written code, so some articles are going to be all about coding.

Literature

[1] - "Time Series Analysis by State Space Methods", Second Edition, by J. Durbin and S.J. Koopman.

[2] - "State-space Models With Regime Switching" by Chang-Jin Kim and Charles R. Nelson.

My IT-blog

Thursday, June 23, 2016

GSoC 2016 #2