Sunday, June 5, 2016

GSoC 2016 #1

Kim filter


(This is a picture I get, searching for "Kim filter" in Google)

If you look at the code in my Pull Request, you can see the class KimFilter (kim_filter.py file), which is the main part of my project. So what does it exactly do?
There is such thing called a linear state space representation:

It describes some observed vector process yt in time. Ht, At, Ft, Gt are some matrices, zt is a vector of exogenous data and beta_t is an unobserved process vector, which generates yt. et and vt are white noise vectors and nu is the intercept term.
This representation is usually used, when we have some real life process yt, and we want to understand its internal structure, to find a law describing it.
Exact values of matrices in the model definition depend on unknown vector of parameters in a model-specific way. In the general case it's just a vector of all matrices' elements, written in a row (but I never witnessed anything like this before). To estimate the unknown parameters we use the principle of maximum likelihood, widely used in Statistics and Machine Learning, which says that the most likely parameter values are the most probable. To apply maximum likelihood estimation (MLE), we need to calculate probability density of the observed process values. This is an issue a well known Kalman Filter deals with, calculating filtered values of unobserved process in addition. Then we apply some numerical maximization towards Kalman filter output as a function of parameters and get the solution.
Let's modify the linear state space representation a little:

You can see that instead of time indexes under the representation matrices we now have St, a discrete Markov process with P states and P x P transition matrix, which represents a changing regime of the process. When the process St is known a priori, we have a usual linear model. But when it is unknown and random, the model is very complex. To estimate it, we use Kim Filter, described in [1], which has been implemented by me recently in formerly mentioned PR.
It iterates over the process and during every iteration runs through three steps:
  1. Kalman filter step (_kalman_filter_step method in KalmanFilter class). During this phase we use Kalman filter for every value of current and previous regimes and thus get a battery of P x P filtered states and their covariances, conditional on every previous and current regime combination. If you look in the code, you can see, that I delegate this calculations to P KalmanFilter instances, which is very handy.
  2. Hamilton filter step (_hamilton_filter_step method). During this phase we calculate the likelihood of current observation, conditional on previous observations, and append it to  the overall likelihood. Also we calculate some helpful weight probabilities.
  3. Approximation step (_approximation_step method). Here we sum up filtered states and covariances, obtained during the first step, with weights, obtained during the second step, to achieve filtered states, conditional on only current regime value.
For the detailed mathematical description you can refer to [1] or look around my code.

Maximum likelihood estimation

We now can estimate likelihood of the parameters. The next goal is to optimize parameters according to the likelihood. This responsibility is handled by RegimeSwitchingMLEModel class, located in rs_mlemodel.py file and derived from MLEModel class. It was very handy to extend MLEModel and substitute Kalman filter by Kim filter instance in a corresponding property. These filters have a very similar interface, and MLEModel, being more about optimizing, delegates all model-specific actions to filter instance.
All noticeable changes were about parameters transforming, since we have another group of parameters, representing transition matrix. Also, RegimeSwitchingMLEModel has got a feature, discussed in the next item.

Starting parameters


Just passing the likelihood function to optimizer doesn't guarantee its convergence to global maximum. Optimizer can get stuck in local maxima, so the goal is to start from somewhere very close to the best likelihood value. My proposal suggested two approaches to this problem:

  1. Just to pretend like there is no problem and let user provide the starting parameters. Probably he will enter something very close to global maximum. This scenario is also implemented, because it is used for testing, based on Kim's code samples, where author passes some a priori known starting values to optimizer.
  2. To fit a one-regime analog of the switching model, using provided data, first (to run this feature, set fit_nonswitching_first=True in fit method of your RegimeSwitchingMLEModel extension). Then to initialize a switching model, where transition matrix is set to identity (like if there is no switching) and all regimes are equivalent to that obtained by non-switching fit. This approach seems to be intuitive, because we are choosing the model, complicating it gradually, according to the likelihood improvement.

Tests

To test the model I used this Gauss code from Kim's site, implementing Lam's Generalized Hamilton model. There is a basic test test_kim.py of Kim filter, which just tests likelihood and filtered states, obtained from my code, against Gauss code output. I am lucky that it is passed! Also, there is a file test_rs_mlemodel.py, which tests MLE against the same code. It is passed as well! Almost the same, but commented and more readable code you can see in the notebook, presenting regime_switching module usage.
Looking at the output of the notebook, you can see, that fit-non-switching-first start works not so bad, likelihood differs not much from the best, as well as obtained parameters.

What's next?

I hope to start implementing models soon, to make my module more ready-to-use. The first model in a row will be Markov-switching Autoregressive model, and it probably will be the hardest one. There is going be some fun, like implementing EM-algorithm for starting parameters estimation, I can't wait to start! Feel free to leave comments if you don't understand something or have got any thoughts about my project. Thanks!

Literature

[1] "State-space Models With Regime Switching" by Chang-Jin Kim and Charles R. Nelson.

No comments:

Post a Comment