Monday, July 25, 2016

GSoC 2016 #4

Time Varying Parameters

Let's consider the following process:
Here yt is an observed process, xt is an exogenous vector, beta are so called time varying parameters that change with time, as (2) equation states. e and v are white noise terms:
Presented model has a name of Time-Varying-Parameter model, and it was a part of my proposal. As you can see, it is non-switching, but it is used to evaluate a good start parameters for switching model likelihood optimization.
TVP and MS-TVP models occurred to be the easiest and the most pleasant items of my proposal. Due to their simplicity I didn't have any difficulties implementing and debugging them. During MLE their parameters occurred to converge to expected values nicely, as well.

TVP: Implementation and testing

TVP model was implemented in upper-level statespace module (tvp.py file), rather then in the regime_switching. Implementation is a concise extension of MLEModel class. I used Kim and Nelson's (1989) modelling changing conditional variance or uncertainty in the U.S. monetary growth ([1], chapter 3.4) as a functional test and for iPython notebook demonstration.
A special thing about TVP is that its MLE results class (TVPResults) has a plot_coefficient method, which can draw a nice plot of time varying parameters, changing with time:
 

Heteroskedastic disturbances

Adding heteroskedastic disturbances to observation equation (1) allows to make a model regime-switching:
where St is a Markov regime process.

MS-TVP: Implementation and testing

TVP model with heteroskedastic disturbances is implemented in switching_tvp.py file of regime_switching module. It is as concise and elegant, as a non-switching analog. I'm going to implement coefficient plotting soon.
I used Kim's (1993) Time-varying-parameter model with heteroskedastic disturbances for U.S. monetary growth uncertainty to perform functional testing. One nice thing about MS-TVP is that it finds a near-correct likelihood maximum from a non-switching start. As you can see in tests.test_switching_tvp.TestKim1993_MLEFitNonswitchingFirst class, I use 0.05% relative tolerance.

What's next?

The remaining part of the summer will be about improving and polishing existing models. Now I am working on adding heteroskedastic disturbances to transition equation (2). As I noted above, I have to add coefficient plotting for a switching model. Other goals are making a MS-TVP notebook demonstration and overall improvement of MS-AR model.

Literature

[1] "State-space Models With Regime Switching" by Chang-Jin Kim and Charles R. Nelson.

Sunday, July 10, 2016

GSoC 2016 #3

Improving the code

During May and June I've been working hard, producing thousands of code lines, implementing Markov switching state space logic and tests, assuring that everything works correctly. After the midterm evaluation I've already implemented Kim filter, switching MLE model and Markov switching autoregression all generally working and passing basic tests.
So this was a nice moment to take a break and look closer at the existing code. Since the primary aspect of the project is its usability and maintainability after the summer, a detailed documentation, covering some hard mathematical calculations with comments, architectural enhancements are even more important things to do than to produce another model.
Here are an items completed so far to achieve a perfect code.

Refactoring

Several architectural improvements were done to decompose functionality into logical modules and match Statsmodels state space idioms. Initial architecture of regime_switching module wasn't anything sophisticated but something that just worked for the beginning:

As you can see, the KimFilter class aggregated the entire regime switching state space functionality like a bubble of code, which is something obvious to split into parts.
Another inconvenient thing about KimFilter was its complex state architecture, that is, to perform filtering, first thing you need is to bind some data to the filter, optionally select a way of regime probabilities and unobserved state initialization, than call filter method, after that filtered_regime_probs, etc. attributes are fulfilled with a useful data. This is inconvenient, because you have to look after the current state relevance by yourself.
This is how regime_switching looks after completed refactoring iteration:



Responsibilities of a different kind are now divided between an increased number of entities:
  • SwitchingRepresentation handles switching state space model, that is, it aggregates KalmanFilter instances for every regime and stores a regime transition probability matrix. FrozenSwitchingRepresentation is an immutable snapshot of representation.
  • KimFilter class is related to filtering, but it neither performs actual filtering nor stores any filtered data, it only controls the process. The first thing is handled by private _KimFilter class, while the second - by KimFilterResults, which is returned from KimFilter.filter method.
  • Smoothing is organized in a mirrored way, as you can see from the diagram: KimSmoother, KimSmootherResults and _KimSmoother classes.
MLE model wasn't touched by any major changes, except that a private ssm attribute is now KimSmoother class instance, rather than KimFilter.

Docstrings

An iteration of documenting was also done. It touched all main entities and the testing code.
This process also had some educational advantages for me personally, because I often feel a problem to express my thoughts and ideas to other people (e.g. my classmates), when it is about a very abstract things like coding or Math. So this was a nice practice. Moreover, documenting helped me to improve the code to make it more clear and concise, sometimes it even helped me to find bugs.

Comments

When it comes to optimal implementation of mathematics algorithms with a lot of matrix manipulations, code becomes quite unreadable. This is where inline comments help a lot. I tried to comment almost every logical block inside every method, the most dense comments are in _KimFilter and _KimSmoother classes, doing all the hard computational work.

What's next?

I will continue to enhance written code. There is some interface functionality to be added and to be covered by smoke tests. Only after that I will switch back to model implementation (MS-DFM and MS-TVP).