My IT-blog: GSoC 2016 #Final

Prediction and forecasting

The last challenge I faced during GSoC was to implement Kim prediction and forecasting. At first it appeared to be quite difficult, because I dealt with both mathematical and architectural issues. Here's a short overview of the subject.

Prediction maths

Kalman filter in Statsmodels has three modes of prediction:

Static prediction. This is prediction of the next observation based on current. This type of prediction is equivalent to usual Kalman filtering routine.
Dynamic prediction. Still don't get its purpose.
Forecasting. Mathematically speaking, forecast of the out-of-sample observation is its expectation conditional on known observations.

My goal was to implement two of these types - static prediction and forecasting in case of switching regimes, i.e. construct Kim prediction and forecasting device.

I haven't got any problems with static prediction, because it's also equivalent to Kim filtering. But forecasting issue is not covered in [1], so I had to use my own intelligence to come up with mathematical routine, calculating future data expectation and covariance, conditional on known observations. Basically it's just a Kim filter, but without Hamilton filtering step and with underlying Kalman filters working in prediction mode.

Prediction code

My laziness forced intention to write less code and reuse the existing. The idea was to somehow reuse Kalman filter prediction, but when I came up with the correct forecast routine I understood that it doesn't appear to be possible. So I had to implement all the routine myself, which is located at kim_filter.KimPredictionResults._predict method. Luckily the prediction code is much more compact then the filter's one. Also I didn't have to care so much about optimality, since prediction doesn't take part in likelihood optimization.

Some nice pics

Since no test data is available for forecasting, I used iPython notebooks as a sort of visual testing.

I added pictures of static (one-step-ahead) prediction and forecasting to Lam's model and MS-AR model notebooks, they look sensible and my mentor liked them:

(this is for Lam's model)

(and that's for MS-AR)

Forecast's variance is constant, because it's fast to find a stationary value.

GSoC: Summary

I'm proud to say that I've completed almost all items of my proposal, except of constructing a generic Kim filter with an arbitrary r number of previous states to look up. But this is due to performance problems, which to be solved require more time then GSoC permits. In detail, implemented pure-Python r=2 case works slowly and is to be rewritten in Cython.
Anyway, a good advantage of my Kim filter implementation, as mentioned by my mentor, is using logarithms to work with probabilities. It gives a high improvement in precision, as I conclude from testing.
A broad report on what's completed and what's not can be found here in github comments.

GSoC: My impressions

GSoC has surely increased my level of self-confidence. I've made a lot of nice work, written 10k lines of code (I was expecting much less, to be honest), met many nice people and students.
I have to admit, that GSoC appeared to be easier than I thought. The most difficult and nervous part of GSoC was building a good proposal. I remember, that I learned a lot of new material in very short terms - I even had to read books about state space models during vacation, from my smartphone, sitting in a plane or a subway train.
I also started working on my project quite early - in the beginning of May. So, I had like 60% of my project completed by the midterm, and I didn't start working full time yet, because my school and exams finished only by the end of June.
So I worked hard during July, spending days in the local library, but still, I think I never worked like 8 hours a day. Eventually, to the beginning of the August I completed almost everything, the only left thing was prediction and forecasting, discussed previously in this post.
I dreamed about taking part in GSoC since I was sophomore, and I'm glad I finally did it. The GSoC code I produced is definitely the best work I've ever done, but I hope to do more in future.
Thanks for reading my blog! It was created for GSoC needs only, but I think I will continue writing, as I get something interesting to tell.

Literature

[1] - "State-space Models With Regime Switching" by Chang-Jin Kim and Charles R. Nelson.

My IT-blog

Thursday, August 18, 2016

GSoC 2016 #Final