Friday, May 13, 2016

GSoC 2016 #0

About me


Welcome to my blog! Feel free to leave comments or anything. I am a student from Russia, right now I am on my 3rd year at Ural Federal University studying Computer Science. Also, I am studying at the Yandex School of Data Analysis, which is a free Master's level program by Yandex, Russian search engine and the biggest IT-company in the country. 
My interests lie in areas of Statistics, Machine Learning, also in Computer Vision and Natural Language Processing a little. I really love Mathematics, as well as I love its application, expressed within code.
My hobbies are listening to hip-hop music, working out in gym, visiting art galleries and reading novels by Dostoyevsky.
This year I am happy to take part in Google Summer of Code in Statsmodels under Python Foundation. It's a great opportunity, and I hope I will pass all deadlines successfully :)

About my GSoC project

The goal is to implement a Python module to do inference for linear state space models with regime switching, i.e. with underlying parameters changing in time by markovian law. A device for that, called Kim Filter, is described in "State-space Models With Regime Switching" book by Chang-Jin Kim and Charles R. Nelson.
Kim Filter includes Kalman Filter as a phase, which makes my work much easier and motivates pure Python approach, because I can delegate all the heavy lifting to statsmodels' Cython module performing Kalman Filter routine.
The next step of my project is to implement well-tested econometric models with regime switching, including Markov switching autoregression, Dynamic Factor model with regime switching and Time varying parameter model with Markov-switching heteroscedasticity. You can find details and exact specification of models in my proposal.
To perform testing, I am going to use Gauss code examples, published on Professor Chang-Jin Kim's website.

Setting up a development environment

To setup the environment, I followed advices of my mentor Chad Fulton, who helps me a lot with technical, coding, and field-specific issues. Probably, this would be helpful to anyone, who wants to contribute to statsmodels or any other Python library.
I am using Mac OS, so I performed the following steps:
  1. Deleted all versions of Statsmodels in my site-packages directory.
  2. Cloned the Statsmodels master repository to ~/projects/statsmodels (or anywhere else in your case).
  3. Added ~/project/statsmodels to my PYTHONPATH environment variable (included line export PYTHONPATH=~/projects/statsmodels:$PYTHONPATH at the end of ~/.bash_profile).
Now, any changes made to Python files are available when I restart the Python instance or use reload(module) command in the Python shell.
If I pull any changes to Cython files, I recompile them with python setup.py build_ext -i in statsmodels folder.

Running Gauss code

Code examples, provided by "State-space Models With Regime Switching" authors, require Gauss language interpreter, which is not a free software. But there is an open-source Ox console, which can run Gauss code.
But OxGauss doesn't support some Gauss functions by default, and you have to load analogous Ox language wrappers. In my case that was a function optmum, widely used in Kim-Nelson code samples, Ox Console developers provide M@ximize package for it. Another problem I spent some time to figure out is that M@ximize is incompatible with Ox Console 7, so I have to use 6th version, which works just fine.


What's next?

I will post regular reports about my work during the whole summer. I have already implemented a part of my project, so the next report is coming soon. But if interested, you already can see the code. Next time I will talk about design of Kim Filter and details of its implementation and testing. I'm sure you'll enjoy that!

No comments:

Post a Comment