Time Series Clustering for Portfolio Management

My PhD thesis was completed in 2018 and was based on applying time series clustering techniques to fixed-income investments (corporate bonds). My hypothesis was that if one could effectively cluster bonds based on how they reacted to changes in the macro-economic and financial environment that it could be possible to use the results of these clusters as the basis of a portfolio management methodology which would be superior to the portfolio management techniques currently in use (which are largely based on diversification across a variety of dimensions such as industry and geography.

In its essence, my hypothesis is that different financial investments react differently to changes in the macro-economic or financial environment, which I modeled with the VIX index (a measure of implied volatility of S&P 500 stock options) and the TED Spread (an indicator of the perceived health of the banking system). In different ways, both of these indices are believed to be reflective of the level of risk (or volatility or fear) in the global macro financial systems. My belief is that some “clusters” of investments do badly in times of high risk and others do well due to a “flight to quality” effect and that if a portfolio manager can base his portfolio selections to diversify according to these clusters that the resulting portfolio will exhibit superior performance based on traditional portfolio metrics of maximizing return while minimizing variance (risk).

Primary outputs of my thesis included the following:

  • A high-level overview/literature survey of the field of time series clustering (Chapter 2 of thesis)
  • An extensive Matlab-based programming environment to implement and evaluate various clustering approaches (Chapters 3 and 4 of thesis)
  • The primary theoretical contribution of my thesis was an extension of model-based clustering approaches to allow the specification of candidate generative models using state space and Kalman filtering techniques (thesis chapter 5.3 and my 2018 paper published for the AETA conference. 

A small team of students is currently working on extending my thesis work by doing the following tasks.

  • Refresh and improve the data used for evaluating various clustering approaches.  The current dataset contains US bond price histories for all US issues from approximately 2002 through 2010.  I would like to refresh this data and to consider adding additional explanatory time series data and additional meta-data describing the individual bond issuers and issues.
  • Update my Matlab-based evaluation system to be more general purpose and more robust, perhaps re-coding it in Python or R.  When complete, I would like to publish this evaluation system for use by other researchers.
  • The initial results of a new clustering algorithm developed under this thesis appeared to be promising (Chapter 6 of thesis).  I would like to continue to refine and evaluate this modeling approach to determine if it can indeed yield superior results in the composition and management of investment portfolios.

In the summer and fall of 2024, the students have largely completed the first task above (refreshing and re-building the database). The objectives for the spring of 2025 are to complete the second task (re-build the evaluation environment in Python) and to begin work on the third task (evaluate the results and improve the clustering algorithm).

Two to four students are needed on this project. A strong background in Python is required in addition to an interest in financial engineering.