Skip to main content
Log in

Identifying patterns in financial markets: extending the statistical jump model for regime identification

  • Original Research
  • Published:
Annals of Operations Research Aims and scope Submit manuscript

Abstract

Regime-driven models are popular for addressing temporal patterns in both financial market performance and underlying stylized factors, wherein a regime describes periods with relatively homogeneous behavior. Recently, statistical jump models have been proposed to learn regimes with high persistence, based on clustering temporal features while explicitly penalizing jumps across regimes. In this article, we extend the jump model by generalizing the discrete hidden state variable into a probability vector over all regimes. This allows us to estimate the probability of being in each regime, providing valuable information for downstream tasks such as regime-aware portfolio models and risk management. Our model’s smooth transition from one regime to another enhances robustness over the original discrete model. We provide a probabilistic interpretation of our continuous model and demonstrate its advantages through simulations and real-world data experiments. The interpretation motivates a novel penalty term, called mode loss, which pushes the probability estimates to the vertices of the probability simplex thereby improving the model’s ability to identify regimes. We demonstrate through a series of empirical and real world tests that the approach outperforms traditional regime-switching models. This outperformance is pronounced when the regimes are imbalanced and historical data is limited, both common in financial markets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Algorithm 1
Algorithm 2
Algorithm 3
Algorithm 4
Algorithm 5
Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

Notes

  1. The optimal selection of the number of states K can be addressed in a similar manner.

  2. To address the issue of local optima, just like for the discrete jump model we run the algorithm ten times with different starting points, generated by the K-means\(++\) algorithm.

  3. The approximate DP algorithm can decrease the running time by approximately five times with an appropriate grid size, as described below. This reduction is crucial as QP problems need to be solved tens to hundreds of times for each jump model solution. Another reason for proposing this DP approach is its adaptability to the mode loss introduced later in Sect. 4.4.

  4. The exact sequence of hidden vectors solved by a QP solver differs very little from the sequence given by the approximate DP algorithm.

  5. We remark that the LP problem (12) is a tight relaxation for fitting the hidden state sequence \({\varvec{S}}\) in the discrete jump model (1) when the model parameters \(\Theta \) are held fixed, a task previously addressed by the DP algorithm 3.

  6. Here, the signal-to-noise ratio measures the separability among clusters, and is calculated as the ratio of the distance between two cluster centers to the volatility (Balakrishnan et al., 2017).

References

  • Aminikhanghahi, S., & Cook, D. J. (2017). A survey of methods for time series change point detection. Knowledge and information systems, 51(2), 339–367.

    Article  Google Scholar 

  • Andersson, S., Rydén, T., & Johansson, R. (2003). Linear optimal prediction and innovations representations of hidden markov models. Stochastic Processes and their Applications, 108(1), 131–149.

    Article  Google Scholar 

  • Ang, A., & Timmermann, A. (2012). Regime changes and financial markets. Annual Review of Financial Economics, 4(1), 313–337.

    Article  Google Scholar 

  • Arthur, D. and Vassilvitskii, S. (2007). K-means++: The advantages of careful seeding. In Proceedings of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA ’07, pp. 1027–1035, USA. Society for Industrial and Applied Mathematics.

  • Attouch, H., Bolte, J., Redont, P., & Soubeyran, A. (2010). Proximal alternating minimization and projection methods for nonconvex problems: An approach based on the Kurdyka-Lojasiewicz inequality. Mathematics of Operations Research, 35(2), 438–457.

    Article  Google Scholar 

  • Bae, G. I., Kim, W. C., & Mulvey, J. M. (2014). Dynamic asset allocation for varied financial markets under regime switching framework. European Journal of Operational Research, 234(2), 450–458.

    Article  Google Scholar 

  • Balakrishnan, S., Wainwright, M. J., & Yu, B. (2017). Statistical guarantees for the EM algorithm: From population to sample-based analysis. The Annals of Statistics, 45(1), 77–120.

    Article  Google Scholar 

  • Barberis, N. and Thaler, R. (2003). Chapter 18 A survey of behavioral finance. In Financial Markets and Asset Pricing, of Handbook of the Economics of Finance, vol. 1 pp. 1053–1128. Elsevier.

  • Baum, L., Petrie, T., Soules, G., & Weiss, N. (1970). A maximization technique occurring in the statistical analysis of probabilistic functions of Markov chains. The Annals of Mathematical Statistics, 41(1), 164–71.

    Article  Google Scholar 

  • Bazzi, M., Blasques, F., Koopman, S. J., & Lucas, A. (2017). Time-varying transition probabilities for Markov regime switching models. Journal of Time Series Analysis, 38(3), 458–478.

    Article  Google Scholar 

  • Bemporad, A., Breschi, V., Piga, D., & Boyd, S. P. (2018). Fitting jump models. Automatica, 96, 11–21.

    Article  Google Scholar 

  • Bertsekas, D. P. (1999). Nonlinear Programming. Athena Scientific, Belmont, 2nd edition.

  • Bertsimas, D., & Tsitsiklis, J. N. (1997). Introduction to Linear Optimization. Athena Scientific.

    Google Scholar 

  • Bickel, P. J., Ritov, Y., & Rydén, T. (1998). Asymptotic normality of the maximum-likelihood estimator for general hidden Markov models. The Annals of Statistics, 26(4), 1614–1635.

    Article  Google Scholar 

  • Bolte, J., Sabach, S., & Teboulle, M. (2014). Proximal alternating linearized minimization for nonconvex and nonsmooth problems. Mathematical Programming, 146(1–2), 459–494.

    Article  Google Scholar 

  • Boswijk, H. P., Hommes, C. H., & Manzan, S. (2007). Behavioral heterogeneity in stock prices. Journal of Economic Dynamics and Control, 31(6), 1938–1970.

    Article  Google Scholar 

  • Bradley, A. P. (1997). The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recognition, 30(7), 1145–1159.

    Article  Google Scholar 

  • Brodersen, K. H., Ong, C. S., Stephan, K. E., and Buhmann, J. M. (2010). The balanced accuracy and its posterior distribution. In 2010 20th International Conference on Pattern Recognition, pp. 3121–3124.

  • Bry, G. and Boschan, C. (1971). Cyclical Analysis of Time Series: Selected Procedures and Computer Programs. NBER.

  • Bulla, J., & Bulla, I. (2006). Stylized facts of financial time series and hidden semi-Markov models. Computational Statistics Data Analysis, 51(4), 2192–2209.

    Article  Google Scholar 

  • Bulla, J. (2011). Hidden Markov models with t components: Increased persistence and other aspects. Quantitative Finance, 11(3), 459–475.

    Article  Google Scholar 

  • Bulla, J., & Berzel, A. (2008). Computational issues in parameter estimation for stationary hidden Markov models. Computational Statistics, 23(1), 1–18.

    Article  Google Scholar 

  • Bulla, J., Mergner, S., Bulla, I., Sesboüé, A., & Chesneau, C. (2011). Markov-switching asset allocation: Do profitable strategies exist? Journal of Asset Management, 12(4), 310–321.

    Article  Google Scholar 

  • Cartea, A., & Jaimungal, S. (2013). Modelling asset prices for algorithmic and high-frequency trading. Applied Mathematical Finance, 20(6), 512–547.

    Article  Google Scholar 

  • Cortese, F., Kolm, P., & Lindström, E. (2023). What drives cryptocurrency returns? A sparse statistical jump model approach. Digital Finance, 5(3), 483–518.

    Article  Google Scholar 

  • Cortese, F. P., Kolm, P. N., & Lindström, E. (2023). Generalized Information Criteria for Sparse Statistical Jump Models. In P. Linde (Ed.), Symposium I Anvendt Statistik. (Vol. 44). Copenhagen: Copenhagen Business School.

    Google Scholar 

  • Dempster, A. P., Laird, N. M., & Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society Series B Methodological, 39(1), 1–22.

    Article  Google Scholar 

  • Dias, J. G., Vermunt, J. K., & Ramos, S. (2015). Clustering financial time series: New insights from an extended hidden Markov model. European Journal of Operational Research, 243(3), 852–864.

    Article  Google Scholar 

  • Ebbers, J., Heymann, J., Drude, L., Glarner, T., Haeb-Umbach, R., and Raj, B. (2017). Hidden Markov model variational autoencoder for acoustic unit discovery. In InterSpeech, pp. 488–492.

  • Elliott, R. J., Siu, T. K., & Badescu, A. (2010). On mean-variance portfolio selection under a hidden Markovian regime-switching model. Economic Modelling, 27(3), 678–686.

    Article  Google Scholar 

  • Fine, S., Singer, Y., & Tishby, N. (1998). The hierarchical hidden Markov model: Analysis and applications. Machine Learning, 32(1), 41–62.

    Article  Google Scholar 

  • Ghahramani, Z., & Jordan, M. (1995). Factorial hidden Markov models. In D. Touretzky, M. Mozer, & M. Hasselmo (Eds.), Advances in Neural Information Processing Systems. (Vol. 8). MIT Press.

    Google Scholar 

  • Goutte, S., Ismail, A., & Pham, H. (2017). Regime-switching stochastic volatility model: Estimation and calibration to VIX options. Applied Mathematical Finance, 24(1), 38–75.

    Article  Google Scholar 

  • Gray, S. F. (1996). Modeling the conditional distribution of interest rates as a regime-switching process. Journal of Financial Economics, 42(1), 27–62.

    Article  Google Scholar 

  • Gu, J., & Mulvey, J. M. (2021). Factor momentum and regime-switching overlay strategy. The Journal of Financial Data Science, 3(4), 101–129.

    Article  Google Scholar 

  • Hallac, D., Vare, S., Boyd, S., and Leskovec, J. (2017). Toeplitz inverse covariance-based clustering of multivariate time series data. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’17, pp. 215–223, New York, NY, USA. Association for Computing Machinery.

  • Hamilton, J. D. (1989). A new approach to the economic analysis of nonstationary time series and the business cycle. Econometrica, 57(2), 357–384.

    Article  Google Scholar 

  • Hamilton, J. D., & Susmel, R. (1994). Autoregressive conditional heteroskedasticity and changes in regime. Journal of Econometrics, 64(1), 307–333.

    Article  Google Scholar 

  • Hand, D., & Till, R. (2001). A simple generalisation of the area under the ROC curve for multiple class classification problems. Machine Learning, 45(2), 171–186.

    Article  Google Scholar 

  • Hardy, M. R. (2001). A regime-switching model of long-term stock returns. North American Actuarial Journal, 5(2), 41–53.

    Article  Google Scholar 

  • Himberg, J., Korpiaho, K., Mannila, H., Tikanmaki, J., and Toivonen, H. T. (2001). Time series segmentation for context recognition in mobile devices. In Proceedings 2001 IEEE international conference on data mining, pp. 203–210. IEEE.

  • Hsu, D., Kakade, S. M., & Zhang, T. (2012). A spectral algorithm for learning hidden Markov models. Journal of Computer and System Sciences, 78(5), 1460–1480.

    Article  Google Scholar 

  • Kim, S.-J., Koh, K., Boyd, S., & Gorinevsky, D. (2009). \(\ell _1\) trend filtering. SIAM Review, 51(2), 339–360.

    Article  Google Scholar 

  • Kowalski, M. (2009). Sparse regression using mixed norms. Applied and Computational Harmonic Analysis, 27(3), 303–324.

    Article  Google Scholar 

  • Levin, D. A., Peres, Y., and Wilmer, E. L. (2017). Markov Chains and Mixing Times. American Mathematical Society, 2nd edition.

  • Li, X. and Mulvey, J. M. (2023). Optimal portfolio execution in a regime-switching market with non-linear impact costs: Combining dynamic program and neural network. pre-print.

  • Li, X., & Mulvey, J. M. (2021). Portfolio optimization under regime switching and transaction costs: Combining neural networks and dynamic programs. INFORMS Journal on Optimization, 3(4), 398–417.

    Article  Google Scholar 

  • Lin, M. (2023). Essays on Applications of Networks and Discrete Optimization. Phd. dissertation, Princeton University.

  • Lloyd, S. P. (1982). Least squares quantization in PCM. IEEE Transactions on Information Theory, 28(2), 129–137.

    Article  Google Scholar 

  • Mulvey, J. M., & Liu, H. (2016). Identifying economic regimes: Reducing downside risks for university endowments and foundations. The Journal of Portfolio Management, 43(1), 100–108.

    Article  Google Scholar 

  • Munkres, J. (2000). Topology. Pearson, 2nd edition.

  • Ng, A., Jordan, M., and Weiss, Y. (2001). On spectral clustering: Analysis and an algorithm. Advances in neural information processing systems, 14.

  • Nystrup, P., Kolm, P. N., & Lindström, E. (2020). Greedy online classification of persistent market states using realized intraday volatility features. The Journal of Financial Data Science, 2(3), 25–39.

    Article  Google Scholar 

  • Nystrup, P., Kolm, P. N., & Lindström, E. (2021). Feature selection in jump models. Expert Systems with Applications, 184, 115558.

    Article  Google Scholar 

  • Nystrup, P., Lindström, E., & Madsen, H. (2020). Learning hidden Markov models with persistent states by penalizing jumps. Expert Systems with Applications, 150, 113307.

    Article  Google Scholar 

  • Nystrup, P., Madsen, H., & Lindström, E. (2015). Stylised facts of financial time series and hidden Markov models in continuous time. Quantitative Finance, 15(9), 1531–1541.

    Article  Google Scholar 

  • Nystrup, P., Madsen, H., & Lindström, E. (2017). Long memory of financial time series and hidden Markov models with time-varying parameters. Journal of Forecasting, 36(8), 989–1002.

    Article  Google Scholar 

  • Pagan, A. R., & Sossounov, K. A. (2003). A simple framework for analysing bull and bear markets. Journal of Applied Econometrics, 18(1), 23–46.

    Article  Google Scholar 

  • Peyré, G., & Cuturi, M. (2019). Computational optimal transport. Foundations and Trends in Machine Learning, 11(5–6), 355–607.

    Article  Google Scholar 

  • Picard, F., Lebarbier, E., Budinskà, E., & Robin, S. (2011). Joint segmentation of multivariate Gaussian processes using mixed linear models. Computational Statistics Data Analysis, 55(2), 1160–1170.

    Article  Google Scholar 

  • Rabiner, L. R. (1989). A tutorial on hidden Markov models and selected applications in speech recognition. Proceedings of the IEEE, 77(2), 257–286.

    Article  Google Scholar 

  • Reus, L., & Mulvey, J. M. (2016). Dynamic allocations for currency futures under switching regimes signals. European Journal of Operational Research, 253(1), 85–93.

    Article  Google Scholar 

  • Rydén, T. (2008). EM versus Markov chain Monte Carlo for estimation of hidden Markov models: a computational perspective. Bayesian Analysis, 3(4), 659–688.

    Article  Google Scholar 

  • Rydén, T., Teräsvirta, T., & Åsbrink, S. (1998). Stylized facts of daily return series and the hidden Markov model. Journal of Applied Econometrics, 13(3), 217–244.

    Article  Google Scholar 

  • Sawhney, A. (2020). Regime identification, curse of dimensionality and deep generative models. Quantitative Brokers: Technical report.

    Google Scholar 

  • Schwert, G. W. (1989). Why does stock market volatility change over time? The Journal of Finance, 44(5), 1115–1153.

    Article  Google Scholar 

  • Shu, Y., Yu, C., and Mulvey, J. M. (2024). Regime-aware asset allocation: A statistical jump model approach. SSRN.

  • Stock, J. H., & Watson, M. W. (1996). Evidence on structural instability in macroeconomic time series relations. Journal of Business Economic Statistics, 14(1), 11–30.

    Article  Google Scholar 

  • Uysal, A. S., & Mulvey, J. M. (2021). A machine learning approach in regime-switching risk parity portfolios. The Journal of Financial Data Science, 3(2), 87–108.

    Article  Google Scholar 

  • Viterbi, A. (1967). Error bounds for convolutional codes and an asymptotically optimum decoding algorithm. IEEE Transactions on Information Theory, 13(2), 260–269.

    Article  Google Scholar 

  • Witten, D. M., & Tibshirani, R. (2010). A framework for feature selection in clustering. Journal of the American Statistical Association, 105(490), 713–726. PMID: 20811510.

    Article  Google Scholar 

  • Wright, S. J. (2015). Coordinate descent algorithms. Mathematical Programming, 151(1), 3–34.

    Article  Google Scholar 

  • Yang, F., Balakrishnan, S., & Wainwright, M. J. (2017). Statistical and computational guarantees for the Baum-Welch algorithm. The Journal of Machine Learning Research, 18(1), 4528–4580.

    Google Scholar 

  • Zhang, M., Jiang, X., Fang, Z., Zeng, Y., & Xu, K. (2019). High-order hidden Markov model for trend prediction in financial time series. Physica A: Statistical Mechanics and its Applications, 517, 1–12.

    Article  Google Scholar 

  • Zheng, K., Li, Y., & Xu, W. (2021). Regime switching model estimation: Spectral clustering hidden Markov model. Annals of Operations Research, 303, 297–319.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yizhan Shu.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This article is dedicated to Professor Bill Ziemba, who spent his long career researching and discovering systematic approaches to improve investment performance by strategies and approaches based on conditional “inefficient” pricing behavior. His research covers examples in a wide range of markets and instruments, from sporting and gambling events to traditional financial markets, and even less conventional markets such as wine and Turkish rugs. These patterns are sometimes identified as anomalies (arbitrage), obtaining an edge (such card counting in blackjack) or systematic risk-factors (traditional asset classes). With the rise of micro-level data and new data science methods, the research Bill initiated has continued to grow, marking him as a pioneer explorer.

We emphasize that the statistical jump models we study in this article are not related to jump-diffusion models, a common class of stochastic processes.

A Appendix: Proofs

A Appendix: Proofs

Proof of Proposition 1

From the probabilistic assumptions, the joint likelihood is

$$\begin{aligned} p({\varvec{Y}},{\varvec{S}}|\Theta )&=p({\varvec{Y}}|{\varvec{S}}, \Theta )p({\varvec{S}}) \end{aligned}$$
(33)
$$\begin{aligned}&=\prod _{t=0}^{T-1}p({\varvec{y}}_t|\Theta , {\varvec{s}}_t)\times \pi ({\varvec{s}}_0)\times \prod _{t=1}^{T-1}K\left( {\varvec{s}}_{t-1}, {\varvec{s}}_t\right) \,. \end{aligned}$$
(34)

Thus

$$\begin{aligned} \log p({\varvec{Y}},{\varvec{S}}|\Theta )=\sum _{t=0}^{T-1}\log p({\varvec{y}}_t|\Theta , {\varvec{s}}_t)+\log \pi ({\varvec{s}}_0)+\sum _{t=1}^{T-1}\log K\left( {\varvec{s}}_{t-1}, {\varvec{s}}_t\right) \,. \end{aligned}$$
(35)

For the first term, by Jensen’s inequality we have

$$\begin{aligned} \log p({\varvec{y}}_t|\Theta , {\varvec{s}}_t)&=\log \left( \sum _{k=0}^{K-1}s_{tk}p({\varvec{y}}_t|{\varvec{\theta }}_k)\right) \end{aligned}$$
(36)
$$\begin{aligned}&\ge \sum _{k=0}^{K-1}s_{tk}\log p({\varvec{y}}_t|{\varvec{\theta }}_k) \,. \end{aligned}$$
(37)

Inserting (37) into (35), we obtain

$$\begin{aligned} \log p({\varvec{Y}},{\varvec{S}}|\Theta ) \ge&\sum _{t, k}s_{tk}\log p({\varvec{y}}_t|{\varvec{\theta }}_k) +\log \pi ({\varvec{s}}_0)+\sum _{t=1}^{T-1}\log K\left( {\varvec{s}}_{t-1}, {\varvec{s}}_t\right) \,. \end{aligned}$$
(38)

From equation (17), it follows that the above right hand side is precisely the negation of the objective function J. Therefore minimizing J is equivalent to maximizing a lower bound of the joint log-likelihood function. \(\square \)

Proof of Proposition 2

From the previous proof, we know that

$$\begin{aligned} \log p({\varvec{Y}},{\varvec{S}}|\Theta )&=\sum _{t=0}^{T-1}\log p({\varvec{y}}_t|\Theta , {\varvec{s}}_t)+\log p({\varvec{S}}) \end{aligned}$$
(39)
$$\begin{aligned}&\ge \sum _{t,k}s_{tk}\log p({\varvec{y}}_t|{\varvec{\theta }}_k)+\log p({\varvec{S}}) \end{aligned}$$
(40)

Inserting (18) into (40), we obtain

$$\begin{aligned} \sum _{t, k}s_{t,k}\left( -l({\varvec{y}}_t,{\varvec{\theta }}_k)-\log \nu \right) -{{\mathcal {L}}}({\varvec{S}})-\log \eta =&-\left( \sum _{t, k}s_{t,k}l({\varvec{y}}_t,{\varvec{\theta }}_k) + {{\mathcal {L}}}({\varvec{S}})\right) +\text {const} \end{aligned}$$
(41)
$$\begin{aligned} =&-J+\text {const} \,. \end{aligned}$$
(42)

Thus, minimizing J is equivalent to maximizing a lower bound of the log-likelihood \(p({\varvec{Y}},{\varvec{S}}|\Theta )\). \(\square \)

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Aydınhan, A.O., Kolm, P.N., Mulvey, J.M. et al. Identifying patterns in financial markets: extending the statistical jump model for regime identification. Ann Oper Res (2024). https://doi.org/10.1007/s10479-024-06035-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10479-024-06035-z

Keywords

Navigation