Skip to main content
Log in

Risk measures-based cluster methods for finance

  • Original Article
  • Published:
Risk Management Aims and scope Submit manuscript

Abstract

This paper performs an extensive comparison of cluster techniques for financial applications based on risk measures and returns as classification variables. We consider the cluster techniques and risk measures largely used in the literature. For the analysis, we use a database composed of daily returns of the U.S. equity market. As for financial applications, we consider capital determination, portfolio optimization, and asset pricing. We found that the number of clusters varies over the years. The years with the fewest clusters coincide with periods of instability, such as 2008 (Subprime Crisis) and 2015 (slowdown in United States domestic product). Overall, we observe that our data support the superiority of the Fanny and MC approaches. By construction, both techniques are more robust to the distinct probabilistic distribution of data, which is typically the case for financial data. Furthermore, our results highlight the practical utility of considering risk measures and returns as classification variables in financial applications.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Similar content being viewed by others

Notes

  1. A coherent risk measure, as in Artzner et al. (1999), fulfills Monotonicity, Translation Invariance, Subadditivity, and Positive Homogeneity axiom.

  2. A generalized deviation measure, in the sense of Rockafellar et al. (2006), fulfills the following proprieties: Translation Insensitivity, Positive Homogeneity, Subadditivity, and Nonnegativity.

  3. Limitedness indicates that the risk of a position is not greater than the maximum loss. See Righi (2019).

  4. A functional is named elicitable when it is the minimizer of expectation of some score function. See Ziegel (2016) and Acerbi and Szekely (2017).

  5. For similarity measures, the interpretation is different. The higher the observed value, the more similar the observations.

  6. In addition to ML, it can be considered a Bayesian approach. For the sake of brevity, we will not address this approach. Information can be found at Fraley and Raftery (2007).

  7. We downloaded data from the Trading Economics website (https://tradingeconomics.com/). We consider the American market due to its representativeness in the international scenario. Besides that, we select this market because other studies that explore the use of the cluster to classify financial data also use it. See, for example, Bjerring et al. (2017), Puerto et al. (2020), and Tayalı (2020).

  8. We include the most traded stocks to avoid convergence issues in our estimation. Furthermore, despite our restriction, we consider more than 67% of our sample each year. We do not consider transaction costs and bid-ask spread because our intention is not to compare portfolio formation strategies. Our aim in this paper is to illustrate the use of risk and return as classification variables.

  9. We use daily data for risk prediction because it is the frequency commonly used in risk prediction studies. See, for example, Kuester et al. (2006) and Müller et al. (2022).

  10. As the 2005 period was used to make the 2006 risk forecasts, our final dataset contains information from January 2006 to December 2017.

  11. Illustrates regarding the optimal number of clusters for K-means, PAM, C-Means, and Fanny clustering are available under request.

  12. We employ an elicitable risk measure to assess the accuracy of the risk estimates utilizing a consistent loss function.

  13. We do not obtain risk forecasts for the out-of-sample period because each year, we consider a different set of assets to obtain cluster returns.

  14. We use \(\alpha = 0.01\) because it is recommended by Basel Committee on Banking Supervision (2013).

  15. We opted for the mean-variance because it is usually used in portfolio optimization problems that use cluster returns as the dataset. See, for example, Tola et al. (2008) and Chen and Huang (2009).

  16. We use the deviation measure because it represents the Markowitz classical approach (Markowitz 1952).

  17. For the Sharpe ratio, we use Standard Deviation as a risk measure.

  18. We consider the Fama-French three-factor model because empirical evidence points out that the inclusion of the size and the book-to-market (BM) effect improves the asset pricing ability of the Capital Asset Pricing Model (CAPM). See, for example, Fama and French (1996) and Gaunt (2004). Lawrence et al. (2007) empirically test and compare the performance of the traditional CAPM, the three-moment CAPM, and the Fama-French three-factor model. The result shows that the three-factor model outperforms the other models. Regarding Fama-French five-factor model, Jiao and Lilti (2017), on the Chinese stock market, identify that this model does not capture more variations of expected stock returns than the three-factor model.

  19. We use OLS regression because it is a common method for time-series regression estimation in pricing. See, for instance, Bali et al. (2014) and Atilgan et al. (2019).

  20. Illustrations of portfolio returns for other years are available on request.

  21. For PAM, in 2006, the average market beta was close to 1.

References

  • Acerbi, C. 2002. Spectral measures of risk: A coherent representation of subjective risk aversion. Journal of Banking & Finance 26 (7): 1505–1518.

    Article  Google Scholar 

  • Acerbi, C., and B. Szekely. 2017. General properties of backtestable statistics. Working Paper.

  • Aït-Sahalia, Y., and D. Xiu. 2016. Increased correlation among asset classes: Are volatility or jumps to blame, or both? Journal of Econometrics 194 (2): 205–219.

    Article  Google Scholar 

  • Artzner, P., F. Delbaen, J.M. Eber, and D. Heath. 1999. Coherent measures of risk. Mathematical Finance 9 (3): 203–228.

    Article  Google Scholar 

  • Atilgan, Y., T.G. Bali, K.O. Demirtas, and A.D. Gunaydin. 2019. Global downside risk and equity returns. Journal of International Money and Finance 98: 102065.

    Article  Google Scholar 

  • Atilgan, Y., T.G. Bali, K.O. Demirtas, and A.D. Gunaydin. 2020. Left-tail momentum: Underreaction to bad news, costly arbitrage and equity returns. Journal of Financial Economics 135 (3): 725–753.

    Article  Google Scholar 

  • Bali, T.G., N. Cakici, and R.F. Whitelaw. 2014. Hybrid tail risk and expected stock returns: When does the tail wag the dog? The Review of Asset Pricing Studies 4 (2): 206–246.

    Article  Google Scholar 

  • Bark, H.-K.K. 1991. Risk, return, and equilibrium in the emerging markets: Evidence from the Korean stock market. Journal of Economics and Business 43 (4): 353–362.

    Article  Google Scholar 

  • Basel Committee on Banking Supervision. 2013. Fundamental review of the trading book: A revised market risk framework. Consultative Document, October.

  • Bellini, F., and E. Di Bernardino. 2017. Risk management with expectiles. The European Journal of Finance 23 (6): 487–506.

    Article  Google Scholar 

  • Bellman, R., R. Kalaba, and L. Zadeh. 1966. Abstraction and pattern classification. Journal of Mathematical Analysis and Applications 13 (1): 1–7.

    Article  Google Scholar 

  • BenMim, I., and A. BenSaïda. 2019. Financial contagion across major stock markets: A study during crisis episodes. The North American Journal of Economics and Finance 48: 187–201.

    Article  Google Scholar 

  • Bezdek, J.C. 2013. Pattern recognition with fuzzy objective function algorithms. New York: Springer.

    Google Scholar 

  • Binder, D.A. 1978. Bayesian cluster analysis. Biometrika 65 (1): 31–38.

    Article  Google Scholar 

  • Bjerring, T.T., O. Ross, and A. Weissensteiner. 2017. Feature selection for portfolio optimization. Annals of Operations Research 256 (1): 21–40.

    Article  Google Scholar 

  • Blume, M.E. 1970. Portfolio theory: A step toward its practical application. The Journal of Business 43 (2): 152–173.

    Article  Google Scholar 

  • Blume, M.E., and I. Friend. 1973. A new look at the capital asset pricing model. The Journal of Finance 28 (1): 19–33.

    Article  Google Scholar 

  • Charrad, Malika, N. Ghazzali, V. Boiteau, and A. Niknafs. 2015. Determining the best number of clusters in a data set. R Packages. http://cran.rediris.es/web/packages/NbClust/NbClust.pdf

  • Chen, B., J. Zhong, and Y. Chen. 2020. A hybrid approach for portfolio selection with higher-order moments: Empirical evidence from Shanghai Stock Exchange. Expert Systems with Applications 145: 113104.

    Article  Google Scholar 

  • Chen, L.-H., and L. Huang. 2009. Portfolio optimization of equity mutual funds with fuzzy return rates and risks. Expert Systems with Applications 36 (2): 3720–3727.

    Article  Google Scholar 

  • Cheong, D., Y.M. Kim, H.W. Byun, K.J. Oh, and T.Y. Kim. 2017. Using genetic algorithm to support clustering-based portfolio optimization by investor information. Applied Soft Computing 61: 593–602.

    Article  Google Scholar 

  • Cont, R. 2001. Empirical properties of asset returns: Stylized facts and statistical issues. Quantitative Finance 1 (2): 223–236.

    Article  Google Scholar 

  • Dempster, A.P., N.M. Laird, and D.B. Rubin. 1977. Maximum likelihood from incomplete data via the em algorithm. Journal of the Royal Statistical Society: Series B (Methodological) 39 (1): 1–22.

    Google Scholar 

  • Diaz, A., G. Garcia-Donato, and A. Mora-Valencia. 2017. Risk quantification in turmoil markets. Risk Management 19 (3): 202–224.

    Article  Google Scholar 

  • Dunn, J.C. 1973. A fuzzy relative of the ISODATA process and its use in detecting compact well-separated clusters. Journal of Cybernetics 3 (3): 32–57.

    Article  Google Scholar 

  • Fama, E.F., and K.R. French. 1992. The cross-section of expected stock returns. The Journal of Finance 47 (2): 427–465.

    Article  Google Scholar 

  • Fama, E.F., and K.R. French. 1993. Common risk factors in the returns on stocks and bonds. Journal of Financial Economics 33: 3–56.

    Article  Google Scholar 

  • Fama, E.F., and K.R. French. 1996. Multifactor explanations of asset pricing anomalies. The Journal of Finance 51 (1): 55–84.

    Article  Google Scholar 

  • Fischer, T. 2003. Risk capital allocation by coherent risk measures based on one-sided moments. Insurance: Mathematics and Economics 32: 135–146.

    Google Scholar 

  • Föllmer, H., and A. Schied. 2002. Convex measures of risk and trading constraints. Finance and Stochastics 6: 429–447.

    Article  Google Scholar 

  • Fraley, C., and A.E. Raftery. 2002. Model-based clustering, discriminant analysis, and density estimation. Journal of the American statistical Association 97 (458): 611–631.

    Article  Google Scholar 

  • Fraley, C., and A.E. Raftery. 2007. Bayesian regularization for normal mixture estimation and model-based clustering. Journal of classification 24 (2): 155–181.

    Article  Google Scholar 

  • French, K., 2020. Data library. http://mba.tuck.dartmouth.edu/pages/faculty/ken.french/data_library.html

  • Gaunt, C. 2004. Size and book to market effects and the Fama French three factor asset pricing model: Evidence from the Australian stockmarket. Accounting & Finance 44 (1): 27–44.

    Article  Google Scholar 

  • Gneiting, T. 2011. Making and evaluating point forecasts. Journal of the American Statistical Association 106 (494): 746–762.

    Article  Google Scholar 

  • Halliwell, J., R. Heaney, J. Sawicki, et al. 1999. Size and book to market effects in Australian share markets: A time series analysis. Accounting Research Journal 12: 122–137.

    Google Scholar 

  • Heaton, J.B., N.G. Polson, and J.H. Witte. 2017. Deep learning for finance: Deep portfolios. Applied Stochastic Models in Business and Industry 33 (1): 3–12.

    Article  Google Scholar 

  • Hung, M.-C., and D.-L. Yang. 2001. An efficient fuzzy c-means clustering algorithm. In: Proceedings 2001 IEEE International Conference on Data Mining. IEEE, pp. 225–232.

  • Iorio, C., G. Frasso, A. D’Ambrosio, and R. Siciliano. 2018. A P-spline based clustering approach for portfolio selection. Expert Systems with Applications 95: 88–103.

    Article  Google Scholar 

  • Jensen, M.C. 1968. The performance of mutual funds in the period 1945–1964. The Journal of Finance 23 (2): 389–416.

    Article  Google Scholar 

  • Jiao, W., and J.-J. Lilti. 2017. Whether profitability and investment factors have additional explanatory power comparing with Fama-French three-factor model: empirical evidence on chinese a-share stock market. China Finance and Economic Review 5 (1): 7.

    Article  Google Scholar 

  • Kaufman, L., and P.J. Rousseeuw. 2009. Finding groups in data: An introduction to cluster analysis, vol. 344. Hoboken: Wiley.

    Google Scholar 

  • Kritzman, M. 1993. What practitioners need to know... about factor methods. Financial Analysts Journal 49 (1): 12–15.

    Article  Google Scholar 

  • Kuester, K., S. Mittnik, and M.S. Paolella. 2006. Value-at-risk prediction: A comparison of alternative strategies. Journal of Financial Econometrics 4 (1): 53–89.

    Article  Google Scholar 

  • Lau, J.W., and P.J. Green. 2007. Bayesian model-based clustering procedures. Journal of Computational and Graphical Statistics 16 (3): 526–558.

    Article  Google Scholar 

  • Lawrence, E.R., J. Geppert, and A.J. Prakash. 2007. Asset pricing models: A comparison. Applied Financial Economics 17 (11): 933–940.

    Article  Google Scholar 

  • León, D., Aragón, A., Sandoval, J., Hernández, G., Arévalo, A., and Niño, J., 2017. Clustering algorithms for risk-adjusted portfolio construction. In: International Conference on Computational Science, ICCS. pp. 1334–1343.

  • Lisi, F., and M. Corazza. 2008. Clustering financial data for mutual fund management. In Mathematical and Statistical Methods in Insurance and Finance, 157–164. Cham: Springer.

    Chapter  Google Scholar 

  • MacQueen, J., et al., 1967. Some methods for classification and analysis of multivariate observations. In: Proceedings of the fifth Berkeley symposium on mathematical statistics and probability. Vol. 1. Oakland, CA, USA, pp. 281–297.

  • Maechler, M., Rousseeuw, P., Struyf, A., Hubert, M., and Hornik, K., 2020. cluster: Cluster Analysis Basics and Extensions. R package version 2.1.0—For new features, see the ’Changelog’ file (in the package source).

  • Mahdi, I.B.S., and M.B. Abbes. 2018. Relationship between capital, risk and liquidity: A comparative study between Islamic and conventional banks in MENA region. Research in International Business and Finance 45: 588–596.

    Article  Google Scholar 

  • Majumdar, S., and A.K. Laha. 2020. Clustering and classification of time series using topological data analysis with applications to finance. Expert Systems with Applications 162: 113868.

    Article  Google Scholar 

  • Markowitz, H. 1952. Portfolio selection. Journal of Finance 7: 77–91.

    Google Scholar 

  • McLachlan, G.J., S.X. Lee, and S.I. Rathnayake. 2019. Finite mixture models. Annual Review of Statistics and its Application 6: 355–378.

    Article  Google Scholar 

  • Meyer, D., Dimitriadou, E., Hornik, K., Weingessel, A., Leisch, F., Chang, C.-C., Lin, C.-C., and Meyer, M. D., 2020. Package‘1071’. The R Journal.

  • Müller, F.,and Righi, M., 2020. Model risk measures: A review and new proposals on risk forecasting. Working Paper. https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3489917

  • Müller, F.M., S.S. Santos, T.W. Gössling, and M.B. Righi. 2022. Comparison of risk forecasts for cryptocurrencies: A focus on Range Value at Risk. Finance Research Letters 48: 102916.

    Article  Google Scholar 

  • Nanda, S., B. Mahanty, and M. Tiwari. 2010. Clustering Indian stock market data for portfolio management. Expert Systems with Applications 37 (12): 8793–8798.

    Article  Google Scholar 

  • Newey, W.K., and K.D. West. 1987. A simple, positive semi-definite, heteroskedasticity and autocorrelation consistent covariance matrix. Econometrica 55 (3): 703–708.

    Article  Google Scholar 

  • Ogryczak, W., and A. Ruszczyński. 1999. From stochastic dominance to mean-risk models: Semideviations as risk measures. European Journal of Operational Research 116 (1): 33–50.

    Article  Google Scholar 

  • Pai, G.V., and T. Michel. 2009. Evolutionary optimization of constrained \(k\)-means clustered assets for diversification in small portfolios. IEEE Transactions on Evolutionary Computation 13 (5): 1030–1053.

    Article  Google Scholar 

  • Pérignon, C., and D.R. Smith. 2010. The level and quality of value-at-risk disclosure by commercial banks. Journal of Banking & Finance 34 (2): 362–377.

    Article  Google Scholar 

  • Pradhan, R.P., M.B. Arvin, S. Bahmani, J.H. Hall, and N.R. Norman. 2017. Finance and growth: Evidence from the ARF countries. The Quarterly Review of Economics and Finance 66: 136–148.

    Article  Google Scholar 

  • Puerto, J., M. Rodríguez-Madrena, and A. Scozzari. 2020. Clustering and portfolio selection problems: A unified framework. Computers & Operations Research 117: 104891.

    Article  Google Scholar 

  • Reynolds, A.P., G. Richards, B. de la Iglesia, and V.J. Rayward-Smith. 2006. Clustering rules: A comparison of partitioning and hierarchical clustering algorithms. Journal of Mathematical Modelling and Algorithms 5 (4): 475–504.

    Article  Google Scholar 

  • Righi, M.B. 2019. A composition between risk and deviation measures. Annals of Operations Research 282 (1–2): 299–313.

    Article  Google Scholar 

  • Righi, M.B., and D. Borenstein. 2018. A simulation comparison of risk measures for portfolio optimization. Finance Research Letters 24: 105–112.

    Article  Google Scholar 

  • Righi, M.B., and P. Ceretta. 2016. Shortfall deviation risk: An alternative to risk measurement. Journal of Risk 19 (2): 81–116.

    Article  Google Scholar 

  • Righi, M.B., and P.S. Ceretta. 2015. A comparison of expected shortfall estimation models. Journal of Economics and Business 78: 14–47.

    Article  Google Scholar 

  • Righi, M.B., F.M. Müller, and M.R. Moresco. 2020. On a robust risk measurement approach for capital determination errors minimization. Insurance: Mathematics and Economics 95: 199–211.

    Google Scholar 

  • Rockafellar, R., S. Uryasev, and M. Zabarankin. 2006. Generalized deviations in risk analysis. Finance and Stochastics 10: 51–74.

    Article  Google Scholar 

  • Ruspini, E.H. 1969. A new approach to clustering. Information and Control 15 (1): 22–32.

    Article  Google Scholar 

  • Schwarz, G., et al. 1978. Estimating the dimension of a model. Annals of Statistics 6 (2): 461–464.

    Article  Google Scholar 

  • Scrucca, L., M. Fop, T.B. Murphy, and A.E. Raftery. 2016. mclust 5: Clustering, classification and density estimation using gaussian finite mixture models. The R Journal 8 (1): 289–317.

    Article  Google Scholar 

  • Shawky, H.A., R. Kuenzel, and A.D. Mikhail. 1997. International portfolio diversification: A synthesis and an update. Journal of International Financial Markets, Institutions and Money 7 (4): 303–327.

    Article  Google Scholar 

  • Tayalı, S.T. 2020. A novel backtesting methodology for clustering in mean-variance portfolio optimization. Knowledge-Based Systems 209: 106454.

    Article  Google Scholar 

  • Tibshirani, R., G. Walther, and T. Hastie. 2001. Estimating the number of clusters in a data set via the gap statistic. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 63 (2): 411–423.

    Article  Google Scholar 

  • Tola, V., F. Lillo, M. Gallegati, and R.N. Mantegna. 2008. Cluster analysis for portfolio optimization. Journal of Economic Dynamics and Control 32 (1): 235–258.

    Article  Google Scholar 

  • Ziegel, J. 2016. Coherence and elicitability. Mathematical Finance 26 (4): 901–918.

    Article  Google Scholar 

Download references

Funding

We are grateful for the financial support of CNPq (Brazilian Research Council) projects number 302369/2018-0 and 407556/2018-4.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Fernanda Maria Müller.

Ethics declarations

Conflict of interest

We confirrm there are no interests to declare regarding this manuscript.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Appendix

See Tables 13, 14, 15 and 16.

Table 13 P value of the left one-tailed Mann–Whitney test
Table 14 P value of the Mann–Whitney test applied under the performance indexes of portfolio optimization
Table 15 The right one-tailed Mann–Whitney unilateral test applied to adjusted R2 of Fama-French Three-Factor model used to explain the excess return of the following clustering techniques: C-means, PAM, C-means, Fanny, and MC
Table 16 P value of the right one-tailed Mann–Whitney unilateral test

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Guedes, P.C., Müller, F.M. & Righi, M.B. Risk measures-based cluster methods for finance. Risk Manag 25, 4 (2023). https://doi.org/10.1057/s41283-022-00110-0

Download citation

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1057/s41283-022-00110-0

Keywords

Navigation