Abstract
This paper performs an extensive comparison of cluster techniques for financial applications based on risk measures and returns as classification variables. We consider the cluster techniques and risk measures largely used in the literature. For the analysis, we use a database composed of daily returns of the U.S. equity market. As for financial applications, we consider capital determination, portfolio optimization, and asset pricing. We found that the number of clusters varies over the years. The years with the fewest clusters coincide with periods of instability, such as 2008 (Subprime Crisis) and 2015 (slowdown in United States domestic product). Overall, we observe that our data support the superiority of the Fanny and MC approaches. By construction, both techniques are more robust to the distinct probabilistic distribution of data, which is typically the case for financial data. Furthermore, our results highlight the practical utility of considering risk measures and returns as classification variables in financial applications.
Similar content being viewed by others
Notes
A coherent risk measure, as in Artzner et al. (1999), fulfills Monotonicity, Translation Invariance, Subadditivity, and Positive Homogeneity axiom.
A generalized deviation measure, in the sense of Rockafellar et al. (2006), fulfills the following proprieties: Translation Insensitivity, Positive Homogeneity, Subadditivity, and Nonnegativity.
Limitedness indicates that the risk of a position is not greater than the maximum loss. See Righi (2019).
For similarity measures, the interpretation is different. The higher the observed value, the more similar the observations.
In addition to ML, it can be considered a Bayesian approach. For the sake of brevity, we will not address this approach. Information can be found at Fraley and Raftery (2007).
We downloaded data from the Trading Economics website (https://tradingeconomics.com/). We consider the American market due to its representativeness in the international scenario. Besides that, we select this market because other studies that explore the use of the cluster to classify financial data also use it. See, for example, Bjerring et al. (2017), Puerto et al. (2020), and Tayalı (2020).
We include the most traded stocks to avoid convergence issues in our estimation. Furthermore, despite our restriction, we consider more than 67% of our sample each year. We do not consider transaction costs and bid-ask spread because our intention is not to compare portfolio formation strategies. Our aim in this paper is to illustrate the use of risk and return as classification variables.
As the 2005 period was used to make the 2006 risk forecasts, our final dataset contains information from January 2006 to December 2017.
Illustrates regarding the optimal number of clusters for K-means, PAM, C-Means, and Fanny clustering are available under request.
We employ an elicitable risk measure to assess the accuracy of the risk estimates utilizing a consistent loss function.
We do not obtain risk forecasts for the out-of-sample period because each year, we consider a different set of assets to obtain cluster returns.
We use \(\alpha = 0.01\) because it is recommended by Basel Committee on Banking Supervision (2013).
We use the deviation measure because it represents the Markowitz classical approach (Markowitz 1952).
For the Sharpe ratio, we use Standard Deviation as a risk measure.
We consider the Fama-French three-factor model because empirical evidence points out that the inclusion of the size and the book-to-market (BM) effect improves the asset pricing ability of the Capital Asset Pricing Model (CAPM). See, for example, Fama and French (1996) and Gaunt (2004). Lawrence et al. (2007) empirically test and compare the performance of the traditional CAPM, the three-moment CAPM, and the Fama-French three-factor model. The result shows that the three-factor model outperforms the other models. Regarding Fama-French five-factor model, Jiao and Lilti (2017), on the Chinese stock market, identify that this model does not capture more variations of expected stock returns than the three-factor model.
Illustrations of portfolio returns for other years are available on request.
For PAM, in 2006, the average market beta was close to 1.
References
Acerbi, C. 2002. Spectral measures of risk: A coherent representation of subjective risk aversion. Journal of Banking & Finance 26 (7): 1505–1518.
Acerbi, C., and B. Szekely. 2017. General properties of backtestable statistics. Working Paper.
Aït-Sahalia, Y., and D. Xiu. 2016. Increased correlation among asset classes: Are volatility or jumps to blame, or both? Journal of Econometrics 194 (2): 205–219.
Artzner, P., F. Delbaen, J.M. Eber, and D. Heath. 1999. Coherent measures of risk. Mathematical Finance 9 (3): 203–228.
Atilgan, Y., T.G. Bali, K.O. Demirtas, and A.D. Gunaydin. 2019. Global downside risk and equity returns. Journal of International Money and Finance 98: 102065.
Atilgan, Y., T.G. Bali, K.O. Demirtas, and A.D. Gunaydin. 2020. Left-tail momentum: Underreaction to bad news, costly arbitrage and equity returns. Journal of Financial Economics 135 (3): 725–753.
Bali, T.G., N. Cakici, and R.F. Whitelaw. 2014. Hybrid tail risk and expected stock returns: When does the tail wag the dog? The Review of Asset Pricing Studies 4 (2): 206–246.
Bark, H.-K.K. 1991. Risk, return, and equilibrium in the emerging markets: Evidence from the Korean stock market. Journal of Economics and Business 43 (4): 353–362.
Basel Committee on Banking Supervision. 2013. Fundamental review of the trading book: A revised market risk framework. Consultative Document, October.
Bellini, F., and E. Di Bernardino. 2017. Risk management with expectiles. The European Journal of Finance 23 (6): 487–506.
Bellman, R., R. Kalaba, and L. Zadeh. 1966. Abstraction and pattern classification. Journal of Mathematical Analysis and Applications 13 (1): 1–7.
BenMim, I., and A. BenSaïda. 2019. Financial contagion across major stock markets: A study during crisis episodes. The North American Journal of Economics and Finance 48: 187–201.
Bezdek, J.C. 2013. Pattern recognition with fuzzy objective function algorithms. New York: Springer.
Binder, D.A. 1978. Bayesian cluster analysis. Biometrika 65 (1): 31–38.
Bjerring, T.T., O. Ross, and A. Weissensteiner. 2017. Feature selection for portfolio optimization. Annals of Operations Research 256 (1): 21–40.
Blume, M.E. 1970. Portfolio theory: A step toward its practical application. The Journal of Business 43 (2): 152–173.
Blume, M.E., and I. Friend. 1973. A new look at the capital asset pricing model. The Journal of Finance 28 (1): 19–33.
Charrad, Malika, N. Ghazzali, V. Boiteau, and A. Niknafs. 2015. Determining the best number of clusters in a data set. R Packages. http://cran.rediris.es/web/packages/NbClust/NbClust.pdf
Chen, B., J. Zhong, and Y. Chen. 2020. A hybrid approach for portfolio selection with higher-order moments: Empirical evidence from Shanghai Stock Exchange. Expert Systems with Applications 145: 113104.
Chen, L.-H., and L. Huang. 2009. Portfolio optimization of equity mutual funds with fuzzy return rates and risks. Expert Systems with Applications 36 (2): 3720–3727.
Cheong, D., Y.M. Kim, H.W. Byun, K.J. Oh, and T.Y. Kim. 2017. Using genetic algorithm to support clustering-based portfolio optimization by investor information. Applied Soft Computing 61: 593–602.
Cont, R. 2001. Empirical properties of asset returns: Stylized facts and statistical issues. Quantitative Finance 1 (2): 223–236.
Dempster, A.P., N.M. Laird, and D.B. Rubin. 1977. Maximum likelihood from incomplete data via the em algorithm. Journal of the Royal Statistical Society: Series B (Methodological) 39 (1): 1–22.
Diaz, A., G. Garcia-Donato, and A. Mora-Valencia. 2017. Risk quantification in turmoil markets. Risk Management 19 (3): 202–224.
Dunn, J.C. 1973. A fuzzy relative of the ISODATA process and its use in detecting compact well-separated clusters. Journal of Cybernetics 3 (3): 32–57.
Fama, E.F., and K.R. French. 1992. The cross-section of expected stock returns. The Journal of Finance 47 (2): 427–465.
Fama, E.F., and K.R. French. 1993. Common risk factors in the returns on stocks and bonds. Journal of Financial Economics 33: 3–56.
Fama, E.F., and K.R. French. 1996. Multifactor explanations of asset pricing anomalies. The Journal of Finance 51 (1): 55–84.
Fischer, T. 2003. Risk capital allocation by coherent risk measures based on one-sided moments. Insurance: Mathematics and Economics 32: 135–146.
Föllmer, H., and A. Schied. 2002. Convex measures of risk and trading constraints. Finance and Stochastics 6: 429–447.
Fraley, C., and A.E. Raftery. 2002. Model-based clustering, discriminant analysis, and density estimation. Journal of the American statistical Association 97 (458): 611–631.
Fraley, C., and A.E. Raftery. 2007. Bayesian regularization for normal mixture estimation and model-based clustering. Journal of classification 24 (2): 155–181.
French, K., 2020. Data library. http://mba.tuck.dartmouth.edu/pages/faculty/ken.french/data_library.html
Gaunt, C. 2004. Size and book to market effects and the Fama French three factor asset pricing model: Evidence from the Australian stockmarket. Accounting & Finance 44 (1): 27–44.
Gneiting, T. 2011. Making and evaluating point forecasts. Journal of the American Statistical Association 106 (494): 746–762.
Halliwell, J., R. Heaney, J. Sawicki, et al. 1999. Size and book to market effects in Australian share markets: A time series analysis. Accounting Research Journal 12: 122–137.
Heaton, J.B., N.G. Polson, and J.H. Witte. 2017. Deep learning for finance: Deep portfolios. Applied Stochastic Models in Business and Industry 33 (1): 3–12.
Hung, M.-C., and D.-L. Yang. 2001. An efficient fuzzy c-means clustering algorithm. In: Proceedings 2001 IEEE International Conference on Data Mining. IEEE, pp. 225–232.
Iorio, C., G. Frasso, A. D’Ambrosio, and R. Siciliano. 2018. A P-spline based clustering approach for portfolio selection. Expert Systems with Applications 95: 88–103.
Jensen, M.C. 1968. The performance of mutual funds in the period 1945–1964. The Journal of Finance 23 (2): 389–416.
Jiao, W., and J.-J. Lilti. 2017. Whether profitability and investment factors have additional explanatory power comparing with Fama-French three-factor model: empirical evidence on chinese a-share stock market. China Finance and Economic Review 5 (1): 7.
Kaufman, L., and P.J. Rousseeuw. 2009. Finding groups in data: An introduction to cluster analysis, vol. 344. Hoboken: Wiley.
Kritzman, M. 1993. What practitioners need to know... about factor methods. Financial Analysts Journal 49 (1): 12–15.
Kuester, K., S. Mittnik, and M.S. Paolella. 2006. Value-at-risk prediction: A comparison of alternative strategies. Journal of Financial Econometrics 4 (1): 53–89.
Lau, J.W., and P.J. Green. 2007. Bayesian model-based clustering procedures. Journal of Computational and Graphical Statistics 16 (3): 526–558.
Lawrence, E.R., J. Geppert, and A.J. Prakash. 2007. Asset pricing models: A comparison. Applied Financial Economics 17 (11): 933–940.
León, D., Aragón, A., Sandoval, J., Hernández, G., Arévalo, A., and Niño, J., 2017. Clustering algorithms for risk-adjusted portfolio construction. In: International Conference on Computational Science, ICCS. pp. 1334–1343.
Lisi, F., and M. Corazza. 2008. Clustering financial data for mutual fund management. In Mathematical and Statistical Methods in Insurance and Finance, 157–164. Cham: Springer.
MacQueen, J., et al., 1967. Some methods for classification and analysis of multivariate observations. In: Proceedings of the fifth Berkeley symposium on mathematical statistics and probability. Vol. 1. Oakland, CA, USA, pp. 281–297.
Maechler, M., Rousseeuw, P., Struyf, A., Hubert, M., and Hornik, K., 2020. cluster: Cluster Analysis Basics and Extensions. R package version 2.1.0—For new features, see the ’Changelog’ file (in the package source).
Mahdi, I.B.S., and M.B. Abbes. 2018. Relationship between capital, risk and liquidity: A comparative study between Islamic and conventional banks in MENA region. Research in International Business and Finance 45: 588–596.
Majumdar, S., and A.K. Laha. 2020. Clustering and classification of time series using topological data analysis with applications to finance. Expert Systems with Applications 162: 113868.
Markowitz, H. 1952. Portfolio selection. Journal of Finance 7: 77–91.
McLachlan, G.J., S.X. Lee, and S.I. Rathnayake. 2019. Finite mixture models. Annual Review of Statistics and its Application 6: 355–378.
Meyer, D., Dimitriadou, E., Hornik, K., Weingessel, A., Leisch, F., Chang, C.-C., Lin, C.-C., and Meyer, M. D., 2020. Package‘1071’. The R Journal.
Müller, F.,and Righi, M., 2020. Model risk measures: A review and new proposals on risk forecasting. Working Paper. https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3489917
Müller, F.M., S.S. Santos, T.W. Gössling, and M.B. Righi. 2022. Comparison of risk forecasts for cryptocurrencies: A focus on Range Value at Risk. Finance Research Letters 48: 102916.
Nanda, S., B. Mahanty, and M. Tiwari. 2010. Clustering Indian stock market data for portfolio management. Expert Systems with Applications 37 (12): 8793–8798.
Newey, W.K., and K.D. West. 1987. A simple, positive semi-definite, heteroskedasticity and autocorrelation consistent covariance matrix. Econometrica 55 (3): 703–708.
Ogryczak, W., and A. Ruszczyński. 1999. From stochastic dominance to mean-risk models: Semideviations as risk measures. European Journal of Operational Research 116 (1): 33–50.
Pai, G.V., and T. Michel. 2009. Evolutionary optimization of constrained \(k\)-means clustered assets for diversification in small portfolios. IEEE Transactions on Evolutionary Computation 13 (5): 1030–1053.
Pérignon, C., and D.R. Smith. 2010. The level and quality of value-at-risk disclosure by commercial banks. Journal of Banking & Finance 34 (2): 362–377.
Pradhan, R.P., M.B. Arvin, S. Bahmani, J.H. Hall, and N.R. Norman. 2017. Finance and growth: Evidence from the ARF countries. The Quarterly Review of Economics and Finance 66: 136–148.
Puerto, J., M. Rodríguez-Madrena, and A. Scozzari. 2020. Clustering and portfolio selection problems: A unified framework. Computers & Operations Research 117: 104891.
Reynolds, A.P., G. Richards, B. de la Iglesia, and V.J. Rayward-Smith. 2006. Clustering rules: A comparison of partitioning and hierarchical clustering algorithms. Journal of Mathematical Modelling and Algorithms 5 (4): 475–504.
Righi, M.B. 2019. A composition between risk and deviation measures. Annals of Operations Research 282 (1–2): 299–313.
Righi, M.B., and D. Borenstein. 2018. A simulation comparison of risk measures for portfolio optimization. Finance Research Letters 24: 105–112.
Righi, M.B., and P. Ceretta. 2016. Shortfall deviation risk: An alternative to risk measurement. Journal of Risk 19 (2): 81–116.
Righi, M.B., and P.S. Ceretta. 2015. A comparison of expected shortfall estimation models. Journal of Economics and Business 78: 14–47.
Righi, M.B., F.M. Müller, and M.R. Moresco. 2020. On a robust risk measurement approach for capital determination errors minimization. Insurance: Mathematics and Economics 95: 199–211.
Rockafellar, R., S. Uryasev, and M. Zabarankin. 2006. Generalized deviations in risk analysis. Finance and Stochastics 10: 51–74.
Ruspini, E.H. 1969. A new approach to clustering. Information and Control 15 (1): 22–32.
Schwarz, G., et al. 1978. Estimating the dimension of a model. Annals of Statistics 6 (2): 461–464.
Scrucca, L., M. Fop, T.B. Murphy, and A.E. Raftery. 2016. mclust 5: Clustering, classification and density estimation using gaussian finite mixture models. The R Journal 8 (1): 289–317.
Shawky, H.A., R. Kuenzel, and A.D. Mikhail. 1997. International portfolio diversification: A synthesis and an update. Journal of International Financial Markets, Institutions and Money 7 (4): 303–327.
Tayalı, S.T. 2020. A novel backtesting methodology for clustering in mean-variance portfolio optimization. Knowledge-Based Systems 209: 106454.
Tibshirani, R., G. Walther, and T. Hastie. 2001. Estimating the number of clusters in a data set via the gap statistic. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 63 (2): 411–423.
Tola, V., F. Lillo, M. Gallegati, and R.N. Mantegna. 2008. Cluster analysis for portfolio optimization. Journal of Economic Dynamics and Control 32 (1): 235–258.
Ziegel, J. 2016. Coherence and elicitability. Mathematical Finance 26 (4): 901–918.
Funding
We are grateful for the financial support of CNPq (Brazilian Research Council) projects number 302369/2018-0 and 407556/2018-4.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
We confirrm there are no interests to declare regarding this manuscript.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Guedes, P.C., Müller, F.M. & Righi, M.B. Risk measures-based cluster methods for finance. Risk Manag 25, 4 (2023). https://doi.org/10.1057/s41283-022-00110-0
Accepted:
Published:
DOI: https://doi.org/10.1057/s41283-022-00110-0