Cluster Analysis for Investment Funds Portfolio Optimisation: A Symbolic Data Approach

Terraza, Virginie; Toque, Carole

doi:10.1007/978-3-030-66691-0_5

Virginie Terraza⁸ &
Carole Toque⁹

Part of the book series: Risk, Systems and Decisions ((RSD))

1752 Accesses
1 Citations

Abstract

In risk management and portfolio optimization it is important to know which assets move individually or in certain groups to make a diversified portfolio. The statistical uncertainty of the correlation matrix is the main problem into the optimization of a financial portfolio. Indeed, estimates of correlations are often noisy particularly in stress period and unreliable as estimation horizons are always finite. Another drawback in the classical estimation of correlations is that time series are estimated on historical data and prediction based on past data is very difficult, since finding elementary structures in data which are valid and persistent in the future is not really easy. The Markowitz optimization approaches of portfolio suffer from theses estimation errors. From the perspective of machine learning, new approaches have been proposed in the literature of applied finance. Among these techniques, clustering has been considered as a significant method to capture the natural structure of data. The objective of this research is to use data mining approaches for identifying the best clustering indicators for building optimal portfolios. Clustering is an empirical procedure for grouping financial assets into homogeneous groups. The aim of cluster analysis is to maximize similarity within groups of assets and minimize similarity between groups. The similarities and dissimilarities are based on the attribute values and frequently involve distance measures. There are different techniques used for clustering, some are Partitioning based technique, Density based technique, Model based technique, Grid based technique. In this research we consider the symbolic approach based histogram-valued data and clusters as a new approach for investment funds portfolio optimization. Firstly, it is based on aggregating individual level data into group-based summarized by symbols. In our case, symbols are histogram-valued data taking into account variability inside groups. Secondly, for partitioning, we use dynamical clustering which is an extension of K-means where, instead of the means, we use other kinds of centers called ‘kernel’ distributions in our case. After clustering, stock samples are selected from these clusters to create funds of funds optimal portfolios which impose the lowest risk measured in terms of Conditional Value at Risk for a certain return. Funds’ Portfolios are compared during the period of 2008–2016 using the conditional Sharpe ratio and the 2017 year is used to validate our results out of sample. In this research we show that the use of symbolic data clustering algorithms can improve the reliability of the portfolio in terms of the risk adjusted performance.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

eBook: USD 16.99; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Hierarchical Clustering as a Dimension Reduction Technique in the Markowitz Portfolio Optimization Problem

Article 01 December 2021

Portfolio Construction Based on Time Series Clustering Method Evidence in the Vietnamese Stock Market

Development of an efficient cluster-based portfolio optimization model under realistic market conditions

Article 21 January 2020

References

Acerbi, Tasche (2002) On the coherence of expected shortfall. J Bank Financ 26(7):1487–1503
Article Google Scholar
Afonso F, Diday E, Toque C (2018) Data science par analyse des données symboliques. Technip, 448 pages. ISBN: 9782710811817
Google Scholar
Argawal, Naik (2004) Risks and portfolio decisions involving hedge funds. Rev Financ Stud 17(1):63–98
Article Google Scholar
Artzner P, Delbaen F, Eber J-M, Heath D (1997) Thinking coherently. Risk 10:68–71
Google Scholar
Basak and Shapiro (2001) Value at Risk based management: optimal policies and asset prices. Review of Financial Studies 14(2):371–405
Google Scholar
Basak S, Shapiro A (1998) Value-at-risk based management: optimal policies and asset prices. Working paper, Wharton School, University of Pennsylvania
Google Scholar
Billard L, Diday E (2007) Symbolic data analysis: conceptual statistics and data mining (Wiley series in computational statistics). Wiley, Hoboken
Google Scholar
Billard L, Diday E (2019) Clustering methodology for symbolic data. Wiley, Hoboken, p 288
Book Google Scholar
Bock HH, Diday E (2000) Analysis of symbolic data. Exploratory methods for extracting statistical information from complex data. Springer, Berlin
Google Scholar
Brito P, Chavent M (2012) Divisive monothetic clustering for interval and histogram-valued data. In: Proceedings ICPRAM 2012-1st international conference on pattern recognition applications and methods, Vilamoura, Portugal
Google Scholar
Calinski T, Harabasz J (1974) (1974). A dendrite method for cluster analysis. Commun Stat 3:1–27
Google Scholar
Dias S, Brito P (2015) Linear regression model with histogram-valued variables. Stat Anal Data Min 8(2):75–113
Article Google Scholar
Diday E (1971) La méthode des nuées dynamiques. Revue de Statistique Appliquée 19:19–34
Google Scholar
Diday E (1988) The symbolic approach in clustering and related methods of data analysis: the basic choices. In: Bock HH (ed) IFCS ‘87, vol 1988, pp 673–684
Google Scholar
Diday E (2010) Principal component analysis for categorical histogram, data: some open directions of research. In: Fichet B, Piccolo D, Verde R, Vichi M (eds) Classification and multivariate analysis for complex data structures. Springer Verlag, Heidelberg, p 492. ISBN 9783642133114
Google Scholar
Diday E (2013) Principal component analysis for bar charts and metabins tables. Stat Anal Data Min ASA Data Sci J 6(5):403–430
Article Google Scholar
Diday E (2016) Thinking by classes in Data Science: the symbolic data analysis paradigm. WIREs Comput Stat 8:172–205. https://doi.org/10.1002/wics.1384
Article Google Scholar
Diday E, Noirhomme-Fraiture M (2008) Symbolic data analysis and the SODAS software. Wiley-Interscience, New York
Google Scholar
Diday E, Simon JC (1976) Clustering analysis. In: Fu K (ed) Digital pattern classification. Springer, Berlin
Google Scholar
Elton EJ, Gruber MJ, Brown SJ, Goetzman WN (2007) Modern portfolio theory and investment analysis, 7th edn. Wiley, New York
Google Scholar
Emilion R, Diday E (2018) Symbolic data analysis basic theory. In: Saporta, Wang, Diday, Rong Guan (eds) Chapter in Advances in data sciences. ISTE-Wiley
Google Scholar
Gaivoronski AA, Pflug G (2000) Value at risk in portfolio optimization: properties and computational approach. NTNU, Department of Industrial Economics and Technology Management, Working paper
Google Scholar
Haddad R (2016) Apprentissage supervisé des données symboliques et adaptation aux données massives et distribuées. Thèse de doctorat, Université Paris 9 Dauphine, France
Google Scholar
Kim J, Billard L (2018) Double monothetic clustering for histogram-valued data. Communications for Statistical Applications and Methods 25:263–274
Article Google Scholar
Korzeniewski J (2018) Efficient stock portfolio construction by means clustering. Folia Oeconomica 1(333)
Google Scholar
Krokhmal P, Palmquist J, Uryasev S (2002) Portfolio optimization with conditional value-at-risk criterion. J Risk 4(2)
Google Scholar
Le-Rademacher J, Billard L (2013) Principal component histograms from interval-valued observations. Comput Stat 28:2117–2138
Article Google Scholar
Markowitz (1952) Portfolio selection. J Financ 7(1):77–91
Google Scholar
Marvin K (2015) Creating diversified portfolios using cluster analysis. WP, Princeton University
Google Scholar
Medova E (1998) VAR methodology and the limitation of catastrophic or unquantifiable risk. VII International Conference on Stochastic Programming, the University of British Columbia, Vancouver, Canada
Google Scholar
Pasha SA, Leong PHW (2013) Cluster analysis of high-dimensional high-frequency financial time series. IEEE Conference on Computational Intelligence for Financial Engineering & Economics
Google Scholar
Pflug GC (2000) Some remarks on the value-at-risk and the conditional value-at-risk. In: Uryasev SP (ed) Probabilistic constrainted optimization: methodology and applications. Kluwer, Norwell, pp 278–287
Google Scholar
Ren Z (2005) Portfolio construction using clustering methods. Worcester Polytechnic Institute, Worcester
Google Scholar
Rockafellar RT, Uryasev S (2000) Optimization of conditional value-at-risk. J Risk 2:21–41
Article Google Scholar
Rockafellar RT, Uryasev S (2002) Conditional value at risk for general loss distribution. J Bank Financ 26(7):1443–1471
Article Google Scholar
Rosen F (2006) Correlation based clustering of the Stockholm Stock Exchange. WP, Stockholm University
Google Scholar
Toque C, Terraza V (2013) Histogram-valued data on value at risk measures: a symbolic approach for risk attribution. Appl Econ Lett 21(17):1243–1251
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Economics and Management, University of Luxembourg, Luxembourg, Luxembourg
Virginie Terraza
Ministère de la Transition écologique et solidaire, Paris - La Défense, France
Carole Toque

Authors

Virginie Terraza
View author publications
You can also search for this author in PubMed Google Scholar
Carole Toque
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Virginie Terraza .

Editor information

Editors and Affiliations

Technical University of Crete, Greece and Audencia Business School, Nantes Cedex 3, France
Constantin Zopounidis
Audencia Nantes School of Management, Nantes Cedex 3, France
Ramzi Benkraiem
Audencia Nantes School of Management, Nantes Cedex 3, France
Iordanis Kalaitzoglou

Appendix

Period 2010–2012

Table A1 The symbolic data table for the four classes

Full size table

Table A2 Funds by cluster

Full size table

Period 2013–2014

Table A3 The symbolic data table for the three classes

Full size table

Table A4 Funds by cluster

Full size table

Period 2015–2016

Table A5 The symbolic data table for the five classes

Full size table

Table A6 Funds by cluster

Full size table

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Terraza, V., Toque, C. (2021). Cluster Analysis for Investment Funds Portfolio Optimisation: A Symbolic Data Approach. In: Zopounidis, C., Benkraiem, R., Kalaitzoglou, I. (eds) Financial Risk Management and Modeling. Risk, Systems and Decisions. Springer, Cham. https://doi.org/10.1007/978-3-030-66691-0_5

Download citation

DOI: https://doi.org/10.1007/978-3-030-66691-0_5
Published: 14 September 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-66690-3
Online ISBN: 978-3-030-66691-0
eBook Packages: Business and ManagementBusiness and Management (R0)

Publish with us

Policies and ethics

Cluster Analysis for Investment Funds Portfolio Optimisation: A Symbolic Data Approach

Abstract

Access this chapter

Similar content being viewed by others

Hierarchical Clustering as a Dimension Reduction Technique in the Markowitz Portfolio Optimization Problem

Portfolio Construction Based on Time Series Clustering Method Evidence in the Vietnamese Stock Market

Development of an efficient cluster-based portfolio optimization model under realistic market conditions

References