Abstract
Financial time series clustering finds application in forecasting, noise reduction and enhanced index tracking. The central theme in all the available clustering algorithms is the dissimilarity measure employed by the algorithm. The dissimilarity measures, applicable in financial domain, as used or suggested in past researches, are correlation based dissimilarity measure, temporal correlation based dissimilarity measure and dynamic time wrapping (DTW) based dissimilarity measure. One shortcoming of these dissimilarity measures is that they do not take into account the lead or lag existing between the returns of different stocks which changes with time. Mostly, such stocks with high value of correlation at some lead or lag belong to the same cluster (or sector). The present paper, proposes two new dissimilarity measures which show superior clustering results as compared to past measures when compared over 3 data sets comprising of 526 companies. abstract environment.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Dose, C.: Clustering of financial time series with application to index and enhanced index tracking portfolio. Phys. A 355, 145–151 (2005)
Mantegna, S.: An Introduction to Econophysics Correlations and Complexity in Finance. Cambridge University Press, Cambridge (1999)
Basalto, N., et al.: Hausdorff clustering of financial time series. Phys. A 379, 635–644 (2007)
Saeed, T.: Effective clustering of time-series data using FCM. Int. J. Mach. Learn. Comput. 4(2), 170–176 (2014)
Guan, J.: Cluster financial time series for portfolio. In: Proceedings of the 2007 ICWAPR, Beijing, China, 2–4 November 2007
Marti, A., Nielson, D.: Clustering financial time series: how long is enough? In: Proceedings of the Twenty-Fifth IJCAI (2016)
Aghabozorgi, S., Shirkhorshidi, A.S., Wah, T.Y.: Time-series clustering - a decade review. Inform. Syst. 53, 16–38 (2015)
John, G.: k-shape: efficient and accurate clustering of time series. SIGMOD Rec. 45(1), 69–76 (2016)
Montero, V.: TSclust: an r package for time series clustering. J. Stat. Softw. 62(1), 1–43 (2014)
Murtagh, L.: Ward’s hierarchical agglomerative clustering method: which algorithms implement Ward’s criterion? J. Classif. 31, 274–295 (2014)
Berndt, D.J., Clifford, J.: Using dynamic time warping to find patterns in time series. In: KDD, Workshop, pp. 359–370 (1994)
Chouakria, A.D., Nagabhushan, P.N.: Adaptive dissimilarity index for measuring time series proximity. Adv. Data Anal. Classif. 1(1), 5–21 (2007)
Chouakria-Douzal, A.: Compression technique preserving correlations of a multivariate temporal sequence. In: Advances in Intelligent Data Analysis, pp 566–577. Springer, Heidelberg (2003)
Gavrilov, M., et al.: Mining the stock market: which measure is best ? In: Proceedings of the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD, pp. 487–496 (2000)
Giorgino, T.: Computing and visualizing dynamic time warping alignments in R: the dtw package. J. Stat. Softw. 31(7), 1–24 (2009)
Hoffmann, M., et al.: Estimation of the lead-lag parameter from non-synchronous data. ISI/BS Bernoulli 19(2), 426–461 (2013)
Acknowledgments
Kartikay Gupta was supported by Teaching Assistantship Grant by Ministry of Human Resource Development, India.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
A Appendix
A Appendix
1.1 A.1 Correlation Based Dissimilarity Measure (COR)
A simple dissimilarity measure for time series clustering is based on Pearson’s correlation factor between time series \( X_{T} \) and \( Y_{T} \) given by
where \( m_{X} \) and \( m_{Y} \) are the average values of the time series \( X_{T} \) and \( Y_{T} \) respectively. The dissimilarity measure is then given by
For more information regarding this distance measure, one may refer to [9].
1.2 A.2 Temporal Correlation Based Dissimilarity Measure (CORT)
As introduced by Douzal [12], the similarity between two time series is evaluated using first order temporal correlation coefficient [13] given by,
The dissimilarity proposed by Douzal [12] modulates the ‘dissimilarity value’ between \( X_T \) and \( Y_T \) using the coefficient \( CORT(X_{T},Y_{T}) \). Specifically, it is defined as follows.
where \( \phi _{k} \) is an adaptive function given by,
and Dissimilarity\( (X_{T},Y_{T}) \) refers to dissimilarity value computed using any of the available dissimilarity measures like Euclidean, DTW etc. In this paper, we choose DTW as the preferred dissimilarity measure. This is because DTW effectively takes into account slight shape distortions while calculating its dissimilarity measure value. For more details regarding DTW, readers are referred to [9, 11].
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG
About this paper
Cite this paper
Gupta, K., Chatterjee, N. (2018). Financial Time Series Clustering. In: Satapathy, S., Joshi, A. (eds) Information and Communication Technology for Intelligent Systems (ICTIS 2017) - Volume 2. ICTIS 2017. Smart Innovation, Systems and Technologies, vol 84. Springer, Cham. https://doi.org/10.1007/978-3-319-63645-0_16
Download citation
DOI: https://doi.org/10.1007/978-3-319-63645-0_16
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-63644-3
Online ISBN: 978-3-319-63645-0
eBook Packages: EngineeringEngineering (R0)