Clusterization of Indices and Assets in the Stock Market

  • Leszek J. Chmielewski
  • Maciej Janowicz
  • Luiza Ochnio
  • Arkadiusz Orłowski
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9375)


K-means clustering algorithm has been used to classify major world stock market indices as well as most important assets in the Warsaw stock exchange (GPW). In addition, to obtain information about mutual connections between indices and stocks, the Granger-causality test has been applied and the Pearson R correlation coefficients have been calculated. It has been found that the three procedures applied provide qualitatively different kind of information about the groups of financial data. Not surprisingly, the major world stock market indices appear to be very strictly interconnects from the point of view of both Granger-causality and correlation. Such connections are less transparent in the case of individual stocks. However, the “cluster leaders” can be identified which leads to the possibility of more efficient trading.


Technical analysis Clustering K-means Granger causality Pearson correlation 


  1. 1.
    Murphy, J.: Technical Analysis of Financial Markets. New York Institute of Finance, New York (1999)Google Scholar
  2. 2.
    Kaufman, P.: Trading Systems and Methods. Wiley, New York (2013)Google Scholar
  3. 3.
    Malkiel, B.: A Random Walk Down the Wall Street. Norton, New York (1981)Google Scholar
  4. 4.
    Fama, E., Blume, M.: Filter rules and stock-market trading. J. Bus. 39, 226–241 (1966)CrossRefGoogle Scholar
  5. 5.
    Brock, W., Lakonishok, J., LeBaron, B.: Simple technical trading rules and the stochastic properties of stock returns. J. Finan. 47(5), 1731–1764 (1992)CrossRefGoogle Scholar
  6. 6.
    Lo, A., MacKinley, A.: Stock market prices do not follow random walks: evidence from a simple specification test. Rev. Finan. Stud. 1, 41–66 (1988)CrossRefGoogle Scholar
  7. 7.
    Lo, A., MacKinley, A.: A Non-Random Walk down Wall Street. Princeton University Press, Princeton (1999)Google Scholar
  8. 8.
    Lo, A., Mamaysky, H., Wang, J.: Foundations of technical analysis: computational algorithms, statistical inference, and empirical implementation. J. Finan. 55(4), 1705–1765 (2000)CrossRefGoogle Scholar
  9. 9.
    MacQueen, J.: Some methods for classification and analysis of multivariate observations. In: Cam, M.L., Neyman, J. (eds.) Proceedings of the 5th Berkeley Symposium on Mathematical Statistics and Probability, vol. 1, pp. 281–297. University of California Press, Berkeley (1967)Google Scholar
  10. 10.
    Steinhaus, H.: Sur la division des corps matériels en parties. Bull. Acad. Polon. Sci. 4(12), 801–804 (1957)MathSciNetzbMATHGoogle Scholar
  11. 11.
    Lloyd, S.: Least square quantization in PCM (1957) Bell Telephone Laboratories PaperGoogle Scholar
  12. 12.
    Forgy, E.: Cluster analysis of multivariate data: efficiency versus interpretability of classifications. Biometrics 21(3), 768–769 (1965)Google Scholar
  13. 13.
    Hartigan, J.: Clustering Algorithms. Wiley, New York (1975)zbMATHGoogle Scholar
  14. 14.
    Granger, C.W.J.: Investigating causal relations by econometric models and cross-spectral methods. Econometrica 37(3), 424–438 (1969)CrossRefzbMATHGoogle Scholar
  15. 15.
    Pearson, K.: Notes on regression and inheritance in the case of two parents. Proc. R. Soc. Lond. 58, 240–242 (1895)CrossRefGoogle Scholar
  16. 16.
    Scikit-learn Community: Scikit-learn - machine learning in python (2015).
  17. 17.
    Pedregosa, F., Varoquaux, G., Gramfort, A., et al.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)MathSciNetzbMATHGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Leszek J. Chmielewski
    • 1
  • Maciej Janowicz
    • 1
  • Luiza Ochnio
    • 1
  • Arkadiusz Orłowski
    • 1
  1. 1.Faculty of Applied Informatics and Mathematics (WZIM)Warsaw University of Life Sciences (SGGW)WarsawPoland

Personalised recommendations