Hubness-Aware Classification, Instance Selection and Feature Construction: Survey and Extensions to Time-Series

  • Nenad Tomašev
  • Krisztian Buza
  • Kristóf Marussy
  • Piroska B. Kis
Part of the Studies in Computational Intelligence book series (SCI, volume 584)


Time-series classification is the common denominator in many real-world pattern recognition tasks. In the last decade, the simple nearest neighbor classifier, in combination with dynamic time warping (DTW) as distance measure, has been shown to achieve surprisingly good overall results on time-series classification problems. On the other hand, the presence of hubs, i.e., instances that are similar to exceptionally large number of other instances, has been shown to be one of the crucial properties of time-series data sets. To achieve high performance, the presence of hubs should be taken into account for machine learning tasks related to time-series. In this chapter, we survey hubness-aware classification methods and instance selection, and we propose to use selected instances for feature construction. We provide detailed description of the algorithms using uniform terminology and notations. Many of the surveyed approaches were originally introduced for vector classification, and their application to time-series data is novel, therefore, we provide experimental results on large number of publicly available real-world time-series data sets.


Time-series classification Hubs Instance selection Feature construction 



Research partially performed within the framework of the grant of the Hungarian Scientific Research Fund (grant No. OTKA 108947). The position of Krisztian Buza is funded by the Warsaw Center of Mathematics and Computer Science (WCMCS).


  1. 1.
    Aha, D., Kibler, D., Albert, M.: Instance-based learning algorithms. Mach. Learn. 6(1), 37–66 (1991)Google Scholar
  2. 2.
    Altendorf, E., Restificar, A., Dietterich, T.: Learning from sparse data by exploiting monotonicity constraints. In: Proceedings of the 21st Annual Conference on Uncertainty in Artificial Intelligence, pp. 18–26. AUAI Press, Arlington, Virginia (2005)Google Scholar
  3. 3.
    Barabási, A.: Linked: How Everything Is Connected to Everything Else and What It Means for Business, Science, and Everyday Life. Plume, New York (2003)Google Scholar
  4. 4.
    Bellman, R.E.: Adaptive Control Processes—A Guided Tour. Princeton University Press, Princeton (1961)MATHGoogle Scholar
  5. 5.
    Botsch, M.: Machine Learning Techniques for Time Series Classification. Cuvillier, Munchen (2009)Google Scholar
  6. 6.
    Brighton, H., Mellish, C.: Advances in instance selection for instance-based learning algorithms. Data Min. Knowl. Discov. 6(2), 153–172 (2002)CrossRefMathSciNetMATHGoogle Scholar
  7. 7.
    Burges, C.J.: A tutorial on support vector machines for pattern recognition. Data Min. Knowl. Discov. 2(2), 121–167 (1998)CrossRefGoogle Scholar
  8. 8.
    Buza, K.A.: Fusion Methods for Time-Series Classification. Peter Lang Verlag, New York (2011)MATHGoogle Scholar
  9. 9.
    Buza, K., Nanopoulos, A., Schmidt-Thieme, L.: Insight: efficient and effective instance selection for time-series classification. In: Proceedings of the 15th Pacific-Asia Conference on Knowledge Discovery and Data Mining. Lecture Notes in Computer Science, vol. 6635, pp. 149–160. Springer (2011)Google Scholar
  10. 10.
    Chen, G.H., Nikolov, S., Shah, D.: A latent source model for nonparametric time series classification. In: Advances in Neural Information Processing Systems, vol. 26, pp. 1088–1096. Springer (2013)Google Scholar
  11. 11.
    Cortes, C., Vapnik, V.: Support vector machine. Mach. Learn. 20(3), 273–297 (1995)MATHGoogle Scholar
  12. 12.
    Devroye, L., Györfi, L., Lugosi, G.: A Probabilistic Theory of Pattern Recognition. Springer, New York (1996)CrossRefMATHGoogle Scholar
  13. 13.
    Duivesteijn, W., Feelders, A.: Nearest neighbour classification with monotonicity constraints. In: Machine Learning and Knowledge Discovery in Databases, pp. 301–316. Springer (2008)Google Scholar
  14. 14.
    Eads, D., Hill, D., Davis, S., Perkins, S., Ma, J., Porter, R., Theiler, J.: Genetic algorithms and support vector machines for time series classification. In: Applications and Science of Neural Networks, Fuzzy Systems, and Evolutionary Computation V, Proceedings of SPIE, vol. 4787, pp. 74–85 (2002)Google Scholar
  15. 15.
    Fix, E., Hodges, J.: Discriminatory analysis, nonparametric discrimination: consistency properties. Technical Report, USAF School of Aviation Medicine, Randolph Field (1951)Google Scholar
  16. 16.
    Garcia, V., Mollineda, R.A., Sanchez, J.S.: On the k-NN performance in a challenging scenario of imbalance and overlapping. Pattern Anal. Appl. 11, 269–280 (2008)CrossRefMathSciNetGoogle Scholar
  17. 17.
    Geurts, P.: Pattern extraction for time series classification. In: Principles of Data Mining and Knowledge Discovery, pp. 115–127. Springer (2001)Google Scholar
  18. 18.
    Grabocka, J., Wistuba, M., Schmidt-Thieme, L.: Time-series classification through histograms of symbolic polynomials. Comput. Res. Repos.- arXiv abs/1307.6365 (2013)
  19. 19.
    Grochowski, M., Jankowski, N.: Comparison of instance selection algorithms II. Results and comments. In: International Conference on Artificial Intelligence and Soft Computing. Lecture Notes in Computer Science, vol. 3070, pp. 580–585. Springer, Berlin (2004)Google Scholar
  20. 20.
    Hand, D.J., Vinciotti, V.: Choosing k for two-class nearest neighbour classifiers with unbalanced classes. Pattern Recognit. Lett. 24, 1555–1562 (2003)CrossRefMATHGoogle Scholar
  21. 21.
    He, H., Garcia, E.A.: Learning from imbalanced data. IEEE Trans. Knowl. Data Eng. 21(9), 1263–1284 (2009)CrossRefGoogle Scholar
  22. 22.
    He, X., Zhang, J.: Why do hubs tend to be essential in protein networks? PLoS Genet. 2(6) (2006)Google Scholar
  23. 23.
    Horváth, T., Vojtáš, P.: Ordinal classification with monotonicity constraints. In: Advances in Data Mining. Applications in Medicine, Web Mining, Marketing, Image and Signal Mining, pp. 217–225 (2006)Google Scholar
  24. 24.
    Jankowski, N., Grochowski, M.: Comparison of instance selection algorithms I. Algorithms survey. In: Proceedings of the International Conference on Artificial Intelligence and Soft Computing. Lecture Notes in Computer Science, vol. 3070, pp. 598–603. Springer, Berlin (2004)Google Scholar
  25. 25.
    Jolliffe, I.: Principal Component Analysis. Wiley Online Library, New York (2005)Google Scholar
  26. 26.
    Kehagias, A., Petridis, V.: Predictive modular neural networks for time series classification. Neural Netw. 10(1), 31–49 (1997)CrossRefGoogle Scholar
  27. 27.
    Keller, J.E., Gray, M.R., Givens, J.A.: A fuzzy k-nearest-neighbor algorithm. IEEE Trans. Syst., Man Cybern. 15(4), 580–585 (1985)CrossRefGoogle Scholar
  28. 28.
    Keogh, E., Shelton, C., Moerchen, F.: Workshop and challenge on time series classification. In: International Conference on Knowledge Discovery and Data Mining (KDD) (2007)Google Scholar
  29. 29.
    Kim, S., Smyth, P.: Segmental hidden Markov models with random effects for waveform modeling. J. Mach. Learn. Res. 7, 945–969 (2006)MathSciNetMATHGoogle Scholar
  30. 30.
    Levenshtein, V.: Binary codes capable of correcting deletions, insertions, and reversals. Sov. Phys. Dokl. 10(8), 707–710 (1966)MathSciNetGoogle Scholar
  31. 31.
    Lin, W.J., Chen, J.J.: Class-imbalanced classifiers for high-dimensional data. Brief. Bioinform. 14(1), 13–26 (2013)CrossRefGoogle Scholar
  32. 32.
    Liu, H., Motoda, H.: On issues of instance selection. Data Min. Knowl. Discov. 6(2), 115–130 (2002)CrossRefMathSciNetGoogle Scholar
  33. 33.
    MacDonald, I., Zucchini, W.: Hidden Markov and Other Models for Discrete-Valued Time Series, vol. 1. Chapman & Hall, London (1997)MATHGoogle Scholar
  34. 34.
    Marcel, S., Millan, J.: Person authentication using brainwaves (EEG) and maximum a posteriori model adaptation. IEEE Trans. Pattern Anal. Mach. Intell. 29, 743–752 (2007)CrossRefGoogle Scholar
  35. 35.
    Martens, R., Claesen, L.: On-line signature verification by dynamic time-warping. In: Proceedings of the 13th International Conference on Pattern Recognition, vol. 3, pp. 38–42 (1996)Google Scholar
  36. 36.
    Marussy, K., Buza, K.: Success: a new approach for semi-supervised classification of time-series. In: Rutkowski, L., Korytkowski, M., Scherer, R., Tadeusiewicz, R., Zadeh, L., Zurada, J. (eds.) Artificial Intelligence and Soft Computing. Lecture Notes in Computer Science, vol. 7894, pp. 437–447. Springer, Berlin (2013)CrossRefGoogle Scholar
  37. 37.
    Niels, R.: Dynamic time warping: an intuitive way of handwriting recognition? Master’s Thesis. Radboud University Nijmegen, The Netherlands (2004)Google Scholar
  38. 38.
    Petridis, V., Kehagias, A.: Predictive Modular Neural Networks: Applications to Time Series. The Springer International Series in Engineering and Computer Science, vol. 466. Springer, Netherlands (1998)CrossRefGoogle Scholar
  39. 39.
    Rabiner, L., Juang, B.: An introduction to hidden Markov models. ASSP Mag. 3(1), 4–16 (1986)CrossRefGoogle Scholar
  40. 40.
    Radovanović, M.: Representations and Metrics in High-Dimensional Data Mining. Izdavačka knjižarnica Zorana Stojanovića, Novi Sad, Serbia (2011)Google Scholar
  41. 41.
    Radovanović, M., Nanopoulos, A., Ivanović, M.: Nearest neighbors in high-dimensional data: the emergence and influence of hubs. In: Proceedings of the 26th International Conference on Machine Learning (ICML), pp. 865–872 (2009)Google Scholar
  42. 42.
    Radovanović, M., Nanopoulos, A., Ivanović, M.: Hubs in space: popular nearest neighbors in high-dimensional data. J. Mach. Learn. Res. (JMLR) 11, 2487–2531 (2010)MATHGoogle Scholar
  43. 43.
    Radovanović, M., Nanopoulos, A., Ivanović, M.: Time-series classification in many intrinsic dimensions. In: Proceedings of the 10th SIAM International Conference on Data Mining (SDM), pp. 677–688 (2010)Google Scholar
  44. 44.
    Rish, I.: An empirical study of the naive Bayes classifier. In: Proceedings of the IJCAI Workshop on Empirical Methods in Artificial Intelligence (2001)Google Scholar
  45. 45.
    Sakoe, H., Chiba, S.: Dynamic programming algorithm optimization for spoken word recognition. Acoust. Speech Signal Process. 26(1), 43–49 (1978)CrossRefMATHGoogle Scholar
  46. 46.
    Schedl, M.F.A.: A Mirex meta-analysis of hubness in audio music similarity. In: Proceedings of the 13th International Society for Music Information Retrieval Conference (ISMIR 12) (2012)Google Scholar
  47. 47.
    Stańczyk, U.: Recognition of author gender for literary texts. In: Man-Machine Interactions 2, pp. 229–238. Springer (2011)Google Scholar
  48. 48.
    Sykacek, P., Roberts, S.: Bayesian time series classification. Adv. Neural Inf. Process. Syst. 2, 937–944 (2002)Google Scholar
  49. 49.
    Tomašev, N.: The Role of Hubness in High-Dimensional Data Analysis. Jožef Stefan International Postgraduate School (2013)Google Scholar
  50. 50.
    Tomašev, N., Mladenić, D.: Nearest neighbor voting in high dimensional data: learning from past occurrences. Comput. Sci. Inf. Syst. 9, 691–712 (2012)CrossRefGoogle Scholar
  51. 51.
    Tomašev, N., Mladenić, D.: Class imbalance and the curse of minority hubs. Knowl. Based Syst. 53, 157–172 (2013)CrossRefGoogle Scholar
  52. 52.
    Tomašev, N., Mladenić, D.: Hub co-occurrence modeling for robust high-dimensional kNN classification. In: Proceedings of the ECML/PKDD Conference. Springer (2013)Google Scholar
  53. 53.
    Tomašev, N., Radovanović, M., Mladenić, D., Ivanovicć, M.: A probabilistic approach to nearest neighbor classification: Naive hubness Bayesian k-nearest neighbor. In: Proceedings of the CIKM Conference (2011)Google Scholar
  54. 54.
    Tomašev, N., Radovanović, M., Mladenić, D., Ivanović, M.: Hubness-based fuzzy measures for high-dimensional k-nearest neighbor classification. Int. J. Mach. Learn. Cybern. 5(3), 445 (2013)CrossRefGoogle Scholar
  55. 55.
    Tomašev, N., Radovanović, M., Mladenić, D., Ivanović, M.: The role of hubness in clustering high-dimensional data. IEEE Trans. Knowl. Data Eng. 99 (PrePrints), 1 (2013)Google Scholar
  56. 56.
    Wang, J., Neskovic, P., Cooper, L.N.: Improving nearest neighbor rule with a simple adaptive distance measure. Pattern Recognit. Lett. 28(2), 207–213 (2007)CrossRefGoogle Scholar
  57. 57.
    Xi, X., Keogh, E., Shelton, C., Wei, L., Ratanamahatana, C.: Fast time series classification using numerosity reduction. In: Proceedings of the 23rd International Conference on Machine Learning (ICML), pp. 1033–1040 (2006)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2015

Authors and Affiliations

  • Nenad Tomašev
    • 1
  • Krisztian Buza
    • 2
  • Kristóf Marussy
    • 3
  • Piroska B. Kis
    • 4
  1. 1.Institute Jožef StefanArtificial Intelligence LaboratoryLjubljanaSlovenia
  2. 2.Faculty of Mathematics, Informatics and MechanicsUniversity of Warsaw (MIMUW)WarszawaPoland
  3. 3.Department of Computer Science and Information TheoryBudapest University of Technology and EconomicsBudapestHungary
  4. 4.Department of Mathematics and Computer ScienceCollege of DunaújvárosDunaújvárosHungary

Personalised recommendations