A Competitive Measure to Assess the Similarity between Two Time Series

  • Joan Serrà
  • Josep Lluís Arcos
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7466)

Abstract

Time series are ubiquitous, and a measure to assess their similarity is a core part of many systems, including case-based reasoning systems. Although several proposals have been made, still the more robust and reliable time series similarity measures are the classical ones, introduced long time ago. In this paper we propose a new approach to time series similarity based on the costs of iteratively jumping (or moving) between the sample values of two time series. We show that this approach can be very competitive when compared against the aforementioned classical measures. In fact, extensive experiments show that it can be statistically significantly superior for a number of data sources. Since the approach is also computationally simple, we foresee its application as an alternative off-the-shelf tool to be used in many case-based reasoning systems dealing with time series.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Bottrighi, A., Leonardi, G., Montani, S., Portinale, L., Terenziani, P.: Intelligent Data Interpretation and Case Base Exploration through Temporal Abstractions. In: Bichindaritz, I., Montani, S. (eds.) ICCBR 2010. LNCS, vol. 6176, pp. 36–50. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  2. 2.
    Faloutsos, C., Ranganathan, M., Manolopoulos, Y.: Fast subsequence matching in time-series databases. In: Proc. of the ACM SIGMOD Int. Conf. on Management of Data, pp. 419–429 (1994)Google Scholar
  3. 3.
    Funk, P., Xiong, N.: Case-based reasoning and knowledge discovery in medical applications with time series. Computational Intelligence 22(3/4), 238–253 (2006)MathSciNetCrossRefGoogle Scholar
  4. 4.
    Han, J., Kamber, M.: Data mining: concepts and techniques. Morgan Kaufmann, Waltham (2005)Google Scholar
  5. 5.
    Hollander, M., Wolfe, D.A.: Nonparametric statistical methods, 2nd edn. Wiley, New York (1999)MATHGoogle Scholar
  6. 6.
    Kanawati, R., Malek, M., Salotti, S. (eds.): Proc. of the 1st Workshop on Applying CBR to Time Series Prediction (ICCBR-2003). Dept. of Computer and Information Science, Northwestern University of Science and Technology (2003)Google Scholar
  7. 7.
    Kanawati, R., Salotti, S. (eds.): Proc. of the 2nd Workshop on Applying CBR to Time Series Prediction (ECCBR 2004). Dept. of Sistemas Informaticos y Programación. Universidad Complutense de Madrid (2004)Google Scholar
  8. 8.
    Kantz, H., Schreiber, T.: Nonlinear time series analysis. Cambridge University Press, Cambridge (2004)MATHGoogle Scholar
  9. 9.
    Keogh, E.: Machine learning in time series databases (and everything is a time series!). Tutorial at the AAAI Int. Conf. on Artificial Intelligence (2011)Google Scholar
  10. 10.
    Keogh, E., Kasetty, S.: On the need for time series data mining benchmarks: a survey and empirical demonstration. Data Mining and Knowledge Discovery 7(4), 349–371 (2003)MathSciNetCrossRefGoogle Scholar
  11. 11.
    Keogh, E., Zhu, Q., Hu, B., Hao, Y., Xi, X., Wei, L., Ratanamahatana, C.A.: The UCR time series classification/clustering homepage (2011), http://www.cs.ucr.edu/%7eeamonn/time_series_data
  12. 12.
    Martin, F.J., Plaza, E.: Ceaseless Case-Based Reasoning. In: Funk, P., González Calero, P.A. (eds.) ECCBR 2004. LNCS (LNAI), vol. 3155, pp. 287–301. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  13. 13.
    Montani, S., Portinale, L., Leonardi, G., Bellazzi, R., Bellazzi, R.: Case-based retrieval to support the treatment of end stage renal failure patients. Artificial Intelligence in Medicine 37(1), 31–42 (2006)CrossRefGoogle Scholar
  14. 14.
    Rabiner, L.R., Juang, B.: Fundamentals of speech recognition. Prentice-Hall, Upper Saddle River (1993)Google Scholar
  15. 15.
    Sakoe, H., Chiba, S.: Dynamic programming algorithm optimization for spoken word recognition. IEEE Trans. on Acoustics, Speech, and Language Processing 26(1), 43–50 (1978)MATHCrossRefGoogle Scholar
  16. 16.
    Salzberg, S.L.: On comparing classifiers: pitfalls to avoid and a recommended approach. Data Mining and Knowledge Discovery 1(3), 317–328 (1997)CrossRefGoogle Scholar
  17. 17.
    Serrà, J., Müller, M., Grosche, P., Arcos, J.L.: Unsupervised detection of music boundaries by time series structure features. In: Proc. of the AAAI Int. Conf. on Artificial Intelligence (in press, 2012)Google Scholar
  18. 18.
    Serrà, J., Serra, X., Andrzejak, R.G.: Cross recurrence quantification for cover song identification. New Journal of Physics 11(9), 093017 (2009)CrossRefGoogle Scholar
  19. 19.
    Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. Journal of Molecular Biology 147, 195–197 (1981)CrossRefGoogle Scholar
  20. 20.
    Wang, X., Mueen, A., Ding, H., Trajcevski, G., Scheuermann, P., Keogh, E.: Experimental comparison of representation methods and distance measures for time series data. Data Mining and Knowledge Discovery (in press, 2012), http://dx.doi.org/10.1007/s10618-012-0250-5
  21. 21.
    Xia, B.B.: Similarity search in time series data sets. MSc thesis, Simon Fraser University, Burnaby, Canada (1997)Google Scholar
  22. 22.
    Xiong, N., Funk, P.: Concise case indexing of time series in health care by means of key sequence discovery. Applied Intelligence 28(3), 247–260 (2008)CrossRefGoogle Scholar
  23. 23.
    Zehraoui, F., Kanawati, R., Salotti, S.: CASEP2: Hybrid Case-Based Reasoning System for Sequence Processing. In: Funk, P., González Calero, P.A. (eds.) ECCBR 2004. LNCS (LNAI), vol. 3155, pp. 449–463. Springer, Heidelberg (2004)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Joan Serrà
    • 1
  • Josep Lluís Arcos
    • 1
  1. 1.IIIA-CSIC, Artificial Intelligence Research InstituteSpanish National Research CouncilBarcelonaSpain

Personalised recommendations