Skip to main content

Towards Time Series Classification without Human Preprocessing

  • Conference paper

Part of the Lecture Notes in Computer Science book series (LNAI,volume 8556)

Abstract

Similarity search is a core functionality in many data mining algorithms. Over the past decade these algorithms were designed to mostly work with human assistance to extract characteristic, aligned patterns of equal length and scaling. Human assistance is not cost-effective. We propose our shotgun distance similarity metric that extracts, scales, and aligns segments from a query to a sample time series. This simplifies the classification of time series as produced by sensors. A time series is classified based on its segments at varying lengths as part of our shotgun ensemble classifier. It improves the best published accuracies on case studies in the context of bioacoustics, human motion detection, spectrographs or personalized medicine. Finally, it performs better than state of the art on the official UCR classification benchmark.

Keywords

  • Time Series
  • Gait Cycle
  • Window Length
  • Dynamic Time Warping
  • Data Mining Algorithm

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-3-319-08979-9_18
  • Chapter length: 15 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
eBook
USD   79.99
Price excludes VAT (USA)
  • ISBN: 978-3-319-08979-9
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   99.99
Price excludes VAT (USA)

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Agrawal, R., Faloutsos, C., Swami, A.: Efficient similarity search in sequence databases. In: Lomet, D.B. (ed.) FODO 1993. LNCS, vol. 730, pp. 69–84. Springer, Heidelberg (1993)

    CrossRef  Google Scholar 

  2. Bagnall, A., Davis, L.M., Hills, J., Lines, J.: Transformation Based Ensembles for Time Series Classification. In: SDM, vol. 12, pp. 307–318. SIAM (2012)

    Google Scholar 

  3. Batista, G., Wang, X., Keogh, E.J.: A Complexity-Invariant Distance Measure for Time Series. In: SDM, vol. 11, pp. 699–710. SIAM/Omnipress (2011)

    Google Scholar 

  4. BIDMC, http://www.physionet.org/physiobank/database/chfdb/

  5. Chen, Q., Chen, L., Lian, X., Liu, Y., Yu, J.X.: Indexable PLA for Efficient Similarity Search. In: VLDB, pp. 435–446. ACM (2007)

    Google Scholar 

  6. CMU Graphics Lab Motion Capture Database, http://mocap.cs.cmu.edu/

  7. Ding, H., Trajcevski, G., Scheuermann, P., Wang, X., Keogh, E.: Querying and mining of time series data: experimental comparison of representations and distance measures. Proceedings of the VLDB Endowment 1(2), 1542–1552 (2008)

    CrossRef  Google Scholar 

  8. Hu, B., Chen, Y., Keogh, E.: Time Series Classification under More Realistic Assumptions. In: SDM, pp. 578–586. SIAM (2013)

    Google Scholar 

  9. Jeong, Y., Jeong, M.K., Omitaomu, O.A.: Weighted dynamic time warping for time series classification. Pattern Recognition 44(9), 2231–2240 (2011)

    CrossRef  Google Scholar 

  10. Kaggle: Go from Big Data to Big Analytics, https://www.kaggle.com

  11. Keogh, E., Xi, X., Wei, L., Ratanamahatana, C.A.: UCR Time Series Classification/Clustering Homepage, http://www.cs.ucr.edu/~eamonn/time_series_data

  12. Keogh, E., Kasetty, S.: On the need for time series data mining benchmarks: a survey and empirical demonstration. In: KDD, pp. 102–111. ACM (2002)

    Google Scholar 

  13. Lin, J., Keogh, E.J., Wei, L., Lonardi, S.: Experiencing SAX: a novel symbolic representation of time series. Data Mining and Knowledge Discovery 15(2) (2007)

    Google Scholar 

  14. Lin, J., Khade, R., Li, Y.: Rotation-invariant similarity in time series using bag-of-patterns representation. J. Intell. Inf. Syst. 39(2), 287–315 (2012)

    CrossRef  Google Scholar 

  15. Lipowsky, C., Dranischnikow, E., Göttler, H., Gottron, T., Kemeter, M., Schömer, E.: Alignment of Noisy and Uniformly Scaled Time Series. In: Bhowmick, S.S., Küng, J., Wagner, R. (eds.) DEXA 2009. LNCS, vol. 5690, pp. 675–688. Springer, Heidelberg (2009)

    CrossRef  Google Scholar 

  16. Mueen, A., Keogh, E.J., Young, N.: Logical-shapelets: an expressive primitive for time series classification. In: Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2011, pp. 1154–1162. ACM (2011)

    Google Scholar 

  17. Rakthanmanon, T., Campana, B., Mueen, A., Batista, G., Westover, B., Zhu, Q., Zakaria, J., Keogh, E.: Searching and mining trillions of time series subsequences under dynamic time warping, pp. 262–270. ACM (2012)

    Google Scholar 

  18. Rakthanmanon, T., Keogh, E.: Fast Shapelets: A Scalable Algorithm for Discovering Time Series Shapelets. In: SDM, pp. 668–676. SIAM (2013)

    Google Scholar 

  19. Sakoe, H., Chiba, S.: Dynamic programming algorithm optimization for spoken word recognition. IEEE Trans. Acoust., Speech, Signal Processing (1), 43–49 (1978)

    Google Scholar 

  20. Schäfer, P., Dreßler, S.: Shooting Audio Recordings of Insects with SFA. In: AmiBio Workshop, Bonn, Germany (2013) (to appear)

    Google Scholar 

  21. Schäfer, P., Högqvist, M.: SFA: a symbolic fourier approximation and index for similarity search in high dimensional datasets. In: Rundensteiner, E.A., Markl, V., Manolescu, I., Amer-Yahia, S., Naumann, F., Ari, I. (eds.) EDBT, pp. 516–527. ACM (2012)

    Google Scholar 

  22. Shotgun Distance Webpage, http://www.zib.de/patrick.schaefer/shotgun

  23. UCR Insect Contest (2012), http://www.cs.ucr.edu/~eamonn/CE

  24. Venter, J.C., et al.: The Sequence of the Human Genome. Science 291(5507), 1304–1351 (2001)

    CrossRef  Google Scholar 

  25. Warren Liao, T.: Clustering of time series data—a survey. Pattern Recognition 38(11), 1857–1874 (2005)

    MATH  CrossRef  Google Scholar 

  26. Ye, L., Keogh, E.J.: Time series shapelets: a novel technique that allows accurate, interpretable and fast classification. DMKD 22(1-2), 149–182 (2011)

    MATH  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Schäfer, P. (2014). Towards Time Series Classification without Human Preprocessing. In: Perner, P. (eds) Machine Learning and Data Mining in Pattern Recognition. MLDM 2014. Lecture Notes in Computer Science(), vol 8556. Springer, Cham. https://doi.org/10.1007/978-3-319-08979-9_18

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-08979-9_18

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-08978-2

  • Online ISBN: 978-3-319-08979-9

  • eBook Packages: Computer ScienceComputer Science (R0)