Unsupervised Discovery of Motifs under Amplitude Scaling and Shifting in Time Series Databases

  • Tom Armstrong
  • Eric Drewniak
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6871)

Abstract

We introduce an algorithm, MD-RP, for unsupervised discovery of frequently occurring patterns, or motifs, in time series databases. Unlike prior approaches that can handle pattern distortion in the time dimension only, MD-RP is robust at finding pattern instances with amplitude shifting and with amplitude scaling. Using an established discretization method, SAX, we augment the existing real-valued time series representation with additional features to capture shifting and scaling. We evaluate our representation change on the modified randomized projection algorithm on synthetic data with planted, known motifs and on real-world data with known motifs (e.g., GPS). The empirical results demonstrate the effectiveness of MD-RP at discovering motifs that are undiscoverable by prior approaches. Finally, we show that MD-RP can be used to find subsequences of time series that are the least similar to all other subsequences.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Yankov, D., Keogh, E., Medina, J., Chiu, B., Zordan, V.: Detecting time series motifs under uniform scaling. In: KDD 2007: Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 844–853. ACM, New York (2007)Google Scholar
  2. 2.
    Catalano, J., Armstrong, T., Oates, T.: Discovering patterns in real-valued time series. In: Fürnkranz, J., Scheffer, T., Spiliopoulou, M. (eds.) PKDD 2006. LNCS (LNAI), vol. 4213, pp. 462–469. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  3. 3.
    Sagot, M.: Spelling approximate repeated or common motifs using a suffix tree. In: Lucchesi, C.L., Moura, A.V. (eds.) LATIN 1998. LNCS, vol. 1380, pp. 374–390. Springer, Heidelberg (1998)CrossRefGoogle Scholar
  4. 4.
    Pevzner, P., Sze, S.: Combinatorial approaches to finding subtle signals in DNA sequences. In: Proceedings of the Eighth International Conference on Intelligent Systems for Molecular Biology, Citeseer, vol. 8, pp. 269–278 (2000)Google Scholar
  5. 5.
    Buhler, J., Tompa, M.: Finding motifs using random projections. Journal of Computational Biology 9(2), 225–242 (2002)CrossRefGoogle Scholar
  6. 6.
    Lin, J., Keogh, E., Lonardi, S., Chiu, B.: A symbolic representation of time series, with implications for streaming algorithms. In: DMKD 2003: Proceedings of the 8th ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery, pp. 2–11. ACM Press, New York (2003)Google Scholar
  7. 7.
    Shieh, J., Keogh, E.: Isax: indexing and mining terabyte sized time series. In: KDD 2008: Proceeding of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 623–631. ACM, New York (2008)Google Scholar
  8. 8.
    Jurafsky, D., Martin, J.: Speech and language processing. Prentice-Hall, New York (2000)Google Scholar
  9. 9.
    Chiu, B., Keogh, E., Lonardi, S.: Probabilistic discovery of time series motifs. In: 9th International Conference on Knowledge Discovery and Data Mining (SIGKDD 2003), pp. 493–498 (2003)Google Scholar
  10. 10.
    Lin, J., Keogh, E., Lonardi, S., Patel, P.: Finding motifs in time series. In: Proceedings of the Second Workshop on Temporal Data Mining, Edmonton, Alberta, Canada (July 2002)Google Scholar
  11. 11.
    Minnen, D., Isbell, C., Essa, I., Starner, T.: Detecting subdimensional motifs: An efficient algorithm for generalized multivariate pattern discovery. In: IEEE Int. Conf. on Data Mining (ICDM), vol. 1 (2007)Google Scholar
  12. 12.
    Mohammad, Y., Nishida, T.: Constrained Motif Discovery in Time Series. New Generation Computing 27(4), 319–346 (2009)CrossRefMATHGoogle Scholar
  13. 13.
    Minnen, D., Starner, T., Essa, I., Isbell, C.: Improving activity discovery with automatic neighborhood estimation. In: International Joint Conference on Artificial Intelligence, pp. 6–12 (2007)Google Scholar
  14. 14.
    Vahdatpour, A., Amini, N., Sarrafzadeh, M.: Toward unsupervised activity discovery using multi-dimensional motif detection in time series. In: IJCAI, pp. 1261–1266 (2009)Google Scholar
  15. 15.
    Oates, T.: Identifying distinctive subsequences in multivariate time series by clustering. In: Chaudhuri, S., Madigan, D. (eds.) Fifth International Conference on Knowledge Discovery and Data Mining, pp. 322–326. ACM Press, San Diego (1999)Google Scholar
  16. 16.
    Oates, T., Schmill, M.D., Cohen, P.R.: A method for clustering the experiences of a mobile robot that accords with human judgments. In: AAAI/IAAI, pp. 846–851 (2000)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Tom Armstrong
    • 1
  • Eric Drewniak
    • 1
  1. 1.Wheaton CollegeNortonUSA

Personalised recommendations