Skip to main content

Mining Time Series Data

  • Chapter
  • First Online:
Data Mining and Knowledge Discovery Handbook

Summary

Much of the world’s supply of data is in the form of time series. In the last decade, there has been an explosion of interest in mining time series data. A number of new algorithms have been introduced to classify, cluster, segment, index, discover rules, and detect anomalies/novelties in time series. While these many different techniques used to solve these problems use a multitude of different techniques, they all have one common factor; they require some high level representation of the data, rather than the original raw data. These high level representations are necessary as a feature extraction step, or simply to make the storage, transmission, and computation of massive dataset feasible. A multitude of representations have been proposed in the literature, including spectral transforms, wavelets transforms, piecewise polynomials, eigenfunctions, and symbolic mappings. This chapter gives a high-level survey of time series Data Mining tasks, with an emphasis on time series representations.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 349.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Aach, J. and Church, G. Aligning gene expression time series with time warping algorithms. Bioinformatics; 2001, Volume 17, pp. 495-508.

    Article  Google Scholar 

  • Aggarwal, C., Hinneburg, A., Keim, D. A. On the surprising behavior of distance metrics in high dimensional space. In proceedings of the 8th International Conference on Database Theory; 2001 Jan 4-6; London, UK, pp 420-434.

    Google Scholar 

  • Agrawal, R., Faloutsos, C., Swami, A. Efficient Similarity Search in Sequence Data bases. International Conference on Foundations of Data Organization (FODO); 1993.

    Google Scholar 

  • Agrawal, R., Lin, K.-I., Sawhney, H.S., Shim, K. Fast Similarity Search in the Presence of Noise, Scaling, and Translation in Trime-Series Databases. Proceedings of 21st International Conference on Very Large Databases; 1995 Sep; Zurich, Switzerland, pp. 490-500.

    Google Scholar 

  • Berndt, D.J., Clifford, J. Finding Patterns in Time Series: A Dynamic Programming Approach. In Advances in Knowledge Discovery and Data Mining AAAI/MIT Press, Menlo Park, CA, 1996, pp. 229-248.

    Google Scholar 

  • Bollobas, B., Das, G., Gunopulos, D., Mannila, H. Time-Series Similarity Problems and Well-Separated Geometric Sets. Nordic Jour. of Computing 2001; 4.

    Google Scholar 

  • Brin, S. Near neighbor search in large metric spaces. Proceedings of 21st VLDB; 1995.

    Google Scholar 

  • Chakrabarti, K., Keogh, E., Pazzani, M., Mehrotra, S. Locally adaptive dimensionality reduction for indexing large time series databases. ACM Transactions on Database Systems. Volume 27, Issue 2, (June 2002). pp 188-228.

    Article  Google Scholar 

  • Chan, K., Fu, A.W. Efficient time series matching by wavelets. Proceedings of 15th IEEE International Conference on Data Engineering; 1999 Mar 23-26; Sydney, Australia, pp. 126-133.

    Google Scholar 

  • Chang, C.L.E., Garcia-Molina, H., Wiederhold, G. Clustering for Approximate Similarity Search in High-Dimensional Spaces. IEEE Transactions on Knowledge and Data Engineering 2002; Jul – Aug, 14(4): 792-808.

    Article  Google Scholar 

  • Chiu, B.Y., Keogh, E., Lonardi, S. Probabilistic discovery of time series motifs. Proceedings of ACM SIGKDD; 2003, pp. 493-498.

    Google Scholar 

  • Ciaccia, P., Patella, M., Zezula, P. M-tree: An efficient access method for similarity search in metric spaces. Proceedings of 23rd VLDB; 1997, pp. 426-435.

    Google Scholar 

  • Crochemore, M., Czumaj, A., Gasjeniec, L, Jarominek, S., Lecroq, T., Plandowski, W., Rytter, W. Speeding up two string-matching algorithms. Algorithmica; 1994; Vol. 12(4/5), pp. 247-267.

    Article  MATH  MathSciNet  Google Scholar 

  • Dasgupta, D., Forrest, S. Novelty Detection in Time Series Data Using Ideas from Immunology. Proceedings of 8th International conference on Intelligent Systems; 1999 Jun 24-26; Denver, CO.

    Google Scholar 

  • Debregeas, A., Hebrail, G. Interactive interpretation of kohonen maps applied to curves. In proceedings of the 4th Int’l Conference of Knowledge Discovery and Data Mining; 1998 Aug 27-31; New York, NY, pp 179-183.

    Google Scholar 

  • Faloutsos, C., Jagadish, H., Mendelzon, A., Milo, T. A signature technique for similaritybased queries. Proceedings of the International Conference on Compression and Complexity of Sequences; 1997 Jun 11-13; Positano-Salerno, Italy.

    Google Scholar 

  • Faloutsos, C., Ranganathan, M., Manolopoulos, Y. Fast subsequence matching in time-series databases. In proceedings of the ACM SIGMOD Int’l Conference on Management of Data; 1994 May 25-27; Minneapolis, MN, pp 419-429.

    Google Scholar 

  • Ge, X., Smyth, P. Deformable Markov Model Templates for Time-Series Pattern Matching. Proceedings of 6th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining; 2000 Aug 20-23; Boston , MA, pp. 81-90.

    Google Scholar 

  • Geurts, P. Pattern extraction for time series classification. Proceedings of Principles of Data Mining and Knowledge Discovery, 5th European Conference; 2001 Sep 3-5; Freiburg, Germany, pp 115-127.

    Google Scholar 

  • Goldin, D.Q., Kanellakis, P.C. On Similarity Queries for Time-Series Data: Constraint Specification and Implementation. Proceedings of the 1st International Conference on the Principles and Practice of Constraint Programming; 1995 Sep 19-22; Cassis, France, pp. 137-153.

    Google Scholar 

  • Guralnik, V., Srivastava, J. Event detection from time series data. In proceedings of the 5th ACM SIGKDD Int’l Conference on Knowledge Discovery and Data Mining; 1999 Aug 15-18; San Diego, CA, pp 33-42.

    Google Scholar 

  • Huhtala, Y., Karkkainen, J, Toivonen, H. Mining for similarities in aligned time series using wavelet. Data Mining and Knowledge Discovery: Theory, Tools, and Technology, SPIE Proceedings Series 1995; Orlando, FL, Vol. 3695, pp. 150-160.

    Google Scholar 

  • Hochheiser, H., Shneiderman,, B. Interactive Exploration of Time-Sereis Data. Proceedings of 4th International conference on Discovery Science; 2001 Nov 25-28; Washington, DC, pp. 441-446.

    Google Scholar 

  • Indyk, P., Koudas, N., Muthukrishnan, S. Identifying representative trends in massive time series data sets using sketches. In proceedings of the 26th Int’l Conference on Very Large Data Bases; 2000 Sept 10-14; Cairo, Egypt, pp 363-372.

    Google Scholar 

  • Jagadish, H.V., Mendelzon, A.O., and Milo, T. Similarity-Based Queries. Proceedings of ACM PODS; 1995 May; San Jose, CA, pp. 36-45.

    Google Scholar 

  • Kahveci, T., Singh, A. Variable length queries for time series data. In proceedings of the 17th Int’l Conference on Data Engineering; 2001 Apr 2-6; Heidelberg, Germany, pp 273-282.

    Google Scholar 

  • Kalpakis, K., Gada, D., Puttagunta, V. Distance measures for effective clustering of ARIMA time-series. Proceedings of the IEEE Int’l Conference on Data Mining; 2001 Nov 29-Dec 2; San Jose, CA, pp 273-280.

    Google Scholar 

  • Kanth, K.V., Agrawal, D., Singh, A. Dimensionality reduction for similarity searching in dynamic databases. Proceedings of ACM SIGMOD International Conference; 1998, pp. 166-176.

    Google Scholar 

  • Keogh, E. Exact indexing of dynamic time warping. Proceedings of 28th Internation Conference on Very Large Databases; 2002; Hong Kong, pp. 406-417.

    Google Scholar 

  • Keogh, E., Chakrabarti, K., Mehrotra, S., Pazzani, M. Locally adaptive dimensionality reduction for indexing large time series databases. Proceedings of ACM SIGMOD International Conference; 2001.

    Google Scholar 

  • Keogh, E., Chakrabarti, K., Pazzani, M., Mehrotra, S. Dimensionality reduction for fast similarity search in large time series databases. Knowledge and Information Systems 2001; 3: 263-286.

    Article  MATH  Google Scholar 

  • Keogh, E., Lin, J., Truppel, W. Clustering of Time Series Subsequences is Meaningless: Implications for Previous and Future Research. Proceedings of ICDM; 2003, pp. 115-122.

    Google Scholar 

  • Keogh, E., Lonardi, S., Chiu, W. Finding Surprising Patterns in a Time Series Database In Linear Time and Space. In the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining; 2002 Jul 23 – 26; Edmonton, Alberta, Canada, pp 550-556.

    Google Scholar 

  • Keogh, E., Lonardi, S., Ratanamahatana, C.A. Towards Parameter-Free Data Mining. Proceedings of 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining; 2004 Aug 22-25; Seattle, WA.

    Google Scholar 

  • Keogh, E., Pazzani, M. An enhanced representation of time series which allows fast and accurate classification, clustering and relevance feedback. Proceedings of the 4th Int’l Conference on Knowledge Discovery and Data Mining; 1998 Aug 27-31; New York, NY, pp 239-241.

    Google Scholar 

  • Keogh, E. and Kasetty, S. On the Need for Time Series Data Mining Benchmarks: A Survey and Empirical Demonstration. In the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining; 2002 Jul 23 – 26; Edmonton, Alberta, Canada, pp 102-111.

    Google Scholar 

  • Keogh, E., Smyth, P. A Probabilistic Approach to Fast Pattern matching in Time Series Databases. Proceedings of 3rd International conference on Knowledge Discovery and Data Mining; 1997 Aug 14-17; Newport Beach, CA, pp. 24-30.

    Google Scholar 

  • Korn, F., Jagadish, H., Faloutsos, C. Efficiently supporting ad hoc queries in large datasets of time sequences. Proceedings of SIGMOD International Conferences 1997; Tucson, AZ, pp. 289-300.

    Google Scholar 

  • Kruskal, J.B., Sankoff, D., Editors. Time Warps, String Edits, and Macromolecules: The Theory and Practice of Sequence Comparison. Addison-Wesley, 1983.

    Google Scholar 

  • Lin, J., Keogh, E., Lonardi, S., Chiu, B. A Symbolic Representation of Time Series, with Implications for Streaming Algorithms. Workshop on Research Issues in Data Mining and Knowledge Discovery, 8th ACM SIGMOD; 2003 Jun 13; San Diego, CA.

    Google Scholar 

  • Lin, J., Keogh, E., Lonardi, S., Lankford, J. P., Nystrom, D. M. Visually Mining and Monitoring Massive Time Series. Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining; 2004 Aug 22-25; Seattle, WA.

    Google Scholar 

  • Ma, J., Perkins, S. Online Novelty Detection on Temporal Sequences. Proceedings of 9th International Conference on Knowledge Discovery and Data Mining; 2003 Aug 24-27; Washington DC.

    Google Scholar 

  • Nievergelt, H., Hinterberger, H., Sevcik, K.C. The grid file: An adaptable, symmetricmultikey file structure. ACM Trans. Database Systems; 1984; 9(1): 38-71.

    Article  Google Scholar 

  • Palpanas, T., Vlachos, M., Keogh, E., Gunopulos, D., Truppel, W. Online Amnestic Approximation of Streaming Time Series. Proceedings of 20th International Conference on Data Engineering; 2004, Boston, MA.

    Google Scholar 

  • Pavlidis, T., Horowitz, S. Segmentation of plane curves. IEEE Transactions on Computers; 1974 August; Vol. C-23(8), pp. 860-870.

    Article  MathSciNet  Google Scholar 

  • Popivanov, I., Miller, R. J. Similarity search over time series data using wave -lets. In proceedings of the 18th Int’l Conference on Data Engineering; 2002 Feb 26-Mar 1; San Jose, CA, pp 212-221.

    Google Scholar 

  • Rafiei, D., Mendelzon, A. O. Efficient retrieval of similar time sequences using DFT. In proceedings of the 5th Int’l Conference on Foundations of Data Organization and Algorithms; 1998 Nov 12-13; Kobe, Japan.

    Google Scholar 

  • Ratanamahatana, C.A., Keogh, E. Making Time-Series Classification More Accurate Using Learned Constrints. Proceedings of SIAM International Conference on Data Mining; 2004 Apr 22-24; Lake Buena Vista, FL, pp.11-22.

    Google Scholar 

  • Ripley, B.D. Pattern recognition and neural networks. Cambridge University Press, Cambridge, UK, 1996.

    MATH  Google Scholar 

  • Robinson, J.T. The K-d-b-tree: A search structure for large multidimensional dynamic indexes. Proceedings of ACM SIGMOD; 1981.

    Google Scholar 

  • Shahabi, C., Tian, X., Zhao,W. TSA-tree: a wavelet based approach to improve the efficiency of multi-level surprise and trend queries. In proceedings of the 12th Int’l Conference on Scientific and Statistical Database Management; 2000 Jul 26-28; Berlin, Germany, pp 55-68.

    Google Scholar 

  • Struzik, Z., Siebes, A. The Haar wavelet transform in the time series similarity paradigm. Proceedings of 3rd European Conference on Principles and Practice of Knowledge Discovery in Databases; 1999; Prague, Czech Republic, pp. 12-22.

    Google Scholar 

  • Tufte, E. The visual display of quantitative information. Graphics Press, Cheshire, Connecticut, 1983.

    Google Scholar 

  • Tzouramanis, T., Vassilakopoulos, M., Manolopoulos, Y. Overlapping Linear Quadtrees: A Spatio-Temporal Access Method. ACM-GIS; 1998, pp. 1-7.

    Google Scholar 

  • Guralnik, V., Srivastava, J. Event Detection from Time Series Data. Proceedings of ACM SIGKDD; 1999, pp 33-42.

    Google Scholar 

  • Vlachos, M., Gunopulos, D., Das, G. Rotation Invariant Distance Measures for Trajectories. Proceedings of 10th International Conference on Knowledge Discovery and Data Mining; 2004 Aug 22-25; Seattle, WA.

    Google Scholar 

  • Vlachos, M., Meek, C., Vagena, Z., Gunopulos, D. Identification of Similarities, Periodicities & Bursts for Online Search Queries. Proceedings of International Conference on Management of Data; 2004; Paris, France.

    Google Scholar 

  • Weber, M., lexa, M., Muller, W. Visualizing Time Series on Spirals. Proceedings of IEEE Symposium on Information Visualization; 2000 Oct 21-26; San Diego, CA, pp. 7-14.

    Google Scholar 

  • Wijk, J.J. van, E. van Selow. Cluster and calendar-based visualization of time series data. Proceedings of IEEE Symposium on Information Visualization; 1999 Oct 25-26, IEEE Computer Society, pp 4-9.

    Google Scholar 

  • Wu, D., Agrawal, D., El Abbadi, A., Singh, A, Smith, T.R. Efficient retrieval for browsing large image databases. Proceedings of 5th International Conference on Knowledge Information; 1996; Rockville, MD, pp. 11-18.

    Google Scholar 

  • Wu, Y., Agrawal, D., El Abbadi, A. A comparison of DFT and DWT based similarity search in time-series databases. In proceedings of the 9th ACM CIKM Int’l Conference on Information and Knowledge Management; 2000 Nov 6-11; McLean, VA, pp 488-495.

    Google Scholar 

  • Yi, B., Faloutsos, C. Fast time sequence indexing for arbitrary lp norms. Proceedings of the 26th Int’l Conference on Very Large Databases; 2000 Sep 10-14; Cairo, Egypt, pp 385-394.

    Google Scholar 

  • Yianilos, P. Data structures and algorithms for nearest neighbor search in general metric spaces. Proceedings of 3rd SIAM on Discrete Algorithms; 1992.

    Google Scholar 

  • Zhu, Y., Shasha, D. StatStream: Statistical Monitoring of Thousands of Data Streams in Real Time, Proceedings of VLDB; 2002, pp. 358-369.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer Science+Business Media, LLC

About this chapter

Cite this chapter

Ratanamahatana, C.A., Lin, J., Gunopulos, D., Keogh, E., Vlachos, M., Das, G. (2009). Mining Time Series Data. In: Maimon, O., Rokach, L. (eds) Data Mining and Knowledge Discovery Handbook. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-09823-4_56

Download citation

  • DOI: https://doi.org/10.1007/978-0-387-09823-4_56

  • Published:

  • Publisher Name: Springer, Boston, MA

  • Print ISBN: 978-0-387-09822-7

  • Online ISBN: 978-0-387-09823-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics