The VLDB Journal

, Volume 15, Issue 1, pp 1–20 | Cite as

Indexing Multidimensional Time-Series

  • Michail Vlachos
  • Marios Hadjieleftheriou
  • Dimitrios Gunopulos
  • Eamonn Keogh
Regular Paper

Abstract

While most time series data mining research has concentrated on providing solutions for a single distance function, in this work we motivate the need for an index structure that can support multiple distance measures. Our specific area of interest is the efficient retrieval and analysis of similar trajectories. Trajectory datasets are very common in environmental applications, mobility experiments, and video surveillance and are especially important for the discovery of certain biological patterns. Our primary similarity measure is based on the longest common subsequence (LCSS) model that offers enhanced robustness, particularly for noisy data, which are encountered very often in real-world applications. However, our index is able to accommodate other distance measures as well, including the ubiquitous Euclidean distance and the increasingly popular dynamic time warping (DTW). While other researchers have advocated one or other of these similarity measures, a major contribution of our work is the ability to support all these measures without the need to restructure the index. Our framework guarantees no false dismissals and can also be tailored to provide much faster response time at the expense of slightly reduced precision/recall. The experimental results demonstrate that our index can help speed up the computation of expensive similarity measures such as the LCSS and the DTW.

Keywords

Ensemble index Longest common subsequence Dynamic time warping Trajectories Motion capture 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Aach, J., Church, G.: Aligning gene expression time series with time warping algorithms. Bioinformatics 17, 495–508 (2001)CrossRefGoogle Scholar
  2. 2.
    Agrawal, R, Faloutsos, C, Swami, A: Efficient similarity search in sequence databases. In: Proc. of the 4th FODO, pp. 69–84 (October 1993)Google Scholar
  3. 3.
    Agrawal, R., Lin, K., Sawhney, H.S., Shim, K.: Fast similarity search in the presence of noise, scaling and translation in time-series databases. In: Proc. of VLDB, pp. 490–501 (September 1995)Google Scholar
  4. 4.
    Arikan, O., Forsyth, D.: Interactive motion generation from examples. In: Proc. of ACM SIGGRAPH (2002)Google Scholar
  5. 5.
    Bollobás, B., Das, G., Gunopulos, D., Mannila, H.: Time-series similarity problems and well-separated geometric sets. In: Proc. of the 13th SCG (1997)Google Scholar
  6. 6.
    Bar-Joseph, Z., Gerber, G., Gifford, D., Jaakkola, T., Simon, I.: A new approach to analyzing gene expression time series data. In: Proc. of 6th RECOMB, pp. 39–48 (2002)Google Scholar
  7. 7.
    Barbara, D.: Mobile computing and databases—a survey. In: IEEE TKDE, pp. 108–117 (January 1999)Google Scholar
  8. 8.
    Berndt, D., Clifford, J.: Using dynamic time warping to find patterns in time series. In: Proc. of AAAI-94 Workshop of SIGKDD (1994)Google Scholar
  9. 9.
    Betke, M., Gips, J., Fleming, P.: The camera mouse: visual tracking of body features to provide computer access for people with severe disabilities. IEEE Trans. Neural Syst. Rehabil. Eng. 10(1) (2002)Google Scholar
  10. 10.
    Bozkaya, T., Yazdani, N., Ozsoyoglu, M.: Matching and indexing sequences of different lengths. In: Proc. of the CIKM (1997)Google Scholar
  11. 11.
    Cardle, M., Vlachos, M., Brooks, S., Keogh, E., Gunopulos, D.: Fast motion capture matching with replicated motion editing. In: 29th ACM SIGGRAPH, Sketches and Applications (2003)Google Scholar
  12. 12.
    Chu, K., Wong, M.: Fast time-series searching with scaling and shifting. ACM Principles of Database Systems, pp. 237–248 (June 1999)Google Scholar
  13. 13.
    Chudova, D., Gaffney, S., Mjolsness, E., Smyth, P.: Translation-invariant mixture models for curve clustering. In: Proc. of 9th SIGKDD, pp. 79–88 (2003)Google Scholar
  14. 14.
    Das, G., Gunopulos, D., Mannila, H.: Finding similar time series. In: Proc. of the 1st PKDD Symposium, pp. 88–100 (1997)Google Scholar
  15. 15.
    de Boor, C.: A Practical Guide to Splines. Springer, Berlin Heidelberg New York (1978)Google Scholar
  16. 16.
    Faloutsos, C., Jagadish, H., Mendelzon, A., Milo, T.: Signature technique for similarity-based queries. In: SEQUENCES 97 (1997)Google Scholar
  17. 17.
    Faloutsos, C., Lin, K.-I.: FastMap: a fast algorithm for indexing, data-mining and visualization of traditional and multimedia datasets. In: Proc. of ACM SIGMOD, pp. 163–174 (May 1995)Google Scholar
  18. 18.
    Faloutsos, C., Ranganathan, M., Manolopoulos, I.: Fast subsequence matching in time series databases. In: Proc. of ACM SIGMOD, pp. 419–429 (1994)Google Scholar
  19. 19.
    Gaffney, S., Smyth, P.: Trajectory clustering with mixtures of regression models. In: Proc. of ACM SIGKDD, pp. 63–72 (1999)Google Scholar
  20. 20.
    Gavrila, D., Davis, L.: Towards 3-d model-based tracking and recognition of human movement: a multi-view approach. In: Int. Workshop on Face and Gesture Recognition (1995)Google Scholar
  21. 21.
    Ge, X., Smyth, P.: Deformable markov model templates for time-series pattern matching. In: Proc. of ACM SIGKDD (2000)Google Scholar
  22. 22.
    Goldin, D., Kanellakis, P.: On similarity queries for time-series data. In: Proc. of Principles and Practice of Constraint Programming (September 1995)Google Scholar
  23. 23.
    Gollmer, K., Posten, C.: Detection of distorted pattern using dynamic time warping algorithm and application for supervision of bioprocesses. On-Line Fault Detection and Supervision in Chemical Process Industries (1995)Google Scholar
  24. 24.
    Grumbach, S., Rigaux, P., Segoufin, L.: Manipulating interpolated data is easier than you thought. In: Proc. of VLDB, pp. 156–165 (2000)Google Scholar
  25. 25.
    Guttman, A.: R-trees: a dynamic index structure for spatial searching. In: Proc. of ACM SIGMOD, pp. 47–57 (1984)Google Scholar
  26. 26.
    Hadjieleftheriou, M., Kollios, G., Tsotras, V., Gunopulos, D.: Efficient indexing of spatiotemporal objects. In: Proc. of 8th EDBT, pp. 251–268 (2002)Google Scholar
  27. 27.
    Jagadish, H.V., Mendelzon, A.O., Milo, T.: Similarity-based queries. In: Proc. of the 14th ACM PODS, pp. 36–45 (1995)Google Scholar
  28. 28.
    Kahveci, T., Singh, A., Gürel, A.: Similarity searching for multi-attribute sequences. In: Proc. of SSDBM (2002)Google Scholar
  29. 29.
    Kahveci, T., Singh, A.K.: Variable length queries for time series data. In: Proc. of IEEE ICDE, pp. 273–282 (2001)Google Scholar
  30. 30.
    Keogh, E.: Exact indexing of dynamic time warping. In: Proc. of VLDB, pp. 406–417 (2002)Google Scholar
  31. 31.
    Keogh, E., Chakrabarti, K., Mehrotra, S., Pazzani, M.: Locally adaptive dimensionality reduction for indexing large time series databases. In: Proc. of ACM SIGMOD, pp. 151–162 (2001)Google Scholar
  32. 32.
    Keogh, E., Kasetty, S.: On the need for time series data mining benchmarks: a survey and empirical demonstration. In: Proc. of SIGKDD, pp. 102–111 (2002)Google Scholar
  33. 33.
    Kim, S., Park, S., Chu, W.: An index-based approach for similarity search supporting time warping in large sequence databases. In: Proc. of IEEE ICDE, pp. 607–614 (2001)Google Scholar
  34. 34.
    Kovács-Vajna, Z.: A fingerprint verification system based on triangular matching and dynamic time warping. IEEE Trans Pattern Anal Mach Intell, 22(11), (2000)Google Scholar
  35. 35.
    Lee, S.-L., Chun, S.-J., Kim, D.-H., Lee, J.-H., Chung, C.-W.: Similarity search for multidimensional data sequences. In: Proc. of IEEE ICDE, pp. 599–608 (2000)Google Scholar
  36. 36.
    Levenshtein, V.: Binary codes capable of correcting deletions, insertions, and reversals. Soviet Physics—Doklady 10(10), 707–710 (1966)MathSciNetGoogle Scholar
  37. 37.
    Munich, M., Perona, P.: Continuous dynamic time warping for translation-invariant curve alignment with applications to signature verification. In: 7th International Conference on Computer Vision, pp. 108–115 (1999)Google Scholar
  38. 38.
    Park, S., Chu, W., Yoon, J., Hsu, C.: Efficient searches for similar subsequences of different lengths in sequence databases. In: Proc. of IEEE ICDE, pp. 23–32 (2000)Google Scholar
  39. 39.
    Perng, S., Wang, H., Zhang, S., Parker, D.S.: Landmarks: A new model for similarity-based pattern querying in time series databases. In: Proc. of IEEE ICDE, pp. 33–42 (2000)Google Scholar
  40. 40.
    Pfoser, D., Jensen, C.S.: Capturing the uncertainty of moving-object representations. Lecture Notes in Computer Science, vol 1651 (1999)Google Scholar
  41. 41.
    Pratt, K.B.: Locating patterns in discrete time-series. Master’s thesis (2001)Google Scholar
  42. 42.
    Qu, Y., Wang, C., Wang, X.: Supporting fast search in time series for movement patterns in multiple scales. In: Proc. of ACM CIKM, pp. 251–258 (1998)Google Scholar
  43. 43.
    Rafiei, D., Mendelzon, A.: Querying time series data based on similarity. In: IEEE Trans. Knowl. Data Eng., 12(5), 675–693 (2000)CrossRefGoogle Scholar
  44. 44.
    Ratanamahatana, C.A., Keogh, E.: Everything you know about dynamic time warping is wrong. In: 3rd Workshop on Mining Temporal and Sequential Data (SIGKDD), 2004Google Scholar
  45. 45.
    Ratanamahatana, C.A., Keogh, E.: Making time-series classification more accurate using learned constraints. In: Proc. of SIAM International Conference on Data Mining (SDM) (2004)Google Scholar
  46. 46.
    Rath, T., Manmatha, R.: Word image matching using dynamic time warping. Tec Report MM-38. Center for Intelligent Information Retrieval, University of Massachusetts Amherst (2002)Google Scholar
  47. 47.
    Roddick, J.F., Hornsby, K.: Temporal, spatial and spatio-temporal data mining. TSDM (2000)Google Scholar
  48. 48.
    Roussopoulos, N., Kelley, S., Vincent, F.: Nearest neighbor queries. In: Proc. of ACM SIGMOD (1995)Google Scholar
  49. 49.
    Shimada, M., Uehara, K.: Discovery of correlation from multi-stream of human motion. Discovery Sci., pp. 290–294 (2000)Google Scholar
  50. 50.
    Strik, H., Boves, L.: Averaging physiological signals with the use of a dtw algorithm. In: SPEECH’88, 7th FASE Symposium, Book 3, pp. 883–890 (1988)Google Scholar
  51. 51.
    Valdes-Perez, R.E., Stone, C.A.: Systematic detection of subtle spatio-temporal patterns in time-lapse imaging: II. Particle migrations. Bioimaging 6(2), 71–78 (1998)CrossRefGoogle Scholar
  52. 52.
    Vlachos, M., Kollios, G., Gunopulos, D.: Discovering similar multidimensional trajectories. In: Proc. of IEEE ICDE, pp. 673–684 (2002)Google Scholar
  53. 53.
    Yi, B.-K., Faloutsos, C.: Fast time sequence indexing for arbitrary Lp norms. In: Proc. of VLDB, pp. 385–394 (2000)Google Scholar
  54. 54.
    Yi, B.-K., Jagadish, H.V., Faloutsos, C.: Efficient retrieval of similar time sequences under time warping. In: Proc. of IEEE ICDE, pp. 201–208 (1998)Google Scholar
  55. 55.
    Zhu, Y., Shasha, D.: Warping indexes with envelope transforms for query by humming. In: Proc. of ACM SIGMOD, pp. 181–192 (2003)Google Scholar

Copyright information

© Springer-Verlag 2006

Authors and Affiliations

  • Michail Vlachos
    • 1
  • Marios Hadjieleftheriou
    • 2
  • Dimitrios Gunopulos
    • 2
  • Eamonn Keogh
    • 2
  1. 1.IBM T.J. Watson Research CenterHawthorneUSA
  2. 2.Computer Science DepartmentUniversity of CaliforniaRiversideUSA

Personalised recommendations