Correlation Analysis of Spatial Time Series Datasets: A Filter-and-Refine Approach

  • Pusheng Zhang
  • Yan Huang
  • Shashi Shekhar
  • Vipin Kumar
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2637)

Abstract

A spatial time series dataset is a collection of time series, each referencing a location in a common spatial framework. Correlation analysis is often used to identify pairs of potentially interacting elements from the cross product of two spatial time series datasets. However, the computational cost of correlation analysis is very high when the dimension of the time series and the number of locations in the spatial frameworks are large. The key contribution of this paper is the use of spatial autocorrelation among spatial neighboring time series to reduce computational cost. A filter-and-refine algorithm based on coning, i.e. grouping of locations, is proposed to reduce the cost of correlation analysis over a pair of spatial time series datasets. Cone-level correlation computation can be used to eliminate (filter out) a large number of element pairs whose correlation is clearly below (or above) a given threshold. Element pair correlation needs to be computed for remaining pairs. Using experimental studies with Earth science datasets, we show that the filter-and-refine approach can save a large fraction of the computational cost, particularly when the minimal correlation threshold is high.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    R. Agrawal, C. Faloutsos, and A. Swami. Efficient Similarity Search In Sequence Databases. In Proc. of the 4th Int’l Conference of Foundations of Data Organization and Algorithms, 1993.Google Scholar
  2. 2.
    G. Box, G. Jenkins, and G. Reinsel. Time Series Analysis: Forecasting and Control. Prentice Hall, 1994.Google Scholar
  3. 3.
    B. W. Lindgren. Statistical Theory (Fourth Edition). Chapman-Hall, 1998.Google Scholar
  4. 4.
    K. Chan and A. W. Fu. Efficient Time Series Matching by Wavelets. In Proc. of the 15th ICDE, 1999.Google Scholar
  5. 5.
    N. Cressie. Statistics for Spatial Data. John Wiley and Sons, 1991.Google Scholar
  6. 6.
    Christos Faloutsos. Searching Multimedia Databases By Content. Kluwer Academic Publishers, 1996.Google Scholar
  7. 7.
    R. Grossman, C. Kamath, P. Kegelmeyer, V. Kumar, and R. Namburu, editors. Data Mining for Scientific and Engineering Applications. Kluwer Academic Publishers, 2001.Google Scholar
  8. 8.
    D. Gunopulos and G. Das. Time Series Similarity Measures and Time Series Indexing. SIGMOD Record, 30(2):624–624, 2001.CrossRefGoogle Scholar
  9. 9.
    J. Han and M. Kamber. Data Mining: Concepts and Techniques. Morgan Kaufmann Publishers, 2000.Google Scholar
  10. 10.
    E. Keogh and M. Pazzani. An Indexing Scheme for Fast Similarity Search in Large Time Series Databases. In Proc. of 11th Int’l Conference on Scientific and Statistical Database Management, 1999.Google Scholar
  11. 11.
    Y. Moon, K. Whang, and W. Han. A Subsequence Matching Method in Time-Series Databases Based on Generalized Windows. In Proc. of ACM SIGMOD, Madison, WI, 2002.Google Scholar
  12. 12.
    C. Potter, S. Klooster, and V. Brooks. Inter-annual Variability in Terrestrial Net Primary Production: Exploration of Trends and Controls on Regional to Global Scales. Ecosystems, 2(1):36–48, 1999.CrossRefGoogle Scholar
  13. 13.
    J. Roddick and K. Hornsby. Temporal, Spatial, and Spatio-Temporal Data Mining. In First Int’l Workshop on Temporal, Spatial and Spatio-Temporal Data Mining, 2000.Google Scholar
  14. 14.
    S. Shekhar and S. Chawla. Spatial Databases: A Tour. Prentice Hall, 2002.Google Scholar
  15. 15.
    S. Shekhar, S. Chawla, S. Ravada, A. Fetterer, X. Liu, and C.T. Lu. Spatial databases: Accomplishments and research needs. IEEE Transactions on Knowledge and Data Engineering, 11(1):45–55, 1999.CrossRefGoogle Scholar
  16. 16.
    M. Steinbach, P. Tan, V. Kumar, C. Potter, S. Klooster, and A. Torregrosa. Data Mining for the Discovery of Ocean Climate Indices. In Proc of the Fifth Workshop on Scientific Data Mining, 2002.Google Scholar
  17. 17.
    P. Tan, M. Steinbach, V. Kumar, C. Potter, S. Klooster, and A. Torregrosa. Finding Spatio-Temporal Patterns in Earth Science Data. In KDD 2001 Workshop on Temporal Data Mining, 2001.Google Scholar
  18. 18.
    G. H. Taylor. Impacts of the El Niño/Southern Oscillation on the Pacific Northwest. http://www.ocs.orst.edu/reports/enso_pnw.html.
  19. 19.
    Michael F. Worboys. GIS — A Computing Perspective. Taylor and Francis, 1995.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2003

Authors and Affiliations

  • Pusheng Zhang
    • 1
  • Yan Huang
    • 1
  • Shashi Shekhar
    • 1
  • Vipin Kumar
    • 1
  1. 1.Computer Science & Engineering DepartmentUniversity of MinnesotaMinneapolisUSA

Personalised recommendations