Spatio-temporal Outlier Detection in Precipitation Data

  • Elizabeth Wu
  • Wei Liu
  • Sanjay Chawla
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5840)

Abstract

The detection of outliers from spatio-temporal data is an important task due to the increasing amount of spatio-temporal data available and the need to understand and interpret it. Due to the limitations of current data mining techniques, new techniques to handle this data need to be developed. We propose a spatio-temporal outlier detection algorithm called Outstretch, which discovers the outlier movement patterns of the top-k spatial outliers over several time periods. The top-k spatial outliers are found using the Exact-Grid Top-k and Approx-Grid Top-k algorithms, which are an extension of algorithms developed by Agarwal et al. [1]. Since they use the Kulldorff spatial scan statistic, they are capable of discovering all outliers, unaffected by neighbouring regions that may contain missing values. After generating the outlier sequences, we show one way they can be interpreted, by comparing them to the phases of the El Niño Southern Oscilliation (ENSO) weather phenomenon to provide a meaningful analysis of the results.

Keywords

Spatio-Temporal Data Mining Outlier Detection South America Precipitation Extremes 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Agarwal, D., Phillips, J.M., Venkatasubramanian, S.: The Hunting of the Bump: On Maximizing Statistical Discrepancy. In: Proceedings 17th Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 1137–1146. ACM Press, New York (2006)CrossRefGoogle Scholar
  2. 2.
    Han, J., Altman, R.B., Kumar, V., Mannila, H., Pregibon, D.: Emerging Scientific Applications in Data Mining. Communications of the ACM 45(8), 54–58 (2002)CrossRefGoogle Scholar
  3. 3.
    Miller, H.J.: Geographic Data Mining and Knowledge Discovery. In: Wilson, Fotheringham, A.S. (eds.) Handbook of Geographic Information Science (2007)Google Scholar
  4. 4.
    Miller, H.J., Han, J.: Geographic Data Mining and Knowledge Discovery: An Overview In Geographic Data Mining and Knowledge Discovery. Taylor & Francis, New York (2001)CrossRefGoogle Scholar
  5. 5.
    Openshaw, S.: Geographical Data Mining: Key Design Issues. In: Proceedings of GeoComputation 1999 (1999)Google Scholar
  6. 6.
    Liebmann, B., Allured, D.: Daily precipitation grids for South America. Bulletin of the American Meteorological Society 86, 1567–1570 (2005)CrossRefGoogle Scholar
  7. 7.
    DAleo, J.S., Grube, P.G.: The Oryx Resource Guide to El Niño and La Niña. Oryx Press, CT (2002)Google Scholar
  8. 8.
    National Oceanic and Atmospheric Administration (NOAA) Climate Prediction Center: Monthly Atmospheric & SST Indices, http://www.cpc.noaa.gov/data/indices (Accessed February 2, 2008)
  9. 9.
    Wu, E., Chawla, S.: Spatio-Temporal Analysis of the relationship between South American Precipitation Extremes and the El Niño Southern Oscillation. In: Proceedings of the 2007 International Workshop on Spatial and Spatio-temporal Data Mining. IEEE Computer Society, Washington (2007)Google Scholar
  10. 10.
    Agarwal, D., McGregor, A., Phillips, J.M., Venkatasubramanian, S., Zhu, Z.: Spatial Scan Statistics: Approximations and Performance Study. In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 24–33. ACM Press, New York (2006)CrossRefGoogle Scholar
  11. 11.
    Cheng, T., Li, Z.: A Multiscale Approach for Spatio-Temporal Outlier Detection. Transactions in GIS 10(2), 253–263 (2006)CrossRefGoogle Scholar
  12. 12.
    Birant, D., Kut, A.: Spatio-temporal outlier detection in large databases. In: 28th International Conference on Information Technology Interfaces, pp. 179–184 (2006)Google Scholar
  13. 13.
    Ng, R.: Detecting Outliers from Large Datasets In Geographic Data Mining and Knowledge Discovery, pp. 218–235. Taylor & Francis, New York (2001)CrossRefGoogle Scholar
  14. 14.
    Theodoridis, Y., Silva, J.R.O., Nascimento, M.A.: On the Generation of Spatiotemporal Datasets. In: Proceedings of the 6th International Symposium on Advances in Spatial Databases, pp. 147–164. Springer, London (1999)Google Scholar
  15. 15.
    Kulldorff, M.: A Spatial Scan Statistic. Communications in Statistics - Theory and Methods 26, 1481–1496 (1997)MATHCrossRefMathSciNetGoogle Scholar
  16. 16.
    Iyengar, V.S.: On Detecting Space-Time Clusters. In: Proceedings of the tenth ACM SIGKDD international conference on Knowledge Discovery and Data mining, pp. 587–592. ACM, New York (2004)CrossRefGoogle Scholar
  17. 17.
    Chawla, S., Shekhar, S., Wu, W., Ozesmi, U.: Modelling Spatial Dependencies for Mining Geospatial Data: An Introduction. In: Geographic Data Mining and Knowledge Discovery, pp. 131–159. Taylor & Francis, New York (2001)CrossRefGoogle Scholar
  18. 18.
    Tobler, W.R.: A computer model simulation of urban growth in the Detroit region. Economic Geography 46(2), 234–240 (1970)CrossRefGoogle Scholar
  19. 19.
    Redmond, K.: Classification of El Niño and La Niña Winters, http://www.wrcc.dri.edu/enso/ensodef.html (Accessed October 24, 2008)

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Elizabeth Wu
    • 1
  • Wei Liu
    • 1
  • Sanjay Chawla
    • 1
  1. 1.School of Information TechnologiesThe University of SydneySydneyAustralia

Personalised recommendations