Skip to main content

Detecting Extreme Events from Climate Time Series via Topic Modeling

  • Conference paper
  • 2315 Accesses

Abstract

We propose a topic-model-based approach to define and detect patterns corresponding to extreme climate-related events over different regions around the globe from the time series data of various climate variables. While topic models are popular for tasks such as natural language processing, bioinformatics, and computer vision, we are unaware of their applications to modeling climate extremes. Inference from our model can be used to construct climate extreme indices, predict disastrous extreme events such as drought and floods, and understand the influence of climate change on climate extremes.

Keywords

  • Climate extremes
  • Extreme events
  • Topic modeling
  • Latent Dirichlet allocation
  • Unsupervised learning

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-3-319-17220-0_19
  • Chapter length: 9 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
eBook
USD   169.00
Price excludes VAT (USA)
  • ISBN: 978-3-319-17220-0
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   219.99
Price excludes VAT (USA)
Hardcover Book
USD   249.99
Price excludes VAT (USA)
Fig. 19.1
Fig. 19.2
Fig. 19.3
Fig. 19.4

Notes

  1. 1.

    As an example, each variable of a time span s can be discretized into too low, normal, and too high according to its deviation from typical value (mean) calculated from a longer time epoch E over a geographical region l. A description of how we obtain I n for each month of E over each geo-location from climate data is given in Sect. 19.3.

  2. 2.

    Richer model structures are added later in order encode bias from our knowledge of the data and problem. See Zhu and Xing (2010), Agovic and Banerjee (2012), Hennig et al. (2012), and Blei and McAuliffe (2007).

  3. 3.

    Both GEV and GPD have three specific realizations (Gumbel, Frechet, and Weibull) according to their shape parameter.

  4. 4.

    The data grid has size 144 (longitude) by 73 (latitude).

References

  • Agovic A, Banerjee A (2012) Gaussian process topic models. In: Uncertainty in Artificial Intelligence (UAI), 2010. CoRR. abs/1203.3462

    Google Scholar 

  • Beirlant J, Goegebeur Y, Segers J, Teugels J, De Waal D, Ferro C (2004) Statistics of extremes: theory and applications. Wiley series in probability and statistics. Wiley, Hoboken

    CrossRef  Google Scholar 

  • Blei DM (2012) Probabilistic topic models. Commun ACM 55(4):77–84

    CrossRef  Google Scholar 

  • Blei DM, McAuliffe JD (2007) Supervised topic models. In: Advances in neural information processing systems 20, Proceedings of the twenty-first annual conference on neural information processing systems, Vancouver, 3–6 Dec 2007

    Google Scholar 

  • Blei DM, Ng AY, Jordan MI (2003) Latent dirichlet allocation. J Mach Learn Res 3:993–1022

    Google Scholar 

  • Cook KH (2008) Climate science: the mysteries of Sahel droughts. Nat Geosci 1(10):647–648

    CrossRef  Google Scholar 

  • Dai A, Trenberth KE, Qian T (2004) A global dataset of palmer drought severity index for 1870–2002: Relationship with soil moisture and effects of surface warming. J Hydrometeorol 5:1117–1130

    CrossRef  Google Scholar 

  • Deerwester S, Dumais ST, Furnas GW, Landauer TK, Harshman R (1990) Indexing by latent semantic analysis. J Am Soc Inf Sci 41(6):391–407

    CrossRef  Google Scholar 

  • Dirmeyer PA, Shukla J (1996) The effect on regional and global climate of expansion of the world’s deserts. OJR Meteorol Soc 122(530):451–482

    CrossRef  Google Scholar 

  • Qiang Fu, Banerjee A, Liess S, Snyder PK (2012) Drought detection of the last century: an mrf-based approach. In: SIAM SDM, Anaheim, pp 24–34

    Google Scholar 

  • Rekatsinas T, Ghosh S, Mekaru SR, Nsoesie EO, Brownstein JS, Getoor L, Ramakrishnan N (2013) Forecasting rare disease outbreaks using multiple data sources. In: SIAM International Conference on Data Mining (SDM), 2015, NIPS 2013 workshop on topic models

    Google Scholar 

  • Griffiths T, Steyvers M (2004) Finding scientific topics. Proc Natl Acad Sci 101:5228–5235

    CrossRef  Google Scholar 

  • Gumbel EJ (1954) Statistical theory of extreme values and some practical applications: a series of lectures. Applied mathematics series. U.S. Govt. Print. Office, Washington DC

    Google Scholar 

  • Heffernan JE, Tawn JA (2004) A conditional approach for multivariate extreme values. R Stat Soc B(66):497–547

    Google Scholar 

  • Hennig P, Stern DH, Herbrich R, Graepel T (2012) Kernel topic models. In: Proceedings of the fifteenth international conference on artificial intelligence and statistics, AISTATS 2012, La Palma, pp 511–519, 21–23 April 2012

    Google Scholar 

  • Liu Y, Bahadori MT, Li H (2012) Sparse-gev: sparse latent space model for multivariate extreme value time serie modeling. In: Proceedings of the 29th international conference on machine learning, ICML 2012, Edinburgh, June 26–July 1 2012

    Google Scholar 

  • Managing the risks of extreme events and disasters to advance climate change adaptation. Special Report of the IPCC (2012)

    Google Scholar 

  • Mimno DM, McCallum A (2012) Topic models conditioned on arbitrary features with dirichlet-multinomial regression. In: CoRR. UAI, 2008, abs/1206.3278

    Google Scholar 

  • Monteleoni C et al (2013) Climate Informatics, chapter 4, pp 81–126

    Google Scholar 

  • Papadimitriou CH, Raghavan P, Tamaki H, Vempala S (2000) Latent semantic indexing: a probabilistic analysis. J Comput Syst Sci 61(2):217–235

    CrossRef  Google Scholar 

  • Scheffer M, Holmgren M, Brovkin V, Claussen M (2005) Synergy between small- and large-scale feedbacks of vegetation on the water cycle. Glob Chang Biol 11:1003–1012+

    Google Scholar 

  • Schubert SD, Suarez MJ, Pegion PJ, Koster RD, Bacmeister JT (2004) On the cause of the 1930s Dust Bowl. Science 303(5665):1855–1859

    CrossRef  Google Scholar 

  • Steinbach M, Tan P-N, Kumar V, Klooster SA, Potter C (2003) Discovery of climate indices using clustering. In: Proceedings of the ninth ACM SIGKDD international conference on knowledge discovery and data mining, Washington DC, pp 446–455, 24–27 Aug 2003

    Google Scholar 

  • Steinhaeuser K, Chawla NV, Ganguly AR (2011) Comparing predictive power in climate data: clustering matters. In: Advances in spatial and temporal databases – 12th international symposium, SSTD 2011, Proceedings, Minneapolis, 24–26 Aug 2011, pp 39–55

    Google Scholar 

  • Steinhaeuser K, Chawla NV, Ganguly AR (2011) Comparing predictive power in climate data: clustering matters. In: SSTD, Minneapolis, pp 39–55

    Google Scholar 

  • Wallach HM, Murray I, Salakhutdinov R, Mimno DM (2009) Evaluation methods for topic models. In: Proceedings of the 26th Annual international conference on machine learning, ICML 2009, Montreal, pp 1105–1112, 14–18 June 2009

    Google Scholar 

  • World climate research programme: Grand challenges (2013)

    Google Scholar 

  • Zhu J, Xing EP (2010) Conditional topic random fields. In: Proceedings of the 27th international conference on machine learning (ICML-10), Haifa, pp 1239–1246, 21–24 June 2010

    Google Scholar 

  • http://www.esrl.noaa.gov/psd/

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Cheng Tang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Tang, C., Monteleoni, C. (2015). Detecting Extreme Events from Climate Time Series via Topic Modeling. In: Lakshmanan, V., Gilleland, E., McGovern, A., Tingley, M. (eds) Machine Learning and Data Mining Approaches to Climate Science. Springer, Cham. https://doi.org/10.1007/978-3-319-17220-0_19

Download citation