Data Mining and Knowledge Discovery

, Volume 27, Issue 2, pp 225-258

First online:

Open Access This content is freely available online to anyone, anywhere at any time.

Discovery of extreme events-related communities in contrasting groups of physical system networks

  • Zhengzhang ChenAffiliated withNorth Carolina State UniversityOak Ridge National Laboratory
  • , William HendrixAffiliated withNorth Carolina State University
  • , Hang GuanAffiliated withZhejiang University
  • , Isaac K. TettehAffiliated withNorth Carolina State University
  • , Alok ChoudharyAffiliated withNorthwestern University
  • , Fredrick SemazziAffiliated withNorth Carolina State University
  • , Nagiza F. SamatovaAffiliated withNorth Carolina State UniversityOak Ridge National Laboratory Email author 


The latent behavior of a physical system that can exhibit extreme events such as hurricanes or rainfalls, is complex. Recently, a very promising means for studying complex systems has emerged through the concept of complex networks. Networks representing relationships between individual objects usually exhibit community dynamics. Conventional community detection methods mainly focus on either mining frequent subgraphs in a network or detecting stable communities in time-varying networks. In this paper, we formulate a novel problem—detection of predictive and phase-biased communities in contrasting groups of networks, and propose an efficient and effective machine learning solution for finding such anomalous communities. We build different groups of networks corresponding to different system’s phases, such as higher or low hurricane activity, discover phase-related system components as seeds to help bound the search space of community generation in each network, and use the proposed contrast-based technique to identify the changing communities across different groups. The detected anomalous communities are hypothesized (1) to play an important role in defining the target system’s state(s) and (2) to improve the predictive skill of the system’s states when used collectively in the ensemble of predictive models. When tested on the two important extreme event problems—identification of tropical cyclone-related and of African Sahel rainfall-related climate indices—our algorithm demonstrated the superior performance in terms of various skill and robustness metrics, including 8–16 % accuracy increase, as well as physical interpretability of detected communities. The experimental results also show the efficiency of our algorithm on synthetic datasets.


Spatio-temporal data mining Complex network analysis Community detection Comparative analysis Network motif detection Extreme event prediction