Identifying Developing Cloud Clusters Using Predictive Features


Forecasters need better data-driven techniques using feature extraction to determine whether a cyclone will develop from a loosely organized cluster of clouds. Prior studies have attempted to predict the formation of tropical cyclones using numerical weather prediction models and satellite and radar data. However, refined observational data and forecasting techniques are not always available or accurate in areas such as the North Atlantic Ocean where data are sparse. In response, this research investigates the predictive features that contribute to a cloud cluster developing into a tropical cyclone without using dynamic models. Instead, it will only use global gridded satellite data which are readily available. Generally, an imbalance occurs in the classification process of cloud clusters since the number of non-developing cloud clusters is greater than the number of developing cloud clusters. Imbalanced data are an essential source of low performance in learning about rare events. To address this issue, the produced cloud cluster feature dataset is balanced by applying the Selective Clustering based Oversampling Technique (SCOT), which addresses data imbalance in a selective manner and can be used in many applications. In this research, the predictive features are identified based on the performance of separating developing and non-developing cloud clusters from the balanced feature dataset when using a standard classifier. The predictive features are identified only if the classification yields a geometric mean of at least 80 % and a Heidke Skill Score of at least 0.8.


Feature extraction Imbalanced data Oversampling Tropical cyclone 



This work is partially supported by the Expeditions in Computing by the National Science Foundation under Award CCF-1029731.


  1. Bekkar M, Djemaa HK, Alitouche TA (2013) Evaluation measures for models assessment over imbalanced data sets. J Inf Eng Appl 3(10):27–38Google Scholar
  2. Beven JL, Avila LA, Blake ES, Brown DP, Franklin JL, Knabb RD, Pasch RJ, Rhome JR, Stewart SR (2008) Atlantic Hurricane season of 2005. Mon Weather Rev 136(3):1109–1173. doi: 10.1175/2007MWR2074.1 CrossRefGoogle Scholar
  3. Doswell CA, Davies-Jones R, Keller DL (1990) On summary measures of skill in rare event forecasting based on contingency tables. Weather Forecast 5(4):576–585. doi: 10.1175/1520-0434(1990)005<0576:OSMOSI>2.0.CO;2 CrossRefGoogle Scholar
  4. García V, Sánchez JS, Mollineda RA, Alejo R, Sotoca JM (2007) The class imbalance problem in pattern classification and learning (pp 283–291). Presented at the Congreso Español de Informática 2007, Zaragoza. Retrieved from
  5. He H, Garcia EA (2009) Learning from imbalanced data. IEEE Trans Knowl Data Eng 21(9):1263–1284. doi: 10.1109/TKDE.2008.239 CrossRefGoogle Scholar
  6. Hennon CC (2003). Investigating probabilistic forecasting of tropical cyclogenesis over the North Atlantic using linear and non-linear classifiers. The Ohio State University. Retrieved from
  7. Hennon CC, Marzban C, Hobgood JS (2005) Improving tropical cyclogenesis statistical model forecasts through the application of a neural network classifier. Weather Forecast 20(6):1073–1083. doi: 10.1175/WAF890.1 CrossRefGoogle Scholar
  8. Hennon CC, Helms CN, Knapp KR, Bowen AR (2011) An objective algorithm for detecting and tracking tropical cloud clusters: implications for tropical cyclogenesis prediction. J Atmos Oceanic Tech 28(8):1007–1018. doi: 10.1175/2010JTECHA1522.1 CrossRefGoogle Scholar
  9. Kerns BW, Chen SS (2013) Cloud clusters and tropical cyclogenesis: developing and nondeveloping systems and their large-scale environment. Mon Weather Rev 141(1):192–210. doi: 10.1175/MWR-D-11-00239.1 CrossRefGoogle Scholar
  10. Knapp KR, Kossin JP (2007) New global tropical cyclone dataset from ISCCP B1 geostationary satellite observations. J Appl Remote Sensing 1(1):013505–0135056. doi: 10.1117/1.2712816 CrossRefGoogle Scholar
  11. Knapp KR, Ansari S, Bain CL, Bourassa MA, Dickinson MJ, Funk C, Helms CN, Hennon CC, Holmes C, Huffman GJ, Kossin JP, Lee H-T, Loew A, Magnusdottir G (2011) Globally gridded satellite observations for climate studies. Bull Am Meteorol Soc 92(7):893–907. doi: 10.1175/2011BAMS3039.1 CrossRefGoogle Scholar
  12. Lacewell CW, Homaifar A (2014) Identifying predictive features of developing cloud clusters. Presented at the 4th international workshop on climate informatics, Boulder, Colorado. Retrieved from

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  1. 1.Department of Electrical and Computer EngineeringNorth Carolina Agricultural and Technical State UniversityGreensboroUSA

Personalised recommendations