Skip to main content
Log in

Conjunction of hard k-mean and fuzzy c-mean techniques in clustering and identifying some critical meteorological parameters for thunderstorm formation over a metro city of India during pre-monsoon season

  • Published:
Journal of Earth System Science Aims and scope Submit manuscript

Abstract

Among the metro cities in India, Kolkata is chosen incidentally for the present study. The study aims at clustering of pre-monsoon days of the urban area, Kolkata (22º32′N, 88º20′E) (India) in two groups (thunderstorm days denoted by TS and non-thunderstorm days denoted by NTS in the literature) using hard k-mean technique, backward selection procedure and fuzzy c-mean algorithm (FCM). Various thermodynamic and dynamic parameters that are already identified by several scientists as responsible for thunderstorm formation have been considered here for different atmospheric layers up to 500 hPa. The study is performed in two stages: In the first stage, the hard k-mean technique is applied to cluster the days of a semi-supervised dataset in the two categories mentioned above. Then the backward selection procedure is used to find the best possible combinations of the theoretically influential atmospheric parameters considered in the study, that plays the dominant role in the categorization on the basis of performance score (PC). In the second stage of the work, fuzzy c-mean algorithm is applied to the same semi-supervised dataset of parameters to clarify the results obtained in the first stage. This study is performed separately for the morning (0000 UTC) and afternoon (1200 UTC) atmosphere as it is already revealed that there is a structural difference between the morning and afternoon atmosphere of Kolkata, India. In the first stage with the (thunderstorm and non-thunderstorm) dates of 0000 UTC (morning) data reveals that the combination of maximum vertical velocity and PPLCL at 1000 hPa level performs better in detecting the pre-monsoon thunderstorm days, whereas with those at 1200 UTC (afternoon) data showed vertical wind speed shear for (1000–850) hPa layer, maximum vertical velocity, PPLCL at 1000 hPa level and (θesθe) at 850 hPa level dominate better in detecting thunderstorm days. It is interesting to note that these findings are supported by FCM in the second stage for both morning and afternoon atmosphere.

Research highlights

  • The study in the first stage reveals that the hard k-means clustering technique with two different metrics, Euclidean distance produces more stable results than Manhattan distances.

  • In the second stage the fuzzy c-means algorithm helps to clarify the reasons behind the results obtained in the first stage.

  • In this study proportion correct, HK skill score and backward selection procedure indicates that combination of maximum vertical velocity and PPLCL furnish better results in morning whereas in afternoon along with the above two parameters, vertical wind speed shear (1000–850 hPa) and (θesθe) at 850 hPa also gives better results for the categorization of pre-monsoon days of Kolkata (India), in two groups, thunderstorm and non-thunderstorm.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Figure 1
Figure 2
Figure 3
Figure 4

Similar content being viewed by others

References

  • Andreassen P B 1987 On the social psychology of the stock market. Aggregate attributional effects and the regressivness of prediction; J. Pers. Soc. Psychol. 53(3) 490–496.

  • Betts A K 1974 Thermodynamic classification of tropical convective soundings; Mon. Weather Rev. 102 760–764.

  • Bezdek J C, Ehrlich R and Full W 1984 FCM: The fuzzy c-means clustering algorithm; Comput. Geosci. 10 191–203.

  • Bolton D 1980 The computation of equivalent potential temperature; Mon. Weather Rev. 108 1046–1053.

  • Chakraborty Sweta and Ghosh Sarbari 2020 Application of hard K-mean technique in conjunction with fuzzy C mean algorithm in clustering the pre-monsoon thunderstorm and non-thunderstorm days of Kolkata, India; MOL2NET 6, ISSN: 2624-5078.

  • Chaudhuri Sutapa 2006 A hybrid model to estimate the depth of potential convective instability during severe thunderstorms; Soft. Comput. 10 643–648.

  • Das S 2017 Severe thunderstorm observation and modeling – A review; Vayu Mandal 43(2) 1–29.

  • Desai B N and Rao Y P 1954 On the cold pools and their role in the development of Nor’westers over West Bengal and East Pakistan; Ind. J. Meteor. Geophys. 5 243–248.

  • Dikbas F, Firat M, Koc A C and Gungor M 2012 Classification of precipitation series using fuzzy cluster method; Int. J. Climatol. 32 1596–1603.

  • Doreswamy, Ghoneim O A and Manjaunath B R 2015 Air pollution clustering using K-means algorithm in smart city; Int. J. Innov. Res. Comput. Commun. Eng. 3(7) 51–57.

  • Esteban P, Martin Vide J and Mases M 2006 Daily atmospheric circulation catalogue for western Europe using multivariate techniques; Int. J. Climatol. 26 1501–1515.

  • Ghosh S and De U K 1996 A comparative study of the atmospheric layers below first lifting condensation level for instantaneous pre-monsoon thunderstorm occurrence at Agartala (23°30′N, 91°15′E) and Ranchi (23°14′N, 85°14′E) of India; Adv. Atmos. Sci. 14(1) 93–97.

  • Ghosh S, Sen P K and De U K 1999 Identification of significant parameters for the prediction of pre-monsoon thunderstorms at Calcutta; Int. J. Climatol. 19 673–681.

  • Ghosh S, Sen P K and De U K 2004 Classification of thunderstorm days and non-thunderstorm days in Calcutta (India) on the basis of linear discriminant analysis; Atmosphera 17 1–12.

  • Hanssen A W and Kuippers W J A 1965 On the relationship between the frequency of rain and various meteorological parameters; Verhand. K. Nederlands Meteorl. Inst. 81 2–15.

  • Kessler E 1982 Thunderstorm morphology and dynamics; US Department of Commerce; USA 2 5–7, 93–95, 146–149.

  • Kuo H L 1965 On formation and intensification of tropical cyclones through latent heat release by cumulus convection; J. Atmos. Sci. 22 40–63.

  • Lolis C J 2009 Winter cloudiness variability in the Mediterranean region and its connection to atmospheric circulation features; Theor. Appl. Climatol. 96 357–373.

  • Lorenz E N 1963 Deterministic non periodic flow; J. Atmos. Sci. 20 130–141.

  • Mayilvaganan M and Vanitha P 2015 Correlation analysis of meteorological data in region of Tamil Nadu districts based on K-means clustering algorithm; Int. J. Comput. Trends Technol. (IJCST) 3 184–190.

  • Mohurle S V, Purohit R and Patil M 2018 A study of fuzzy clustering concept for measuring air pollution index; Int. J. Adv. Sci. 3 43–45.

  • Mukhopadhyay P, Singh H A K and Singh S S 2005 Two severe Nor’westers in April 2003 over Kolkata using Doppler radar observations and satellite imagery; Weather 60 343–353.

  • Nath S, Kotal S D and Kundu P K 2015 Application of fuzzy clustering technique for analysis of North Indian Ocean Tropical cyclone tracks; Trop. Cyclone Res. Rev. 4 110–123.

  • Nayak H P and Mandal M 2014 Analysis of stability parameters in relation to precipitation associated with pre-monsoon thunderstorms over Kolkata, India; Earth Syst. Sci. 123(4) 689–703.

  • Riordan D and Hansen B K 2002 A fuzzy case-based system for weather prediction; Eng. Int. Syst. 3 139–146.

  • Robeson S M and Doty J A 2005 Identifying rogue air temperature stations using cluster analysis of percentile trends; Bull. Am. Meteorol. Soc. 18 1275–1287.

  • Ruspini E R 1969 A new approach to clustering; Inform. Control. 19 22–32.

  • Sönmez I and Kömüşcü A U 2011 Reclassification of rainfall regions of Turkey by K-means methodology and their temporal variability in relation to North Atlantic Oscillation (NAO); Theor. Appl. Climatol. 106 499–510.

  • Vamsi Krishna G 2015 Prediction of rainfall using unsupervised model based approach using K-means algorithm; Int. J. Math. Sci. Comput. 1 11–20.

  • Varsha K S and Maya L Pai 2018 Rainfall prediction using fuzzy C-mean clustering and fuzzy rule-based classification; Int. J. Pure Appl. Math. 119 597–605.

  • Williams E and Sátori G 2004 Lightning, thermodynamic and hydrological comparison of the two tropical continental chimneys; J. Atmos. Sol. Terr. Phys. 66(13–14) 1213–1231, https://doi.org/10.1016/j.jastp.

  • Xiong Y J, Qie X S, Zhou Y J, Yuan T and Zhang T L 2006 Regional responses of lightning activities to relative humidity of the surface; Chin. J. Geophys. 49(2) 311–318, https://doi.org/10.1002/cjg2.840.

  • Yen John and Langari Reza 1999 Fuzzy logic intelligence control & information; Pearson Education Ltd., pp. 379–383.

  • Zarnani A and Musilek P 2013 Modeling forecast uncertainty using fuzzy clustering; Indust. Environ. Appl. AISC 188 287–296.

Download references

Acknowledgements

Thanks to Dr. Ganesh Kumar Das, Regional Meteorological Centre, India Meteorological Department, Alipore, for supplying the necessary data. Our sincere gratitude to Late Prof. Utpal Kumar De, Ex-emeritus Prof. of School of Environmental Studies, Jadavpur University, without whose active participation and encouragement this study could not have been possible.

Author information

Authors and Affiliations

Authors

Contributions

SC and SG: Conceptualization, methodology, formal analysis, investigation, writing (original draft preparation), writing (review and editing), visualization and supervision. SC, SG and SKM: Data collection.

Corresponding author

Correspondence to Sweta Chakraborty.

Additional information

Communicated by Parthasarathi Mukhopadhyay

Corresponding editor: Parthasarathi Mukhopadhyay

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chakraborty, S., Ghosh, S. & Midya, S.K. Conjunction of hard k-mean and fuzzy c-mean techniques in clustering and identifying some critical meteorological parameters for thunderstorm formation over a metro city of India during pre-monsoon season. J Earth Syst Sci 132, 59 (2023). https://doi.org/10.1007/s12040-023-02059-4

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s12040-023-02059-4

Keywords

Navigation