Abstract
The western North Pacific Subtropical High (WNPSH) is a key circulation system regulating the East Asian climate, and its area has crucial indicative for summer precipitation in China. In this study, we established the model for classification prediction of summer WNPSH area via the Gaussian Naive Bayes (GNB). By setting different category proportions and different training set sample sizes, we investigated the prediction ability of GNB and its dependent on the data sample size. After comparing the prediction performance of GNB with tree models which were commonly used in short-term climate prediction, it was found that the accuracy scores (ACC) and balanced accuracy scores (BCC) of GNB were statistically significantly higher than tree models. Additionally, under different category classification criteria, the ACC and BCC of GNB could maintain above 0.77 and 0.75, respectively. Especially for anomalous categories, the recalls values could maintain above 0.5. These results indicate that the GNB had very strong prediction ability for the summer WNPSH area and could also better predict the degree of anomalous WNPSH area. Moreover, under different training set sample sizes, the ACC of GNB could be maintained above 0.6, which suggests that the GNB was less dependent on the data sample size and could reduce the limitation of abrupt interdecadal changes in climate on the available data sample size to some extent. This study reveals strong prediction ability of GNB for the summer WNPSH area, which also has high reference value for the research of other short-term climate prediction problems.
Similar content being viewed by others
Data availability
Enquiries about data availability should be directed to the authors.
Change history
23 September 2022
A Correction to this paper has been published: https://doi.org/10.1007/s00382-022-06514-8
References
Breiman L (2001) Random forests. Mach Learn 45:5–32. https://doi.org/10.1023/A:1010933404324
Breiman L, Friedman JH, Olshen RA, Stone CJ (1984) Classifcation and regression trees. Monterey, Wadsworth and Brooks/Cole
Champagne C, Davidson A, Cherneski P, L’Heureux J, Hadwen T (2015) Monitoring agricultural risk in Canada using L-band passive microwave soil moisture from SMOS. J Hydrometeorol 16:5–18. https://doi.org/10.1175/JHM-D-14-0039.1
Chen LT (1977) Influences of sea surface temperature anomalies over equatorial Eastern Pacific on tropical circulation and flood-season precipitation in China. Chinese J Atmos Sci 1(1):1–12 (in Chinese)
Choubin B, Zehtabian G, Azareh A, Rafiei-Sardooi E, Sajedi-Hosseini F, Kişi Ö (2018) Precipitation forecasting using classification and regression trees (CART) model: a comparative study of different approaches. Environ Earth Sci 77:314. https://doi.org/10.1007/s12665-018-7498-z
Danandeh MA (2021) Seasonal rainfall hindcasting using ensemble multi-stage genetic programming. Theor Appl Climatol 143:461–472. https://doi.org/10.1007/s00704-020-03438-3
Ding R, Ha KJ, Li JP (2010) Interdecadal shift in the relationship between the East Asian summer monsoon and the tropical Indian Ocean. Clim Dyn 34:1059–1071. https://doi.org/10.1007/s00382-009-0555-2
Ding R, Li JP, Zheng F, Feng J, Liu DQ (2016) Estimating the limit of decadal-scale climate predictability using observational data. Clim Dyn 46:1563–1580. https://doi.org/10.1007/s00382-015-2662-6
Englehart PJ, Douglas AV (2009) Diagnosing warm-season rainfall variability in Mexico: a classification tree approach. Int J Climatol 30:694–704. https://doi.org/10.1002/joc.1934
Friedman JH (2001) Greedy function approximation: a gradient boosting machine. Ann Stat 29:1189–1232. https://doi.org/10.1214/AOS/1013203451
Graham NE (1994) Decadal-scale climate variability in the tropical and North Pacific during the 1970s and 1980s: observations and model results. Clim Dyn 10:135–162. https://doi.org/10.1007/BF00210626
Guo T et al (2021) Machine learning models for predicting in-hospital mortality in acute aortic dissection patients. Front Cardiovasc Med 8:727773. https://doi.org/10.3389/fcvm.2021.727773
Ham YG, Kug JS, Park JY, Jin FF (2013) Sea surface temperature in the north tropical Atlantic as a trigger for El Niño/Southern Oscillation events. Nature Geosci 6:112–116. https://doi.org/10.1038/ngeo1686
Ham YG, Kim JH, Luo JJ (2019) Deep learning for multi-year ENSO forecasts. Nature 573:568–572. https://doi.org/10.1038/s41586-019-1559-7
He C, Zhou T (2014) The two interannual variability modes of the Western North Pacific Subtropical High simulated by 28 CMIP5-AMIP models. Clim Dyn 43:2455–2469. https://doi.org/10.1007/s00382-014-2068-x
He C, Zhou TJ, Wu B (2015) The key oceanic regions responsible for the interannual variability of the western North Pacifific subtropical high and associated mechanisms. J Meteor Res 29:562–575. https://doi.org/10.1007/s13351-015-5037-3
Hong CC, Lee MY, Hsu HH, Lin NH, Tsuang BJ (2015) Tropical SST forcing on the anomalous WNP subtropical high during July-August 2010 and the record-high SST in the tropical Atlantic. Clim Dyn 45:633–650. https://doi.org/10.1007/s00382-014-2275-5
Huang RH (1986) Physical mechanism of influence of heat source anomaly over low latitudes on general circulation over northern hemisphere in winter. Sci China Series B 1:91–103 (in Chinese)
Huang G, Hu KM, Xie SP (2010) Strengthening of tropical Indian Ocean teleconnection to the Northwest Pacific since the Mid-1970s: an atmospheric GCM study. J Clim 23(19):5294–5304. https://doi.org/10.1175/2010jcli3577.1
Ishii M, Shouji A, Sugimoto S, Takanori M (2005) Objective analyses of sea-surface temperature and marine meteorological variables for the 20th century using ICOADS and the Kobe Collection. Int J Climatol 25:865–879. https://doi.org/10.1002/joc.1169
Jia YJ, Hu YJ, Zhong Z, Zhu YM (2015) Statistical forecast model of western pacific subtropical high indices in summer. Plateau Meteor 34(05):1369–1378 (in Chinese)
Kamel H, Abdulah D, Al-Tuwaijari JM (2019) "Cancer Classification Using Gaussian Naive Bayes Algorithm" 2019 International Engineering Conference (IEC), pp165–170. https://doi.org/10.1109/IEC47844.2019.8950650
Khamis A, Ismail Z, Haron K, Mohammed AT (2005) The effects of outliers data on neural network performance. J Appl Sci 5:1394–1398. https://doi.org/10.3923/jas.2005.1394.1398
Lala J, Seifu T, Paul B (2020) Predicting rainy season onset in the Ethiopian highlands for agricultural planning. J Hydrometeorology 21:1675–1688. https://doi.org/10.1175/JHM-D-20-0058.1
Li CY, Hu J (1987) A study on interaction between the East Asia atmosphere circulation and El Niño. Chinese J Atmos Sci 11(4):359–364 (in Chinese)
Li JP, Hsu HH, Wang WC, Ha K-J, Li T, Kitoh A (2018) East Asian climate under global warming: understanding and projection. Clim Dyn 51:3969–3972. https://doi.org/10.1007/s00382-018-4523-6
Li WD, Gao X, Hao ZH, Sun R (2022) Using deep learning for precipitation forecasting based on spatio-temporal information: a case study. Clim Dyn 58:443–457. https://doi.org/10.1007/s00382-021-05916-4
Liu YY, Li WJ, Zuo JQ, Hu ZZ (2014) Simulation and projection of the western pacific subtropical high in CMIP5 models. J Meteorol Res 28:327–340. https://doi.org/10.1007/s13351-014-3151-2
Mahmud S, Islam MA (2019) Predictive ability of covariate-dependent Markov models and classification tree for analyzing rainfall data in Bangladesh. Theor Appl Climatol 138:335–346. https://doi.org/10.1007/s00704-019-02812-0
Meko DM, Baisan CH (2001) Pilot study of latewood-width of conifers as an indicator of variability of summer rainfall in the North American monsoon region. Int J Climatol 21:697–708. https://doi.org/10.1002/joc.646
Mosley L (2010) A balanced approach to the multi-class imbalance problem, IJCV
Ng W, Minasny B, Mendes WDS, Demattê JAM (2020) The influence of training sample size on the accuracy of deep learning models for the prediction of soil properties with near-infrared spectroscopy data. Soil 6(2):565–578. https://doi.org/10.5194/soil-6-565-2020
Nourani V, Razzaghzadeh Z, Baghanam AH, Molajou A (2019) ANGBNased statistical downscaling of climatic parameters using decision tree predictor screening method. Theor Appl Climatol 137:1729–1746. https://doi.org/10.1007/s00704-018-2686-z
Prasad AM, Iverson LR, Liaw A (2006) Newer classififi-cation and regression tree techniques: bagging and random forests for ecological prediction. Ecosystems 9:181–199. https://doi.org/10.1007/s10021-005-0054-1
Pundlik R (2016) Comparison of sensitivity for consumer loan data using Gaussian Naïve Bayes (GNB) and Logistic Regression (LR). In: 2016 7th International Conference on Intelligent Systems, Modelling and Simulation (ISMS), pp 120–124. https://doi.org/10.1109/ISMS.2016.57
Qian QF, Jia X, Lin H, Zhang R (2021) Seasonal forecast of nonmonsoonal winter precipitation over the Eurasian continent using machine-learning models. J Clim 34(17):7113–7129
Qian QF, Jia XJ, Lin H (2020) Machine learning models for the seasonal forecast of winter surface air temperature in North America. Earth Space Sci. https://doi.org/10.1029/2020EA001140
Rhee J, Park S, Lu Z (2014) Relationship between land cover patterns and surface temperature in urban areas. Gisci Remote Sens 51:521–536. https://doi.org/10.1080/15481603.2014.964455
Tong X, Yan ZW, Xia JJ, Lou X (2019) Decisive atmospheric circulation indices for July-August precipitation in north China based on tree models. J Hydrometeorol. https://doi.org/10.1175/JHM-D-19-0045.s1
Vandal T, Kodra E, Ganguly AR (2019) Intercomparison of machine learning methods for statistical downscaling: the case of daily and extreme precipitation. Theor Appl Climatol 137:557–570. https://doi.org/10.1007/s00704-018-2613-3
Wang B, Xiang B, Lee JY (2013) Subtropical high predictability establishes a promising way for monsoon and tropical storm predictions. Proc Natl Acad Sci USA 110:2718–2722. https://doi.org/10.1073/pnas.1214626110
Wang JL, Yang J, Ren HL, Li JX, Bao Q, Gao MN (2021) Dynamical and machine learning hybrid seasonal prediction of summer rainfall in China. J Meteor Res 35:583–593. https://doi.org/10.1007/s13351-021-0185-0
Xiang BQ, Wang B, Yu WD, Xu SB (2013) How can anomalous western North Pacifific subtropical high intensify in late summer? Geophys Res Lett 40:2349–2354. https://doi.org/10.1002/grl.50431
Xie SP, Du Y, Huang G, Zheng X, Tokinaga H, Hu K, Liu Q (2010) Decadal shift in El Niño influences on Indo-Western Pacific and East Asian Climate in the 1970s. J Clim 23(12):3352–3368. https://doi.org/10.1175/2010JCLI3429.1
Xie SP, Hu KM, Hafner J, Tokinaga H, Du Y, Huang G (2009) Indian Ocean capacitor effect on Indo-Western Pacific climate during the summer following El Niño. J Climate 22(3):730–747. https://doi.org/10.1175/2008JCLI2544.1
Yun KS, Ha KJ, Yeh SW, Wang B, Xiang BQ (2015) Critical role of boreal summer North Pacific subtropical highs in ENSO transition. Clim Dyn 44:1979–1992. https://doi.org/10.1007/s00382-014-2193-6
Zhang R, Akimasa S, Masahide K (1999) A diagnostic study of the impact of El Nino on the precipitation in China. Adv Atmos Sci 16(2):229–241. https://doi.org/10.1007/bf02973084
Zhi R, Chen LY, Zhu XY (2017) Analysis of characteristics and causes of precipitation anomalies over Northern China in Autumn. Meteorological Monthly 44:572–581 (in Chinese)
Zhou TJ, Yu RC (2005) Atmospheric water vapor transport associated with typical anomalous summer rainfall patterns in China. J Geophys Res Atmos 110:D08104. https://doi.org/10.1029/2004jd005413
Zou LW, Zhou TJ, Wu B, Chen HM, Li LJ (2009) The interannual variability of summertime western Pacific subtropical high hindcasted by GAMIL CliPAS experiments. Chinese J Atmospheric Sci 33(5):959–970 (in Chinese)
Acknowledgements
This work was supported by the National Natural Science Foundation of China (41975076, and 42175067) and the National Key Research and Development Program of China (2017YFC1502305, and 2019YFA0607104).
Funding
This work was supported by the National Natural Science Foundation of China (41975076, and 42175067, 41775069) and the National Key Research and Development Program of China (2017YFC1502305, and 2019YFA0607104).
Author information
Authors and Affiliations
Corresponding authors
Ethics declarations
Conflict of interest
The authors have not disclosed any competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Li, D., Hu, S., He, W. et al. The area prediction of western North Pacific Subtropical High in summer based on Gaussian Naive Bayes. Clim Dyn 59, 3193–3210 (2022). https://doi.org/10.1007/s00382-022-06252-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00382-022-06252-x