Abstract
Machine learning methods are effective tools for improving short-term climate prediction. However, commonly used methods often carry out classification and regression prediction modeling separately and independently. Such a single modeling approach may obtain inconsistent prediction results in classification and regression and thus may not meet the needs of practical applications well. To address this issue, this study proposes a selective Naive Bayes ensemble model (SENB-EM) by introducing causal effect and voting strategy on Naive Bayes. The new model can not only screen effective predictors but also perform classification and regression prediction simultaneously. After being applied to the area prediction of summer western North Pacific subtropical high (WNPSH) from 2008 to 2021, it is found that the accuracy classification score (a metric to assess the overall classification prediction accuracy) and the time correlation coefficient (TCC) of SENB-EM can reach 1.0 and 0.81, respectively. After integrating the results of different models [including multiple linear regression ensemble model (MLR-EM), SENB-EM, and Chinese Multi-model Ensemble Prediction System (CMME) used by National Climate Center (NCC)] for 2017–2021, the TCC of the ensemble results of SENB-EM and CMME can reach 0.92 (the highest result among them). This indicates that the prediction results of the summer WNPSH area provided by SENB-EM have a high reference value for the real-time prediction. It is worth noting that, except for the numerical prediction results, the SENB-EM model can also give the range of numerical prediction intervals and predictions for anomalous degrees of the WNPSH area, thus providing more reference information for meteorological forecasters. Overall, as a new hybrid machine learning model, the SENB-EM has a good prediction ability; the approach of performing classification prediction and regression prediction simultaneously through integration is informative to short-term climate prediction.
References
Andersson, T. R., J. S. Hosking, M. Pérez-Ortiz, et al., 2021: Seasonal Arctic sea ice forecasting with probabilistic deep learning. Nat. Commun., 12, 5124, doi: https://doi.org/10.1038/s41467-021-25257-4.
Breiman, L., 1996: Bagging predictors. Mach. Learn., 24, 123–140, doi: https://doi.org/10.1023/A:1018054314350.
Breiman, L., 2001: Random forests. Mach. Learn., 45, 5–32, doi: https://doi.org/10.1023/A:1010933404324.
Choubin, B., G. Zehtabian, A. Azareh, et al., 2018: Precipitation forecasting using classification and regression trees (CART) model: A comparative study of different approaches. Environ. Earth Sci., 77, 314, doi: https://doi.org/10.1007/s12665-018-7498-z.
Fan, P. Y., J. Yang, Z. P. Zhang, et al., 2022: Summer precipitation prediction in eastern China based on machine learning. Climate Dyn., doi: https://doi.org/10.1007/s00382-022-06464-1.
Friedman, N., D. Geiger, and M. Goldszmidt, 1997: Bayesian network classifiers. Mach. Learn., 29, 131–163, doi: https://doi.org/10.1023/A:1007465528199.
Ham, Y.-G., J.-H. Kim, and J.-J. Luo, 2019: Deep learning for multi-year ENSO forecasts. Nature, 573, 568–572, doi: https://doi.org/10.1038/s41586-019-1559-7.
He, C., T. J. Zhou, and B. Wu, 2015: The key oceanic regions responsible for the interannual variability of the western North Pacific subtropical high and associated mechanisms. J. Meteor. Res., 29, 562–575, doi: https://doi.org/10.1007/s13351-015-5037-3.
Hernán, M. A., and J. M. Robins, 2020: A definition of causal effect. Causal Inference: What If, M. A. Hernán, and J. M. Robins, Eds., Chapman & Hall/CRC, Boca Raton, 311 pp.
Hoerl, A. E., and R. W. Kennard, 1970: Ridge regression: Biased estimation for nonorthogonal problems. Technometrics, 12, 55–67, doi: https://doi.org/10.1080/00401706.1970.10488634.
Hong, C.-C., T.-C. Chang, and H.-H. Hsu, 2014: Enhanced relationship between the tropical Atlantic SST and the summertime western North Pacific subtropical high after the early 1980s. J. Geophys. Res. Atmos., 119, 3715–3722, doi: https://doi.org/10.1002/2013JD021394.
Ishii, M., A. Shouji, S. Sugimoto, et al., 2005: Objective analyses of sea-surface temperature and marine meteorological variables for the 20th century using ICOADS and the Kobe Collection. Int. J. Climatol., 25, 865–879, doi: https://doi.org/10.1002/joc.1169.
Jia, X. L., L. J. Chen, H. Gao, et al., 2013: Advances of the short-range climate prediction in China. J. Appl. Meteor. Sci., 24, 641–655. (in Chinese)
Jiang, L. X., C. Q. Li, S. S. Wang, et al., 2016: Deep feature weighting for Naive Bayes and its application to text classification. Eng. Appl. Artif. Intell., 52, 26–39, doi: https://doi.org/10.1016/j.engappai.2016.02.002.
Li, D. Q., S. J. Hu, W. P. He, et al., 2022: The area prediction of western North Pacific subtropical high in summer based on Gaussian Naive Bayes. Climate Dyn., doi: https://doi.org/10.1007/s00382-022-06252-x.
Li, Y., X. Q. Yang, and Q. Xie, 2010: Selective interaction between interannual variability of North Pacific subtropical high and ENSO cycle. Chinese J. Geophys., 53, 1543–1553. (in Chinese)
Meng, J., J. F. Fan, J. Ludescher, et al., 2020: Complexity-based approach for El Niño magnitude forecasting before the spring predictability barrier. Proc. Natl. Acad. Sci. USA, 117, 177–183, doi: https://doi.org/10.1073/pnas.1917007117.
Nayak, M. A., and S. Ghosh, 2013: Prediction of extreme rainfall event using weather pattern recognition and support vector machine classifier. Theor. Appl. Climatol., 114, 583–603, doi: https://doi.org/10.1007/s00704-013-0867-3.
Nooteboom, P. D., Q. Y. Feng, C. López, et al., 2018: Using network theory and machine learning to predict El Niño. Earth Syst. Dynam., 9, 969–983, doi: https://doi.org/10.5194/esd-9-969-2018.
Pham, B. T., D. T. Bui, H. R. Pourghasemi, et al., 2017: Landslide susceptibility assesssment in the Uttarakhand area (India) using GIS: A comparison study of prediction capability of naïve bayes, multilayer perceptron neural networks, and functional trees methods. Theor. Appl. Climatol., 128, 255–273, doi: https://doi.org/10.1007/s00704-015-1702-9.
Pundlik, R., 2016: Comparison of sensitivity for consumer loan data using Gaussian Naïve Bayes (GNB) and Logistic Regression (LR). Proceedings 2016 7th International Conference on Intelligent Systems, Modelling and Simulation (ISMS), IEEE, Bangkok, Thailand, 120–124, doi: https://doi.org/10.1109/ISMS.2016.57.
Qian, Q. F., X. J. Jia, and H. Lin, 2020: Machine learning models for the seasonal forecast of winter surface air temperature in North America. Earth Space Sci., 7, e2020EA001140, doi: https://doi.org/10.1029/2020EA001140.
Sun, J. Q., J. Ming, M. Q. Zhang, et al., 2018: Circulation features associated with the record-breaking rainfall over South China in June 2017. J. Climate, 31, 7209–7224, doi: https://doi.org/10.1175/JCLI-D-17-0903.1.
Tibshirani, R., 1996: Regression shrinkage and selection via the lasso. J. Roy. Stat. Soc.: Ser. B, 58, 267–288, doi: https://doi.org/10.1111/j.2517-6161.1996.tb02080.x.
Tong, X., Z. W. Yan, J. J. Xia, et al., 2019: Decisive atmospheric circulation indices for July–August precipitation in North China based on tree models. J. Hydrometeor., 20, 1707–1720, doi: https://doi.org/10.1175/JHM-D-19-0045.s1.
Wahiduzzaman, M., A. Yeasmin, J.-J. Luo, et al., 2021: Markov Chain Monte Carlo simulation and regression approach guided by El Niño-Southern Oscillation to model the tropical cyclone occurrence over the Bay of Bengal. Climate Dyn., 56, 2693–2713, doi: https://doi.org/10.1007/s00382-020-05610-x.
Wang, B., B. Q. Xiang, and J.-Y. Lee, 2013: Subtropical high predictability establishes a promising way for monsoon and tropical storm predictions. Proc. Natl. Acad. Sci. USA, 110, 2718–2722, doi: https://doi.org/10.1073/pnas.1214626110.
Wang, L., J.-Y. Yu, and H. Paek, 2017: Enhanced biennial variability in the Pacific due to Atlantic capacitor effect. Nat. Commun., 8, 14887, doi: https://doi.org/10.1038/ncommsl4887.
Wei, F. Y., 2011: Physical basis of short-term climate prediction in China and short-term climate objective prediction methods. J. Appl. Meteor. Sci., 22, 1–11. (in Chinese)
Wu, Z. W., B. Wang, J. P. Li, et al., 2009: An empirical seasonal prediction model of the East Asian summer monsoon using ENSO and NAO. J. Geophys. Res. Atmos., 114, D18120, doi: https://doi.org/10.1029/2009JD011733.
Xiang, B. Q., B. Wang, W. D. Yu, et al., 2013: How can anomalous western North Pacific subtropical high intensify in late summer. Geophys. Res. Lett., 40, 2349–2354, doi: https://doi.org/10.1002/grl.50431.
Xiao, M. Z., Q. Zhang, V. P. Singh, et al., 2017: Probabilistic forecasting of seasonal drought behaviors in the Huai River basin, China. Theor. Appl. Climatol., 128, 667–677, doi: https://doi.org/10.1007/s00704-016-1733-x.
Xie, S.-P., K. M. Hu, J. Hafner, et al., 2009: Indian Ocean capacitor effect on Indo-Western Pacific climate during the summer following El Niño. J. Climate, 22, 730–747, doi: https://doi.org/10.1175/2008JCLI2544.1.
Xue, F., H. J. Wang, and J. H. He, 2004: Interannual variability of Mascarene high and Australian high and their influences on East Asian summer monsoon. J. Meteor. Soc. Japan, Ser. II, 82, 1173–1186, doi: https://doi.org/10.2151/jmsj.2004.1173.
Yan, M., Y. F. Qian, and J. Liu, 2011: Interdecadal variations of the western Pacific subtropical high and surface heat flux over East Asia and their relationship. Acta Meteor. Sinica, 25, 156, doi: https://doi.org/10.1007/s13351-011-0023-x.
Yang, J. L., Q. Y. Liu, and Z. Y. Liu, 2010: Linking observations of the Asian monsoon to the Indian Ocean SST: Possible roles of Indian Ocean basin mode and dipole mode. J. Climate, 23, 5889–5902, doi: https://doi.org/10.1175/2010JCLI2962.1.
Yang, S. X., F. H. Ling, W. S. Ying, et al., 2022: A brief overview of the application of artificial intelligence to climate prediction. Trans. Atmos. Sci., 1–22. Available online at https://mc03.manuscriptcentral.com/acta-e. Accessed on 16 November 2022.
Yim, S.-Y., B. Wang, and W. Xing, 2014: Prediction of early summer rainfall over South China by a physical-empirical model. Climate Dyn., 43, 1883–1891, doi: https://doi.org/10.1007/s00382-013-2014-3.
Yim, S.-Y., B. Wang, W. Xing, et al., 2015: Prediction of Meiyu rainfall in Taiwan by multi-lead physical-empirical models. Climate Dyn., 44, 3033–3042, doi: https://doi.org/10.1007/s00382-014-2340-0.
Funding
Supported by the National Natural Science Foundation of China (42130610, 41975076, and 42175067) and National Key Research and Development Program of China (2019YFA0607104).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Li, D., Hu, S., Guo, J. et al. A New Hybrid Machine Learning Model for Short-Term Climate Prediction by Performing Classification Prediction and Regression Prediction Simultaneously. J Meteorol Res 36, 853–865 (2022). https://doi.org/10.1007/s13351-022-1214-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13351-022-1214-3