Abstract
Nowcasts of strong convective precipitation and radar-based quantitative precipitation estimations have always been hot yet challenging issues in meteorological sciences. Data-driven machine learning, especially deep learning, provides a new technical approach for the quantitative estimation and forecasting of precipitation. A high-quality, large-sample, and labeled training dataset is critical for the successful application of machine-learning technology to a specific field. The present study develops a benchmark dataset that can be applied to machine learning for minute-scale quantitative precipitation estimation and forecasting (QpefBD), containing 231,978 samples of 3185 heavy precipitation events that occurred in 6 provinces of central and eastern China from April to October 2016–2018. Each individual sample consists of 8 products of weather radars at 6-min intervals within the time window of the corresponding event and products of 27 physical quantities at hourly intervals that describe the atmospheric dynamic and thermodynamic conditions. Two data labels, i.e., ground precipitation intensity and areal coverage of heavy precipitation at 6-min intervals, are also included. The present study describes the basic components of the dataset and data processing and provides metrics for the evaluation of model performance on precipitation estimation and forecasting. Based on these evaluation metrics, some simple and commonly used methods are applied to evaluate precipitation estimates and forecasts. The results can serve as the benchmark reference for the performance evaluation of machine learning models using this dataset.
This paper also gives some suggestions and scenarios of the QpefBD application. We believe that the application of this benchmark dataset will promote interdisciplinary collaboration between meteorological sciences and artificial intelligence sciences, providing a new way for the identification and forecast of heavy precipitation.
Similar content being viewed by others
References
Blumberg, W. G., K. T. Halbert, T. A. Supinie, et al., 2017: SHARPpy: An open-source sounding analysis toolkit for the atmospheric sciences. Bull. Amer. Meteor. Soc., 98, 1625–1636, doi: https://doi.org/10.1175/BAMS-D-15-00309.1.
Chen, X. C., K. Zhao, and M. Xue, 2014: Spatial and temporal characteristics of warm season convection over Pearl River Delta region, China, based on 3 years of operational radar data. J. Geophys. Res. Atmos., 119, 12,447–12,465, doi: https://doi.org/10.1002/2014JD021965.
Dixon, M., and G. Wiener, 1993: TITAN: Thunderstorm identification, tracking, analysis, and nowcasting—A radar-based methodology. J. Atmos. Oceanic Technol., 10, 785–797, doi: https://doi.org/10.1175/1520-0426(1993)010<0785:TTITAA>2.0.CO;2.
Doswell, C. A. III, H. E. Brooks, and R. A. Maddox, 1996: Flash flood forecasting: An ingredients-based methodology. Wea. Forecasting, 11, 560–581, doi: https://doi.org/10.1175/1520-0434(1996)011<0560:FFFAIB>2.0.CO;2.
Foresti, L., I. V. Sideris, D. Nerini, et al., 2019: Using a 10-year radar archive for nowcasting precipitation growth and decay: A probabilistic machine learning approach. Wea. Forecasting, 34, 1547–1569, doi: https://doi.org/10.1175/WAF-D-18-0206.1.
Gagné, D. J. II, A. McGovern, S. E. Haupt, et al., 2017: Storm-based probabilistic hail forecasting with machine learning applied to convection-allowing ensembles. Wea. Forecasting, 32, 1819–1840, doi: https://doi.org/10.1175/WAF-D-17-0010.1.
Gupta, R., R. Hosfelt, S. Sajeev, et al., 2019: xBD: A dataset for assessing building damage from satellite imagery. Available online at https://arxiv.org/abs/1911.09296. Accessed on 30 December 2021.
Han, L., J. Z. Sun, W. Zhang, et al., 2017: A machine learning nowcasting method based on real-time reanalysis data. J. Geophys. Res. Atmos., 122, 4038–4051, doi: https://doi.org/10.1002/2016JD025783.
Hersbach, H., B. Bell, P. Berrisford, et al., 2020: The ERA5 global reanalysis. Quart. J. Roy. Meteor. Soc., 146, 1999–2049, doi: https://doi.org/10.1002/qj.3803.
Jing, J. R., Q. Li, and X. Peng, 2019: MLC-LSTM: Exploiting the spatiotemporal correlation between multi-level weather radar echoes for echo sequence extrapolation. Sensors, 19, 3988, doi: https://doi.org/10.3390/s19183988.
Johnson, J. T., P. L. MacKeen, A. Witt, et al., 1998: The storm cell identification and tracking algorithm: An enhanced WSR-88D algorithm. Wea. Forecasting, 13, 263–276, doi: https://doi.org/10.1175/1520-0434(1998)013<0263:TSCIAT>2.0.CO;2.
Lagerquist, R., A. McGovern, and T. Smith, 2017: Machine learning for real-time prediction of damaging straight-line convective wind. Wea. Forecasting, 32, 2175–2193, doi: https://doi.org/10.1175/WAF-D-17-0038.1.
Leng, L., X. Y. Huang, H. P. Yang, et al., 2012: Recognition and application of Doppler weather radar clear air echoes. Meteor. Sci. Technol, 40, 534–541, doi: https://doi.org/10.2012/04.004. (in Chinese)
Liu, L. P., L. L. Wu, and Y. M. Yang, 2007: Development of fuzzy-logical two-step ground clutter detection algorithm. Acta Meteor. Sinica, 65, 252–260, doi: https://doi.org/10.3321/j.issn:0577-6619.2007.02.011. (in Chinese)
Liu, Y., D. G. Xi, Z. L. Li, et al., 2015: A new methodology for pixel-quantitative precipitation nowcasting using a pyramid Lucas Kanade optical flow approach. J. Hydrol., 529, 354–364, doi: https://doi.org/10.1016/j.jhydrol.2015.07.042.
Marzban, C., and G. J. Stumpf, 1996: A neural network for tornado prediction based on Doppler radar-derived attributes. J. Appl. Meteor. Climatol., 35, 617–626, doi: https://doi.org/10.1175/1520-0450(1996)035<0617:ANNFTP>2.0.CO;2.
Marzban, C., and A. Witt, 2001: A Bayesian neural network for severe-hail size prediction. Wea. Forecasting, 16, 600–610, doi: https://doi.org/10.1175/1520-0434(2001)016<0600:ABNNFS>2.0.CO;2.
Mecikalski, J. R., J. K. Williams, C. P. Jewett, et al., 2015: Probabilistic 0–1-h convective initiation nowcasts that combine geostationary satellite observations and numerical weather prediction model data. J. Appl. Meteor. Climatol., 54, 1039–1059, doi: https://doi.org/10.1175/JAMC-D-14-0129.1.
Pan, Y., Y. Shen, J. J. Yu, et al., 2015: An experiment of high-resolution gauge-radar-satellite combined precipitation retrieval based on the Bayesian merging method. Acta Meteor. Sinica, 73, 177–186, doi: https://doi.org/10.11676/qxxb2015.010. (in Chinese)
Perler, D., and O. Marchand, 2009: A study in weather model output postprocessing: Using the boosting method for thunderstorm detection. Wea. Forecasting, 24, 211–222, doi: https://doi.org/10.1175/2008WAF2007047.1.
Rasp, S., P. D. Dueben, S. Scher, et al., 2020: WeatherBench: A benchmark data set for data-driven weather forecasting. J. Adv. Model. Earth Syst., 12, e2020MS002203, doi: https://doi.org/10.1029/2020MS002203.
Reichstein, M., G. Camps-Valls, B. Stevens, et al., 2019: Deep learning and process understanding for data-driven Earth system science. Nature, 566, 195–204, doi: https://doi.org/10.1038/s41586-019-0912-1.
Russakovsky, O., J. Deng, H. Su, et al., 2015: ImageNet large scale visual recognition challenge. Int. J. Comput. Vis., 115, 211–252, doi: https://doi.org/10.1007/s11263-015-0816-y.
Shi, X. J., Z. R. Chen, H. Wang, et al., 2015: Convolutional LSTM network: A machine learning approach for precipitation now-casting. Proceedings of the 28th International Conference on Neural Information Processing Systems, MIT, Montréal, Canada, 802–810.
Shi, X. J., Z. H. Gao, L. Lausen, et al., 2017: Deep learning for precipitation nowcasting: A benchmark and a new model. Proceedings of the 31st International Conference on Neural Information Processing Systems, Curran Associates Inc., Long Beach, CA, USA, 5622–5632.
Sønderby, C. K., L. Espeholt, J. Heek, et al., 2020: MetNet: A neural weather model for precipitation forecasting. Available online at https://arxiv.org/pdf/2003.12140.pdf. Accessed on 30 December 2021.
Su, H., J. Deng, and F.-F. Li, 2012: Crowdsourcing annotations for visual object detection. Available online at http://vision.stanford.edu/pdf/bbox_submission.pdf. Accessed on 30 December 2021.
Sun, J. Z., M. Xue, J. W. Wilson, et al., 2014: Use of NWP for nowcasting convective precipitation: Recent progress and challenges. Bull. Amer. Meteor. Soc., 95, 409–426, doi: https://doi.org/10.1175/BAMS-D-11-00263.1.
Tan, X., L. P. Liu, and S. R. Fan, 2013: Statistical characteristics of sea clutter and its identification with the CINRAD. Acta Meteor. Sinica, 71, 962–975, doi: https://doi.org/10.11676/qxxb2013.074. (in Chinese)
Tang, X. W., J. P. Tang, and X. L. Zhang, 2010: An ingredient-based operational heavy rain quantitative forecast system. J. Nanjing Univ. (Nat. Sci.), 46, 277–283. (in Chinese)
Weber, E., and H. Kané, 2020: Building disaster damage assessment in satellite imagery with multi-temporal fusion. Available online at https://arxiv.org/pdf/2004.05525.pdf. Accessed on 30 December 2021.
Wen, H., L. P. Liu, C. A. Zhang, et al., 2016: Operational evaluation of radar data quality control for ground clutter and electromagnetic interference. J. Meteor. Sci., 36, 789–799, doi: https://doi.org/10.3969/2015jms.0085. (in Chinese)
Xiao, Y. J., and L. P. Liu, 2006: Study of methods for interpolating data from weather radar network to 3-D grid and mosaics. Acta Meteor. Sinica, 64, 647–657, doi: https://doi.org/10.3321/j.issn:0577-6619.2006.05.011. (in Chinese)
Xiao, Y. J., L. P. Liu, and H. P. Yang, 2008: Technique for generating hybrid reflectivity field based on 3-D mosaicked reflectivity of weather radar network. Acta Meteor. Sinica, 66, 470–473, doi: https://doi.org/10.3221/j.issn:0777-6612.0000.00.016. (in Chinese)
Ying, M., W. Zhang, H. Yu, et al., 2014: An overview of the China meteorological administration tropical cyclone database. J. Atmos. Oceanic Technol., 31, 287–301, doi: https://doi.org/10.1175/JTECH-D-12-00119.1.
Yu, X. D., and Y. G. Zheng, 2020: Advances in severe convection research and operation in China. J. Meteor. Res., 44, 189–217, doi: https://doi.org/10.1007/s13351-020-9875-2.
Yu, X. D., X. P. Yao, T. N. Xiong, et al., 2006: The Principle and Operational Application of Doppler Weather Radar. China Meteorological Press, Beijing, 185 pp. (in Chinese)
Zhang, W., L. Han, J. Z. Sun, et al., 2019: Application of multichannel 3D-cube successive convolution network for convective storm nowcasting. 2019 IEEE International Conference on Big Data (Big Data), IEEE, Los Angeles, CA, USA, 1705–1710.
Zhang, X. L., S. Y. Tao, and J. H. Sun, 2010: Ingredients-based heavy rainfall forecasting. Chinese J. Atmos. Sci., 44, 754–766. (in Chinese)
Zhang, X. L., Y. Chen, and T. Zhang, 2012: Meso-scale convective weather analysis and severe convective weather forecasting. Acta Meteor. Sinica, 70, 642–654, doi: https://doi.org/10.11676/qxxb2012.052. (in Chinese)
Zhang, X. L., J. H. Sun, Y. G. Zheng, et al., 2020: Progress in severe convective weather forecasting in China since the 1950s. J. Meteor. Res., 34, 699–719, doi: https://doi.org/10.1007/s13351-020-9146-2.
Zhou, K. H., Y. G. Zheng, B. Li, et al., 2019: Forecasting different types of convective weather: A deep learning approach. J. Meteor. Res., 33, 797–809, doi: https://doi.org/10.1007/s13351-019-8162-6.
Zhou, K. H., Y. G. Zheng, W. S. Dong, et al., 2020: A deep learning network for cloud-to-ground lightning nowcasting with multisource data. J. Atmos. Oceanic Technol., 37, 927–942, doi: https://doi.org/10.1175/JTECH-D-19-0146.1.
Author information
Authors and Affiliations
Corresponding author
Additional information
Supported by the National Key Research and Development Program of China (2018YFC1507305).
Rights and permissions
About this article
Cite this article
Xiong, A., Liu, N., Liu, Y. et al. QpefBD: A Benchmark Dataset Applied to Machine Learning for Minute-Scale Quantitative Precipitation Estimation and Forecasting. J Meteorol Res 36, 93–106 (2022). https://doi.org/10.1007/s13351-022-1140-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13351-022-1140-4