Applying machine learning to forecast daily Ambrosia pollen using environmental and NEXRAD parameters

  • Gebreab K. ZewdieEmail author
  • Xun Liu
  • Daji Wu
  • David J. Lary
  • Estelle Levetin
Part of the following topical collections:
  1. Topical Collection on Geospatial Technology in Environmental Health Applications


Approximately 50 million Americans have allergic diseases. Airborne plant pollen is a significant trigger for several of these allergic diseases. Ambrosia (ragweed) is known for its abundant production of pollen and its potent allergic effect in North America. Hence, estimating and predicting the daily atmospheric concentration of pollen (ragweed pollen in particular) is useful for both people with allergies and for the health professionals who care for them. In this study, we show that a suite of variables including meteorological and land surface parameters, as well as next-generation radar (NEXRAD) measurements together with machine learning can be used to estimate successfully the daily pollen concentration. The supervised machine learning approaches we used included random forests, neural networks, and support vector machines. The performance of the training is independently validated using 10% of the data partitioned using the holdout cross-validation method from the original dataset. The random forests (R= 0.61, R2= 0.37), support vector machines (R= 0.51, R2= 0.26), and neural networks (R= 0.46, R2= 0.21) effectively predicted the daily Ambrosia pollen, where the correlation coefficient (R) and R-squared (R2) values are given in brackets. Three independent approaches—the random forests, correlation coefficients, and interaction information—were employed to rank the relative importance of the available predictors.


Pollen Machine learning Environmental parameters NEXRAD measurements 



  1. Andrews, C.P., Ratner, P.H., Ehler, B.R., Brooks, E.G., Pollock, B.H., Ramirez, D.A., Jacobs, R.L. (2013). The mountain cedar model in clinical trials of seasonal allergic rhinoconjunctivitis. Annals of Allergy, Asthma & Immunology, 111(1), 9–13.CrossRefGoogle Scholar
  2. Arizmendi, C., Sanchez, J., Ramos, N., Ramos, G. (1993). Time series predictions with neural nets: application to airborne pollen forecasting. International Journal of Biometeorology, 37(3), 139–144.CrossRefGoogle Scholar
  3. Basak, D., Pal, S., Patranabis, D.C. (2007). Support vector regression. Neural Information Processing-Letters and Reviews, 11(10), 203–224.Google Scholar
  4. Breiman, L. (2001). Random forests. Machine learning, 45(1), 5–32.CrossRefGoogle Scholar
  5. Britton, J., Pavord, I., Richards, K., Knox, A., Wisniewski, A., Wahedna, I., Kinnear, W., Tattersfield, A., Weiss, S. (1994). Factors influencing the occurrence of airway hyperreactivity in the general population: the importance of atopy and airway calibre. European Respiratory Journal, 7(5), 881–887.Google Scholar
  6. Brown, G. (2009). A new perspective for information theoretic feature selection. In: AISTATS, pages 49–56.Google Scholar
  7. Castellano-Méndez, M., Aira, M., Iglesias, I., Jato, V., González-Manteiga, W. (2005). Artificial neural networks as a useful tool to predict the risk level of Betula pollen in the air. International Journal of Biometeorology, 49(5), 310–316.CrossRefGoogle Scholar
  8. Csépe, Z., Makra, L., Voukantsis, D., Matyasovszky, I., Tusnády, G., Karatzas, K., Thibaudon, M. (2014). Predicting daily ragweed pollen concentrations using computational intelligence techniques over two heavily polluted areas in europe. Science of the Total Environment, 476, 542–552.CrossRefGoogle Scholar
  9. D’amato, G., & Spieksma, F.T.M. (1991). Allergenic pollen in Europe. Grana, 30(1), 67–70.CrossRefGoogle Scholar
  10. Darbellay, G.A., Vajda, I., et al. (1999). Estimation of the information by an adaptive partitioning of the observation space. IEEE Transactions on Information Theory, 45(4), 1315–1321.CrossRefGoogle Scholar
  11. Doviak, R.J., & Zrnic, D.S. (2014). Doppler radar & weather observations. Academic Press.Google Scholar
  12. Ernst, P., Ghezzo, H., Becklake, M. (2002). Risk factors for bronchial hyperresponsiveness in late childhood and early adolescence. European Respiratory Journal, 20(3), 635–639.CrossRefGoogle Scholar
  13. Esch, R.E., Hartsell, C.J., Crenshaw, R., Jacobson, R.S. (2001). Common allergenic pollens, fungi, animals, and arthropods. Clinical Reviews in Allergy and Immunology, 21(2), 261–292.CrossRefGoogle Scholar
  14. Friedman, J., Hastie, T., Tibshirani, R. (2001). The elements of statistical learning Vol. 1. Berlin: Springer series in statistics Springer.Google Scholar
  15. Gali, R.K. (2010). Assessment of NEXRAD P3 data on streamflow simulation using SWAT for North Fork Ninnescah watershed, Kansas. PhD thesis: Kansas State University.Google Scholar
  16. Gardner, M.W., & Dorling, S. (1998). Artificial neural networks (the multilayer perceptron)—a review of applications in the atmospheric sciences. Atmospheric Environment, 32(14), 2627–2636.CrossRefGoogle Scholar
  17. Genuer, R., Poggi, J.-M., Tuleau-Malot, C. (2010). Variable selection using random forests. Pattern Recognition Letters, 31(14), 2225–2236.CrossRefGoogle Scholar
  18. Guzella, T.S., & Caminhas, W.M. (2009). A review of machine learning approaches to spam filtering. Expert Systems with Applications, 36(7), 10206–10222.CrossRefGoogle Scholar
  19. Hannesen, R., & Weipert, A. (2003). Detection of dust storms with a C-band doppler radar. Germany: AMS-Gematronik.Google Scholar
  20. Haykin, S. (1994). Neural networks: a comprehensive foundation. New York: Macmillan College Publishing Company.Google Scholar
  21. Haykin, S.S., & et al. (2001). Kalman filtering and neural networks. Wiley Online Library.Google Scholar
  22. Haykin, S.S., Haykin, S.S., Haykin, S.S., Haykin, S.S. (2009). Neural networks and learning machines, volume 3. Pearson Upper Saddle River, NJ, USA.Google Scholar
  23. Hirst, J. (1952). An automatic volumetric spore trap. Annals of Applied Biology, 39(2), 257–265.CrossRefGoogle Scholar
  24. Howard, L.E., & Levetin, E. (2014). Ambrosia pollen in Tulsa, Oklahoma: aerobiology, trends, and forecasting model development. Annals of Allergy, Asthma & Immunology, 113(6), 641–646.CrossRefGoogle Scholar
  25. Kasprzyk, I. (2008). Non-native Ambrosia pollen in the atmosphere of rzeszów (SE Poland); evaluation of the effect of weather conditions on daily concentrations and starting dates of the pollen season. International Journal of Biometeorology, 52(5), 341–351.CrossRefGoogle Scholar
  26. Kinney, P.L. (2008). Climate change, air quality, and human health. American Journal of Preventive Medicine, 35(5), 459–467.CrossRefGoogle Scholar
  27. Kohavi, R., & et al. (1995). A study of cross-validation and bootstrap for accuracy estimation and model selection. In: Ijcai, vol. 14, pp. 1137–1145. Stanford, CA.Google Scholar
  28. Kotsiantis, S. (2007). Supervised machine learning: a review of classification techniques. Informatica, 31, 249–268.Google Scholar
  29. Laaidi, M., Laaidi, K., Besancenot, J.-P., Thibaudon, M. (2003). Ragweed in France: an invasive plant and its allergenic pollen. Annals of Allergy. Asthma & Immunology, 91(2), 195–201.CrossRefGoogle Scholar
  30. Lary, D.J. (2010). Artificial intelligence in geoscience and remote sensing. INTECH Open Access Publisher.Google Scholar
  31. Lary, D.J., Zewdie, G.K., Liu, X., Wu, D., Levetin, E., Allee, R.J., Malakar, N., Walker, A., Mussa, H., Mannino, A., et al. (2018). Machine learning applications for earth observation. In: Earth observation open science and innovation, pp. 165–218. Springer.Google Scholar
  32. Lewis, W.H., Vinay, P., Zenger. V.E. (1983). Airborne and allergenic pollen of North America. Johns Hopkins University Press.Google Scholar
  33. Liaw, A., Wiener, M., et al. (2002). Classification and regression by random forest. R news, 2 (3), 18–22.Google Scholar
  34. Liu, X., Wu, D., Zewdie, G.K., Wijerante, L., Timms, C.I., Riley, A., Levetin, E., Lary, D.J. (2017). Using machine learning to estimate atmospheric ambrosia pollen concentrations in Tulsa, OK. Environmental Health Insights, 11, 1178630217699399.CrossRefGoogle Scholar
  35. Low, R.B., Bielory, L., Qureshi, A.I., Dunn, V., Stuhlmiller, D.F., Dickey, D.A. (2006). The relation of stroke admissions to recent weather, airborne allergens, air pollution, seasons, upper respiratory infections, and asthma incidence, September 11, 2001, and day of the week. Stroke, 37(4), 951–957.CrossRefGoogle Scholar
  36. Maddox, R.A., Zhang, J., Gourley, J.J., Howard, K.W. (2002). Weather radar coverage over the contiguous United States. Weather and Forecasting, 17(4), 927–934.CrossRefGoogle Scholar
  37. Matheson, E.M., Player, M.S., Mainous, A.G., King, D.E., Everett, C.J. (2008). The association between hay fever and stroke in a cohort of middle aged and elderly adults. The Journal of the American Board of Family Medicine, 21(3), 179–183.CrossRefGoogle Scholar
  38. Meyer, D., Leisch, F., Hornik, K. (2003). The support vector machine under test. Neurocomputing, 55(1), 169–186.CrossRefGoogle Scholar
  39. Nowosad, J. (2015). Spatiotemporal models for predicting high pollen concentration level of Corylus, Alnus, and Betula. International Journal of Biometeorology, pp 1–13.Google Scholar
  40. Osowski, S., & Garanty, K. (2007). Forecasting of the daily meteorological pollution using wavelets and support vector machine. Engineering Applications of Artificial Intelligence, 20(6), 745–755.CrossRefGoogle Scholar
  41. Oswalt, M.L., & Marshall, G.D. (2008). Ragweed as an example of worldwide allergen expansion. Allergy, Asthma & Clinical Immunology, 4(3), 1.CrossRefGoogle Scholar
  42. Palacios, I.S., Molina, R.T., Rodríguez, A. M. (2000). Influence of wind direction on pollen concentration in the atmosphere. International Journal of Biometeorology, 44(3), 128–133.CrossRefGoogle Scholar
  43. Postolache, T., Stiller, J., Herrell, R., Goldstein, M., Shreeram, S., Zebrak, R., Thrower, C., Volkov, J., No, M., Volkov, I., et al. (2005). Tree pollen peaks are associated with increased nonviolent suicide in women. Molecular psychiatry, 10(3), 232–235.CrossRefGoogle Scholar
  44. Prank, M., Chapman, D.S., Bullock, J.M., Belmonte, J., Berger, U., Dahl, A., Jäger, S., Kovtunenko, I., Magyar, D., Niemelä, S., et al. (2013). An operational model for forecasting ragweed pollen release and dispersion in Europe. Agricultural and forest meteorology, 182, 43–53.CrossRefGoogle Scholar
  45. Prybutok, V.R., Yi, J., Mitchell, D. (2000). Comparison of neural network models with ARIMA and regression models for prediction of houston’s daily maximum ozone concentrations. European Journal of Operational Research, 122(1), 31–40.CrossRefGoogle Scholar
  46. Puc, M. (2012). Artificial neural network model of the relationship between Betula pollen and meteorological factors in Szczecin (Poland). International Journal of Biometeorology, 56(2), 395–401.CrossRefGoogle Scholar
  47. Ramirez, D.A. (1984). The natural history of mountain cedar pollinosis. Journal of allergy and clinical immunology, 73(1), 88–93.CrossRefGoogle Scholar
  48. Rodríguez-Rajo, F., Astray, G., Ferreiro-Lage, J., Aira, M., Jato-Rodriguez, M., Mejuto, J.C. (2010). Evaluation of atmospheric Poaceae pollen concentration using a neural network applied to a coastal Atlantic climate region. Neural Networks, 23(3), 419–425.CrossRefGoogle Scholar
  49. Rojo, J., Rapp, A., Lara, B., Fernández-González, F., Pérez-Badia, R. (2015). Effect of land uses and wind direction on the contribution of local sources to airborne pollen. Science of the Total Environment, 538, 672–682.CrossRefGoogle Scholar
  50. Sánchez-Mesa, J., Galán, C., Martínez-Heras, J., Hervás-Martínez, C. (2002). The use of a neural network to forecast daily grass pollen concentration in a Mediterranean region: the southern part of the Iberian Peninsula. Clinical & Experimental Allergy, 32(11), 1606–1612.CrossRefGoogle Scholar
  51. Sassen, K. (2008). Boreal tree pollen sensed by polarization lidar: depolarizing biogenic chaff. Geophysical Research Letters, 35(18), L18810.CrossRefGoogle Scholar
  52. Smola, A.J., & Schölkopf, B. (2004). A tutorial on support vector regression. Statistics and Computing, 14(3), 199–222.CrossRefGoogle Scholar
  53. Specht, D.F. (1991). A general regression neural network. IEEE transactions on neural networks, 2 (6), 568–576.CrossRefGoogle Scholar
  54. Stark, P.C., Ryan, L.M., McDonald, J.L., Burge, H.A. (1997). Using meteorologic data to predict daily ragweed pollen levels. Aerobiologia, 13(3), 177–184.CrossRefGoogle Scholar
  55. Stickley, A., Ng, C.F.S., Konishi, S., Koyanagi, A., Watanabe, C. (2017). Airborne pollen and suicide mortality in Tokyo, 2001–2011. Environmental research, 155, 134–140.CrossRefGoogle Scholar
  56. Tränkle, E., & Mielke, B. (1994). Simulation and analysis of pollen coronas. Applied Optics, 33 (21), 4552–4562.CrossRefGoogle Scholar
  57. Vapnik, V. (2013). The nature of statistical learning theory. Springer science & Business Media.Google Scholar
  58. Vapnik, V.N., & Vapnik, V. (1998). Statistical learning theory Vol. 1. New York: Wiley.Google Scholar
  59. Verikas, A., Gelzinis, A., Bacauskiene, M. (2011). Mining data with random forests: a survey and results of new tests. Pattern Recognition, 44(2), 330–349.CrossRefGoogle Scholar
  60. Vivekanandan, J., Ellis, S., Oye, R., Zrnic, D., Ryzhkov, A., Straka, J. (1999). Cloud microphysics retrieval using S-band dual-polarization radar measurements. Bulletin of the American Meteorological Society, 80(3), 381–388.CrossRefGoogle Scholar
  61. Voukantsis, D., Niska, H., Karatzas, K., Riga, M., Damialis, A., Vokou, D. (2010). Forecasting daily pollen concentrations using data-driven modeling methods in Thessaloniki, Greece. Atmospheric Environment, 44(39), 5101– 5111.CrossRefGoogle Scholar
  62. Wayne, P., Foster, S., Connolly, J., Bazzaz, F., Epstein, P. (2002). Production of allergenic pollen by ragweed (Ambrosia artemisiifolia L.) is increased in CO2-enriched atmospheres. Annals of Allergy. Asthma & Immunology, 88(3), 279–282.CrossRefGoogle Scholar
  63. Wilson, J.W., Weckwerth, T.M., Vivekanandan, J., Wakimoto, R.M., Russell, R.W. (1994). Boundary layer clear-air radar echoes: origin of echoes and accuracy of derived winds. Journal of Atmospheric and Oceanic Technology, 11(5), 1184–1206.CrossRefGoogle Scholar
  64. Witten, I.H., & Frank, E. (2005). Data Mining: practical machine learning tools and techniques. Morgan Kaufmann.Google Scholar
  65. Yi, J., & Prybutok, V.R. (1996). A neural network model forecasting for prediction of daily maximum ozone concentration in an industrialized urban area. Environmental Pollution, 92(3), 349–357.CrossRefGoogle Scholar
  66. Zewdie, G.K., Lary, D.J., Liu, X., Wu, D., Levetin, E. (2019). Estimating the daily pollen concentration in the atmosphere using machine learning and nexrad weather radar data. Environmental Monitoring and Assessment.Google Scholar
  67. Zhao, F., Elkelish, A., Durner, J., Lindermayr, C., Winkler, J.B., Ruff, F., Behrendt, H., Traidl-Hoffmann, C., Holzinger, A., Kofler, W., et al. (2016). Common ragweed (Ambrosia artemisiifolia L.): allergenicity and molecular characterization of pollen after plant exposure to elevated NO2. Plant, Cell & Environment, 39(1), 147–164.CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.William B. Hanson Center for Space SciencesThe University of Texas at DallasRichardsonUSA
  2. 2.The University of TulsaTulsaUSA

Personalised recommendations