Applying machine learning to forecast daily Ambrosia pollen using environmental and NEXRAD parameters
- 195 Downloads
- 2 Citations
Abstract
Approximately 50 million Americans have allergic diseases. Airborne plant pollen is a significant trigger for several of these allergic diseases. Ambrosia (ragweed) is known for its abundant production of pollen and its potent allergic effect in North America. Hence, estimating and predicting the daily atmospheric concentration of pollen (ragweed pollen in particular) is useful for both people with allergies and for the health professionals who care for them. In this study, we show that a suite of variables including meteorological and land surface parameters, as well as next-generation radar (NEXRAD) measurements together with machine learning can be used to estimate successfully the daily pollen concentration. The supervised machine learning approaches we used included random forests, neural networks, and support vector machines. The performance of the training is independently validated using 10% of the data partitioned using the holdout cross-validation method from the original dataset. The random forests (R= 0.61, R2= 0.37), support vector machines (R= 0.51, R2= 0.26), and neural networks (R= 0.46, R2= 0.21) effectively predicted the daily Ambrosia pollen, where the correlation coefficient (R) and R-squared (R2) values are given in brackets. Three independent approaches—the random forests, correlation coefficients, and interaction information—were employed to rank the relative importance of the available predictors.
Keywords
Pollen Machine learning Environmental parameters NEXRAD measurementsNotes
References
- Andrews, C.P., Ratner, P.H., Ehler, B.R., Brooks, E.G., Pollock, B.H., Ramirez, D.A., Jacobs, R.L. (2013). The mountain cedar model in clinical trials of seasonal allergic rhinoconjunctivitis. Annals of Allergy, Asthma & Immunology, 111(1), 9–13.CrossRefGoogle Scholar
- Arizmendi, C., Sanchez, J., Ramos, N., Ramos, G. (1993). Time series predictions with neural nets: application to airborne pollen forecasting. International Journal of Biometeorology, 37(3), 139–144.CrossRefGoogle Scholar
- Basak, D., Pal, S., Patranabis, D.C. (2007). Support vector regression. Neural Information Processing-Letters and Reviews, 11(10), 203–224.Google Scholar
- Breiman, L. (2001). Random forests. Machine learning, 45(1), 5–32.CrossRefGoogle Scholar
- Britton, J., Pavord, I., Richards, K., Knox, A., Wisniewski, A., Wahedna, I., Kinnear, W., Tattersfield, A., Weiss, S. (1994). Factors influencing the occurrence of airway hyperreactivity in the general population: the importance of atopy and airway calibre. European Respiratory Journal, 7(5), 881–887.Google Scholar
- Brown, G. (2009). A new perspective for information theoretic feature selection. In: AISTATS, pages 49–56.Google Scholar
- Castellano-Méndez, M., Aira, M., Iglesias, I., Jato, V., González-Manteiga, W. (2005). Artificial neural networks as a useful tool to predict the risk level of Betula pollen in the air. International Journal of Biometeorology, 49(5), 310–316.CrossRefGoogle Scholar
- Csépe, Z., Makra, L., Voukantsis, D., Matyasovszky, I., Tusnády, G., Karatzas, K., Thibaudon, M. (2014). Predicting daily ragweed pollen concentrations using computational intelligence techniques over two heavily polluted areas in europe. Science of the Total Environment, 476, 542–552.CrossRefGoogle Scholar
- D’amato, G., & Spieksma, F.T.M. (1991). Allergenic pollen in Europe. Grana, 30(1), 67–70.CrossRefGoogle Scholar
- Darbellay, G.A., Vajda, I., et al. (1999). Estimation of the information by an adaptive partitioning of the observation space. IEEE Transactions on Information Theory, 45(4), 1315–1321.CrossRefGoogle Scholar
- Doviak, R.J., & Zrnic, D.S. (2014). Doppler radar & weather observations. Academic Press.Google Scholar
- Ernst, P., Ghezzo, H., Becklake, M. (2002). Risk factors for bronchial hyperresponsiveness in late childhood and early adolescence. European Respiratory Journal, 20(3), 635–639.CrossRefGoogle Scholar
- Esch, R.E., Hartsell, C.J., Crenshaw, R., Jacobson, R.S. (2001). Common allergenic pollens, fungi, animals, and arthropods. Clinical Reviews in Allergy and Immunology, 21(2), 261–292.CrossRefGoogle Scholar
- Friedman, J., Hastie, T., Tibshirani, R. (2001). The elements of statistical learning Vol. 1. Berlin: Springer series in statistics Springer.Google Scholar
- Gali, R.K. (2010). Assessment of NEXRAD P3 data on streamflow simulation using SWAT for North Fork Ninnescah watershed, Kansas. PhD thesis: Kansas State University.Google Scholar
- Gardner, M.W., & Dorling, S. (1998). Artificial neural networks (the multilayer perceptron)—a review of applications in the atmospheric sciences. Atmospheric Environment, 32(14), 2627–2636.CrossRefGoogle Scholar
- Genuer, R., Poggi, J.-M., Tuleau-Malot, C. (2010). Variable selection using random forests. Pattern Recognition Letters, 31(14), 2225–2236.CrossRefGoogle Scholar
- Guzella, T.S., & Caminhas, W.M. (2009). A review of machine learning approaches to spam filtering. Expert Systems with Applications, 36(7), 10206–10222.CrossRefGoogle Scholar
- Hannesen, R., & Weipert, A. (2003). Detection of dust storms with a C-band doppler radar. Germany: AMS-Gematronik.Google Scholar
- Haykin, S. (1994). Neural networks: a comprehensive foundation. New York: Macmillan College Publishing Company.Google Scholar
- Haykin, S.S., & et al. (2001). Kalman filtering and neural networks. Wiley Online Library.Google Scholar
- Haykin, S.S., Haykin, S.S., Haykin, S.S., Haykin, S.S. (2009). Neural networks and learning machines, volume 3. Pearson Upper Saddle River, NJ, USA.Google Scholar
- Hirst, J. (1952). An automatic volumetric spore trap. Annals of Applied Biology, 39(2), 257–265.CrossRefGoogle Scholar
- Howard, L.E., & Levetin, E. (2014). Ambrosia pollen in Tulsa, Oklahoma: aerobiology, trends, and forecasting model development. Annals of Allergy, Asthma & Immunology, 113(6), 641–646.CrossRefGoogle Scholar
- Kasprzyk, I. (2008). Non-native Ambrosia pollen in the atmosphere of rzeszów (SE Poland); evaluation of the effect of weather conditions on daily concentrations and starting dates of the pollen season. International Journal of Biometeorology, 52(5), 341–351.CrossRefGoogle Scholar
- Kinney, P.L. (2008). Climate change, air quality, and human health. American Journal of Preventive Medicine, 35(5), 459–467.CrossRefGoogle Scholar
- Kohavi, R., & et al. (1995). A study of cross-validation and bootstrap for accuracy estimation and model selection. In: Ijcai, vol. 14, pp. 1137–1145. Stanford, CA.Google Scholar
- Kotsiantis, S. (2007). Supervised machine learning: a review of classification techniques. Informatica, 31, 249–268.Google Scholar
- Laaidi, M., Laaidi, K., Besancenot, J.-P., Thibaudon, M. (2003). Ragweed in France: an invasive plant and its allergenic pollen. Annals of Allergy. Asthma & Immunology, 91(2), 195–201.CrossRefGoogle Scholar
- Lary, D.J. (2010). Artificial intelligence in geoscience and remote sensing. INTECH Open Access Publisher.Google Scholar
- Lary, D.J., Zewdie, G.K., Liu, X., Wu, D., Levetin, E., Allee, R.J., Malakar, N., Walker, A., Mussa, H., Mannino, A., et al. (2018). Machine learning applications for earth observation. In: Earth observation open science and innovation, pp. 165–218. Springer.Google Scholar
- Lewis, W.H., Vinay, P., Zenger. V.E. (1983). Airborne and allergenic pollen of North America. Johns Hopkins University Press.Google Scholar
- Liaw, A., Wiener, M., et al. (2002). Classification and regression by random forest. R news, 2 (3), 18–22.Google Scholar
- Liu, X., Wu, D., Zewdie, G.K., Wijerante, L., Timms, C.I., Riley, A., Levetin, E., Lary, D.J. (2017). Using machine learning to estimate atmospheric ambrosia pollen concentrations in Tulsa, OK. Environmental Health Insights, 11, 1178630217699399.CrossRefGoogle Scholar
- Low, R.B., Bielory, L., Qureshi, A.I., Dunn, V., Stuhlmiller, D.F., Dickey, D.A. (2006). The relation of stroke admissions to recent weather, airborne allergens, air pollution, seasons, upper respiratory infections, and asthma incidence, September 11, 2001, and day of the week. Stroke, 37(4), 951–957.CrossRefGoogle Scholar
- Maddox, R.A., Zhang, J., Gourley, J.J., Howard, K.W. (2002). Weather radar coverage over the contiguous United States. Weather and Forecasting, 17(4), 927–934.CrossRefGoogle Scholar
- Matheson, E.M., Player, M.S., Mainous, A.G., King, D.E., Everett, C.J. (2008). The association between hay fever and stroke in a cohort of middle aged and elderly adults. The Journal of the American Board of Family Medicine, 21(3), 179–183.CrossRefGoogle Scholar
- Meyer, D., Leisch, F., Hornik, K. (2003). The support vector machine under test. Neurocomputing, 55(1), 169–186.CrossRefGoogle Scholar
- Nowosad, J. (2015). Spatiotemporal models for predicting high pollen concentration level of Corylus, Alnus, and Betula. International Journal of Biometeorology, pp 1–13.Google Scholar
- Osowski, S., & Garanty, K. (2007). Forecasting of the daily meteorological pollution using wavelets and support vector machine. Engineering Applications of Artificial Intelligence, 20(6), 745–755.CrossRefGoogle Scholar
- Oswalt, M.L., & Marshall, G.D. (2008). Ragweed as an example of worldwide allergen expansion. Allergy, Asthma & Clinical Immunology, 4(3), 1.CrossRefGoogle Scholar
- Palacios, I.S., Molina, R.T., Rodríguez, A. M. (2000). Influence of wind direction on pollen concentration in the atmosphere. International Journal of Biometeorology, 44(3), 128–133.CrossRefGoogle Scholar
- Postolache, T., Stiller, J., Herrell, R., Goldstein, M., Shreeram, S., Zebrak, R., Thrower, C., Volkov, J., No, M., Volkov, I., et al. (2005). Tree pollen peaks are associated with increased nonviolent suicide in women. Molecular psychiatry, 10(3), 232–235.CrossRefGoogle Scholar
- Prank, M., Chapman, D.S., Bullock, J.M., Belmonte, J., Berger, U., Dahl, A., Jäger, S., Kovtunenko, I., Magyar, D., Niemelä, S., et al. (2013). An operational model for forecasting ragweed pollen release and dispersion in Europe. Agricultural and forest meteorology, 182, 43–53.CrossRefGoogle Scholar
- Prybutok, V.R., Yi, J., Mitchell, D. (2000). Comparison of neural network models with ARIMA and regression models for prediction of houston’s daily maximum ozone concentrations. European Journal of Operational Research, 122(1), 31–40.CrossRefGoogle Scholar
- Puc, M. (2012). Artificial neural network model of the relationship between Betula pollen and meteorological factors in Szczecin (Poland). International Journal of Biometeorology, 56(2), 395–401.CrossRefGoogle Scholar
- Ramirez, D.A. (1984). The natural history of mountain cedar pollinosis. Journal of allergy and clinical immunology, 73(1), 88–93.CrossRefGoogle Scholar
- Rodríguez-Rajo, F., Astray, G., Ferreiro-Lage, J., Aira, M., Jato-Rodriguez, M., Mejuto, J.C. (2010). Evaluation of atmospheric Poaceae pollen concentration using a neural network applied to a coastal Atlantic climate region. Neural Networks, 23(3), 419–425.CrossRefGoogle Scholar
- Rojo, J., Rapp, A., Lara, B., Fernández-González, F., Pérez-Badia, R. (2015). Effect of land uses and wind direction on the contribution of local sources to airborne pollen. Science of the Total Environment, 538, 672–682.CrossRefGoogle Scholar
- Sánchez-Mesa, J., Galán, C., Martínez-Heras, J., Hervás-Martínez, C. (2002). The use of a neural network to forecast daily grass pollen concentration in a Mediterranean region: the southern part of the Iberian Peninsula. Clinical & Experimental Allergy, 32(11), 1606–1612.CrossRefGoogle Scholar
- Sassen, K. (2008). Boreal tree pollen sensed by polarization lidar: depolarizing biogenic chaff. Geophysical Research Letters, 35(18), L18810.CrossRefGoogle Scholar
- Smola, A.J., & Schölkopf, B. (2004). A tutorial on support vector regression. Statistics and Computing, 14(3), 199–222.CrossRefGoogle Scholar
- Specht, D.F. (1991). A general regression neural network. IEEE transactions on neural networks, 2 (6), 568–576.CrossRefGoogle Scholar
- Stark, P.C., Ryan, L.M., McDonald, J.L., Burge, H.A. (1997). Using meteorologic data to predict daily ragweed pollen levels. Aerobiologia, 13(3), 177–184.CrossRefGoogle Scholar
- Stickley, A., Ng, C.F.S., Konishi, S., Koyanagi, A., Watanabe, C. (2017). Airborne pollen and suicide mortality in Tokyo, 2001–2011. Environmental research, 155, 134–140.CrossRefGoogle Scholar
- Tränkle, E., & Mielke, B. (1994). Simulation and analysis of pollen coronas. Applied Optics, 33 (21), 4552–4562.CrossRefGoogle Scholar
- Vapnik, V. (2013). The nature of statistical learning theory. Springer science & Business Media.Google Scholar
- Vapnik, V.N., & Vapnik, V. (1998). Statistical learning theory Vol. 1. New York: Wiley.Google Scholar
- Verikas, A., Gelzinis, A., Bacauskiene, M. (2011). Mining data with random forests: a survey and results of new tests. Pattern Recognition, 44(2), 330–349.CrossRefGoogle Scholar
- Vivekanandan, J., Ellis, S., Oye, R., Zrnic, D., Ryzhkov, A., Straka, J. (1999). Cloud microphysics retrieval using S-band dual-polarization radar measurements. Bulletin of the American Meteorological Society, 80(3), 381–388.CrossRefGoogle Scholar
- Voukantsis, D., Niska, H., Karatzas, K., Riga, M., Damialis, A., Vokou, D. (2010). Forecasting daily pollen concentrations using data-driven modeling methods in Thessaloniki, Greece. Atmospheric Environment, 44(39), 5101– 5111.CrossRefGoogle Scholar
- Wayne, P., Foster, S., Connolly, J., Bazzaz, F., Epstein, P. (2002). Production of allergenic pollen by ragweed (Ambrosia artemisiifolia L.) is increased in CO2-enriched atmospheres. Annals of Allergy. Asthma & Immunology, 88(3), 279–282.CrossRefGoogle Scholar
- Wilson, J.W., Weckwerth, T.M., Vivekanandan, J., Wakimoto, R.M., Russell, R.W. (1994). Boundary layer clear-air radar echoes: origin of echoes and accuracy of derived winds. Journal of Atmospheric and Oceanic Technology, 11(5), 1184–1206.CrossRefGoogle Scholar
- Witten, I.H., & Frank, E. (2005). Data Mining: practical machine learning tools and techniques. Morgan Kaufmann.Google Scholar
- Yi, J., & Prybutok, V.R. (1996). A neural network model forecasting for prediction of daily maximum ozone concentration in an industrialized urban area. Environmental Pollution, 92(3), 349–357.CrossRefGoogle Scholar
- Zewdie, G.K., Lary, D.J., Liu, X., Wu, D., Levetin, E. (2019). Estimating the daily pollen concentration in the atmosphere using machine learning and nexrad weather radar data. Environmental Monitoring and Assessment.Google Scholar
- Zhao, F., Elkelish, A., Durner, J., Lindermayr, C., Winkler, J.B., Ruff, F., Behrendt, H., Traidl-Hoffmann, C., Holzinger, A., Kofler, W., et al. (2016). Common ragweed (Ambrosia artemisiifolia L.): allergenicity and molecular characterization of pollen after plant exposure to elevated NO2. Plant, Cell & Environment, 39(1), 147–164.CrossRefGoogle Scholar