Abstract
There are a significant number of global and regional studies on air pollution prediction using machine learning. This study looks at the application of machine learning to anticipate air pollution, as well as the state of the field right now and its projected expansion. This study searches over 1794 documents created by 5354 academics and published in 745 publications between 1991 and 2023, using Scopus as the primary search engine. For the purpose of identifying and visualising major authors, journals, countries, research publications, and key trends on these concerns, articles published on these themes were evaluated using Biblioshiny, Vosviewer and S-curve analysis. We discover that interest in this subject began to grow in 2017 and has since grown at a rate of 18.56 percent per year. Although prestigious journals such as Environmental Pollution, Atmospheric Environment, and Science of the Total Environment have been at the forefront of advancing research on the application of machine learning to forecast air pollution, these journals are not the only ones doing so. The top four leading countries in terms of total citations are China (6,784 citations), the United Kingdom (2,758 citations), the United States (2145 citations), and India (1,117 citations). The top three most prestigious universities are Fudan University, China (63 articles), the University of Southern California, USA (60 articles), and Tsinghua University, China (56 articles). The authors' keyword co-occurrence network mappings show that machine learning (577 occurrences), air pollution (282 occurrences), and air quality (166 occurrences) are the top three most frequent keywords, respectively. This research focuses on using machine learning to predict air pollution.
Similar content being viewed by others
Data Availability
The datasets used in the current work will be made available by the corresponding author upon reasonable request.
References
Abafe EA, Bahta YT, Jordaan H (2022) Exploring biblioshiny for historical assessment of global research on sustainable use of water in agriculture. Sustainability 14(17):10651
Andersen ZJ et al (2011) Chronic obstructive pulmonary disease and long-term exposure to traffic-related air pollution: a cohort study. Am J Respir Crit Care Med 183(4):455–461
Aria M, Cuccurullo C (2017) bibliometrix: an R-tool for comprehensive science mapping analysis. J Informetr 11(4):959–975
Babbar H, Rani S, Masud M, Verma S, Anand D, Jhanjhi N (2021) Load balancing algorithm for migrating switches in software-defined vehicular networks. Comput Mater Contin 67(1):1301–1316
Bai L, Wang J, Ma X, Lu H (2018) Air pollution forecasts: an overview. Int J Environ Res Public Health 15(4):780
Batth RS, Gupta M, Mann KS, Verma S, Malhotra A (2020) “Comparative study of tdma-based mac protocols in vanet: a mirror review,” in international conference on innovative computing and communications: proceedings of ICICC 2019, Volume 2, pp. 107–123
Bornmann L, Marx W (2015) Methods for the generation of normalized citation impact scores in bibliometrics: which method best reflects the judgements of experts? J Informetr 9(2):408–418
Bornmann L, Marx W, Barth A (2013) The normalization of citation counts based on classification systems. Publications 1(2):78–86
Broadus RN (1987) Toward a definition of ‘bibliometrics.’ Scientometrics 12:373–379
Brokamp C, Jandarov R, Rao MB, LeMasters G, Ryan P (2017) Exposure assessment models for elemental components of particulate matter in an urban environment: a comparison of regression and random forest approaches. Atmos Environ 151:1–11
Cabaneros SM, Calautit JK, Hughes BR (2019) A review of artificial neural network models for ambient air pollution prediction. Environ Model Softw 119(June):285–304. https://doi.org/10.1016/j.envsoft.2019.06.014
Castelli M, Clemente FM, Popovič A, Silva S, Vanneschi L (2020) “A machine learning approach to predict air quality in California,” Complexity, vol. 2020
Chen G et al (2018) A machine learning method to estimate PM2. 5 concentrations across China with remote sensing, meteorological and land use information. Sci Total Environ 636:52–60
Corani G (2005) Air quality prediction in Milan: feed-forward neural networks, pruned neural networks and lazy learning. Ecol Modell 185(2–4):513–529
Crouse DL, Ross NA, Goldberg MS (2009) Double burden of deprivation and high concentrations of ambient air pollution at the neighbourhood scale in Montreal, Canada. Soc Sci Med 69(6):971–981
Darrow LA, Klein M, Flanders WD, Mulholland JA, Tolbert PE, Strickland MJ (2014) Air pollution and acute respiratory infections among children 0–4 years of age: an 18-year time-series study. Am J Epidemiol 180(10):968–977
Dash S et al (2022) Guidance image-based enhanced matched filter with modified thresholding for blood vessel extraction. Symmetry (basel) 14(2):194
Di Q et al (2019) An ensemble-based model of PM2. 5 concentration across the contiguous United States with high spatiotemporal resolution. Environ Int 130:104909
Díaz-Robles LA, Fu JS, Reed GD (2008) Modeling and source apportionment of diesel particulate matter. Environ Int 34(1):1–11
Diodato VP Gellatly P (2013) Dictionary of bibliometrics. Routledge
Dogra V, Singh A, Verma S, Kavita, Jhanjhi NZ, Talib MN (2021) “Analyzing DistilBERT for sentiment classification of banking financial news,” in intelligent computing and innovation on data science: proceedings of ICTIDS 2021, pp. 501–510
Du H et al (2019) Research development on sustainable urban infrastructure from 1991 to 2017: a bibliometric analysis to inform future innovations. Earth’s Futur 7(7):718–733
Van Eck NJ, Waltman L (2011) “Text mining and visualization using VOSviewer,” arXiv Prepr. arXiv1109.2058
Van Eck NJ, Waltman L (2014) “Visualizing bibliometric networks,” Meas Sch Impact Method Pract, pp. 285–320
Ernst H (1997) The use of patent data for technological forecasting: the diffusion of CNC-technology in the machine tool industry. Small Bus Econ 9:361–381
Ezzati M, Kammen DM (2001) Indoor air pollution from biomass combustion and acute respiratory infections in kenya: an exposure-response study. Lancet 358(9282):619–624
Figueroa-Rodríguez KA, Álvarez-Ávila del MC, Hernández Castillo F, Schwentesius Rindermann R, Figueroa-Sandoval B (2019) Farmers’ market actors, dynamics, and attributes: a bibliometric study. Sustainability 11(3):745
Freeman BS, Taylor G, Gharabaghi B, Thé J (2018) Forecasting air quality time series using deep learning. J Air Waste Manage Assoc 68(8):866–886
Gardner MW, Dorling SR (1998) Artificial neural networks (the multilayer perceptron)—a review of applications in the atmospheric sciences. Atmos Environ 32(14–15):2627–2636
Gardner MW, Dorling SR (1999) Neural network modelling and prediction of hourly NOx and NO2 concentrations in urban air in London. Atmos Environ 33(5):709–719
Ghosh G, Sood M, Verma S (2020) Internet of things based video surveillance systems for security applications. J Comput Theor Nanosci 17(6):2582–2588
Guo P, Tian W, Li H, Zhang G, Li J (2020) Global characteristics and trends of research on construction dust: based on bibliometric and visualized analysis. Environ Sci Pollut Res 27:37773–37789
Guo Q et al., (2022) “Applications of artificial intelligence in the field of air pollution: a bibliometric analysis,” Front. Public Heal, p. 2972
Hallinger P, Chatpinyakoop C (2019) A bibliometric review of research on higher education for sustainable development, 1998–2018. Sustainability 11(8):2401
Hallinger P, Kovačević J (2019) A bibliometric review of research on educational administration: science mapping the literature, 1960 to 2018. Rev Educ Res 89(3):335–369
Hong Y-C, Lee J-T, Kim H, Kwon H-J (2002) Air pollution: a new risk factor in ischemic stroke mortality. Stroke 33(9):2165–2169
Hou Y, Shen Z (2022) Research trends, hotspots and frontiers of ozone pollution from 1996 to 2021: a review based on a bibliometric visualization analysis. Sustainability 14(17):10898
Huang CJ, Kuo PH (2018) A deep CNN-LSTM model for particulate matter PM25 forecasting in smart cities. Sensors 18(7):2220
Jain S, Kaur N, Verma S, Kavita, Hosen ASMS, Sehgal SS (2022) Use of machine learning in air pollution research: a bibliographic perspective. Electron. https://doi.org/10.3390/electronics11213621
Janarthanan R, Partheeban P, Somasundaram K, Navin Elamparithi P (2021) A deep learning approach for prediction of air quality index in a metropolitan city. Sustain Cities Soc 67:102720. https://doi.org/10.1016/j.scs.2021.102720
Janik A, Ryszko A, Szafraniec M (2020) Scientific landscape of smart and sustainable cities literature: a bibliometric analysis. Sustainability 12(3):779
Kampa M, Castanas E (2008) Human health effects of air pollution. Environ Pollut 151(2):362–367
Kaur M, Verma S (2020) Flying ad-hoc network (FANET): challenges and routing protocols. J Comput Theor Nanosci 17(6):2575–2581
Kerckhoffs J, Hoek G, Portengen L, Brunekreef B, Vermeulen RCH (2019) Performance of prediction algorithms for modeling outdoor air pollution spatial surfaces. Environ Sci Technol 53(3):1413–1421
Koseoglu MA, Rahimi R, Okumus F, Liu J (2016) Bibliometric studies in tourism. Ann Tour Res 61:180–198
S Kumar R Shanker S Verma 2018 “Context aware dynamic permission model: a retrospect of privacy and security in android system,” In 2018 international conference on intelligent circuits and systems (ICICS) 324–329
Kumar M, Raju KS, Kumar D, Goyal N, Verma S, Singh A (2021) An efficient framework using visual recognition for IoT based smart city surveillance. Multimed Tools Appl 80:1–19. https://doi.org/10.1007/s11042-020-10471-x
Li X et al (2016) The impact of observation nudging on simulated meteorology and ozone concentrations during DISCOVER-AQ 2013 texas campaign. Atmos Chem Phys 16(5):3127–3144
Li S et al (2022) Sources and processes of organic aerosol in non-refractory PM1 and PM2. 5 during foggy and haze episodes in an urban environment of the Yangtze River Delta, China. Environ Res 212:113557
Li Y, Sha Z, Tang A, Goulding K, Liu X (2023) The application of machine learning to air pollution research: A bibliometric analysis. Ecotoxicol Environ Saf. https://doi.org/10.1016/j.ecoenv.2023.114911
Loomis D et al (2013) The carcinogenicity of outdoor air pollution. Lancet Oncol 14(13):1262
Loomis D, Huang W, Chen G (2014) The international agency for research on cancer (IARC) evaluation of the carcinogenicity of outdoor air pollution: focus on China. Chin J Cancer 33(4):189
Mao G, Hu H, Liu X, Crittenden J, Huang N (2021) A bibliometric analysis of industrial wastewater treatments from 1998 to 2019. Environ Pollut 275:115785
Mehmood K et al (2022) Predicting the quality of air with machine learning approaches: current research priorities and future perspectives. J Clean Prod 379:134656
Mehmood K et al (2022) Predicting the quality of air with machine learning approaches: current research priorities and future perspectives. J Clean Prod 379(P2):134656. https://doi.org/10.1016/j.jclepro.2022.134656
Mokhtari I, Bechkit W, Rivano H, Yaici MR (2021) Uncertainty-aware deep learning architectures for highly dynamic air quality prediction. IEEE Access 9:14765–14778
Munim ZH, Dushenko M, Jimenez VJ, Shakil MH, Imset M (2020) Big data and artificial intelligence in the maritime industry: a bibliometric review and future research directions. Marit Policy Manag 47(5):577–597
Nahar KMO, Ottom MA, Alshibli F, Shquier MMA (2020) Air quality index using machine learning–a jordan case study. Compusoft 9(9):3831–3840
Najafi G et al (2016) SVM and ANFIS for prediction of performance and exhaust emissions of a SI engine with gasoline–ethanol blended fuels. Appl Therm Eng 95:186–203
Navares R, Aznarte JL (2020) Predicting air quality with deep learning LSTM: towards comprehensive models. Ecol Inform 55:101019
Nyberg F et al (2000) Urban air pollution and lung cancer in stockholm. Epidemiology 11(5):487–495
Orru H, Ebi KL, Forsberg B (2017) The interplay of climate change and air pollution on health. Curr Environ Heal Reports 4:504–513
Pritchard A (1969) Statistical bibliography or bibliometrics. J Doc 25:348
Qin D, Yu J, Zou G, Yong R, Zhao Q, Zhang B (2019) A novel combined prediction scheme based on CNN and LSTM for urban PM 2.5 concentration. Ieee Access 7:20050–20059
Qureshi MI et al (2015) Environment and air pollution: health services bequeath to grotesque menace. Environ Sci Pollut Res 22:3467–3476
Ren C, Cao SL (2019)“Development and application of linear ventilation and temperature models for indoor environmental prediction and HVAC systems control,” Sustain. Cities Soc. 51:101673.
Ruckerl R et al (2006) Air pollution and markers of inflammation and coagulation in patients with coronary heart disease. Am J Respir Crit Care Med 173(4):432–441
Rybarczyk Y, Zalakeviciute R (2018) Machine learning approaches for outdoor air quality modelling: a systematic review. Appl Sci 8(12):2570
Shaban KB, Kadri A, Rezk E (2016) Urban air pollution monitoring system with forecasting models. IEEE Sens J 16(8):2598–2606
Sharma T, Verma S (2017) Prediction of heart disease using cleveland dataset: a machine learning approach. Int J Recent Res Asp 4(3):17–21
Soundari AG, Jeslin JG, Akshaya AC (2019) Indian air quality prediction and analysis using machine learning. Int J Appl Eng Res 14(11):181–186
Stafoggia M et al (2019) Estimation of daily PM10 and PM2. 5 concentrations in Italy, 2013–2015, using a spatiotemporal land-use random-forest model. Environ Int 124:170–179
Tagaris E, Liao K-J, DeLucia AJ, Deck L, Amar P, Russell AG (2009) Potential impact of climate change on air pollution-related human health effects. Environ Sci Technol 43(13):4979–4988
Tian X et al (2020) Power allocation scheme for maximizing spectral efficiency and energy efficiency tradeoff for uplink NOMA systems in B5G/6G. Phys Commun 43:101227
Van Eck N, Waltman L (2010) Software survey: VOSviewer, a computer program for bibliometric mapping. Scientometrics 84(2):523–538
Velasco-Muñoz JF, Aznar-Sánchez JA, Belmonte-Ureña LJ, Román-Sánchez IM (2018) Sustainable water use in agriculture: a review of worldwide research. Sustainability 10(4):1084
Vitolo C, Elkhatib Y, Reusser D, Macleod CJA, Buytaert W (2015) Web technologies for environmental big data. Environ Model Softw 63:185–198
Wang W, Men C, Lu W (2008) Online prediction model based on support vector machine. Neurocomputing 71(4–6):550–558
Wang D, Wei S, Luo H, Yue C, Grunder O (2017) A novel hybrid model for air quality index forecasting based on two-phase decomposition technique and modified extreme learning machine. Sci Total Environ 580:719–733
World health organization (WHO), exposure & health impacts of air pollution. Air quality and health. https://www.who.int/teams/environment-climate-change-and-health/air-quality-energy-and-health/health-impacts/exposure-air-pollution#:~:text=The%20combined%20or%20joint%20effects,cancer%20and%20acute%20respiratory%20infections. Accessed on 13 Jan 2024.
Wu L, Li N, Yang Y (2018) Prediction of air quality indicators for the Beijing-Tianjin-Hebei region. J Clean Prod 196:682–687
Xue T et al (2019) Spatiotemporal continuous estimates of PM2. 5 concentrations in China, 2000–2016: a machine learning method with inputs from satellites, chemical transport model, and ground observations. Environ Int 123:345–357
Yan R, Liao J, Yang J, Sun W, Nong M, Li F (2021) Multi-hour and multi-site air quality index forecasting in Beijing using CNN, LSTM, CNN-LSTM, and spatiotemporal clustering. Expert Syst Appl 169:114513
Yang G, Jan MA, Rehman AU, Babar M, Aimal MM, Verma S (2020) Interoperability and data storage in internet of multimedia things: investigating current trends, research challenges and future directions. IEEE Access 8:124382–124401
Yang J, Yan R, Nong M, Liao J, Li F, Sun W (2021) PM2. 5 concentrations forecasting in Beijing through deep learning with different inputs, model structures and forecast time. Atmos Pollut Res 12(9):101168
Ye X, Wang X, Zhang L (2022) Diagnosing the model bias in simulating daily surface ozone variability using a machine learning method: the effects of dry deposition and cloud optical depth. Environ Sci Technol 56(23):16665–16675
Yi J, Prybutok VR (1996) A neural network model forecasting for prediction of daily maximum ozone concentration in an industrialized urban area. Environ Pollut 92(3):349–357
Yu Y et al (2020) A bibliometric analysis using VOSviewer of publications on COVID-19. Ann Transl Med 8(13):816–816. https://doi.org/10.21037/atm-20-4235
Zamani Joharestani M, Cao C, Ni X, Bashir B, Talebiesfandarani S (2019) PM25 prediction based on random forest, XGBoost, and deep learning using multisource remote sensing data. Atmosphere (basel) 10(7):373
Zhan Y, Luo Y, Deng X, Grieneisen ML, Zhang M, Di B (2018) Spatiotemporal prediction of daily ambient ozone levels across China using random forest for human exposure assessment. Environ Pollut 233:464–473
Zhou Y, Chang F-J, Chang L-C, Kao I-F, Wang Y-S (2019) Explore a deep learning multi-output neural network for regional multi-step-ahead air quality forecasts. J Clean Prod 209:134–145
Zhu D, Cai C, Yang T, Zhou X (2018a) A machine learning approach for air quality prediction: model regularization and optimization. Big Data Cogn Comput 2(1):5
Zhu S, Yang L, Wang W, Liu X, Lu M, Shen X (2018b) Optimal-combined model for air quality index forecasting: 5 cities in North China. Environ Pollut 243:842–850
Zong Z et al (2015) Radiocarbon-based impact assessment of open biomass burning on regional carbonaceous aerosols in North China. Sci Total Environ 518:1–7
Zupic I, Čater T (2015) Bibliometric methods in management and organization. Organ Res Methods 18(3):429–472
Funding
No funding was received for conducting this study.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Ethical Approval
The authors hereby declare that this manuscript is not published or considered for publication elsewhere.
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Ansari, A., Quaff, A.R. Bibliometric Analysis on Global Research Trends in Air Pollution Prediction Research Using Machine Learning from 1991–2023 Using Scopus Database. Aerosol Sci Eng (2024). https://doi.org/10.1007/s41810-024-00221-z
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s41810-024-00221-z