Skip to main content
Log in

Bibliometric Analysis on Global Research Trends in Air Pollution Prediction Research Using Machine Learning from 1991–2023 Using Scopus Database

  • Original Paper
  • Published:
Aerosol Science and Engineering Aims and scope Submit manuscript

Abstract

There are a significant number of global and regional studies on air pollution prediction using machine learning. This study looks at the application of machine learning to anticipate air pollution, as well as the state of the field right now and its projected expansion. This study searches over 1794 documents created by 5354 academics and published in 745 publications between 1991 and 2023, using Scopus as the primary search engine. For the purpose of identifying and visualising major authors, journals, countries, research publications, and key trends on these concerns, articles published on these themes were evaluated using Biblioshiny, Vosviewer and S-curve analysis. We discover that interest in this subject began to grow in 2017 and has since grown at a rate of 18.56 percent per year. Although prestigious journals such as Environmental Pollution, Atmospheric Environment, and Science of the Total Environment have been at the forefront of advancing research on the application of machine learning to forecast air pollution, these journals are not the only ones doing so. The top four leading countries in terms of total citations are China (6,784 citations), the United Kingdom (2,758 citations), the United States (2145 citations), and India (1,117 citations). The top three most prestigious universities are Fudan University, China (63 articles), the University of Southern California, USA (60 articles), and Tsinghua University, China (56 articles). The authors' keyword co-occurrence network mappings show that machine learning (577 occurrences), air pollution (282 occurrences), and air quality (166 occurrences) are the top three most frequent keywords, respectively. This research focuses on using machine learning to predict air pollution.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Data Availability

The datasets used in the current work will be made available by the corresponding author upon reasonable request.

References

  • Abafe EA, Bahta YT, Jordaan H (2022) Exploring biblioshiny for historical assessment of global research on sustainable use of water in agriculture. Sustainability 14(17):10651

    Article  Google Scholar 

  • Andersen ZJ et al (2011) Chronic obstructive pulmonary disease and long-term exposure to traffic-related air pollution: a cohort study. Am J Respir Crit Care Med 183(4):455–461

    Article  CAS  Google Scholar 

  • Aria M, Cuccurullo C (2017) bibliometrix: an R-tool for comprehensive science mapping analysis. J Informetr 11(4):959–975

    Article  Google Scholar 

  • Babbar H, Rani S, Masud M, Verma S, Anand D, Jhanjhi N (2021) Load balancing algorithm for migrating switches in software-defined vehicular networks. Comput Mater Contin 67(1):1301–1316

    Google Scholar 

  • Bai L, Wang J, Ma X, Lu H (2018) Air pollution forecasts: an overview. Int J Environ Res Public Health 15(4):780

    Article  Google Scholar 

  • Batth RS, Gupta M, Mann KS, Verma S, Malhotra A (2020) “Comparative study of tdma-based mac protocols in vanet: a mirror review,” in international conference on innovative computing and communications: proceedings of ICICC 2019, Volume 2, pp. 107–123

  • Bornmann L, Marx W (2015) Methods for the generation of normalized citation impact scores in bibliometrics: which method best reflects the judgements of experts? J Informetr 9(2):408–418

    Article  Google Scholar 

  • Bornmann L, Marx W, Barth A (2013) The normalization of citation counts based on classification systems. Publications 1(2):78–86

    Article  Google Scholar 

  • Broadus RN (1987) Toward a definition of ‘bibliometrics.’ Scientometrics 12:373–379

    Article  Google Scholar 

  • Brokamp C, Jandarov R, Rao MB, LeMasters G, Ryan P (2017) Exposure assessment models for elemental components of particulate matter in an urban environment: a comparison of regression and random forest approaches. Atmos Environ 151:1–11

    Article  CAS  Google Scholar 

  • Cabaneros SM, Calautit JK, Hughes BR (2019) A review of artificial neural network models for ambient air pollution prediction. Environ Model Softw 119(June):285–304. https://doi.org/10.1016/j.envsoft.2019.06.014

    Article  Google Scholar 

  • Castelli M, Clemente FM, Popovič A, Silva S, Vanneschi L (2020) “A machine learning approach to predict air quality in California,” Complexity, vol. 2020

  • Chen G et al (2018) A machine learning method to estimate PM2. 5 concentrations across China with remote sensing, meteorological and land use information. Sci Total Environ 636:52–60

    Article  CAS  Google Scholar 

  • Corani G (2005) Air quality prediction in Milan: feed-forward neural networks, pruned neural networks and lazy learning. Ecol Modell 185(2–4):513–529

    Article  Google Scholar 

  • Crouse DL, Ross NA, Goldberg MS (2009) Double burden of deprivation and high concentrations of ambient air pollution at the neighbourhood scale in Montreal, Canada. Soc Sci Med 69(6):971–981

    Article  Google Scholar 

  • Darrow LA, Klein M, Flanders WD, Mulholland JA, Tolbert PE, Strickland MJ (2014) Air pollution and acute respiratory infections among children 0–4 years of age: an 18-year time-series study. Am J Epidemiol 180(10):968–977

    Article  Google Scholar 

  • Dash S et al (2022) Guidance image-based enhanced matched filter with modified thresholding for blood vessel extraction. Symmetry (basel) 14(2):194

    Article  Google Scholar 

  • Di Q et al (2019) An ensemble-based model of PM2. 5 concentration across the contiguous United States with high spatiotemporal resolution. Environ Int 130:104909

    Article  CAS  Google Scholar 

  • Díaz-Robles LA, Fu JS, Reed GD (2008) Modeling and source apportionment of diesel particulate matter. Environ Int 34(1):1–11

    Article  Google Scholar 

  • Diodato VP Gellatly P (2013) Dictionary of bibliometrics. Routledge

  • Dogra V, Singh A, Verma S, Kavita, Jhanjhi NZ, Talib MN (2021) “Analyzing DistilBERT for sentiment classification of banking financial news,” in intelligent computing and innovation on data science: proceedings of ICTIDS 2021, pp. 501–510

  • Du H et al (2019) Research development on sustainable urban infrastructure from 1991 to 2017: a bibliometric analysis to inform future innovations. Earth’s Futur 7(7):718–733

    Article  Google Scholar 

  • Van Eck NJ, Waltman L (2011) “Text mining and visualization using VOSviewer,” arXiv Prepr. arXiv1109.2058

  • Van Eck NJ, Waltman L (2014) “Visualizing bibliometric networks,” Meas Sch Impact Method Pract, pp. 285–320

  • Ernst H (1997) The use of patent data for technological forecasting: the diffusion of CNC-technology in the machine tool industry. Small Bus Econ 9:361–381

    Article  Google Scholar 

  • Ezzati M, Kammen DM (2001) Indoor air pollution from biomass combustion and acute respiratory infections in kenya: an exposure-response study. Lancet 358(9282):619–624

    Article  CAS  Google Scholar 

  • Figueroa-Rodríguez KA, Álvarez-Ávila del MC, Hernández Castillo F, Schwentesius Rindermann R, Figueroa-Sandoval B (2019) Farmers’ market actors, dynamics, and attributes: a bibliometric study. Sustainability 11(3):745

    Article  Google Scholar 

  • Freeman BS, Taylor G, Gharabaghi B, Thé J (2018) Forecasting air quality time series using deep learning. J Air Waste Manage Assoc 68(8):866–886

    Article  CAS  Google Scholar 

  • Gardner MW, Dorling SR (1998) Artificial neural networks (the multilayer perceptron)—a review of applications in the atmospheric sciences. Atmos Environ 32(14–15):2627–2636

    Article  CAS  Google Scholar 

  • Gardner MW, Dorling SR (1999) Neural network modelling and prediction of hourly NOx and NO2 concentrations in urban air in London. Atmos Environ 33(5):709–719

    Article  CAS  Google Scholar 

  • Ghosh G, Sood M, Verma S (2020) Internet of things based video surveillance systems for security applications. J Comput Theor Nanosci 17(6):2582–2588

    Article  CAS  Google Scholar 

  • Guo P, Tian W, Li H, Zhang G, Li J (2020) Global characteristics and trends of research on construction dust: based on bibliometric and visualized analysis. Environ Sci Pollut Res 27:37773–37789

    Article  Google Scholar 

  • Guo Q et al., (2022) “Applications of artificial intelligence in the field of air pollution: a bibliometric analysis,” Front. Public Heal, p. 2972

  • Hallinger P, Chatpinyakoop C (2019) A bibliometric review of research on higher education for sustainable development, 1998–2018. Sustainability 11(8):2401

    Article  Google Scholar 

  • Hallinger P, Kovačević J (2019) A bibliometric review of research on educational administration: science mapping the literature, 1960 to 2018. Rev Educ Res 89(3):335–369

    Article  Google Scholar 

  • Hong Y-C, Lee J-T, Kim H, Kwon H-J (2002) Air pollution: a new risk factor in ischemic stroke mortality. Stroke 33(9):2165–2169

    Article  CAS  Google Scholar 

  • Hou Y, Shen Z (2022) Research trends, hotspots and frontiers of ozone pollution from 1996 to 2021: a review based on a bibliometric visualization analysis. Sustainability 14(17):10898

    Article  CAS  Google Scholar 

  • Huang CJ, Kuo PH (2018) A deep CNN-LSTM model for particulate matter PM25 forecasting in smart cities. Sensors 18(7):2220

    Article  Google Scholar 

  • Jain S, Kaur N, Verma S, Kavita, Hosen ASMS, Sehgal SS (2022) Use of machine learning in air pollution research: a bibliographic perspective. Electron. https://doi.org/10.3390/electronics11213621

    Article  Google Scholar 

  • Janarthanan R, Partheeban P, Somasundaram K, Navin Elamparithi P (2021) A deep learning approach for prediction of air quality index in a metropolitan city. Sustain Cities Soc 67:102720. https://doi.org/10.1016/j.scs.2021.102720

    Article  Google Scholar 

  • Janik A, Ryszko A, Szafraniec M (2020) Scientific landscape of smart and sustainable cities literature: a bibliometric analysis. Sustainability 12(3):779

    Article  Google Scholar 

  • Kampa M, Castanas E (2008) Human health effects of air pollution. Environ Pollut 151(2):362–367

    Article  CAS  Google Scholar 

  • Kaur M, Verma S (2020) Flying ad-hoc network (FANET): challenges and routing protocols. J Comput Theor Nanosci 17(6):2575–2581

    Article  CAS  Google Scholar 

  • Kerckhoffs J, Hoek G, Portengen L, Brunekreef B, Vermeulen RCH (2019) Performance of prediction algorithms for modeling outdoor air pollution spatial surfaces. Environ Sci Technol 53(3):1413–1421

    Article  CAS  Google Scholar 

  • Koseoglu MA, Rahimi R, Okumus F, Liu J (2016) Bibliometric studies in tourism. Ann Tour Res 61:180–198

    Article  Google Scholar 

  • S Kumar R Shanker S Verma 2018 “Context aware dynamic permission model: a retrospect of privacy and security in android system,” In 2018 international conference on intelligent circuits and systems (ICICS) 324–329

  • Kumar M, Raju KS, Kumar D, Goyal N, Verma S, Singh A (2021) An efficient framework using visual recognition for IoT based smart city surveillance. Multimed Tools Appl 80:1–19. https://doi.org/10.1007/s11042-020-10471-x

    Article  Google Scholar 

  • Li X et al (2016) The impact of observation nudging on simulated meteorology and ozone concentrations during DISCOVER-AQ 2013 texas campaign. Atmos Chem Phys 16(5):3127–3144

    Article  CAS  Google Scholar 

  • Li S et al (2022) Sources and processes of organic aerosol in non-refractory PM1 and PM2. 5 during foggy and haze episodes in an urban environment of the Yangtze River Delta, China. Environ Res 212:113557

    Article  CAS  Google Scholar 

  • Li Y, Sha Z, Tang A, Goulding K, Liu X (2023) The application of machine learning to air pollution research: A bibliometric analysis. Ecotoxicol Environ Saf. https://doi.org/10.1016/j.ecoenv.2023.114911

    Article  Google Scholar 

  • Loomis D et al (2013) The carcinogenicity of outdoor air pollution. Lancet Oncol 14(13):1262

    Article  CAS  Google Scholar 

  • Loomis D, Huang W, Chen G (2014) The international agency for research on cancer (IARC) evaluation of the carcinogenicity of outdoor air pollution: focus on China. Chin J Cancer 33(4):189

    Article  CAS  Google Scholar 

  • Mao G, Hu H, Liu X, Crittenden J, Huang N (2021) A bibliometric analysis of industrial wastewater treatments from 1998 to 2019. Environ Pollut 275:115785

    Article  CAS  Google Scholar 

  • Mehmood K et al (2022) Predicting the quality of air with machine learning approaches: current research priorities and future perspectives. J Clean Prod 379:134656

    Article  CAS  Google Scholar 

  • Mehmood K et al (2022) Predicting the quality of air with machine learning approaches: current research priorities and future perspectives. J Clean Prod 379(P2):134656. https://doi.org/10.1016/j.jclepro.2022.134656

    Article  CAS  Google Scholar 

  • Mokhtari I, Bechkit W, Rivano H, Yaici MR (2021) Uncertainty-aware deep learning architectures for highly dynamic air quality prediction. IEEE Access 9:14765–14778

    Article  Google Scholar 

  • Munim ZH, Dushenko M, Jimenez VJ, Shakil MH, Imset M (2020) Big data and artificial intelligence in the maritime industry: a bibliometric review and future research directions. Marit Policy Manag 47(5):577–597

    Article  Google Scholar 

  • Nahar KMO, Ottom MA, Alshibli F, Shquier MMA (2020) Air quality index using machine learning–a jordan case study. Compusoft 9(9):3831–3840

    Google Scholar 

  • Najafi G et al (2016) SVM and ANFIS for prediction of performance and exhaust emissions of a SI engine with gasoline–ethanol blended fuels. Appl Therm Eng 95:186–203

    Article  CAS  Google Scholar 

  • Navares R, Aznarte JL (2020) Predicting air quality with deep learning LSTM: towards comprehensive models. Ecol Inform 55:101019

    Article  Google Scholar 

  • Nyberg F et al (2000) Urban air pollution and lung cancer in stockholm. Epidemiology 11(5):487–495

    Article  CAS  Google Scholar 

  • Orru H, Ebi KL, Forsberg B (2017) The interplay of climate change and air pollution on health. Curr Environ Heal Reports 4:504–513

    Article  CAS  Google Scholar 

  • Pritchard A (1969) Statistical bibliography or bibliometrics. J Doc 25:348

    Google Scholar 

  • Qin D, Yu J, Zou G, Yong R, Zhao Q, Zhang B (2019) A novel combined prediction scheme based on CNN and LSTM for urban PM 2.5 concentration. Ieee Access 7:20050–20059

    Article  Google Scholar 

  • Qureshi MI et al (2015) Environment and air pollution: health services bequeath to grotesque menace. Environ Sci Pollut Res 22:3467–3476

    Article  CAS  Google Scholar 

  • Ren C, Cao SL (2019)“Development and application of linear ventilation and temperature models for indoor environmental prediction and HVAC systems control,” Sustain. Cities Soc. 51:101673.

  • Ruckerl R et al (2006) Air pollution and markers of inflammation and coagulation in patients with coronary heart disease. Am J Respir Crit Care Med 173(4):432–441

    Article  Google Scholar 

  • Rybarczyk Y, Zalakeviciute R (2018) Machine learning approaches for outdoor air quality modelling: a systematic review. Appl Sci 8(12):2570

    Article  Google Scholar 

  • Shaban KB, Kadri A, Rezk E (2016) Urban air pollution monitoring system with forecasting models. IEEE Sens J 16(8):2598–2606

    Article  CAS  Google Scholar 

  • Sharma T, Verma S (2017) Prediction of heart disease using cleveland dataset: a machine learning approach. Int J Recent Res Asp 4(3):17–21

    Google Scholar 

  • Soundari AG, Jeslin JG, Akshaya AC (2019) Indian air quality prediction and analysis using machine learning. Int J Appl Eng Res 14(11):181–186

    Google Scholar 

  • Stafoggia M et al (2019) Estimation of daily PM10 and PM2. 5 concentrations in Italy, 2013–2015, using a spatiotemporal land-use random-forest model. Environ Int 124:170–179

    Article  CAS  Google Scholar 

  • Tagaris E, Liao K-J, DeLucia AJ, Deck L, Amar P, Russell AG (2009) Potential impact of climate change on air pollution-related human health effects. Environ Sci Technol 43(13):4979–4988

    Article  CAS  Google Scholar 

  • Tian X et al (2020) Power allocation scheme for maximizing spectral efficiency and energy efficiency tradeoff for uplink NOMA systems in B5G/6G. Phys Commun 43:101227

    Article  Google Scholar 

  • Van Eck N, Waltman L (2010) Software survey: VOSviewer, a computer program for bibliometric mapping. Scientometrics 84(2):523–538

    Article  Google Scholar 

  • Velasco-Muñoz JF, Aznar-Sánchez JA, Belmonte-Ureña LJ, Román-Sánchez IM (2018) Sustainable water use in agriculture: a review of worldwide research. Sustainability 10(4):1084

    Article  Google Scholar 

  • Vitolo C, Elkhatib Y, Reusser D, Macleod CJA, Buytaert W (2015) Web technologies for environmental big data. Environ Model Softw 63:185–198

    Article  Google Scholar 

  • Wang W, Men C, Lu W (2008) Online prediction model based on support vector machine. Neurocomputing 71(4–6):550–558

    Article  Google Scholar 

  • Wang D, Wei S, Luo H, Yue C, Grunder O (2017) A novel hybrid model for air quality index forecasting based on two-phase decomposition technique and modified extreme learning machine. Sci Total Environ 580:719–733

    Article  CAS  Google Scholar 

  • World health organization (WHO), exposure & health impacts of air pollution. Air quality and health. https://www.who.int/teams/environment-climate-change-and-health/air-quality-energy-and-health/health-impacts/exposure-air-pollution#:~:text=The%20combined%20or%20joint%20effects,cancer%20and%20acute%20respiratory%20infections. Accessed on 13 Jan 2024.

  • Wu L, Li N, Yang Y (2018) Prediction of air quality indicators for the Beijing-Tianjin-Hebei region. J Clean Prod 196:682–687

    Article  CAS  Google Scholar 

  • Xue T et al (2019) Spatiotemporal continuous estimates of PM2. 5 concentrations in China, 2000–2016: a machine learning method with inputs from satellites, chemical transport model, and ground observations. Environ Int 123:345–357

    Article  CAS  Google Scholar 

  • Yan R, Liao J, Yang J, Sun W, Nong M, Li F (2021) Multi-hour and multi-site air quality index forecasting in Beijing using CNN, LSTM, CNN-LSTM, and spatiotemporal clustering. Expert Syst Appl 169:114513

    Article  Google Scholar 

  • Yang G, Jan MA, Rehman AU, Babar M, Aimal MM, Verma S (2020) Interoperability and data storage in internet of multimedia things: investigating current trends, research challenges and future directions. IEEE Access 8:124382–124401

    Article  Google Scholar 

  • Yang J, Yan R, Nong M, Liao J, Li F, Sun W (2021) PM2. 5 concentrations forecasting in Beijing through deep learning with different inputs, model structures and forecast time. Atmos Pollut Res 12(9):101168

    Article  CAS  Google Scholar 

  • Ye X, Wang X, Zhang L (2022) Diagnosing the model bias in simulating daily surface ozone variability using a machine learning method: the effects of dry deposition and cloud optical depth. Environ Sci Technol 56(23):16665–16675

    Article  CAS  Google Scholar 

  • Yi J, Prybutok VR (1996) A neural network model forecasting for prediction of daily maximum ozone concentration in an industrialized urban area. Environ Pollut 92(3):349–357

    Article  CAS  Google Scholar 

  • Yu Y et al (2020) A bibliometric analysis using VOSviewer of publications on COVID-19. Ann Transl Med 8(13):816–816. https://doi.org/10.21037/atm-20-4235

    Article  CAS  Google Scholar 

  • Zamani Joharestani M, Cao C, Ni X, Bashir B, Talebiesfandarani S (2019) PM25 prediction based on random forest, XGBoost, and deep learning using multisource remote sensing data. Atmosphere (basel) 10(7):373

    Article  Google Scholar 

  • Zhan Y, Luo Y, Deng X, Grieneisen ML, Zhang M, Di B (2018) Spatiotemporal prediction of daily ambient ozone levels across China using random forest for human exposure assessment. Environ Pollut 233:464–473

    Article  CAS  Google Scholar 

  • Zhou Y, Chang F-J, Chang L-C, Kao I-F, Wang Y-S (2019) Explore a deep learning multi-output neural network for regional multi-step-ahead air quality forecasts. J Clean Prod 209:134–145

    Article  CAS  Google Scholar 

  • Zhu D, Cai C, Yang T, Zhou X (2018a) A machine learning approach for air quality prediction: model regularization and optimization. Big Data Cogn Comput 2(1):5

    Article  Google Scholar 

  • Zhu S, Yang L, Wang W, Liu X, Lu M, Shen X (2018b) Optimal-combined model for air quality index forecasting: 5 cities in North China. Environ Pollut 243:842–850

    Article  CAS  Google Scholar 

  • Zong Z et al (2015) Radiocarbon-based impact assessment of open biomass burning on regional carbonaceous aerosols in North China. Sci Total Environ 518:1–7

    Article  Google Scholar 

  • Zupic I, Čater T (2015) Bibliometric methods in management and organization. Organ Res Methods 18(3):429–472

    Article  Google Scholar 

Download references

Funding

No funding was received for conducting this study.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Asif Ansari.

Ethics declarations

Conflict of interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Ethical Approval

The authors hereby declare that this manuscript is not published or considered for publication elsewhere.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file1 (DOCX 1931 kb)

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ansari, A., Quaff, A.R. Bibliometric Analysis on Global Research Trends in Air Pollution Prediction Research Using Machine Learning from 1991–2023 Using Scopus Database. Aerosol Sci Eng (2024). https://doi.org/10.1007/s41810-024-00221-z

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s41810-024-00221-z

Keywords

Navigation