An ensemble learning based hybrid model and framework for air pollution forecasting

Chang, Yue-Shan; Abimannan, Satheesh; Chiao, Hsin-Ta; Lin, Chi-Yeh; Huang, Yo-Ping

doi:10.1007/s11356-020-09855-1

An ensemble learning based hybrid model and framework for air pollution forecasting

Research Article
Published: 03 July 2020

Volume 27, pages 38155–38168, (2020)
Cite this article

Environmental Science and Pollution Research Aims and scope Submit manuscript

Yue-Shan Chang¹,
Satheesh Abimannan²,
Hsin-Ta Chiao³,
Chi-Yeh Lin¹ &
…
Yo-Ping Huang⁴

1581 Accesses
46 Citations
Explore all metrics

Abstract

As advance of economy and industry, the impact of air pollution has gradually gained attention. In order to predict air quality, there were many studies that exploited various machine learning techniques to build predictive model for pollutant concentration or air quality prediction. However, enhancing the prediction performance always is the common problem of existing studies. Traditional templates based on machine learning and deep learning methods, such as GBTR (gradient boosted tree regression), SVR (support vector machine-based regression), and LSTM (long short-term memory), are most promising approaches to address these problems. Some previous researches showed that ensemble learning technology can improve predictive performance of other domains. In order to improve the accuracy of forecasting, in this paper, we propose a hybrid model and framework to improve the forecasting accuracy of air pollution. We not only exploit stacking-based ensemble learning scheme with Pearson correlation coefficient to calculate the correlation between different machine learning models to integrate various forecasting models together, but also construct a framework based on Spark+Hadoop machine learning and TensorFlow deep learning framework to physically integrate these models to demonstrate the next 1 to 8 h’ air pollution forecasting. We also conduct experiments and compare the result with GBTR, SVR, LSTM, and LSTM2 (version 2) models to demonstrate the proposed hybrid model’s predictive performance. The experimental results show that the hybrid model is superior to the existing models used for predicting air pollution.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Comparative analysis of Air Quality Index prediction using deep learning algorithms

Article 21 July 2023

Integrated Dual LSTM Model-Based Air Quality Prediction

A novel hybrid model for six main pollutant concentrations forecasting based on improved LSTM neural networks

Article Open access 24 August 2022

References

Akima H (1970) A new method of interpolation and smooth curve fitting based on local procedures. J ACM 17(4):589–602
Article Google Scholar
Bai L, Wang J, Ma X, Lu H (2018) Air pollution forecasts: an overview. Int J Environ Res Public Health 15(4):780. https://doi.org/10.3390/ijerph15040780
Article CAS Google Scholar
Behera RN, Roy MD (2016) Ensemble based hybrid machine learning approach for sentiment classification-a review. Int J Comput Appl 146(6):31–36. https://doi.org/10.5120/ijca2016910813
Article Google Scholar
Breiman L (1996) Bagging predictors. Mach Learn 24(2):123–140. https://doi.org/10.1023/A:1018054314350
Article Google Scholar
Chang YW Hsieh CJ Chang KW Ringgaard M, Lin C, Chih-Jen J (2010) Training and testing low-degree polynomial data mappings via linear SVM. Journal of Machine Learning Research, 11, 1471–1490, 2010. [online] Available at: http://www.jmlr.org/papers/volume11/chang10a/chang10a.pdf [Accessed 26 May 2019]
Chang, Y.-S., Lin, K.-M., Tsai, Y.-T., Zeng, Y.-Z. and Hung, C (2018) Big data platform for air quality analysis and prediction. In: 2018 27th Wireless and Optical Communication Conference (WOCC). IEEE Xplore,1–3. https://doi.org/10.1109/WOCC.2018.8372743
Chang Y-S, Chiao H-T, Abimannan S, Huang Y-P, Tsai Y-T, Lin K-M (2020) An LSTM-based aggregated model for air pollution forecasting. Atmos Pollut Res 11(8):1451–1463. https://doi.org/10.1016/j.apr.2020.05.015
Article CAS Google Scholar
Chen L, Huang H, Wu C, Tsai Y and Chang Y-S (2018) LoRa-based air quality monitor on unmanned aerial vehicle for smart city. In: 2018 International Conference on System Science and Engineering (ICSSE). IEEE Xplore, pp 1–5. https://doi.org/10.1109/ICSSE.2018.8519967
Cho K, Lee B, Kwon M, Kim S (2019) Air quality prediction using a deep neural network model. J Korean Soc Atmos Environ 35(2):214–225. https://doi.org/10.5572/KOSAE.2019.35.2.214
Article Google Scholar
Corani G (2005) Air quality prediction in Milan: feed-forward neural networks, pruned neural networks and lazy learning. Ecol Model 185(2–4):513–529. https://doi.org/10.1016/j.ecolmodel.2005.01.008
Article Google Scholar
Cortes, C. Vapnik, V (1995) Support-vector networks. Mach Learn, 20(3), 273–297. https://doi.org/10.1023/A:1022627411411
Delavar MR, Gholami A, Shiran GR, Rashidi Y, Nakhaeizadeh GR, Fedra K, Afshar SH (2019) Novel method for improving air pollution prediction based on machine learning approaches: a case study applied to the capital city of Tehran. Int J Geo-Inf 8(2):89–109. https://doi.org/10.3390/ijgi8020099
Article Google Scholar
Deng F, Ma L, Gao X, Chen J (2019) The MR-CA models for analysis of pollution sources and prediction of PM2.5. IEEE Trans Syst Man Cybernet Syst 49(4):814–820. https://doi.org/10.1109/TSMC.2017.2721100
Article Google Scholar
Elangasinghe M, Singhal N, Dirks K, Salmond J, Samarasinghe S (2014) Complex time series analysis of PM10 and PM2.5 for a coastal site using artificial neural network modelling and k-means clustering. Atmos Environ 94:106–116. https://doi.org/10.1016/j.atmosenv.2014.04.051
Article CAS Google Scholar
Fan J, Li S, Fan C, Bai Z, Yang K (2016) The impact of PM2.5 on asthma emergency department visits: a systematic review and meta-analysis. Environ Sci Pollut Res 23:843–885. https://doi.org/10.1007/s11356-015-5321-x
Article CAS Google Scholar
Fielding, R. T. Chapter 5 (2000) Representational State Transfer (REST). Architectural styles and the design of network-based software architectures (Ph.D.). University of California, Irvine, 2000. [online] Available at: https://www.ics.uci.edu/~fielding/pubs/dissertation/ fielding_dissertation.pdf
Franceschi F, Cobo M, Figueredo M (2018) Discovering relationships and forecasting PM10 and PM2.5 concentrations in Bogotá, Colombia, using artificial neural networks, principal component analysis, and k-means clustering. Atmos Pollut Res 9(5):912–922. https://doi.org/10.1016/j.apr.2018.02.006
Article CAS Google Scholar
Freedman DA (2009) Statistical models: theory and practice revised. Cambridge University. ISBN: 978-0-521-74385-3
Friedman JH (2002) Stochastic Gradient Boosting. Comput Stat Data Analysis 38(4):367–378. https://doi.org/10.1016/S0167-9473(01)00065-2
Article Google Scholar
Guo C, Xu Y, Tian Z (2020) Inversion of PM2.5 atmospheric refractivity profile based on AlexNet model from the perspective of electromagnetic wave propagation. Environ Sci Pollut Res. https://doi.org/10.1007/s11356-020-07703-w
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780. https://doi.org/10.1162/neco.1997.9.8.1735
Article CAS Google Scholar
Hu X, Waller L, Lyapustin A, Wang Y, Al-Hamdan M, Crosson W, Estes M, Estes S, Quattrochi D, Puttaswamy S, Liu Y (2014) Estimating ground-level PM2.5 concentrations in the Southeastern United States using MAIAC AOD retrievals and a two-stage model. Remote Sens Environ 140:220–232. https://doi.org/10.1016/j.rse.2013.08.032
Article Google Scholar
Hyndman RJ, & Athanasopoulos G (2018) Forecasting: principles and practice, 2nd, OTexts: Melbourne. OTexts.com/fpp2. [accessed on 12^th may 2018]
Jiang P, Li C, Li R, Yang H (2018) An innovative hybrid air pollution early-warning system based on pollutants forecasting and Extenics evaluation. Knowl-Based Syst 164:174–192. https://doi.org/10.1016/j.knosys.2018.10.036
Article Google Scholar
Kim HS, Park I, Song CH, Lee K, Yun JW, Kim HK, Jeon M, Lee J (2019) Development of daily PM10 and PM2.5 prediction system using a deep long short-term memory neural network model. Atmos Chem Phys Discuss 19:12935–12951. https://doi.org/10.5194/acp-19-12935-2019
Article CAS Google Scholar
Li, T, Li, X, Wang, L, Ren, Y, Zhang, T, Yu, M (2018) Multi-model ensemble forecast method of PM2.5 concentration based on wavelet neural networks. In: 2018 1st international cognitive cities conference (IC3), Okinawa, Japan ,81–86, 7–9. https://doi.org/10.1109/IC3.2018.00026
Liu H, Duan Z, Chen C (2019) A hybrid framework for forecasting PM2.5 concentrations using multi-step deterministic and probabilistic strategy. Air Qual Atmos Health 12(7):785–795. https://doi.org/10.1007/s11869-019-00695-8
Article CAS Google Scholar
Mahajan S, Liu H-M, Tsai T-C, Chen L-J (2018) Improving the accuracy and efficiency of PM2.5 forecast service using cluster-based hybrid neural network model. IEEE Access 6:19193–19204. https://doi.org/10.1109/ACCESS.2018.2820164
Article Google Scholar
Maharani D, Murfi H (2019) Deep neural network for structured data - a case study of mortality rate prediction caused by air quality. J Phys Conf Ser 1192:012010. https://doi.org/10.1088/1742-6596/1192/1/012010
Article Google Scholar
Mitchell T (1997) Machine learning. Singapore: McGraw-Hill, 1997. ISBN-13: 978–0070428072
Pearson K (1895) Notes on regression and inheritance in the case of two parents. Proc R Soc Lond 58(347- 352):240–242. https://doi.org/10.1098/rspl.1895.0041
Article Google Scholar
Polikar R (2006) Ensemble based systems in decision making. IEEE Circ Syst Mag 6(3):21–45. https://doi.org/10.1109/MCAS.2006.1688199
Article Google Scholar
Rijal N, Gutta RT, Cao T, Lin J, Bo Q, Zhang J (2018) Ensemble of deep neural networks for estimating particulate matter from images. In: 2018 IEEE 3rd international conference on image, Vision and Computing (ICIVC), 733-738, 27–29. https://doi.org/10.1109/ICIVC.2018.8492790
Rybarczyk Y, Zalakeviciute R (2018) Machine learning approaches for outdoor air quality modelling: a systematic review. Appl Sci 8(12):2570. https://doi.org/10.3390/app8122570
Article Google Scholar
Seal HL (1967) Studies in the history of probability and statistics. XV: the historical development of the Gauss linear model. Biometrika 54(1–2):1–24. https://doi.org/10.2307/2333849
Article CAS Google Scholar
Shang Z, He J (2018) Predicting hourly PM2.5 concentrations based on random forest and ensemble neural network. In: 2018 Chinese Automation Congress (CAC). pp 234–2345. https://doi.org/10.1109/CAC.2018.8623175
Siwek K Osowski S. Sowinski M (2010) Neural predictor ensemble for accurate forecasting of PM10 pollution. In: The 2010 International joint conference on neural networks (IJCNN), 1-7. https://doi.org/10.1109/IJCNN.2010.5596900
Smola A, Schölkopf B (2004) A tutorial on support vector regression. Stat Comput 14(3):199–222. https://doi.org/10.1023/B:STCO.0000035301.49549.88
Article Google Scholar
Soh P, Chang J, Huang J (2018) Adaptive deep learning-based air quality prediction model using the most relevant spatial-temporal relations. IEEE Access 6:38186–38199. https://doi.org/10.1109/ACCESS.2018.2849820
Article Google Scholar
Steele JM (2004) The Cauchy–Schwarz master class: an introduction to the art of mathematical inequalities, The Mathematical Association of America. ISBN-13 978–0–521-83775-0
Tsai Y, Zeng Y and Chang Y (2018) Air pollution forecasting using RNN with LSTM. In: 2018 IEEE 16th Intl Conf on Dependable, Autonomic and Secure Computing, 16th Intl Conf on Pervasive Intelligence and Computing, 4th Intl Conf on Big Data Intelligence and Computing and Cyber Science and Technology Congress (DASC/PiCom/DataCom/CyberSciTech), 1074–1079. https://doi.org/10.1109/DASC/PiCom/DataCom/CyberSciTec.2018.00178
UN Environment (2019). Air pollution: Africa’s invisible, Silent Killer [online] Available at: https://www.unenvironment.org/fr/node/20803 [Accessed 26 May 2019]
US EPA (2019). Particulate matter (PM) pollution | US EPA. [online] available at: https ://www.epa.gov/pm-pollution [Accessed 26 May 2019]
Usmani M Ebrahim M Adil SH Raza K (2018) Predicting market performance with hybrid model. In: 2018 3rd international conference on emerging trends in engineering, sciences and technology (ICEEST), 1-4. https://doi.org/10.1109/ICEEST.2018.8643327
Ventura L, de Oliveira Pinto F, Soares L, Luna A, Gioda A (2019) Forecast of daily PM2.5 concentrations applying artificial neural networks and Holt–Winters models. Air Qual Atmos Health 12(3):317–325. https://doi.org/10.1007/s11869-018-00660-x
Article CAS Google Scholar
Verma I Ahuja R Meisheri H, Dey L (2018) Air pollutant severity prediction using Bi-directional LSTM Network. In: 2018 IEEE/WIC/ACM international conference on web intelligence (WI), 651-654. https://doi.org/10.1109/WI.2018.00-19
Wang J, Song GA (2018) Deep spatial-temporal ensemble model for air quality prediction. Neurocomputing 314:198–206. https://doi.org/10.1016/j.neucom.2018.06.049
Article Google Scholar
Who.int (2019) How air pollution is destroying our health. [online] Available at: htps://www.who.int/air-pollution/news-and-events/how-air-pollution-is-destroying-our-health [Accessed 26 May 2019]
Yang B, Guo J, Xiao C (2018) Effect of PM2.5 environmental pollution on rat lung. Environ Sci Pollut Res 25:36136–36146. https://doi.org/10.1007/s11356-018-3492-y
Article CAS Google Scholar
Yi X (2018) Deep distributed fusion network for air quality prediction. In: 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. [online] London, United Kingdom: ACM New York, 965–973. https://doi.org/10.1145/3219819.3219822
Zhang X, Rui X Xia X Bai X Yin W Dong T (2015) A hybrid model for short-term air pollutant concentration forecasting. In:2015 IEEE International Conference on Service Operations and Logistics, And Informatics (SOLI), 171–175. https://doi.org/10.1109/SOLI.2015.7367614
Zhang Y, Wang Y, Gao M, Ma Q, Zhao J, Zhang R, Wang Q, Huang L (2019) A predictive data feature exploration-based air quality prediction approach. IEEE Access 7:30732–30743. https://doi.org/10.1109/ACCESS.2019.2897754
Article Google Scholar
Zheng F, Zhong S (2011) Time series forecasting using an ensemble model incorporating ARIMA and ANN based on combined objectives. In: 2011 2nd international conference on artificial intelligence, management science and electronic commerce (AIMSEC), 2671-2674. https://doi.org/10.1109/AIMSEC.2011.6011011
Zhou Z-H. Ensemble learning. In: Li, SZ (eds) Encyclopedia of biometrics, Springer, Berlin. [online] Available at: https://cs.nju.edu.cn/zhouzh/zhouzh.files/publication /springerEBR09.pdf [Accessed 26 May 2019]
Zhou Q, Jiang H, Wang J, Zhou J (2014) A hybrid model for PM2.5 forecasting based on ensemble empirical mode decomposition and a general regression neural network. Sci Total Environ 496:264–274. https://doi.org/10.1016/j.scitotenv.2014.07.051
Article CAS Google Scholar

Download references

Funding

This work was partially supported by Ministry of Science and Technology of Taiwan, Republic of China under Grant No. MOST 106-3114-M-305-001-A, MOST 108-2119-M-305-001-A, MOST 109-2119-M-305-001-A, and MOST108-2321-B-027-001-; and by National Taipei University under Grant No. 106-NTPU_A-H&E-143-001, 107-NTPU_A-H&E-143-001, and 108-NTPU_A-H&E-143-001.

Author information

Authors and Affiliations

Department of Computer Science and Information Engineering, National Taipei University, New Taipei City, Taiwan
Yue-Shan Chang & Chi-Yeh Lin
Galgotias University, Greater Noida, Uttar Pradesh, India
Satheesh Abimannan
Tunghai University, Taichung City, Taiwan
Hsin-Ta Chiao
National Taipei University of Technology, Taipei City, Taiwan
Yo-Ping Huang

Authors

Yue-Shan Chang
View author publications
You can also search for this author in PubMed Google Scholar
Satheesh Abimannan
View author publications
You can also search for this author in PubMed Google Scholar
Hsin-Ta Chiao
View author publications
You can also search for this author in PubMed Google Scholar
Chi-Yeh Lin
View author publications
You can also search for this author in PubMed Google Scholar
Yo-Ping Huang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yue-Shan Chang.

Additional information

Responsible editor: Marcus Schulz

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chang, YS., Abimannan, S., Chiao, HT. et al. An ensemble learning based hybrid model and framework for air pollution forecasting. Environ Sci Pollut Res 27, 38155–38168 (2020). https://doi.org/10.1007/s11356-020-09855-1

Download citation

Received: 13 April 2020
Accepted: 22 June 2020
Published: 03 July 2020
Issue Date: October 2020
DOI: https://doi.org/10.1007/s11356-020-09855-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An ensemble learning based hybrid model and framework for air pollution forecasting

Abstract

Access this article

Similar content being viewed by others

Comparative analysis of Air Quality Index prediction using deep learning algorithms

Integrated Dual LSTM Model-Based Air Quality Prediction

A novel hybrid model for six main pollutant concentrations forecasting based on improved LSTM neural networks

References

Funding

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

An ensemble learning based hybrid model and framework for air pollution forecasting

Abstract

Access this article

Similar content being viewed by others

Comparative analysis of Air Quality Index prediction using deep learning algorithms

Integrated Dual LSTM Model-Based Air Quality Prediction

A novel hybrid model for six main pollutant concentrations forecasting based on improved LSTM neural networks

References

Funding

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation