Abstract
This chapter evaluates and compares the performance of six machine-learning (ML) algorithms in predicting China’s building-related carbon emissions. The models took into account five input parameters influencing building-related CO2 emissions: urbanisation, R&D, population size, GDP, and energy use. The study used quarterly data throughout 1971Q1–2014Q4 to develop, calibrate, and validate the models. Each model was developed using 140 observations and validated on 36 observations. In tuning each ML model for comparative purposes, 10-fold with cross-validation approach was used in selecting the optimal hyperparameters and their associated arguments. The results indicate that the random forest (RF) model attained the highest coefficient of determination (R2) of 99.88%, followed by the k-nearest neighbour (KNN) (99.87%), extreme gradient boosting (XGBoost) (99.77%), decision tree (DT) (99.63%), adaptive boosting (AdaBoost) (99.56%), and the support vector regression (SVR) model (97.67%). Overall, the RF algorithm is the best performing ML algorithm in accurately predicting building-related CO2 emissions, whereas the best algorithm in terms of time efficiency is the DT algorithm. The KNN model is highly recommended when practitioners want to have accurate predictions in a timely manner. RF, KNN, and DT models could be added to the toolkits of environmental policymakers to provide high-quality forecasts and patterns of building-related CO2 emissions in an accurate and real-time manner.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Abidoye, R.B., Chan, A.P.: Improving property valuation accuracy: a comparison of hedonic pricing model and artificial neural network. Pacific Rim Prop. Res. J. 24(1), 71–83 (2018)
Acheampong, A.O., Boateng, E.B.: Modelling carbon emission intensity: application of artificial neural network. J. Clean. Prod. 225, 833–856 (2019). https://doi.org/10.1016/j.jclepro.2019.03.352
Ahmad, M.W., Mourshed, M., Rezgui, Y.: Trees vs Neurons: comparison between random forest and ANN for high-resolution prediction of building energy consumption. Energy Build. 147, 77–89 (2017)
Awad, M., Khanna, R.: Support vector regression. In: Efficient Learning Machines: Theories, Concepts, and Applications for Engineers and System Designers, pp. 67–80. Apress, Berkeley, CA (2015). https://doi.org/10.1007/978-1-4302-5990-9_4
Bannor, B.E., Acheampong, A.O.: Deploying artificial neural networks for modeling energy demand: international evidence. Int. J. Energy Sect. Manag. ahead-of-print (ahead-of-print) (2019). https://doi.org/10.1108/ijesm-06-2019-0008
Boateng, E.B., Pillay, M., Davis, P.: Predicting the level of safety performance using an artificial neural network. In: Ahram T, Karwowski W, Taiar R (eds) Human Systems Engineering and Design. Human Systems Engineering and Design, vol. 876, pp. 705–710. Springer International Publishing, Cham (2019). https://doi.org/10.1007/978-3-030-02053-8
Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)
Buuren, S.V., Groothuis-Oudshoorn, K.: mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 1–68 (2010)
Chen, H., Lee, W., Wang, X.: Energy assessment of office buildings in China using China building energy codes and LEED 2.2. Energy Build. 86, 514–524 (2015)
Chen, T., Guestrin, C.: XGBoost: A scalable tree boosting system. In: Association for Computing Machinery, pp. 785–794 (2016). https://doi.org/10.1145/2939672.2939785
Chen, T., Guestrin, C.: Xgboost: A scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ACM, pp. 785–794 (2016)
Drucker, H., Burges, C.J., Kaufman, L., Smola, A.J., Vapnik, V.: Support vector regression machines. In: Advances in Neural Information Processing Systems, pp. 155–161 (1997)
Fletcher, R.: Practical Methods of Optimization. Wiley (2013)
Freund, Y., Schapire, R.E.: A decision-theoretic generalization of on-line learning and an application to boosting. J. Comput. Syst. Sci. 55(1), 119–139 (1997)
Gallo, C., Conto, F., Fiore, M.: A neural network model for forecasting CO2 emission. AGRIS on-line Papers in Economics and Informatics 6 (665-2016-45020), pp. 31–36 (2014)
Grömping, U.: Variable importance assessment in regression: linear regression versus random forest. Am. Stat. 63(4), 308–319 (2009)
Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer Science & Business Media (2009)
IEA: CO2 Emissions from Fuel Combustion. All rights reserved (2019)
IEA: World Energy Balances. All rights reserved (2019)
International Energy Agency and the United Nations Environment Programme: 2019 global status report for buildings and construction: towards a zero-emission, efficient and resilient buildings and construction sector (2019)
Karush, W.: Minima of functions of several variables with inequalities as side constraints. M Sc Dissertation Department of Mathematics, University of Chicago (1939)
Kuhn, H.W., Tucker, A.W.: Nonlinear programming. In: Neyman, J. (ed.) Proceedings of the Second Berkeley Symposium on Mathematical Statistics and Probability. University of California Press, Berkeley (1951)
Lu, M., Lai, J.: Review on carbon emissions of commercial buildings. Renew. Sustain. Energy Rev. 119, 109545 (2020)
Pedersen, L.: Use of different methodologies for thermal load and energy estimations in buildings including meteorological and sociological input parameters. Renew. Sustain. Energy Rev. 11(5), 998–1007 (2007)
Ren, Z., Chrysostomou, V., Price, T.: The measurement of carbon performance of construction activities. In: Smart and Sustainable Built Environment (2012)
Seo, S., Hwang, Y.: Estimation of CO2 emissions in life cycle of residential buildings. J. Constr. Eng. Manag. 127(5), 414–418 (2001)
Shahbaz, M., Van Hoang, T.H., Mahalik, M.K., Roubaud, D.: Energy consumption, financial development and economic growth in India: new evidence from a nonlinear and asymmetric analysis. Energy Econ. 63, 199–212 (2017)
Smola, A.J., Schölkopf, B.: A tutorial on support vector regression. Stat. Comput. 14(3), 199–222 (2004)
Solomatine, D.P., Shrestha, D.L., AdaBoost, R.T.: A boosting algorithm for regression problems. In: 2004 IEEE International Joint Conference on Neural Networks (IEEE Cat. No. 04CH37541), pp. 1163–1168. IEEE (2004)
Tsai, C.-F., Chiou, Y.-J.: Earnings management prediction: A pilot study of combining neural networks and decision trees. Expert Syst. Appl. 36(3), 7183–7191 (2009)
Vapnik, V.: The nature of statistical learning theory. Springer, New York (1995)
World Bank: World Bank Open Data. The World Bank Group https://data.worldbank.org/ (2019). Accessed 25 Dec 2019
World Bank: Carbon emissions data. https://data.worldbank.org/indicator/EN.ATM.CO2E.KT?end=2014&start=2000 (2020). Accessed 1 Mar 2020
Wu, D.: Supplier selection: A hybrid model using DEA, decision tree and neural network. Expert Syst. Appl. 36(5), 9105–9112 (2009)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG
About this chapter
Cite this chapter
Boateng, E.B., Twumasi, E.A., Darko, A., Tetteh, M.O., Chan, A.P.C. (2021). Predicting Building-Related Carbon Emissions: A Test of Machine Learning Models. In: Hassanien, AE., Taha, M.H.N., Khalifa, N.E.M. (eds) Enabling AI Applications in Data Science. Studies in Computational Intelligence, vol 911. Springer, Cham. https://doi.org/10.1007/978-3-030-52067-0_11
Download citation
DOI: https://doi.org/10.1007/978-3-030-52067-0_11
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-52066-3
Online ISBN: 978-3-030-52067-0
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)