Learning framework for carbon emissions predictions incorporating a RReliefF driven features selection and an iterative neural network architecture improvement

Crespo, Antonio Marcio Ferreira; Wang, Chun; Crespo, Thiago Marques Ferreira; Weigang, Li; Barreto, Alexandre

doi:10.1007/s42452-021-04411-z

Learning framework for carbon emissions predictions incorporating a RReliefF driven features selection and an iterative neural network architecture improvement

Research Article
Open access
Published: 16 March 2021

Volume 3, article number 460, (2021)
Cite this article

Download PDF

You have full access to this open access article

SN Applied Sciences Aims and scope Submit manuscript

Learning framework for carbon emissions predictions incorporating a RReliefF driven features selection and an iterative neural network architecture improvement

Download PDF

Antonio Marcio Ferreira Crespo ORCID: orcid.org/0000-0003-3960-0858¹,
Chun Wang¹,
Thiago Marques Ferreira Crespo²,
Li Weigang³ &
…
Alexandre Barreto⁴

1448 Accesses
1 Citation
1 Altmetric
Explore all metrics

Abstract

Inaccurate carbon emissions predictions may be one of the root factors leading to the overall ineffectiveness of the European Union environmental regulatory framework. Therefore, the present article aims at introducing a novel computational learning framework for carbon emissions prediction incorporating a RReliefF driven features selection and an iterative neural network architecture improvement. Our learning framework algorithmic architecture iteratively chains the features selection process and the backpropagation artificial neural network architecture design based on the data assessment accomplished by the RReliefF algorithm. Thus a better features set / neural network architecture combination is obtained for each specific prediction target. The implemented framework was trained and tested with real world data obtained from the European Union, International Energy Agency, Organisation for Economic Co-operation and Development, and World Bank, for the period 1990–2017. The framework evaluation against current mainstream machine learning models, and its benchmarking comparing to recent published researches on carbon emissions prediction indicates that our research contribution is relevant and capable of supporting the improvement of environmental policies.

Graphic abstract

Predicting carbon dioxide emissions in the United States of America using machine learning algorithms

Article 30 April 2024

A comparative study of statistical and machine learning models on carbon dioxide emissions prediction of China

Article 23 October 2023

Prediction of CO2 emissions in China by generalized regression neural network optimized with fruit fly optimization algorithm

Article 10 June 2023

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

After the Kyoto Protocol signature in 1997, governments, industry stakeholders and academia began to work on the development of effective and efficient environmentally driven policies and economic mechanisms. In such context, the European Union (EU) current efforts in support to sustainable development and climate change avoidance comprises three main challenges, i.e. greenhouse gases (GHG) emissions reduction, consistent increment on energy production from renewables (Renewable Energy Directive—RED), and increase in energy efficiency (Energy Efficiency Directive—EED).

Within the EU environment protection framework, the European Union Emissions Trading System (EU-ETS) was launched in 2005, and it currently covers approximately 45% of EU28 polluting emissions. The EU-ETS implementation observed a staggered approach, and 2020 is the last year of the third phase, as presented below.

EU-ETS phase 1 (2005-2007) - the absence of reliable data on actual emissions and consequent wrong estimations led to allowances surplus (in million metric tonnes of CO2 equivalents - MtCO2-eq):
- Allocated allowances: 6370 MtCO2-eq;
- CO2 emissions: 6215 MtCO2.

EU-ETS phase 2 (2008-2012):
- Allocated allowances: 11373 MtCO2-eq;
- Carbon emissions: 9613 MtCO2.

EU-ETS phase 3 (2013-2020):
- Allowances surplus in 2017: 1.6 billion.

Moreover, in 2019 the EU had to artificially remove 700 million allowances from circulation. Hence, the observed evolution of the EU trading system indicates it sustained and substantial structural supply–demand imbalance that kept distorting the market and compromising the scheme effectiveness as an emissions reduction driver [1].

Moreover, notwithstanding the systematic and continued EU efforts and policies addressing the reduction of carbon emissions, the EU member States' projections converge to an EU-wide total GHG emissions reduction of at most 32%, which falls short of the 40% target for 2030. EU-ETS specific projections indicate that the stationary installations could reach a 10% reduction on emissions between 2020 and 2030, which is insufficient for the accomplishment of the 2030 reduction target of 43% compared to 2005 levels [2].

Thus, inaccurate carbon emissions predictions may be one of the root factors leading to the overall ineffectiveness of the EU environmental regulatory framework. Therefore the achievement of the European Union ambitious targets will require additional policies resulting from new holistic and creative approaches targeting carbon emissions trends prediction.

In such scenario, and considering the findings related to the EU-ETS experience, our contribution explored the following research opportunities: (a) market based climate change avoidance policies efficiency could be relevantly improved by a better accuracy on carbon emissions trends prediction; (b) there is a crucial need for more accurate carbon emissions predictions better supporting each climate change avoidance initiative (i.e. EU-ETS, EED, RED) by considering the particularities of industry / economy sectors under their coverage; (c) machine learning (ML) methods and techniques have the potential capacity to grasp such particularities from economic and energy consumption indicators, what might lead to more realistic carbon emissions forecasts.

Therefore, the present article introduces a computational learning framework for carbon emissions predictions incorporating a sophisticated and effective algorithm for the assessment and selection of the most relevant environmental impacting factors. Such algorithm was innovatively engineered with an artificial neural network (ANN), for the conformation of a forecast framework able to improve its architecture according to the selected impacting factors (predicting features). The framework evaluation against current mainstream machine learning models, and its benchmarking to recent published researches on carbon emissions prediction indicates that our research contribution is relevant and capable of supporting the improvement of environmental policies effectiveness.

2 Relevant related researches

A myriad of researches analyzed carbon emissions behavior, and attempted to predict it by means of several different approaches, methods and techniques, and considering different emissions impacting factors (predictors). Econometric approaches were used by Guan et al. [3], Anger [4], Li and Lu [5], Robalino-Lopez et al. [6], Scott et al. [7], Mi et al. [8]. And Game Theory emerged as one of the preferred approaches to address decision-making and supply chain challenges linked to carbon emissions and carbon policies, as in Chang and Chang [9], Yang et al. [10], Yang et al. [11], and Xu et al. [12].

Whereas General Equilibrium Theory (Wang and Wang [13], Gavard et al. [14], Zhang et al. [15]), Operational Research (Cui et al. [16], Hong et al. [17]), Index Number Theory (Wang et al. [18], Solaymani et al. [19]), Variational Inequality Theory (Allevi et al. [20]), and Grey Systems Theory (Jiang et al. [21]) also played an important role in support of such studies, a significant number of researches opted for the application of techniques within the Statistical and Computational Learning domain, as presented in Table 1.

Table 1 Methodologies, techniques and predicting variables for carbon emissions prediction

Full size table

The literature review provided fundamental insights on the existing carbon emissions impacting factors and how to apply them in emissions prediction models. It was noted that the researches benefiting from Computational Learning and Statistical Learning theories were the ones providing more information regarding how the predictors (or the availability / choice of different predictors) impacts prediction confidence level and accuracy.

The literature review also allowed us to identify some very important challenges related to carbon emissions prediction, when considering the amount and diversity of potential predictors. Firstly, a particular predictor correlation to a specific target varies depending on the scenario / region. Secondly the systematic generation of trustworthy carbon emissions information started in the 1990 decade, and its availability is restricted to some parts of the world.

Thirdly, carbon emissions can be characterized as a worldwide multisectoral interconnected phenomenon, e.g. the pollution outcomes international flights may have contributing components spread in all continents if we consider the airline headquarters location, the flight route (origin-overfly area-destination), the aircraft manufacturer (engines manufacturer, fuselage manufacturer, tires manufacturer, etc.), the fuel producer (petroleum, biofuels).

As an additional example, consider, a huge transnational enterprise may move its heavily polluting production to regions where carbon policies are less strict or even inexistent (carbon leakage). In such scenario, carbon emissions prediction models should be scalable in order to progressively cover broader scopes and process more data.

However, such required scalability would lead to the use of an increasing number predictors (predicting model features space dimension), what would not be accompanied by additional instances, once the availability of trustworthy data is limited. Thus, any intended predicting model should be capable of addressing data related problems such as endogeneity, heteroskedasticity, multicolinearity, and dimensionality.

The overall insights accrued from the literature review drove us to choose a computational learning / statistical Learning approach for the design of our prediction framework. It was also noted that, considering the nature of carbon emissions related data, it would be generally impossible to work with any parametric learning method. Additionally, fitting high-dimensional statistical models often requires the use of non-linear parameter estimation procedures [32]. Therefore, we opted for Neural Networks as the core learning model of our proposed predicting framework.

3 Methods and framework evaluation

Within our research we designed and implemented a computational learning framework for carbon emissions predictions incorporating a RReliefF driven features selection method, and an iterative neural network architecture improvement.

The Relief family of algorithms [33, 34] incorporates the ability to qualify attributes in a dataset as a function of the Euclidean distance computed between neighbouring instances. Such non-parametric and non-myopic algorithms are able to capture non-linear relationships, and run in low-order polynomial time.

The algorithms outcomes are attributes weights that feature a probabilistic interpretation, i.e. the weights are proportional to the difference between two conditional probabilities: the probability of the attribute's value being different conditioned on the given nearest hit, and on the nearest given miss [34].

Figure 1 graphically presents the designed framework, composed by four modules: the Features Engineering Module (FEM), the Model Generation Module (MGM), the Model Evaluation Module (MEM), and the Prediction Explanation Module (PEM).

3.1 Data analysis

Our research focused on the European Union—28 States (EU28), and our dataset comprises data obtained from the European Union (Eurostats), International Energy Agency (IEA), Organisation for Economic Co-operation and Development (OECD), and World Bank (WB) databases, as described below.

Sources: Eurostats, IEA, OECD, World Bank.
Scope:
- EU28;
- 1990 – 2017;
- Total CO2 emissions / sectoral CO2 emissions;
- 26 economic / energy indicators (candidate predictors).

Data Aggregation Levels:
- Regional;
- Annual;
- Total Emissions / Energy Industries / Industries / Commerce – Public Services / Transport / Residential / Aviation.

Tables 2 and 3 introduce the prediction targets and potential predictors explored in our research.

Table 2 Prediction targets—MtCO2 / IEA

Full size table

Table 3 Candidate predictors

Full size table

The characteristics of our research data led us to the application of neural networks (NN) as the core learning model in our proposed network. In this section we provide a deeper analysis of our data, which corroborates our choice and provides additional insights on how to deal with the data issues mentioned in the aforementioned section.

Pearson's coefficient is a test that measures the statistical association between two continuous variables as a function of the covariance observed between them. It provides information about the magnitude of the association, or correlation, as well as the direction of the relationship.

The results of the Pearson correlation test are bound by some important assumptions regarding the tested data, i.e. the variables should be normally distributed, a feature linearity and homoskedasticity. The test outcomes are values ranging −1 to + 1, where + 1 indicates a perfect positive relationship, -1 indicates a perfect negative relationship, and a 0 indicates no relationship exists; strong correlations are indicated by values between ± 0.5 and ± 1. The research data was submitted to the test, and Table 4 presents the potential predictors listed in order of correlation strength.

Table 4 Pearson correlation test—target T1 / Total CO2 (MtCO2)

Full size table

The analysis of the Pearson's test outcomes flags down some important insights. The predictor A18 (Energy Use) shows the strongest correlation with the total CO2 emissions, what is in accordance with the knowledge accrued from the literature review. However, as regards to the predictor A3 (population), the test result is counterintuitive, as it indicates a strong negative relationship between population and CO2 emissions; the literature review also contradicts such negative correlation.

Figure 2 provides the visualization of the predictors A18 and A3, and corroborates the Pearson's test outcome. A deeper analysis of the test outcome raises another important flag, i.e. the predictor A4 (Temperature HDD) shows an irrelevant correlation with total CO2 emissions, and considering the research scope (EU28), such outcome seems inconsistent with the real-world energy consumption dynamics.

The combined analysis of Table 4 and Fig. 2 indicates potential non-linearity between the variables, as well as eventual additional violations of the Pearson's test assumptions. Thus there is a need for a more sophisticated correlation analysis of the data, and as such, we submitted it to the Spearman correlation test and to the Hoeffding’s D Statistic test.

Spearman correlation uses ranks instead of assumptions about the distributions of the two variables and, as such, it analyzes the association between variables according to ordinal measurement levels. Thus, the test does not assume that the variables are normally distributed, and it can be applied to the cases in which the Pearson's assumptions (continuous-level variables, heteroskedasticity, and normality) are not fulfilled.

Similarly to the Pearson's test, Spearman analysis outcomes are values between −1 and + 1, and the test results are representative once the data can be ordinally arranged and the scores on one variable can be monotonically related to the other variable. The Hoeffding’s D test [35] measures the independence of the data sets by computing the distance between the product of the marginal distributions under the null hypothesis and the empirical bivariate distribution. The test is able to identify linear / non-linear, monotonic / non-monotonic functions, and also non-functional relationships. The test outcome is a value between −0.5 and + 1 where larger values indicate stronger relationships, and there is no information about the variables correlation direction.

We then continued the analysis of the potential abnormal results related to the predictors A3 and A4. Table 5 presents the results of the aforementioned tests, and it is possible to observe that A3 increases its relative correlation level as the test sophistication is improved. Moreover, Spearman test result corroborates the abnormal negative relationship between population and total carbon emissions.

Table 5 Potential predictors ranked according to the specified testes outcomes

Full size table

As regards to A4, the joint results imply that none of the tests were able to properly analyze the correlation level between temperature and CO2 emissions. In such scenario, the next data analysis step comprised the use of a state-of-the-art computationally efficient, non-myopic, and non-parametric algorithm able to indicate and weight complex patterns of association, i.e. the RReliefF algorithm. Once a target variable is defined, RReliefF scores the correlated variables with values ranging from -1 (worst) to + 1 (best).

Table 6 presents the outcomes of the aforementioned analysis. The Hoeffding’s D test and the RReliefF algorithm do not provide information about the variables correlation direction, thus Pearson’s and Spearman’s outcomes are presented in terms of absolute value.

Table 6 Data tests results / comparative perspective for Target T1

Full size table

Although the RReliefF score for feature A3 does not contradict the information obtained by the other tests, the analysis of the score attributed to feature A4 indicates an important level of correlation between temperature (A4) and CO2 emissions, whilst such condition was not apparent in the other tests outcomes. Consequently, such discrepancy between the tests was further investigated.

The aggregation level of the data is a characteristic that may adversely bound the effectiveness of such tests. Therefore, the next research step comprised the analysis of the CO2 emissions in a lower level of aggregation, i.e. the test of emissions data of specific industry / economy sectors. Figure 3 shows the comparison between the emissions from commercial and public services (T4) and temperature (A4).

The similarities among the curves are relevant, and Table 7 confirms such observation by presenting the results of the Hoeffding’s D correlation analysis, where one may observe the feature A4 listed as the most relevant one when T4 is the target variable. The application of the Hoeffding’s D test to a carbon emissions dataset featuring a lower level of aggregation confirmed the outcome of the RReliefF analysis.

Table 7 Hoeffding's D statistic test—prediction target T4

Full size table

Based on such conclusions, the next research step consisted of the expansion of the RReliefF analysis to our research dataset in its entirety, while taking the carbon emissions with a lower level of aggregation, i.e. total emissions split into sectoral emissions (Table 2). Table 8 presents the results of the RReliefF scoring for our research data.

Table 8 RReliefF scores for the research dataset

Full size table

Still analyzing the feature A4, it is possible to observe a strong correlation towards the target T6 (residential emissions), what is confirmed by the exploratory visualization in Fig. 4.

Such findings confirmed the applicability of the RReliefF algorithm to assess (score and rank) the correlation level of our research data, and qualified its use in our learning framework features engineering process.

3.2 Features engineering and neural network architecture iterative design

In the proposed learning framework, the features engineering and the model generation (i.e. NN architecture design) are iteratively accomplished by two modules, i.e. the Features Engineering Module (FEM) and the Model Generation Module (MGM), as can be observed in Fig. 1.

The FEM accomplishes the data dimensionality and quality treatment. Such combined treatment is done by a RReliefF driven Backwards Feature Elimination (BFE) aiming at: a) selecting relevant predictors, in order to reduce the dataset features space and avoid the dimensionality curse; b) reducing the computational complexity of the learning algorithm featured in the MGM; c) improving the accuracy of predictions; d) facilitating the interpretation of results, and; e) reducing the data storage space. The feature selection process observes the notation presented in Table 9

Table 9 RReliefF algorithm notation

Full size table

The RReliefF Algorithm, as presented in Fig. 5, uses as input a vector of attribute values $\left[ {\varvec{A}} \right]$ and predicted value $\tau$ for each training instance $I$, and provide as outcome a vector $W$ containing the score of the attributes.

As observed in the algorithm steps 8 and 9, RReliefF uses Eq. 16 for the iterative update of features weights according to theirs probabilistic relevance for the predictions. The intuition behind such weights computation as an expression of probabilistic relevance is conveyed by Eq. 17.

$$W\left[ A \right] = \frac{{P_{diffC|diffA} \cdot P_{diffA} }}{{P_{diffC} }} - \frac{{\left( {1 - P_{diffC|diffA} } \right) \cdot P_{diffA} }}{{1 - P_{diffC} }}$$

(17)

In the formulation of Eq. 17, $P_{diffA}$ represents the probability of having different values of $A$ within the nearest instances, $P_{diffC}$ represents the probability of having a different prediction within the nearest instances, and $P_{diffC | diffA}$ represents the probability of having a different prediction given a different value of $A$ within the nearest instances.

Thus, within the FEM, the initial features $A_{1} \ldots A_{26}$ (potential predictors) are scored by the RReliefF algorithm, the features then are indexed and ranked based on the attributed scores, what leads to a rearranged feature set. Next step, the interaction with the MGM starts, i.e. the rearranged features set is fed to the Backpropagation Neural Network Architecture (NN/BP), the network is trained and the vector containing the current learning framework status (features subset, NN/BP architecture, NN/BP prediction accuracy) is stored. Subsequently, new features subsets are created by backward feature elimination, the NN/BP is trained with the new subset, and the learning framework status vector is updated.

The NN/BP featured in the MGM has the following characteristics:

Feed-forward network $\nu \left( x \right)$ defined as follows:

$$\nu \left( x \right):\, = f^{L} \left( {W^{L} f^{{L - 1}} \left( {W^{{L - 1}} ...f^{1} \left( {W^{1} x} \right)...} \right)} \right)$$

(18)

Number of layers: 3; hidden layer activation function (transfer function): logistic (sigmoid), defined as follows:

$$f^{L - 1} \left( {x_{i} } \right) = \frac{1}{{1 + e^{{ - x_{i} }} }}$$

(19)

Training method: backpropagation;

Normalization method: unit interval;

Training cost (loss) function: residual sum of squares, defined as follows:

$$RSS = \mathop \sum \limits_{i = 1}^{n} \left( {y_{i} - \nu \left( {x_{i} } \right)} \right)^{2}$$

(20)

where ${y}_{i}$ is the i th value of the target variable; ${x}_{i}$ is the ${i}^{th}$ value of the predicting variable; $\nu ({x}_{i})$ is the predicted value of ${y}_{i}$.

As previously mentioned, the FEM and the MGM interact in an iterative way, and such interaction allows for an innovative BFE/RReliefF driven improvement for the NN/BP, i.e. the number of neurons in the hidden layer is changed along with the features subsets, and the learning framework status vector is updated accordingly. Once framework stop conditions are achieved, the status vector encloses the best features subset and the best NN/BP architecture in terms of prediction accuracy.

3.3 Learning framework evaluation

As observed in Fig. 1, the Model Evaluation Module (MEM) built in the implemented learning framework features the best NN/BP architecture (MGM outcome) and feeds it with the evaluation dataset. The module also features three additional ML models (Support Vector Machine—SVM, Gradient Boosting Machine—GBM, and Random Forest—RF), which are used to complement (benchmarking) the learning framework performance evaluation. The results of the benchmarking are presented in the next subsection, along with the overall assessment of the proposed framework performance.

3.3.1 Original contribution

To the best of our knowledge, our proposed learning framework is the first to implement an iterative neural network architecture improvement supported by a Backward Feature Elimination search method driven by the RReliefF algorithm.

The framework iteratively learns (NN/BP architecture, features subset) on ad hoc basis, i.e. specifically for each economy / industry sector. The implemented framework evaluation / validation processes benefited from real world data accrued by the EU, OECD, IEA, and the World Bank. The whole dataset covers the period 1990–2017, and the training and test datasets were determined in accordance to the Pareto principle for data sampling.

Table 10 presents the accuracy (Root Mean Square Error—RMSE for the test dataset) of the proposed learning framework for the totality of the EU28 CO2 emissions as well as for sectoral emissions. The table also presents the accuracy for the experiment control NN/BP, the framework featuring NN/BP supported by plain BFE (NN/BP-BFE), and the framework featuring NN/BP supported by RReliefF driven BFE (NN/BP-RReliefF/BFE).

Table 10 Computation learning framework evaluation figures

Full size table

The accuracy figures in Table 10 demonstrate the improved performance of the proposed learning framework when compared to the control NN/BP, as well as to other possible framework designs. The table also presents the number of predictors in the learned features subset, and the number of neurons in the hidden layer.

As observed in Table 10, the proposed approach combining backward feature elimination, RReliefF feature qualification, and iterative improvement of the NN/BP architecture effectively boosted the carbon emissions prediction accuracy for the EU28 scope dataset.

The computational complexity of the neural network is $O\left( {h^{a} } \right)$, where $h$ is the number of hidden layers and $a$ is the number of features (predictors), and the training process converged in less than 100 epochs, with a learning rate of 0.1. The RReliefF algorithm computational complexity is $O\left( {n.m.a} \right)$, where $n$ is the number of training instances, and $m$ is the number of training instances used by the algorithm to update the weights. The computational environment featured the CPU Intel i9-9900 k supported by the GPU NVIDIA RTX 2080 (8 Gb) and 32 Gb DDR4 RAM, and the whole prediction process took approximately 1 h (average) per prediction target.

3.3.2 Learning framework validation

The validation of the original contribution provided by the proposed learning framework consisted of two different analysis. Firstly, its outcomes were compared to three current mainstream ML models (i.e. SVM, GBM, RF) using our research dataset. Secondly, our accuracy figures were benchmarked against the results of recently published researches targeting carbon emissions prediction.

In the validation process we worked with mean absolute percentage error—MAPE as the accuracy metric, and focused on the prediction target T1 (total CO2 emissions). Thus the best performing model designed by our proposed learning framework features 7 neurons in the hidden layer, and 16 predictors out of the candidates presented in Table 3, i.e.: A18, A4, A23, A16, A8, A17, A10, A6, A22, A12, A11, A25, A9, A7, A24, and A26.

The trained framework achieved 2.28% accuracy performance (MAPE) on the test dataset, and Table 11 allows us to benchmark the result of the proposed learning framework with the results of other NN/BP implementations.

Table 11 Computational learning framework benchmarking figures (MAPE) for prediction target T1—Total EU28 Carbon Emissions (NN/BP specific)

Full size table

As observed in Table 11, the predictions of the proposed learning framework are significantly better than the ones provided by similar researches.

Table 12 compares the result of our proposed framework with the results of three ML models (GBM, RF, SVM) supported by plain BFE, and using our research dataset, as well as against the results of recently published researches models other than NN/BP.

Table 12 Computational learning framework benchmarking figures (MAPE) for prediction target T1—Total EU28 Carbon Emissions (mainstream ML models)

Full size table

The figures in Tables 11 and 12 reinforce the relevance of the results accrued by our research, and confirm ANNs as a powerful algorithm capable of processing a large amount of non-linear and non-parametric data. The RReliefF algorithm, in turn, efficiently assess and rank the predicting variables (features) by effectively addressing non-linear relationships, time series, noisy and correlated features, as well as features interactions of high order (complex patterns associations).

The iterative combination of these two algorithms produced a powerful and scalable prediction tool able to process huge datasets featuring complex and incomplete data. The improved ad hoc learning capability of our framework makes it potentially applicable to any region in the world, and for any level of data aggregation.

Whereas our original contribution represents an important step towards the better design and implementation of environment protection initiatives and policies, its effectiveness would be greatly enhanced once combined with an explainable Artificial Intelligence (XAI) technique, given the black-box nature of ANN algorithms.

3.4 Predictions explanation

Regarding ML outcomes, it is critically important to ensure that its predictions accuracy performance relies on valid features (predictors) computations, i.e. the ML model is providing the right answer for the right reasons. Specifically addressing our research, although ANNs are top performing, non-parametric and scalable algorithms, they lack the required algorithmic transparency to adequately support policy making and decision-making processes targeting the complex environmental challenges. Therefore, we improved our learning framework with an XAI module (i.e. PEM—Prediction Explanation Module) featuring the Local Interpretable Model-agnostic Explanations—LIME technique [36].

LIME is a scalable method that creates local interpretable surrogate models (explanations) around a given instance in order to estimate how data points influence the global model predictions. LIME translates the explanation problem into an optimization problem. The search space comprises explanations generated by the local interpretable surrogate models $g \in G$, where $G$ is a class of interpretable models. Locality is defined by a proximity measure $\pi \left( {x,z} \right)$ expressing the distance between an instance $z$ and $x$. The interpretability degree of the surrogate model is assessed by means of a complexity measure ${\Omega }\left( g \right)$. Thus, by considering $f\left( x \right)$ the model to be explained, the local fidelity measure ${\mathcal{L} }\left( {f,g,\pi } \right)$ express how unfaithful $g$ is in approximating $f$ in the locality $\pi$. Finally the LIME outcome ensuring both interpretability and local fidelity is defined by:

$$\begin{aligned}\varepsilon \left( x \right) =&{\text{arg min }}{\mathcal{L}}\left( {f,g,\pi } \right) + {\Omega }\left( g \right) \\&\quad\!\! {g \in G}\end{aligned}$$

(21)

Within the learning framework XAI module, the model to be explained is defined as $f\left( x \right) = \nu \left( x \right)$ (i.e. the ANN). $G$ comprises ridge regression models for the perturbed sample $z^{\prime} \in Z$ (perturbed samples dataset), such that $g\left( {z^{\prime}} \right) = \beta_{g} \cdot z^{\prime}$. The complexity measure ${\Omega }\left( g \right)$ is expressed in terms of non-zero coefficients in the linear model, $\pi$ is defined by the Euclidian distance, and the local fidelity is computed as square loss. We thus define:

$${\mathcal{L}}\left( {f,g,\pi } \right) = \mathop \sum \limits_{z,z \in Z} \pi \left( {x,z} \right) \left( {\nu \left( z \right) - g\left( {z^{\prime}} \right)} \right)^{2} \left( {22} \right)$$

(22)

From a general perspective, for each prediction to be explained, LIME algorithm permutes (perturbation) the observation n times; the statistics for each variable are extracted and permutations are then sampled from the variable distributions. The model to be explained then predicts the outcome of all permuted observations, and the algorithm calculates the Euclidian distance from all permutations to the original observation, and selects the m features with highest absolute weight in a ridge regression fit of the complex model outcome.

Afterwards a simple model is fitted to the permuted data, explaining the complex model outcome with the m features from the permuted data weighted by its distance to the original observation. And finally, the algorithm extracts the feature weights from the simple model and use them to explain the local behavior of the complex model.

In order to demonstrate the whole outcome of our implemented learning framework, Table 13 presents predictions and explanations (features weights) for two cases, i.e. predictions for the years 2013 (EU-ETS phase 3 first year) and 2016 (most accurate learning framework prediction). For the sake of simplification, among the 16 selected features, Table 13 presents the explanations for the first 5 as ranked by the RReliefF algorithm.

Table 13 Predictions explanation module outcomes

Full size table

As observed in Table 13, the XAI module featured with LIME provides insights on how a specific feature contributes for a specific prediction (case). We observe, for instance, that oil based energy supply (predictor A23) consistently and positively contributes for, and induces total CO2 emissions. In such context it is extremely important to differentiate the two different impacts, i.e. contribution and induction. Although it is clear that energy production from oil combustion definitely contributes for CO2 emissions, its use may induce the reduction of total CO2 emissions due to the replacement phenomenon, by which an increment on oil use may imply a reduction on the use of a more polluting source of energy, like coal. Hence, our learning framework outcomes express inductive relationships rather than contributory ones.

Such improvement turns our research contribution much more valuable for the design of better environmental initiatives and policies dependent on CO2 emissions forecasts. Additionally, the outcomes of our proposed learning framework may provide some background for future works addressing carbon emissions causality analysis, as well as potential improvements on both ANNs and XAI techniques.

4 Conclusions

The accurate forecast of CO2 emissions is one of the most important inputs for any decision-making process targeting climate change / global worming avoidance. Therefore, in our attempt to contribute to such global environmental challenge, we implemented a learning computational framework for carbon emissions predictions. Our framework features the capacity to iteratively improve the prediction features set and the backpropagation neural network (NN/BP) architecture according to the data statistical assessment computed by the RReliefF algorithm. Our research counted with real world data obtained from the European Union, International Energy Agency and World Bank, for the period 1990–2017.

The outcomes of the designed prediction framework were successfully evaluated against different NN/BP based solutions (NN/BP, NN/BP-BFE, NN/BP-RReliefF/BFE, NN/BP-CT, NN/BP-IPSO, NN/BP-PCA, NN/BP-RF)), as well as different mainstream machine learning models (GBM-BFE, RF-BFE, SVM-BFE, SVM-RF). Additionally, the featured XAI module provided insights on how different predictors impacted a specific prediction case. Therefore, our results demonstrated the effectiveness of our approach in terms of increased and explained prediction accuracy, which may adequately support the design and the improvement of environmental initiatives and policies.

Finally, the outcomes of our implemented learning framework may provide some background for future works addressing carbon emissions causality analysis, as well as potential improvements on both ANNs and XAI techniques.

Data Availability

The research used data publicly available on the websites of the mentioned sources.

Abbreviations

ANN:: Artificial Neural Network
BP:: Backpropagation
CDD:: Cooling Degree Days
EEA:: European Environment Agency
EED:: Energy Efficiency Directive
EU:: European Union
EU28:: European Union—28 States
EU-ETS:: European Union Emissions Trading System
FEM:: Features Engineering Module
FGMVM:: Grey Multivariable Verhulst model improved by Fourier series.
GBM-BFE:: Gradient Boosting Machine (GBM) combined with backward feature elimination (BFE).
GHG:: Greenhouse Gases
GM:: Grey Model.
HDD:: Heating Degree Days
IEA:: International Energy Agency
MAPE:: Mean Absolute Percentage Error
MEM:: Model Evaluation Module
MGM:: Model Generation Module
ML:: Machine Learning
MtCO2:: Millions of tons of CO2
MtCO2-eq:: Millions of tons of CO2 equivalent
NN:: Neural Network
NN/BP-BFE:: NN/BP combined with backward feature elimination (BFE)
NN/BP Control:: Direct application of Backpropagation Neural Network (NN/BP)
NN/BP-CT:: NN/BP combined with Chaos Theory (CT).
NN/BP-IPSO:: NN/BP combined with Improved Particle Sworm Optimization (IPSO).
NN/BP-PCA:: NN/BP combined with Principal Component Analysis (PCA).
NN/BP-RF:: NN/BP combined with Random Forest (RF).
NN/BP-RReliefF/BFE:: NN/BP combined with RReliefF driven BFE
OECD:: Organisation for Economic Co-operation and Development
PEM:: Prediction Explanation Module
RED:: Renewable Energy Directive
RF-BFE:: Random Forest (RF) combined with backward feature elimination (BFE).
RMSE:: Root Mean Square Error
SVM-BFE:: Support Vector Machine (SVM) combined with backward feature elimination (BFE).
SVM-RF:: Support Vector Machine (SVM) combined with Random Forest (RF).
XAI:: Explainable Artificial Intelligence

References

Crespo, A. M. F., Wang, C. (2020). "European Union Emissions Trading Scheme: Design Evolution and Effectiveness Analysis." Handbook of Research on Interdisciplinary Approaches to Decision Making for Sustainable Supply Chains, eds. Anjali Awasthi and Katarzyna Grzybowska, IGI Global, 189–210, https://doi.org/10.4018/978-1-5225-9570-0.ch009.
European Environment Agency (2018) "Trends and projections in Europe 2018: Tracking progress towards Europe's climate and energy targets. (No.16/2018). Luxembourg: Publications Office of the European Union. DOI:https://doi.org/10.2800/931891.
Guan D, Hubacek K, Weber CL, Peters GP, Reiner DM (2008) The drivers of Chinese CO2 emissions from 1980 to 2030. Global Environ Change Part A: Human Policy Dimens 18(4):626–634
Article Google Scholar
Anger A (2010) Including aviation in the European emissions trading scheme: Impacts on the industry, CO2 emissions and macroeconomic activity in the EU. J Air Transport Manage 16(2):100–105
Article Google Scholar
Li W, Lu C (2015) The research on setting a unified interval of carbon price benchmark in the national carbon trading market of China. Appl Energy 155:728–739
Article Google Scholar
Robalino-López A, Mena-Nieto A, García-Ramos JE (2016) System dynamics modeling for renewable energy and CO2 emissions: a case study of Ecuador. Energy Sustain Develop 20:11–20. https://doi.org/10.1016/j.esd.2014.02.001
Article Google Scholar
Scott K, Roelich K, Owen A, Barrett J (2017) Extending European energy efficiency standards to include material use: an analysis. Climate Policy 18(5):627–641
Article Google Scholar
Mi Z, Meng J, Guan D, Shan Y, Liu Z, Wang Y, Feng K, Wei Y-M (2017) Pattern changes in determinants of Chinese emissions. Environ Res Lett. https://doi.org/10.1088/1748-9326/aa69cf
Article Google Scholar
Chang K, Chang H (2016) Cutting CO2 intensity targets of interprovincial emissions trading in China. Appl Energy 163:211–221
Article Google Scholar
Yang L, Zhang Q, Ji J (2017) Pricing and carbon emission reduction decisions in supply chains with vertical and horizontal cooperation. Int J Prod Econ 191:286–297
Article Google Scholar
Yang L, Wang G, Ke C (2018) Remanufacturing and promotion in dual-channel supply chains under cap-and-trade regulation. J Clean Prod 204:939–957
Article Google Scholar
Xu L, Wang C, Zhao J (2018) Decision and coordination in the dual-channel supply chain considering cap-and-trade regulation. J Clean Prod 197:551–561
Article Google Scholar
Wang Z, Wang C (2015) How carbon offsetting scheme impacts the duopoly output in production and abatement: analysis in the context of carbon cap-and-trade. J Clean Prod 103:715–723. https://doi.org/10.1016/j.jclepro.2014.04.069
Article Google Scholar
Gavard C, Winchester N, Paltsev S (2016) Limited trading of emissions permits as a climate cooperation mechanism? US–China and EU–China examples. Energy Econ 58:95–104
Article Google Scholar
Zhang X, Qi T, Ou X, Zhang X (2017) The role of multi-region integrated emissions trading scheme: a computable general equilibrium analysis. Appl Energy 185:1860–1868
Article Google Scholar
Cui Q, Li Y, Wei Y (2017) Exploring the impacts of EU ETS on the pollution abatement costs of European airlines: an application of network environmental production function. Transp Policy 60:131–142
Article Google Scholar
Hong I, Su JCP, Chu C, Yen C (2018) Decentralized decision framework to coordinate product design and supply chain decisions: evaluating trade-offs between cost and carbon emission. J Clean Prod 204:107–116
Article Google Scholar
Wang H, Ang BW, Su B (2017) Assessing drivers of economy-wide energy use and emissions: IDA versus SDA. Energy Police, 107:585–599
Article Google Scholar
Solaymani S (2019) CO2 emissions patterns in 7 top carbon emitter economies: the case of transport sector. Energy 168:989–1001
Article Google Scholar
Allevi E, Gnudi A, Konnov IV, Oggioni G (2018) Evaluating the effects of environmental regulations on a closed-loop supply chain network: a variational inequality approach. Ann Oper Res 261(1–2):1–43
Article MathSciNet Google Scholar
Jiang H, Kong P, Hu Y-C, Jiang P (2020) Forecasting China’s CO2 emissions by considering interaction of bilateral FDI using the improved grey multivariable verhulst model. Environ Dev Sustain. https://doi.org/10.1007/s10668-019-00575-2
Article Google Scholar
Chang C (2010) A multivariate causality test of carbon dioxide emissions, energy consumption and economic growth in China. Appl Energy 87:3533–3537
Article Google Scholar
Hong K, Jung H, Park M (2017) Predicting European carbon emission price movements. Carbon Manage 8(1):33–44. https://doi.org/10.1080/17583004.2016.1275813
Article Google Scholar
Liu Y, Tian Y, Chen M (2017) Research on the prediction of carbon emission based on the chaos theory and neural network. Int J Bioautomation 21(4):339–348
Google Scholar
Zhou J, Du S, Shi J, Guang F (2017) Carbon emissions scenario prediction of the thermal power industry in the Beijing-Tianjin-Hebei region based on a back propagation neural network optimized by an improved particle swarm optimization algorithm. Polish J Environ Stud 26(4):1895–1904. https://doi.org/10.15244/pjoes/68881
Article Google Scholar
Sun W, Sun J (2017) "Prediction of carbon dioxide emissions based on principal component analysis with regularized extreme learning machine: the case of China. Environ Eng Res 22(3):302–311
Article Google Scholar
Zhou J, Guang F, Tang R (2018) Scenario analysis of carbon emissions of China’s power industry based on the improved particle swarm optimization-support vector machine model. Polish J Environ Stud 27(1):439–449. https://doi.org/10.15244/pjoes/74132
Article Google Scholar
Li M, Wang W, De G, Ji X, Tan Z (2018) forecasting carbon emissions related to energy consumption in Beijing-Tianjin-Hebei region based on grey prediction theory and extreme learning machine optimized by support vector machine algorithm. Energies, MDPI, Open Access J 11(9):1–15
Google Scholar
Song Y, Liu T, Liang D, Li Y, Song X (2019) A fuzzy stochastic model for carbon price prediction under the effect of demand-related policy in china’s carbon market. Ecol Econ 157:253–265
Article Google Scholar
Sun W, Jin H, Wang X (2019) Predicting and analyzing CO2 emissions based on an improved least squares support vector machine. Polish J Environ Stud 28(6):4391–4401. https://doi.org/10.15244/pjoes/94619
Article Google Scholar
Wen L, Yuan X (2020) Forecasting CO2 emissions in chinas commercial department, through BP neural network based on random forest and PSO. Sci Total Environ. https://doi.org/10.1016/j.scitotenv.2020.137194
Article Google Scholar
Javanmard A, Montanari A (2014) Confidence intervals and hypothesis testing for high-dimensional regression. J Machine Learn Res 15:2869–2909
MathSciNet MATH Google Scholar
Kira, K., Rendell, L. A. (1992). The feature selection problem: traditional methods and a new algorithm. 10th National Conference on Artificial Intelligence. AAAI-92 Proceedings, San Jose, California
Robnik-Šikonja M, Kononenko I (2003) Theoretical and empirical analysis of ReliefF and RReliefF. Machine Learn 53:23–69. https://doi.org/10.1023/A:1025667309714
Article MATH Google Scholar
Hoeffding W (1948) A non-parametric test of independence. Ann Math Stat 19:546–557
Article MathSciNet Google Scholar
Ribeiro, M. T., Singh, S., Guestrin, C (2016) "Why should I trust you? Explaining the predictions of any classifier." Proceedings of the 22nd International Conference on Knowledge Discovery and Data Mining, 1135–1144 https://doi.org/10.1145/2939672.2939778.

Download references

Author information

Authors and Affiliations

Concordia Institute for Information Systems Engineering, Concordia University, Montreal, Canada
Antonio Marcio Ferreira Crespo & Chun Wang
Technology Development and Implementation, DreamerGate Research and Development Inc, Montreal, Canada
Thiago Marques Ferreira Crespo
Computer Science Department, University of Brasilia, Brasilia, Brazil
Li Weigang
C4I and Cyber Center, George Mason University, Fairfax, USA
Alexandre Barreto

Authors

Antonio Marcio Ferreira Crespo
View author publications
You can also search for this author in PubMed Google Scholar
Chun Wang
View author publications
You can also search for this author in PubMed Google Scholar
Thiago Marques Ferreira Crespo
View author publications
You can also search for this author in PubMed Google Scholar
Li Weigang
View author publications
You can also search for this author in PubMed Google Scholar
Alexandre Barreto
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Antonio Marcio Ferreira Crespo.

Ethics declarations

Conflict of interest

The author(s) declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Crespo, A.M.F., Wang, C., Crespo, T.M.F. et al. Learning framework for carbon emissions predictions incorporating a RReliefF driven features selection and an iterative neural network architecture improvement. SN Appl. Sci. 3, 460 (2021). https://doi.org/10.1007/s42452-021-04411-z

Download citation

Received: 04 November 2020
Accepted: 22 February 2021
Published: 16 March 2021
DOI: https://doi.org/10.1007/s42452-021-04411-z

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.