Analysis of decarbonization path in New York state and forecasting carbon emissions using different machine learning algorithms

Ekaterina, Glukhova; Li, Jia

doi:10.1007/s43979-024-00083-x

Analysis of decarbonization path in New York state and forecasting carbon emissions using different machine learning algorithms

original article
Open access
Published: 27 March 2024

Volume 3, article number 8, (2024)
Cite this article

Download PDF

You have full access to this open access article

Carbon Neutrality Aims and scope Submit manuscript

Analysis of decarbonization path in New York state and forecasting carbon emissions using different machine learning algorithms

Download PDF

Glukhova Ekaterina¹ &
Jia Li¹

479 Accesses
Explore all metrics

Abstract

The state of New York admitted 143 million metric tons of carbon emissions from fossil fuels in 2020, prompting the ambitious goal set by the CLCPA to achieve carbon neutrality. The paper focused on analyzing and predicting carbon emissions using four different machine-learning algorithms. It examined emissions from fossil fuel combustion from 1990 to 2020 and validated four different algorithms to choose the most effective one for predicting emissions from 2020 to 2050. The analysis covered various economic sectors including transportation, residential, commercial, industrial, and electric power. By analyzing policies, the paper forecasted emissions for 2030 and 2050, leading to the identification of different pathways to reach carbon neutrality. The research concluded that in order to achieve neutrality, radical measures must be taken by the state of New York. Additionally, the paper compared the most recent data for 2021 with the forecasts, showing that significant measures need to be implemented to achieve the goal of carbon neutrality. Despite some studies assuming a trend of decreased emissions, the research revealed different results. The paper presents three pathways, two of which follow the ambitious plan to reach carbon neutrality. As a result, the emission amount by 2050 for the different pathways was projected to be 31.1, 22.4, and 111.95 of MMt CO₂ e, showcasing the need for urgent action to combat climate change.

Strategies to achieve a carbon neutral society: a review

Article Open access 08 April 2022

Investigating the assumption of perfect displacement for global energy transition: panel evidence from 73 economies

Article 27 December 2023

Does a Carbon Tax Reduce CO2 Emissions? Evidence from British Columbia

Article Open access 13 April 2022

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

The IPCC category represents emissions associated with the burning of fossil fuels [20], making it the largest source of greenhouse gas emissions in the whole world [21]. The issue of CO₂ emission is a critical concern for many regions, including New York State. The role of fossil fuels in contributing to carbon emissions is well-documented and requires strategic solutions to address. This includes fuels combusted for electricity, transportation, and heating in residential, commercial, and industrial buildings. It is evident that reducing these emissions is imperative for the state. New York State took significant action in response to the latest climate science by enacting the Climate Leadership and Community Protection Act (CLCPA) during the 2019 legislative session. The CLCPA sets some of the most ambitious climate targets in the United States. By 2030, the state aims to achieve a 40% reduction in greenhouse gas (GHG) emissions below 1990 [30].

In order to analyze and accurately predict the future trajectory of emissions, machine learning algorithms were applied to examine historical data. This study validated four machine learning algorithms with historical data from 1990–2020. After evaluation of the result, it was found that the ELM algorithm showed the most accurate results with the smallest MAPE, making it the ideal choice for forecasting carbon emissions until 2050.

The study also explored three pathways in alignment with the goals of the CLCPA for New York State. These pathways were designed to scrutinize the potential impact of different policies on carbon emissions in 2030 and 2050. The forecast trajectory was compared with the most recent values of CO₂ emission in 2021, revealing small deviations in the commercial, industrial, and electric power sectors, and underestimations in residential, transportation, and total state carbon emissions. Previous studies have noted a decrease in carbon emissions for 2019 and 2020, assuming the trend will continue to decline. However, the analysis in this article indicates an increase in emissions in 2021, emphasizing the need to adopt a number of proposed policies for each pathway in order to achieve neutrality. Overall, the findings from this research highlight the importance of implementing effective strategies to combat carbon emissions in NYS. With the use of machine learning [18] and detailed analysis of historical data [14], it is possible to make informed predictions and work towards a more sustainable future for the state.

One of the primary challenges faced was the need to make accurate percentage settings for carbon emissions in 2030 and 2050 for different economic sectors in accordance with the state's future plans for carbon reduction. Another issue encountered was the selection of a simple and effective forecasting method [33] that would be suitable for the study's focus on carbon emissions as the single variable of interest. Was carefully evaluate and compared various forecasting algorithms, including ELM, SVM, SVM-ELM, and GM optimized with SVM-ELM, to determine the most suitable approach for the study. Additionally, the big problem faced was the impact of the pandemic on different economic sectors. As the study aimed to understand the effect of the pandemic on carbon emissions, the analysis of emissions from 1990–2020 revealed a decreasing trend in the forecast model for 2050. Upon further examination, it was observed that while some sectors showed small deviations with actual values in 2021 compared to the scenario-based forecast, other sectors exhibited larger underestimations, particularly those that had seen a significant uptrend in 2021. This discrepancy in the forecasted values poses a challenge for researchers in developing effective strategies and policies to reduce carbon emissions. This study will present comprehensive plans for the development of policies and measures tailored to each sector of the economy.

The research roadmap Fig. 1 outlines the comprehensive approach taken to analyze state-wide CO₂emissions in New York State:

The first step involves collecting data on CO₂ emissions from various sectors, including electric power, residential, commercial, industrial, and transportation. The data is then analyzed by fuel categories, such as natural gas, coal, and petroleum. Additionally, an investigation into existing policies and initiatives in New York State to achieve carbon neutrality is conducted, along with an examination of different scenarios and targets outlined in these policies for reducing CO₂ emissions [15].
The second step involves data analysis. Four different machine-learning models were chosen for validation in order to forecast fossil fuel combustion emissions. These models are then trained and tested using historical CO₂ emissions data from 1990 to 2020. Key metrics such as Mean Absolute Error (MAE), Mean Squared Error (MSE), and Mean Absolute Percentage Error (MAPE) are calculated to assess the accuracy of each forecasting model, and the most accurate model is selected.
The third step is Scenario-based prediction, which outlines three prediction scenarios for carbon emission reduction pathways in New York State. These scenarios are based on the city's goal of achieving carbon neutrality and are essential for informing policies and energy development plans. The chosen forecasting model has been employed to predict CO₂ emissions for the period of 2020–2050.

2 Literature review

2.1 Literature on Carbon emission studies and reports

The issue of carbon emissions and its impact on the environment has garnered significant attention in recent years. With the growing concerns about climate change, many researchers and organizations have been focusing on analyzing carbon emissions and exploring ways to mitigate its effects. The study by Dietz and Rosa (1997), they developed a stochastic version of the IPAT model to estimate the effects of population, affluence, and technology on national CO₂ emissions [8]. Their results suggest that there are diseconomies of scale for the largest nations and that the effects of affluence on CO₂ emissions reach a maximum at about $10,000 in per-capita gross domestic product. These findings highlight the complex interplay of socio-economic factors in influencing carbon emissions. The study conducted by Zhao and Yu in 2013 were able to identify key influences on carbon emissions, including industrial production, energy consumption, and transportation [48]. Charfeddine and Kahia (2019) [2] utilized the panel vector autoregressive (PVAR) model to examine the impact of renewable energy consumption and financial development on carbon dioxide (CO₂) emissions and economic growth in the MENA region. Their Inglesi-Lotz [19] analysis suggested that both renewable energy consumption and financial development have a slight influence on CO₂ emissions and economic growth, highlighting the need for stronger policies to improve environmental quality and economic growth in the region. Another study by Keerthana et al. (2023) focused on the United States' energy consumption and carbon dioxide emissions, providing a comprehensive forecast using a regression model. The study emphasized the significance of energy consumption and its link to carbon dioxide emissions. It also highlighted the challenges and potential approaches for reducing emissions and developing improved greenhouse gas mitigations for the country [22]. Ren and Long (2021) examined carbon emission forecasting and scenario analysis in Guangdong Province, China, utilizing a Fast Learning Network (FLN) forecasting algorithm improved by Chicken Swarm Optimization (CSO). Their study provided insights into the potential for achieving carbon emission peak and carbon neutrality in the province, offering scenarios and countermeasures to reduce emissions [33].

The New York State Climate Action Council's Scoping Plan [29], published in 2022, outlines sector-specific and economywide actions necessary to achieve the Climate Act's ambitious goals and requirements. The plan is driven by the urgent need to address climate mitigation, justice, economic opportunity, and long-term job growth for New Yorkers. Developed through a comprehensive, science-based integration analysis of advisory panel recommendations, the Scoping Plan is founded on the principles of energy system reliability, cost-effectiveness, and the welfare, safety, and prosperity of New Yorkers. The plan's recommended actions aim to meet the Climate Act's requirements across various economic sectors, including transportation, buildings, electricity, industry, agriculture, forestry, and waste. Through rigorous analysis, the plan seeks to ensure that the strategies put forth are both effective and achievable, considering the complexities and challenges of reducing greenhouse gas emissions in each sector. As New York moves forward with its climate action strategy, the Scoping Plan provides a clear roadmap to navigate the intricate landscape of climate policy and sets the stage for meaningful progress towards a sustainable future. In support of these efforts, the New York State Energy Research and Development Authority (NYSERDA) published a report in 2023 on energy sector greenhouse gas emissions under the Climate Act [30]. This report, prepared by the Eastern Research Group Inc, provides valuable insights into the trends and dynamics of greenhouse gas emissions within New York's energy sector between 1990 and 2021. The data and analysis presented in the report offer a critical foundation for understanding the state's progress in addressing climate change and inform future actions to achieve the Climate Act's targets. As New York continues its journey towards a cleaner, more sustainable energy future [6], this report serves as an essential tool for policymakers, researchers, and stakeholders to make informed decisions and drive meaningful change.

2.2 Literature on machine learning algorithms

The literature on machine learning algorithms is extensive and diverse, with multiple methodologies and approaches for estimating empirical models from data [28]. Cherkassky and Mulier (2007) highlight the emergence of various methodologies from classical statistics, engineering, signal processing, and computer science, among others. They emphasize the need for a clear understanding of fundamental issues involved in learning algorithms [3]. Alternatively, Huang, Zhu, and Siew (2006) [17] propose the Extreme Learning Machine (ELM) as a fast learning algorithm for single-hidden layer feedforward neural networks. They demonstrate its generalization performance and fast learning speed compared to conventional algorithms. Olatunji [32] (2017) compares ELM and Support Vector Machines (SVM) on the problem of email spam detection, finding that SVM outperforms ELM in accuracy but lags in speed. The research of Huang and colleagues (2015) discusses ELM as a unified learning solution for feature learning, clustering, regression, and classification applications. They highlight the independence of hidden neurons from training samples as a key feature of ELM [16]. The review by Chorowski, Wang, and Zurada (2014) compares SVM and ELM-based classifiers, demonstrating similar performance across various datasets but varying computational resource requirements [4]. Zeng, Zhou, and Zhang (2017) forecast the total energy consumption of China's manufacturing using the Homologous Grey Prediction Model, showcasing its improved performance over traditional Grey Models. They also emphasize the importance of optimizing manufacturing structures and promoting energy-saving technologies [46]. Similarly with study of Zhang et al. (2017) [47]. In the context of environmental protection, Additionally, Li et al. (2018) utilized the grey prediction theory and SVM-ELM algorithm to forecast carbon emissions related to energy consumption in the Beijing-Tianjin-Hebei region, showing the effectiveness of these methods in addressing sustainability challenges [25]. Collectively, these studies underscore the significance of machine learning algorithms in addressing complex challenges related to carbon emissions, energy consumption, and environmental sustainability. Ko et al. (2021) [23] developed a machine learning-based sketch planning model for predicting mobile emissions, highlighting the diverse applications of these algorithms in transportation and urban planning.

3 Methodology and data collection

3.1 Data collection

To effectively analyze Fuel Combustion Emissions for different economic sectors and Fuel Combustion Emissions by Fuel Category for the period from 1990 to 2020 in New York State [1], the following pre-processing and sorting steps was implemented:

Data Collection: Obtain comprehensive datasets from reputable sources such as the New York State Department of Environmental Conservation, the U.S. Environmental Protection Agency, and other relevant governmental agencies. These datasets should encompass detailed information on fuel consumption, emissions, and economic sector data.

Data Cleaning: Clean the datasets to ensure accuracy and consistency. This involves checking for missing or incomplete data, removing duplicates, and rectifying any inconsistencies in the data [44].
Segmentation by Economic Sectors: Categorize the emissions data based on economic sectors such as electric power, industrial, commercial, residential, and transportation sectors.
Fuel Category Classification: Classify the fuel data into categories such as coal, natural gas, and petroleum.
Temporal Segmentation: Organize the data chronologically to analyze emissions trends over the period from 1990 to 2020 [31].
Standardization and Consolidation: Normalize the data if necessary to adjust for any variations in the scale of emissions across different sectors and fuel categories [39]. Aggregate the emissions data based on the economic sectors and fuel types for a comprehensive overview of emissions trends.

When it comes to analyzing CO₂ emissions in New York State, there are several common sources of data that are frequently used by researchers and policymakers. One of the primary sources of data is the NYSERDA [30] which collects and analyzes data related to energy consumption and emissions within the state. The United States Energy Information Administration (EIA) is another important source of data for analyzing CO₂ emissions in New York State. The EIA provides comprehensive data on energy production, consumption, and emissions at both the state and national levels, allowing for comparisons and trend analysis [40,41,42,43].

Finally, the Environmental Protection Agency (EPA) also offers valuable data and resources for studying CO₂ emissions in New York State. The EPA collects data on greenhouse gas emissions from various sources, including industrial facilities, transportation, and energy production, making it a crucial source for understanding the sources and trends of CO₂ emissions [10,11,12].

3.2 Description of the implementation of the four algorithms and the hyperparameters used

In order to forecast CO₂ emissions, four different algorithms were implemented and tested: Extreme Learning Machines (ELM), Support Vector Machines (SVM), SVM-ELM, and the grey prediction model optimized with SVM-ELM. The implementation involved preparing the data, normalizing the input, setting the parameters for the models, generating random input weights and biases, computing hidden layer output matrices, training the regression models, and predicting the output for the training and testing data.

The choice of these methods was based on their ability to handle non-linear data and their effectiveness in regression tasks. ELM and SVM are well-known for their capability to handle complex, non-linear relationships, making them suitable for forecasting CO₂ emissions. The SVM-ELM hybrid model was chosen to leverage the strengths of both algorithms and maximize performance. Additionally, the grey model optimized with SVM-ELM was included to incorporate a different approach to forecasting based on historical data patterns. The hyperparameters used for each model were based on prior research and experimentation to optimize their performance in accurately forecasting CO₂ emissions. Overall, the combination of these algorithms provided a comprehensive approach to forecasting CO₂ emissions and allowed for comparison and evaluation of their effectiveness in handling the given dataset.

3.2.1 Basic methodology of extreme learning machine algorithm

ELM is a machine learning algorithm that involves a three-layered neural network structure. The first layer is the input layer, which receives the input data. The second layer, called the hidden layer, contains a randomly initialized set of neurons that are summed together with the input data to produce a set of hidden layer outputs. The outputs of the hidden layer are then connected to the third layer, the output layer. The output layer has the final classification or prediction based on the weighted sum of the hidden layer outputs. The training of an ELM network involves randomly selecting the weights that connect the input layer to the hidden layer and then solving a set of linear equations to calculate the weights that connect the hidden layer to the output layer. Further details of this calculation can be found in [17, 32] (Fig. 2).

ELM doesn’t require gradient-based backpropagation to work. It uses the Moore–Penrose generalized inverse to set its weights.

$$\sum_{i=1}^{L}{\upbeta }_{i}f\left({\omega }_{i}*{x}_{j}+{b}_{i}\right)={O}_{j},j=1,\dots ,N$$

(1)

Where: f is the activation function, L is the number of hidden units, N is the number of training samples, $\upbeta$ is output weight,

$\omega$ _i = [$\omega$ _i1, $\omega$ _i2,... $\omega$ _in]^T is a weight vector between input and hidden layer, b_i is the offset of the hidden layer unit. The goal of the extreme learning machine algorithm is to minimize the input error which can be expressed as:

$$\sum_{j=1}^{L}{O}_{j}-{T}_{j}=0$$

(2)

Where β_i, ${\omega }_{i}$ and b_i must make sure:

$$\sum_{i=1}^{L}{\upbeta }_{i}f\left({\omega }_{i}*{x}_{j}+{b}_{i}\right)={T}_{j},j=1,\dots ,N$$

(3)

The matrix can be written shortly as:

$${M}_{\beta }=T$$

(4)

$${M}_{\beta }=\left[\begin{array}{lll}f\left({\omega }_{1}*{x}_{1}+{b}_{1}\right)& \dots & f\left({\omega }_{L}*{x}_{1}+{b}_{L}\right)\\ \vdots & \dots & \vdots \\ f\left({\omega }_{1}*{x}_{N}+{b}_{1 }\right)& \dots & f\left({\omega }_{L}*{x}_{N}+{b}_{L}\right)\end{array}\right]N\times L$$

(5)

$$\beta =\left[\begin{array}{c}{\beta }_{1}^{T}\\ \vdots \\ {\beta }_{L}^{T}\end{array}\right] {L\times m }^{T=}\left[\begin{array}{c}{t}_{1}^{T}\\ \vdots \\ {t}_{L}^{T}\end{array}\right] N \times m$$

(6)

According to Huang and Babri's proposal, the output matrix of the neural network is considered. In light of this, the training procedures for the ELM-based classifier can be summarized in the following algorithmic steps. For more comprehensive information regarding the functionality of the ELM algorithm, refer to [16, 17].

The parameter settings used in the process are described in Fig. 3. Firstly, historical data from 1990–2020 was imported and prepared for analysis. The input data was normalized, a recommended step to ensure accurate predictions. Next, the parameters were set. In this case, 100 hidden nodes were used in the hidden layer. Random input weights and biases were generated, and the hidden layer output matrix was computed using the sigmoid function. The output weights were then solved using the Moore–Penrose pseudo-inverse method. Finally, the model was used to predict the CO₂ emissions for both the training and testing data.

3.2.2 Basic methodology of support vector machines

Support vector machines (SVM) is a statistical machine learning technique that uniquely models complex relationships among variables [7]. The Support Vector Machine (SVM) algorithm aims to identify a classification boundary [27] that minimizes the error of training samples about the optimal classification boundary. However, when applying SVM to regression fitting analysis for predicting carbon emissions, multiple influencing factors come into play [37], making it a linearly inseparable problem. In such cases, the kernel function can transform the sample data into a high-dimensional space, reducing the computation efforts required at higher dimensions. For a training sample (x_i, y_i), i = 1, 2, …, n, mapping with a kernel function, the sample can be mapped to a high dimensional linear space:

$$f\left(x\right)=\omega Kernel\left(x\right)+ p$$

(7)

It is hard for a computer to solve the strict constraint conditions, so the relaxation factor ${\upbeta }_{i}$ is added, and the problem is transformed into a linear programming problem of the convex function.

$$max=\left(\frac{1}{2}{|\omega |}^{2}\right)+C\sum_{i=1}^{n}{\upbeta }_{i}$$

(8)

Subject to y_i × (ω x_i + b) ≥ 1−${\upbeta }_{i},{\upbeta }_{i}$ ≥ 0.

Quoting Lagrange factor:

$$\begin{array}{ll}f\left(x\right)=\omega Kernel\left(x\right)+ p\\ -\sum_{i=1}^{n}{\alpha }_{i}\times \left({y}_{i}\times \left(\omega {x}_{i}+b\right)-1+{\beta }_{i}\right)-\sum_{i=1}^{n}{\mu }_{i}\times {\beta }_{i}\end{array}$$

(9)

To obtain the Lagrange dual problem, the problem is transformed into solving the equation α_i.

$${\text{min}}\begin{array}{ll}\left(\sum_{i=1}^{n}\sum_{j=1}^{n}{\mathrm{\alpha }}_{i}{\mathrm{\alpha }}_{j}{{\text{y}}}_{i}{y}_{j}Kernel\left({x}_{i}\times {x}_{j}\right)-\sum_{i=1}^{n}{\mathrm{\alpha }}_{i}\right)\\ C\ge {\mathrm{\alpha }}_{i}\ge 0\end{array}$$

(10)

In Eq. (9), C is a regularization parameter that controls the tradeoff between margin and misclassification error. These ${x}_{j}$ are called Support Vectors only if the corresponding ${\mathrm{\alpha }}_{i}>0$. Further details on the algorithmic implementation of SVM utilized in this work can be found in [25].

The Fig. 4 describes the parameter settings, which involve the following steps: importing the data for training an SVM regression model, preparing the data for training, training the SVM, and making predictions. The use of the Gaussian kernel function suggests that the model is suitable for capturing non-linear relationships in the data.

3.2.3 Principal of the SVM-ELM model

To exploit the similar mapping functionality as the kernel function in SVM, this study replaces the single hidden layer map h(x) in the extreme learning machine (ELM) algorithm with the kernel function used in SVM. In cases where the exact form of h(x) is unknown, a nuclear matrix is defined based on the Mercer condition [4].

$${\Omega }_{ELM}={HH}^{T}=h\left({x}_{i}\right)h\left({x}_{j}\right)= K\left({x}_{i}, {x}_{j}\right)$$

(11)

Using the kernel function ${\text{K}} \left({{\text{X}}}_{{\text{i}}},{{\text{X}}}_{{\text{j}}}\right)$can enhance the stability of the original ELM algorithm. According to the standard optimization principle, the initial objective function can be expressed as:

$$min{L}_{p}=\frac{1}{2}{\omega }^{2}+\frac{1}{2}C\sum_{i=1}^{n}{{\xi }_{i}}^{2}$$

(12)

s.t.ψ(x_i)ω = y_i − ξ_i,i = 1,2…,n, where C is a regular coefficient and ξi is a training error. According to the KKT theory (Karush–Kuhn–Tucker conditions) [13], the above problem can be transformed into:

$$\begin{array}{c}{L}_{p kelm}=\frac{1}{2}{\omega }^{2}+\frac{1}{2}C\sum_{i=1}^{n}{{\xi }_{i}}^{2}-\\ -\sum_{i=1}^{n}{\eta }_{i}\left(\psi \left({x}_{i}\right)\omega = {y}_{i} -{\xi }_{i}\right)\end{array}$$

(13)

where η_i is the Lagrange operator. The output function of the SVM-ELM algorithm is:

$$\omega = \sum_{i=1}^{n}{\eta }_{i}\psi {\left({x}_{i}\right)}^{T}={\psi }^{T}\eta$$

(14)

$${\text{C}}\xi i = \eta i$$

(15)

$$\psi \left({x}_{i}\right)\omega - {y}_{i}+{\xi }_{i}=0$$

(16)

where η = [η₁,η₂,…,η_n]^T

Defining the kernel function K(x_i, x_j) = ψ(x_i) · ψ(x_j) ^T, i = 1, 2... n, and SVM-ELM single hidden layer output is:

$$f\left(x\right)=\psi \left(x\right)\omega =\psi \left(x\right){\psi }^{T}{\left(\psi {\psi }^{T}+\frac{1}{C}\right)}^{-1}Y$$

(17)

Due to the arbitrary assignment of initial input weight and threshold in the original ELM, the algorithm exhibits instability. In contrast, the SVM-ELM algorithm leverages the powerful non-linear mapping capability of the kernel function, resulting in improved function approximation. Consequently, the SVM-ELM algorithm showcases higher learning speed, accuracy, and generalization ability than the original model. Different multi-core functions such as RBF, wavelet, and polynomial can be used to address SVM. In this study, we utilized the RBF kernel function.

The parameter settings for the model to forecast state-wide CO₂ emissions using ELM and SVM regression. The settings begin by specifying the import options for a spreadsheet, including the number of variables, the sheet and range, and the variable names and types. It then imports the data from the specified Excel file into a table. The input data is then normalized. The parameters for the ELM model are then set, including the number of hidden nodes. Random input weights and biases are generated for the ELM model, and the hidden layer output matrix is computed using the sigmoid activation function. Subsequently, the SVM regression model is trained using the ELM outputs, with a Gaussian kernel function specified. Finally, the model is used to predict the output for the training data. The model described is a hybrid SVM-ELM regression model, combining the strengths of ELM in feature extraction and the power of SVM in regression tasks (Fig. 5).

3.2.4 Hybrid GM (1,1) optimized by SVM -ELM algorithm

The grey prediction theory is used to find the laws of the system through processing, writing data, and establishing the corresponding mathematical model Fig. 6. Therefore, using GM to predict the main type of Fuel Combustion Emissions by Fuel Category and Statewide Greenhouse Gas Emissions. GM(1;1) [26, 34, 35] is mainly divided into the following step:

Obtaining original sequence

$$x(0) =(x(0) (1), x(0) (2), \dots , x(0) (n))$$

(18)

Calculating the accumulated generated value

$${x}^{\left(1\right)}\left(k\right)\sum_{m=1}^{k}{x}^{\left(0\right)}\left({\text{m}}\right){\text{k}}=\mathrm{1,2},\dots ,n$$

(19)

Calculating (a) and (u) parameters on the basis of the differential equation

$$\frac{{dx}^{\left(1\right)}}{dt}+{ax}^{\left(1\right)}=u$$

(20)

The GM (1,1) prediction model

$$\begin{array}{c}\text{x}^\wedge\left(1\right)\left(\mathrm k+1\right)=\left(\text{x}\left(0\right)\left(1\right)-\frac ua\right)\text{e}-\mathrm{ak}+\\+\frac ua\text{k}=1,2,\dots,\text{n}\end{array}$$

(21)

The result xˆ⁽⁰⁾ (k + 1)) is obtained by the gradual reduction of the GM (1,1) model [46].

The Hybrid GM (1,1) Optimized by SVM-ELM Algorithm Fig. 7, this model includes a number of processes that are executed in order to prepare, analyze, and predict carbon dioxide emission data using the GM(1,1) and SVM-ELM regression. The code first imports the data from an Excel file and then splits it into a training set and a testing set. The data is then preprocessed by normalizing the input data and preparing it for the Grey Prediction model. The Grey Prediction model is then implemented to forecast carbon emissions based on the training data. This involves calculating the accumulated generation (AG) sequence, estimating parameters a and b, and computing the predicted CO₂ emissions using the Grey Prediction model. Next, sets the parameters for the ELM and computes the hidden layer output matrix using ELM. The SVM regression model is then trained using the ELM outputs and the predicted emissions. Finally, predicts the output for the testing data using the ELM-SVM model.

3.3 Performance metrics used for evaluating the accuracy of the models

To understand the performance of the forecasting model, error analysis indicators are necessary. This research uses mean absolute error (MAE), mean absolute percentage error (MAPE), and mean square error (MSE). MAE represents the mean absolute errors and is used to reflect the actual situation of the forecast error. MAE is calculated as:

$$MAE= \frac{1}{n}\sum_{i=1}^{n}\left|y\left(i\right)-\widehat{y}\left(i\right)\right|$$

(22)

where $\widehat{y}\left(i\right)$ is the prediction, and y(i) is the actual value. The range of values for MAE is 0 to + ∞. As the result of MAE increases, the size of the error likewise increases.

MAPE is a straightforward measure of the prediction accuracy of a forecasting method and is thus usually considered the fairest indicator. On this basis, MAPE is generally viewed as the most critical precision evaluation indicator. Dividing each error by the actual value to get an average result in percentage Table 1, MAPE is defined by:

Table 1 Interpretation of typical MAPE

Full size table

$$MAPE=\frac{1}{n}\sum_{i=1}^{n}\left|\frac{y\left(i\right)-\widehat{y}\left(i\right)}{y\left(i\right)}\right|*100\%$$

(23)

MSE denotes the ratio of the square of the two norms of the error vector to the number of samples and is defined as

$$MSE= \frac{1}{n}\sum_{i=1}^{n}\left(y\left(i\right)-\widehat{y}\left(i\right)\right){ }^{2}$$

(24)

The performance evaluation metrics such as MSE, MAE, and MAPE are used to assess the accuracy and robustness of the different modeling approaches. By carefully adjusting the parameters and evaluating the model performance, the effectiveness of each method and make informed decisions about the most suitable model for predicting CO₂ emissions.

3.4 Experimental setup and implementation

Parameter settings for the algorithms used in this study involved the use of historical data from 1990–2020 as training data. The data was prepared in Excel and then imported into MATLAB for further processing. Each model required specific implementation steps, including algorithm training, testing, and results evaluation.

In the context of scenario forecasting, preliminary emissions calculations for 2030 and 2050 were utilized. These calculations were based on carbon dioxide emissions reduction pathways. The training set for the models included historical data from 1990–2020, as well as the calculated data for the specified scenarios in 2030 and 2050. This approach allowed for the comprehensive evaluation of different parameter settings within the algorithms used, taking into account historical data and projected scenarios.

3.5 New York State decarbonization pathways analysis

New York State has developed decarbonization pathways for various sectors [38], outlining a comprehensive strategy to achieve substantial reductions in greenhouse gas emissions [45] Table 2. In the transportation sector, the focus is on transitioning to zero-emission vehicles and equipment, enhancing public transportation and mobility alternatives, promoting smart growth and mobility-oriented development, and facilitating market-based solutions and financing. The residential and commercial buildings sector aims to adopt zero-emission codes and standards, transition from hydrofluorocarbons, and electrify space and water heating using heat pumps. The electric power sector aims to transform power generation, enhance the grid, and invest in new technology to achieve renewable energy and energy storage goals. In the industrial sector, the strategies are focused on providing financial and technical assistance, incentivizing procurement for low-carbon products, supporting workforce development, and facilitating research, development, and demonstration. By implementing these strategies, New York State aims to achieve a zero-emission electricity system by 2040, electrify 85% of homes and commercial building space by 2050, and reduce direct GHG emissions from industrial activities. These initiatives reflect a commitment to addressing climate change and transitioning towards a sustainable and low-carbon future.

Table 2 Strategy for economic sectors to develope pathways, key assumptions

Full size table

3.6 Scenario settings

The specific percentage carbon emission reduction pathways for New York State's economic sectors were chosen based on careful analysis and consideration of various factors Table 3. Pathway 1, which focuses on high technology availability, aims for a 40% reduction in CO₂ emissions by 2030 and an 85% reduction by 2050 across all sectors. This pathway was chosen because it is ambitious yet achievable with the technological advancements expected in the coming years. Pathway 2, known as the Integrated Pathway, sets higher targets for the electric power sector (70% reduction by 2030, 100% reduction by 2050) and moderate targets for the other sectors. This pathway was selected to prioritize the decarbonization of the electric power sector, which is crucial for overall emissions reductions. Pathway 3, with lower reduction targets across all sectors, provides a more conservative approach. This pathway was included to show the potential outcomes if more modest efforts are made to reduce carbon emissions.

Table 3 Carbon emission percentage reduction for different pathways relative to 1990

Full size table

4 Results and discussion

4.1 NY state CO₂ emission data analysis and pre-processing

Emissions overview for economic sectors Table 4 and Fig. 8:

Table 4 NY state Fuel Combustion Emissions for different economic sectors, 1990–2020 (MMt CO₂ e)

Full size table

Transportation emissions in New York State are a significant contributor to air pollution and carbon emissions. The extensive road network and high population density result in elevated emissions from cars, trucks, planes and public transit. In 2020, the transportation sector was responsible for 42% of New York’s carbon emissions. It is largely dependent on petroleum-based fuels such as gasoline, diesel, and jet fuel. The emissions from fossil fuel combustion grew 25% from 1990 to 2019 and declined 6% from 1990 to 2020. This deference is described with restriction during the COVID-19 pandemic period.

Residential and commercial sectors: The emissions from the combustion of fossil fuels in 2020 in the residential sector are 22%, and in the commercial sector, they are 14%. Together, this makes it the second-largest source of emissions. Direct emissions from buildings originate from the burning of fossil fuels onsite in residential and commercial buildings, primarily for space and water heating, cooking, and associated upstream emissions. The fuels used in buildings today include fossil natural gas, distillate fuel, wood, propane, kerosene, and residual fuel.

Electric power sector: comprised 16% of statewide emissions in 2020. In New York, the electricity sector consists of conventional fossil fuel-fired power generation facilities and nuclear generation facilities, as well as renewable energy generation from sources like wind, solar, hydropower, energy storage, and transmission infrastructure. The most significant trend in fuel combustion emissions since 1990 was a 63% reduction from the electricity sector.

Industrial sector: Industrial emissions made up 6% of statewide emissions in 2020, including emissions from combustion from the oil and gas system in New York. The largest share of emissions is created by the production of food, paper, bulk chemicals, glass, cement, metals, semiconductors, wood products, and plastics.

The majority of fuel combustion emissions in 2020 were from the burning of petroleum fuels, which is unsurprising as these are the main fuels used in transportation. However, emissions from the combustion of petroleum fuels have declined by 41,7% since 1990, while emissions from natural gas increased by nearly 46%. Emissions from the combustion of coal declined by 84% Table 5. Fuel Combustion Emissions In New York State from natural gas makes up for 48% of the fuel combustion emissions, with petroleum fuels accounting for the remaining 52%. the absence of coal in New York State's fuel combustion emissions highlights a shift towards cleaner fuel sources Fig. 9.

Table 5 NY state fuel combustion emissions by fuel category, 1990–2020 (MMt CO₂ e)

Full size table

4.2 New York State carbon emission forecasting results and projections for fuel categories

The comparison of simulation results for four different machine learning algorithms—ELM, SVM, SVM-ELM, and GM & SVM-ELM—has provided the accuracy of predicting carbon emissions for different types of fossil fuels in NYS. The MAPE values for each model were analyzed to determine their accuracy in forecasting carbon emissions for coal, petroleum fuels, and natural gas.

For coal, from Table 1 can be concluded the ELM model demonstrated excellent accuracy with a MAPE of 5%, while the SVM model exhibited a MAPE of 32%, indicating reasonable but less accurate predictions. The combined SVM-ELM model showed improved accuracy with a MAPE of 19%, while the GM&SVM-ELM model was considered unqualified due to its significantly high MAPE of 276%. This analysis allows for the evaluation of the performance of the machine learning models and their ability to accurately predict coal carbon emissions.

In the case of petroleum fuels, the ELM and SVM models exhibited the smallest MAPE values of 2%, suggesting that they are the most accurate in predicting carbon emissions for this category. This visual comparison enabled an assessment of the accuracy and reliability of the forecasting provided by each model.

For natural gas, according to the interpretation of typical MAPE values from Table 1 all four machine learning algorithms demonstrated excellent results in predicting carbon emissions. However, the ELM and SVM models stood out with the smallest percentage error, indicating their particularly accurate predictions (Fig. 10).

According to the evaluation results in Table 6, which include the calculation of MAE, MSE, and MAPE, it can be concluded that the ELM model has demonstrated the highest accuracy in predicting fossil fuel type emissions. This model achieved the lowest values for MAE, MSE, and MAPE, indicating a closer fit between predicted and actual values. The SVM-ELM model comes in second place in terms of accuracy. Although it performed slightly less accurately than the ELM model, it still demonstrated good predictive capabilities for fossil fuel types. Next in line, we have the SVM model and the GM&SVM-ELM model. These models have shown relatively lower accuracy levels in predicting fossil fuel types compared to the ELM and SVM-ELM models.

Table 6 The evaluation of four models

Full size table

Overall, based on the evaluation results shown in Table 6, it can be concluded that the ELM model is the most accurate for predicting fossil fuel types, followed by the SVM-ELM model. The SVM and GM&SVM-ELM models achieved lower levels of accuracy in this regard.

4.3 NY state CO₂ emission scenario-based prediction

The parameters of the three scenarios outlined in Table 3 are primarily set based on the proposed percentage goals for total CO2 emissions in the state of New York. These projections are visually represented in Fig. 11 and Fig. 12. The data set contains 33 values, including those from 1990–2020, projections for 2030 and 2050. The calculation for 2030 and 2050 is made based on the proposed percentage goals for each economic sector. The process begins with the importation of the data. The data is then read and manipulated to prepare it for analysis. Then parameters for the ELM are set, including the number of hidden nodes. Random input weights and biases are generated, followed by the computation of the hidden layer output matrix using the sigmoid function. The output weights are then solved using the Moore–Penrose pseudo-inverse, and the predicted output for training and testing data is calculated (Table 7).

Table 7 Predicted values of CO₂ emissions for New York State, MMtCO₂ e. in different economic sectors

Full size table

Based on the results obtained from the ELM model Fig. 12 it can be observed.

For the residential sector, the original data of 33.702 MMt CO₂ e in 2021 is closest to the P3 prediction of 31.45 MMt CO₂ e. In the commercial sector, the original data of 21.763 MMt CO₂ e in 2021 is closest to the P1 prediction of 20.067 MMt CO₂ e. For the industrial sector, the original data of 8.284 MMt CO₂ e in 2021 is closest to the P1 prediction of 8.386 MMt CO₂ e. In the transportation sector, the original data of 67.296 MMt CO₂ e in 2021 is closest to the P3 prediction of 58.94 MMt CO₂ e. For the electric power sector, the original data of 25.004 MMt CO₂ e in 2021 is closest to the P3 prediction of 25.23 MMt CO₂ e. Lastly, for NYS energy-related emissions, the original data of 179.486 MMt CO₂ e in 2021 is closest to the P3 prediction of 145.41 MMt CO₂ e.

These comparisons indicate that for certain sectors, the predictions of Pathway 3 (P3) are closer to the original data for CO₂ emissions in 2021. This suggests that a more conservative approach, as outlined in Pathway 3, may align better with the actual emissions data in certain economic sectors.

Table 8 and Fig. 13 the findings of the carbon emission share in New York State under different scenarios for 2030 and 2050 reveal significant differences in the distribution of emissions across various sectors. In Pathway 1, the total CO₂ emissions in 2030 are projected to be 124.395 MMtCO₂ e, with the residential sector accounting for 16.3%, the commercial sector for 12.9%, the industrial sector for 9.5%, the transportation sector for 30.8%, and the electric power sector for 30.5%. Similarly, in 2050, the percentages remain consistent across these sectors.

Table 8 Share of carbon emission in NYS under different scenarios in 2030 and 2050 (%)

Full size table

In Pathway 2, the total CO₂ emissions in 2030 are projected to be 118.410 MMtCO₂ e, with the residential sector accounting for 19.7%, the commercial sector for 15.6%, the industrial sector at 11.4%, the transportation sector at 37.2%, and the electric power sector for 16.0%. Interestingly, in 2050, the percentages for the residential, commercial, and industrial sectors increase while the electric power sector will decrease to 0%.

In Pathway 3, the total CO₂ emissions in 2030 are projected to be 160.19 MMtCO₂ e, with the residential sector accounting for 19%, the commercial sector at 15.1%, the industrial sector at 10.4%, the transportation sector at 27.9%, and the electric power sector for 27.6%. And in 2050, the percentages for each sector show slight variations from the 2030 numbers.

These findings suggest that each pathway offers an approach to reducing carbon emissions, with varying impacts on sectors such as transportation and electric power. Therefore, policymakers must carefully consider these pathways and their implications for achieving emission reduction targets.

4.4 Suggestions

The comparison of forecasted results with 2021 shows that NYS is still far from its ambitious targets. Suggestions for reaching these targets include switching to zero-emission energy sources, improving energy efficiency, electrifying space and water heating, electrifying vehicles, and increasing the use of renewable energy in the power sector.

Residential and commercial sector: Transition from fossil fuels to highly efficient equipment and systems powered by zero-emission energy sources. Implement more sustainable building construction methods, such as reusing existing structures and utilizing lower carbon materials. Electrify space and water heating with high-efficiency heat pumps. Promote the use of ground source heat and district geothermal energy networks to reduce the strain on the electric grid. By 2030, aim to make 250 thousand NY homes and commercial buildings energy efficient. By 2050, target electrification of 85% of buildings.

Industrial sector: Improve energy efficiency and promote industrial electrification. Utilize alternative fuels and increase the use of renewable electricity. Implement carbon capture [5], utilization, and sequestration technologies.

Transportation: Aim for the electrification of all light-duty vehicles by 2030, as well as 40% of medium and heavy-duty vehicles. Increase the use of renewable diesel and low-carbon fuels. Progressively shift from fossil fuels to low-carbon electricity. By 2050, target a 10% shift to zero-emission vehicles. Promote the use of low-carbon options in public transportation and reduce personal trips. Encourage the aviation industry to transition to green hydrogen and renewable biofuels.

Electric power: Work towards achieving a 100% zero-emission power grid by phasing out the use of fossil fuels for power generation. Increase the deployment of renewable energy technologies such as solar, energy storage, and wind power to achieve a 70% renewable energy target by 2030. Reduce the use of emitting fuels in the electricity sector by deploying clean energy resources such as land-based wind and solar, offshore wind, hydropower, fuel cells, and energy storage.

These suggestions align with the goals outlined in the New York State's climate action plan to reduce emissions and transition to a sustainable, low-carbon future. Implementing these changes across the residential, commercial, industrial, transportation, and electric power sectors will be crucial in meeting the ambitious targets set for 2030 and 2050. It will require collaboration between government, businesses, and communities to drive the necessary transformation and achieve a more sustainable and environmentally friendly state.

5 Conclusion

This paper has made significant contributions to the understanding of carbon emissions in the state of New York. The analysis of emissions from different economic sectors, including transportation, residential, commercial, industrial, and electric power, has provided clarification into the sources and trends of carbon emissions in the state. The comparison of four different machine learning algorithms to predict carbon emissions has demonstrated the effectiveness of the ELM model in accurately forecasting emissions for different types of fossil fuels.

The findings of this research have also shed light on the pathways to reach carbon neutrality in New York. The comparison of three pathways—Pathway 1, Pathway 2, and Pathway 3—has revealed the potential impact of different strategies on emissions reduction across various sectors. The analysis of the data for 2021 and the projections for 2030 and 2050 have shown that the predictions of Pathway 3 aligned better with the actual emissions data in certain economic sectors, suggesting that a more conservative approach may be essential in achieving emission reduction targets.

The analysis of this research focused solely on emissions from fossil fuel combustion and did not consider other sources of carbon emissions, such as industrial processes, waste management, and agriculture. Future research should expand the scope to include a comprehensive analysis of all sources of carbon emissions to provide a more complete understanding of the carbon footprint of New York state. Future research should also focus on the development and forecasting of green energy use and the replacement of fossil fuels. The transition to renewable and sustainable energy sources is needed to achieve carbon neutrality, and additional studies should explore the potential impact of investing in green technologies and infrastructure on carbon emissions reduction.

Availability of data and materials

All data generated or analyzed during this study are included in this published article.

Abbreviations

CLCPA:: Climate Leadership Community Protection Act
CCS:: Carbon Capture and Storage
BECCS:: Bioenergy with Carbon Capture and Storage
EPA:: Environmental Protection Agency
EIA:: Energy Information Administration
GHG:: Greenhouse Gas
NY:: New York
ELM:: Extreme Learning Machine
SVM:: Support Vector Machines
GM&SVM-ELM:: Grey Prediction Model optimized by SVM-ELM
MAPE:: Mean Absolute Percentage Error
MAE:: Mean Absolute Error
MSE:: Mean Square Error
IPCC:: Intergovernmental Panel on Climate Change
CO₂ :: Carbon Dioxide
MMt:: Million Metric tons
P1:: Pathway 1 High technology availability
P2:: Pathway 2 Integrated pathway
P3:: Pathway 3 Reference case pathway
HVAC:: Heating, ventilation, and air conditioning
LPG:: Liquid petroleum gas
LDV:: Light-duty vehicle
LDA:: Light-duty auto
LDT:: Light-duty truck
MDV/HDV:: Medium-duty vehicle/Heavy-duty vehicle
ZEV:: Zero -emission vehicle

References

Annual emissions beginning 1990 through 2020.(https://data.ny.gov/Energy-Environment/Statewide-Greenhouse-Gas-Emissions-Beginning-1990/5i6e-asw6)
Charfeddine L, Kahia M (2019) Impact of renewable energy consumption and financial development on CO2 emissions and economic growth in the MENA region: a panel vector autoregressive (PVAR) analysis. Renewable Energy 139:198–213
Article Google Scholar
Cherkassky V, Mulier FM (2007) Learning from data: concepts, theory, and methods. John Wiley & Sons
Book Google Scholar
Chorowski J, Wang J, Zurada JM (2014) Review and performance comparison of SVM-and ELM-based classifiers. Neurocomputing 128:507-516. 38
Article Google Scholar
Cohen SM, Rochelle GT, Webber ME (2012) Optimizing post-combustion CO₂ capture in response to volatile electricity prices. Int J Greenhouse Gas Control 8:180–195
Article Google Scholar
Con Edison, National Grid, Pathways to Carbon-Neutral NYC, April 2021 EPA
Cortes, Vapnik V (1995) Support vector networks. Mach Learn 20:273–297
Article Google Scholar
Dietz T, Rosa EA (1997) Effects of population and affluence on CO2 emissions. Proc Natl Acad Sci 94(1):175–179
Article Google Scholar
Energy and Environmental Economics. Pathways to Deep Decarbonization in New York State, June 24, 2020
EPA. 2020. State Greenhouse Gas Inventory and Projection Tool. September 2020. Washington, D.C.: U.S. Environmental Protection Agency. https://www.epa.gov/statelocalenergy/download-state-inventory-and-projection-tool
EPA (2021) Inventory of U.S. Greenhouse gas emissions and sinks, 1990–2019. EPA 430-R-21–005. Environmental Protection Agency, Washington
Google Scholar
EPA (2021) Table 1. Stationary combustion. Municipal solid waste.” GHG emission factors hub. Environmental Protection Agency, Washington
Google Scholar
Gordon G, Tibshirani R (2012) Karush-kuhn-tucker conditions. Optimization 10(725/36):725
Google Scholar
Hassani H, Huang X, Silva ES, Hassani H, Huang X, Silva ES (2019) Fusing big data, blockchain, and cryptocurrency. Springer International Publishing, p 99–117
Hosseini SM, Saifoddin A, Shirmohammadi R, Aslani A (2019) Forecasting of CO₂ emissions in Iran based on time series and regression analysis. Energy Rep 5:619–631
Article Google Scholar
Huang GB, Bai Z, Kasun LLC, Vong CM (2015) Local receptive fields based extreme learning machine. IEEE Comput Intell Mag 10(2):18–29
Article Google Scholar
Huang GB, Zhu QY, Siew CK (2006) Extreme learning machine: theory and applications. Neurocomputing 70(1–3):489–501
Article Google Scholar
Hussin F, Rahim SANM, Hatta NSM, Aroua MK, Mazari SA (2023) A systematic review of machine learning approaches in carbon capture applications. J CO2 Utilization 71:102474
Article Google Scholar
Inglesi-Lotz R, Dogan E (2018) The role of renewable versus non-renewable energy to the level of CO₂ emissions a panel analysis of sub Saharan Africa’s Big 10 electricity generators. Renew Energy 123:36–43
Article Google Scholar
IPCC, Climate Change 2022: Impacts, Adaptation, and Vulnerability. Contribution of Working Group II to the Sixth Assessment Report of the Intergovernmental Panel on Climate Change, ed. H.-O.R. Pörtner, D. C.; Adams, H.; Adelekan, I.; Adler, C.; Adrian, R.; Aldunce, P.; Ali, E.; Begum, R. Ara; Friedl, B. Bednar; Kerr, R. Bezner; Biesbroek, R.; Birkmann, J.; Bowen, K.; Caretta, M. A.; Carnicer, J.; Castellanos, E.; Cheong, T. S.; Chow, W.; G. Cissé, G. Cissé & Ibrahim, Z. Zaiton. 2022, Cambridge, UK and New York, USA: Cambridge University Press.
Ji L, Zhang BB, Huang GH, Xie YL, Niu DX (2018) GHG-mitigation oriented and coal-consumption constrained inexact robust model for regional energy structure adjustment–a case study for Jiangsu Province, China. Renewable Energy 123:549–562
Article Google Scholar
Keerthana KB, Wu SW, Wu ME, Kokulnathan T (2023) The United States energy consumption and carbon dioxide emissions: a comprehensive forecast using a regression model. Sustainability 15(10):7932
Article Google Scholar
Ko S, Park J, Lee D (2021) Development of a machine learning-based sketch planning model for predicting mobile emissions. Trans Res Interdisc Perspect 10:100379
Google Scholar
Lewis CD (1982) Industrial and business forecasting methods. Butterworths, London
Google Scholar
Li M, Wang W, De G, Ji X, Tan Z (2018) Forecasting carbon emissions related to energy consumption in Beijing-Tianjin-Hebei region based on grey prediction theory and extreme learning machine optimized by support vector machine algorithm. Energies 11(9):2475
Article Google Scholar
Liu, S. and Forrest, J.Y.L., 2010. Grey systems: theory and applications. Springer Science & Business Media.
Ma TM, Yamamori K, Thida A (2020) A comparative approach to Naïve Bayes classifier and support vector machine for email spam classification. 2020 IEEE 9th Global Conference on Consumer Electronics (GCCE). IEEE
Morariu C et al (2020) Machine learning for predictive scheduling and resource allocation in large scale manufacturing systems. Comput Ind 120:103244
New York State Climate Action Council. 2022. “New York State Climate Action Council Scoping Plan.” climate.ny.gov/ScopingPlan
New York State Energy Research and Development Authority (NYSERDA). 2023. “Energy Sector Greenhouse Gas Emissions under the New York State Climate Act: 1990–2021.” NYSERDA Report 23–27. Prepared by Eastern Research Group Inc, Concord, MA, USA. www.nyserda.ny.gov/About/Publications.
Nocedal, J. and Wright, S.J., 1999. Numerical optimization springer-verlag. New York.
Olatunji SO (2017) Extreme Learning machines and Support Vector Machines models for email spam detection. 30th Canadian conference on electrical and computer engineering (CCECE). IEEE, p 1–6
Ren F, Long D (2021) Carbon emission forecasting and scenario analysis in Guangdong Province based on optimized fast learning network. J Clean Prod 317:128408
Article Google Scholar
Safdarnejad SM, Hedengren JD, Baxter LL (2015) Plant-level dynamic optimization of Cryogenic Carbon.
Sapnken FE, Noume HC, Tamba JG (2023) Forecasting CO₂ emissions from road fuel combustion using grey prediction models: a novel approach. MethodsX 11:102271
Article Google Scholar
EIA U (2011) State Energy Data System (SEDS). US Department of Energy, Energy Information Administration
Tang D et al (2016) Trend prediction and decomposed driving factors of carbon emissions in Jiangsu Province during 2015–2020. Sustainability 8(10):1018
Article Google Scholar
Terry J (2020) Exploring New York City's path to decarbonization
Thai HT (2022) Machine learning for structural engineering: a state-of-the-art review. In Struct 38:448–491 Elsevier
Article Google Scholar
U.S. Energy Information Administration (EIA), State Energy Data System and EIA calculations made for this analysis. Electric Power Emissions by State (1970 - 2021)
U.S. Energy Information Administration (EIA), State Energy Data System and EIA calculations made for this analysis.Industrial Emissions by State (1970 – 2021)
U.S. Energy Information Administration (EIA), State Energy Data System and EIA calculations made for this analysis. Residential Emissions by State (1970 - 2021)
U.S. Energy Information Administration (EIA), State Energy Data System and EIA calculations made for this analysis. Transportation Emissions by State (1970 - 2021)
Wang N, Zhao Y, Song T, Zou X, Wang E, Du S (2022) Accounting for China’s net carbon emissions and research on the realization path of carbon neutralization based on ecosystem carbon sinks. Sustainability 14(22):14750
Article Google Scholar
Wiley, C., 2023. New York’s Climate Leadership and Community Protection Act Standards for Renewable and Clean Energy: An Evaluation of Policy Implementation (Doctoral dissertation)
Zeng B, Zhou M, Zhang J (2017) Forecasting the energy consumption of China’s manufacturing using a homologous grey prediction model. Sustainability 9(11):1975
Article Google Scholar
Zhang S, Wang J, Zheng W (2018) Decomposition analysis of energy-related CO₂ emissions and decoupling status in China’s logistics industry. Sustainability 10(5):1340
Article Google Scholar
Zhao HN, Yu WY (2013) Research on influence factors of carbon emissions and forecast in Hebei province. Adv Mat Res 807:790–794
Google Scholar

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China (51534005, 51278293and 51178262) and China Environmental Protection Foundation.

Funding

Shanghai Jiao Tong University provides open-access funding.

Author information

Authors and Affiliations

China-UK Low Carbon College, Shanghai Jiao Tong University, 3 Yinlian Road, Shanghai, 201306, People’s Republic of China
Glukhova Ekaterina & Jia Li

Authors

Glukhova Ekaterina
View author publications
You can also search for this author in PubMed Google Scholar
Jia Li
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

EG performed the methodology employed in this study, allowing for accurate data collection. EG and JL carried out the experiment design and implementation. JL provided conceptualization and played a vital role in the project administration and supervision. The original draft was written by EG, who also conducted data analysis and diagramming. JL contributed significantly to the document's writing, reviewing, and editing process. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Jia Li.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

All authors agree to the publication of this manuscript.

Competing interests

The authors declare no competing financial interest.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Supplementary Material 1.

Supplementary Material 2.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Ekaterina, G., Li, J. Analysis of decarbonization path in New York state and forecasting carbon emissions using different machine learning algorithms. Carb Neutrality 3, 8 (2024). https://doi.org/10.1007/s43979-024-00083-x

Download citation

Received: 25 October 2023
Revised: 19 January 2024
Accepted: 23 February 2024
Published: 27 March 2024
DOI: https://doi.org/10.1007/s43979-024-00083-x

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Analysis of decarbonization path in New York state and forecasting carbon emissions using different machine learning algorithms

Abstract

Similar content being viewed by others

Strategies to achieve a carbon neutral society: a review

Investigating the assumption of perfect displacement for global energy transition: panel evidence from 73 economies

Does a Carbon Tax Reduce CO2 Emissions? Evidence from British Columbia

1 Introduction

2 Literature review

2.1 Literature on Carbon emission studies and reports

2.2 Literature on machine learning algorithms

3 Methodology and data collection

3.1 Data collection

3.2 Description of the implementation of the four algorithms and the hyperparameters used

3.2.1 Basic methodology of extreme learning machine algorithm

3.2.2 Basic methodology of support vector machines

3.2.3 Principal of the SVM-ELM model

3.2.4 Hybrid GM (1,1) optimized by SVM -ELM algorithm

3.3 Performance metrics used for evaluating the accuracy of the models

3.4 Experimental setup and implementation

3.5 New York State decarbonization pathways analysis

3.6 Scenario settings

4 Results and discussion

4.1 NY state CO2 emission data analysis and pre-processing

4.2 New York State carbon emission forecasting results and projections for fuel categories

4.3 NY state CO2 emission scenario-based prediction

4.4 Suggestions

5 Conclusion

Availability of data and materials

Abbreviations

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Ethics approval and consent to participate

Consent for publication

Competing interests

Additional information

Publisher’s Note

Supplementary Information

Supplementary Material 1.

Supplementary Material 2.

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation

4.1 NY state CO₂ emission data analysis and pre-processing

4.3 NY state CO₂ emission scenario-based prediction