State-of-health estimation of Lithium-ion battery based on back-propagation neural network with adaptive hidden layer

The reliability and safety of lithium-ion batteries (LIBs) are key issues in battery applications. Accurate prediction of the state-of-health (SOH) of LIBs can reduce or even avoid battery-related accidents. In this paper, a new back-propagation neural network (BPNN) is proposed to predict the SOH of LIBs. The BPNN uses as input the LIB voltage, current and temperature, as well as the charging time, since it is strongly correlated with the SOH. The number of hidden layer nodes is adaptively set based on the training data in order to improve the generalization capability of the BPNN. The effectiveness and robustness of the proposed scheme is verified using four distinct battery datasets and different training data. Experimental results show that the new BPNN is able to accurately predict the SOH of LIBs, revealing superiority when compared to other alternatives.


Introduction
The rapid development of LIBs has led to an increasing electrification of transportation systems, namely electric vehicles (EVs). Many countries have enacted policies to stimulate the development of EVs in order to reduce greenhouse gas emissions and save non-renewable energy [1]. The battery energy [2], charging [3] and thermal [4] management are important components of the battery managing system (BMS). In order to make sure that a LIB operates safely and reliably, the SOH is estimated to assess its working condition [5][6][7]. According to the standard of the IEEE 1188 À 1996, the SOH of a LIB can be defined as: where C now and C new are the current and the nominal capacity of the LIB, respectively. To predict the SOH of a LIB, the internal resistance and capacity are often used as characteristics of aging [8][9][10], except when dealing with the solid phase diffusion time of lithium-ions in the positive electrode [11], or when monitoring aspects like the cyclable lithium-ions [12]. By using the capacity, DC pulse or electrochemical impedance spectroscopy (EIS) [13,14] tests, among others, the LIB health parameters, such as voltage, current, temperature and charging time, are obtained and can be used directly in distinct methods to predict the SOH. For example, the capacity can be obtained by measuring the charge transferred through the battery during the charging or discharging phases [15], and the internal resistance can be determined by calculating the instantaneous voltage drop during a pulse test [16]. However, existing direct methods have limitations in practical applications. Indeed, they require very accurate sensors measurements, the battery must be put out of service for testing (e.g, in capacity or EIS tests), and specific methods are restricted to specific test systems (e.g., DC pulses with high currents are not allowed by the BMS, as they are seen as abnormal operating conditions).
In the last few years, several methods have been proposed and applied to SOH estimation, yielding real-time assessment of LIBs [17]. These approaches can be generally divided into model-based and data-driven methods. The most common model-based approaches rely on electrochemical models or equivalent circuit models (ECMs) [18][19][20][21][22]. Electrochemical models can accurately describe the LIBs dynamics, but the modeling process is complicated and the required computational cost is high, which difficult their use in practical applications. Conversely, ECMs are easier to obtain and involve smaller computational burden; however, their accuracy strongly depends on the values of the models' parameters, which are difficult to estimate. Due to the complexity and limitations of modelbased approaches, data-based techniques have gained interest in SOH estimation. Indeed, a variety of machine learning (ML) algorithms have been proposed, namely artificial neural networks (ANN) [23,24], support vector machines (SVM) [25,26] and relevance vector machines (RVM) [27,28]. In terms of structure, the most recent versions rely on deep learning techniques [29]. Among all ML methods, ANNs have excellent accuracy, adaptability, generalization capability and robustness [30]. For example, a nonlinear autoregressive scheme with exogenous input was proposed in [31], while an extreme learning machine (ELM) was used in [32] for SOH estimation.
Choosing the input data is the first step to consider before training a ML method. Therefore, the dataset to be used as input to the BPNN is worth exploring. The battery data related to aging are either external or internal. The external includes temperature, charging and discharging rates, and depth of discharge, for example [17]. The internal refers mainly to physical and chemical properties of the LIB, such as generation of solid electrolyte interphase (SEI) layer, self-discharge, and decomposition of the anode [33]. In real-world applications, the BMS sensors can collect data, namely the terminals battery voltage, current, surface temperature and time of charge and discharge. Taking multiple data is beneficial to improve the accuracy of the SOH estimation. Current, voltage and temperature have been often used as input to ML algorithms [31,34]. However, correlation analysis has revealed that the charging time of the LIB is even closer related to the SOH than those. Therefore, using the charging time as input to the BPNN is crucial for high accuracy estimation of SOH.
The BPNN is one of the most convenient types of ANN for the purpose of estimation due to its capabilities of nonlinear mapping, self-learning, adaptability, generalization and fault tolerance. Several methods have been applied to improve the performance of existing BPNNs [35][36][37][38], which have shown promising results. However, in previous research, the BPNN structure has not been fully explored, especially in what concerns the number of hidden layer units. Indeed, most BPNNs in the references above use a fixed number of hidden layer units or set the number of hidden layer units to a value in a given interval [39,40]. A BPNN with a fixed number of hidden layer units may yield excellent results for a given training set, but behave poorly when the training data changes. For example, it was shown in [41] that for an optimal fractional order of 7/9, keeping the learning rate at 0.5, the prediction accuracy of a BPNN with 500 hidden layer units was lower than that obtained with 300 and 200 hidden layer units, and for learning rate at 1, the number of hidden layer units should be set equal to 500 to obtain the best prediction. Therefore, it is crucial to have a method that finds the optimal number of hidden layer units, so that the BPNN exhibits the best performance as the training data changes.
In this paper, a SOH prediction method based on a BPNN with adaptive hidden layer is proposed. This method takes the mean absolute error (MAE) of the prediction for a training dataset and chooses the number of hidden layer units that minimizes the MAE. Compared with other methods, the proposed scheme is capable of determining the optimal number of hidden layer units during the training phase, instead of fixing its value after experimentation, thus reducing the time required for setting up the network and improving the accuracy of SOH prediction. The main contributions of the paper are: 1) The proposed method determines the optimal BPNN structure adaptively during the training process, just relying on the results obtained for a training dataset. This is different from other neural network algorithms, which adjust the hyperparameters after each experiment; 2) The charging time of the LIB is employed, along with the voltage, current and temperature, as input to the BPNN; 3) The new BPNN is used with four distinct LIBs and different training datasets, and its prediction results are compared with those obtained with other two BPNNs. It is shown that the proposed scheme, by adaptively changing the structure of the network according to the dataset, leads to superior SOH prediction. 4) The proposed BPNN is shown to necessitate only 50% of data for training, yielding mean absolute percentage error (MAPE) inferior to 0.8%.
The remainder of the paper is organized as follows. Section 2 introduces the LIBs datasets and the parameters used for SOH estimation. Section 3 presents the new BPNN algorithm developed. Section 4 shows the experimental results obtained with the proposed BPNN. Section 5 draws the conclusions.

Datasets and features for SOH estimation
Four battery aging datasets from the NASA Ames Prognostic Center of Excellence (PoCE) repository are used in the follow-up [42]. The NASA batteries are of  Figure 5 shows that the temperature of the LIB increases with the increase in the number of charging and discharging cycles. The speed at which the peak is reached rises, and the average temperature increases. Moreover, the temperature variation of different batteries is similar, with the exception of the B0018 (18#), which has a sudden change in temperature at the beginning of the battery charging and discharging phases.
Based on the above discussion, one can see that the charging time has a relationship with the aging state of the LIB. To explore this relationship closer, the grey relational analysis (GRA) method is applied here. This method is able to calculate the correlation between parameters with very little information [43]. The GRA analysis results are shown in Fig. 6. We verify that the charging time of the LIB is more closely related to the SOH than the voltage, current and temperature. Therefore, in the follow-up, the charging  time is adopted as input to the BPNN, while the LIB capacity related to the charge-discharge cycle is selected as the output.

The BPNN
The structure of the BPNN is shown in Fig. 7. Let us consider that the training dataset is given by D ¼ ðx 1 ; y 1 Þ; ðx 2 ; y 2 Þ:::ðx i ; y i Þ; :::ðx n ; y n Þ f g ; x 2 R a ; y 2 R b , where n is number of samples in the dataset, x i and y i are the input and output of the i-th sample, i ¼ 1; 2; :::; n, and a and b represent the dimensions of the input and output matrices, respectively.
We normalize all parameters by: where z 0 i and z i represent the normalized and non-normalized values, and z max and z min denote the maximum and minimum values of the data, respectively.
The output of the hidden layer is: where g denotes the activation function of the hidden layer, h j is the j-th output node of the hidden layer, w ij represents where b y k represents the prediction of the k-th node, f is the activation function of the output layer, w jk is the weight connecting the hidden and output layers, and d k is the threshold of the k-th output node.
According to the model of the BPNN, the mean square error (MSE) of the sample x l ; y l ð Þ is calculated from the actual and predicted values, which is: The BPNN is an iterative learning algorithm, where the updated estimator for any parameter v is given by: After the training target error is given, the BPNN uses the gradient descent method to adjust the weights of the network. Given the learning rate g, the update formula for the weights is: where w 0 ij and w 0 jk are weights after iterating. The purpose of the BPNN is to minimize the accumulation error on the training set D, which is expressed as:

The BPNN with adaptive hidden layer
In most BPNN, the number of hidden layer nodes is fixed [39,44]. Although for a given number of nodes and training dataset, the network can perform well, its performance may degradate seriously as the training dataset changes. Therefore, the BPNN model needs to be re-designed. Indeed, when the network has too many hidden layer nodes, either over-fitting or long training times may result, depending whether or not the training data carries insufficient or enough information, respectively. Conversely, when the network has too few neurons in the hidden layer, under-fitting will result. The number of hidden layer nodes is chosen by means of the empirical formula: where h max is the maximum number of hidden layer nodes, r and s denote the numbers of input and output nodes, respectively, and . is a constant lesser than 10 [45]. Herein, the mean absolute error (MAE) is used to quantitatively evaluate the accuracy of the training set, that is, where y i andŷ i denote the real and predicted values of the dataset. Usually, there exists a positive correlation between the accuracy of the training set and the test set [46], meaning that high (low) accuracy of the training set often leads to high (low) accuracy of the test set. Therefore, we choose the number of hidden layer nodes of the BPNN in the interval ½1; h max and compute the training set error. We denote the error of the last calculation as MAE new , the corresponding number of hidden layer nodes as h new , the minimum error in all previous calculations as MAE previous , and the corresponding number of hidden layer nodes as h previous . The proposed optimal number of hidden layer nodes is given by: The proposed BPNN with adaptive number of hidden layer nodes is depicted in Fig. 8. The algorithm is given as follows: 1. Initialization. Set the parameters, namely the numbers of nodes of the input and output layers, the training epochs and the learning rate; 2. Calculate the maximum number of hidden layer nodes h max by means of the empirical formula (10), and set the number of hidden layer nodes between 1 and h max ; 3. Train the BPNN with different number of hidden layer nodes and calculate the MAE corresponding to each number of hidden layer nodes; 4. Chose the number of hidden layer nodes that corresponds to the minimum MAE as the best hidden layer, h best ; 5. Use training samples to train the BPNN with the optimal number of hidden layer nodes, thus creating the best-fitting network; 6. Use the network created to forecast the test sample.  (d) Fig. 9 Results using 70% of the data as the training dataset: a B0005, b, B0006, c, B0007, d B0018

Results
In this Section, the effectiveness and generalization capability of the proposed BPNN are analyzed and discussed. The health parameters mentioned in Sect. 2 (or some subset) are used as input to the BPNNs tested, and the capacity of the batteries is taken as the output. The proposed adaptive (AD) hidden layer four-dimensional (4D) network (AD4DBPNN), with voltage, current, temperature and charging time as input parameters is compared with an AD hidden layer three-dimensional (3D) network (AD3DBPNN), with voltage, current and temperature as inputs, and with a fixed hidden layer 4D network (4DBPNN) having the same inputs as the AD4DBPNN.
It should be noted that the method proposed in this paper aims to adaptively change the number of hidden layer units in order to obtain the optimal neural network structure every time the dataset changes. Indeed, the optimal network structure for two distinct training datasets, let us say A and B, is not necessarily the same. Therefore, if the network structure is set based on the training dataset A, then this network may not be suitable for using with dataset B. Thus, the network may be suboptimal and poor prediction results may be obtained.
In the follow-up, we use the front part of a single dataset for training and the latter part for testing. The number of hidden layer nodes of the 4DBPNN is set equal to the optimal number of hidden layer nodes determined for the AD4DBPNN when trained using 70% of the data for each battery and, then, it is kept fixed.
The AD4DBPNN, AD3DBPNN and 4DBPNN are implemented using the software MATLAB 2021. With the exception of the number of hidden layer nodes and the dimension of the input, all BPNNs use the same structure and features: The root mean squared error (RMSE) [47] and MAPE [48] are used as evaluation functions: whereŷ i and y i represent the estimated and true values of the SOH, respectively. First, 70% of the data of four batteries are used for training, and the remaining 30% are used for testing the effectiveness of the BPNNs. Figure 9 depicts the training results of the AD4DBPNN, AD3DBPNN and 4DBPNN, from where we verify that they are relatively close to the real value. When including the charging time as input (AD4DBPNN and 4DBPNN), the networks yield smaller errors than the AD3DBPNN. Figures 10 and 11 show the RMSEs and MAPEs, respectively. One can see that after adding the charging time as an input, the RMSE of the battery B0005 decreases from 1.87% to 0.41%. For the other three LIBs, the AD4DBPNN still has the lowest RMSE or values close to those of the 4DBPNN. Moreover, with the charging time as input, the MAPE decreased by 0.4%, 0.52% and 0.35% in the batteries B0006, B0007 and B0018, respectively.  (d) Fig. 12 Results using 60% of the data as the training dataset: Then, we reduce the proportion of the training set from 70% to 60% of the data. Figure 12c and d highlights that the predictions of the AD4DBPNN have higher accuracy than those of the AD3DBPNN. This means that without adding the charging time as input to the neural network the error of the SOH prediction becomes larger. From Fig. 12a-d, one can see that the prediction results of the AD4DBPNN are closer to the true value than those of the 4DBPNN, whose hidden layer keeps unchanged. As shown in Fig. 13, the RMSE of the predicted results of the AD4DBPNN is 30.3% lower than that of the 4DBPNN. As shown in Fig. 14, for the batteries B0006, B0007 and B0018, the MAPE of the AD4DBPNN decreased by 36.3%, 23.5% and 23.4%, respectively, compared with the values of the 4DBPNN. This shows that, under the same training dataset, decreasing the proportion of training data, keeping the number of hidden layer units of the BPNN constant, leads to local optimum. The behavior means that the training set (the first 60%) and the testing set (the last 40%) have small and large errors, respectively, which can be seen in Fig. 12 where the capacity is reduced from 1.5 to 1.4.
Finally, the training set was further reduced to 50% of the data. Figure 15 shows the performance of the three BPNNs. It can be seen that the AD4DBPNN exhibits the better generalization and accuracy, while the results of the AD3DBPNN and 4DBPNN show different degrees of fluctuation. Figure 16 compares the RMSE of the predicted results. We verify that the AD4DBPNN still maintains good accuracy, namely 1.45% for the worst performing battery (B0006). In contrast, both the AD3DBPNN and 4DBPNN have RMSEs exceeding 1.9%, showing poor predictive capability. Figure 17 shows that the proposed method achieves 0.65% MAPE on the worst performing dataset, while the other two algorithms achieve 0.95% and 1.18%, respectively. Table 1 summarizes the best number of hidden layer nodes of the BPNNs when the training set changes. It can be seen that the number of hidden layer nodes for obtaining the best results is different for different training sets.
In general, the AD4DBPNN is more capable of predicting the SOH of LIBs when the training dataset changes. When the charging time is used as input for training, the prediction accuracy is effectively improved. In addition, when the proportion of training data drops to 50% of the data, the maximum error of the AD4DBPNN is only 0.65%, which proves its effectiveness. From Figs. 9, 10, 11, 12,13, 14, 15, 16, 17, the AD4DBPNN still maintains high accuracy when the battery dataset used to train the neural network changes. When the dataset changes from B0007 to B0018, the total number of samples in the dataset decreases from 168 to 132. At this time, the AD4DBPNN can still show very high accuracy in predicting the SOH. The above shows that the AD4DBPNN has good generalization capability.

Conclusion
This paper proposed a three-layer, four-dimensional, adaptive hidden layer network (AD4DBPNN), with input parameters voltage, current, temperature and charging time, to predict the SOH of LIBs. The method determines the optimal number of hidden layer units based on the MAE of the prediction for a training dataset. The accuracy of the SOH prediction is improved by including the  charging time as input to the BPNN, since it is highly correlated with the SOH of the LIB. Even when the ratio of the training data is reduced to 50% of the full dataset, the proposed method shows good accuracy, with only 1.45% and 0.65% RMSE and MAPE, respectively, on the worst performing dataset. Although the proposed method uses an adaptive scheme to determine the optimal structure of the BPNN, it does not avoid the gradient descent method. Future work will be carried out in the direction of using a suitable algorithm to optimize the iteration method of the network weight threshold.

Declarations
Conflict of interest The authors declare no conflict of interest.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.