1 Introduction

Lithium-ion batteries have been widely used in commercial products such as electric vehicles and smartphones, thanks to their relatively high energy density and long service life [1, 2]. For safe operation, reliable and timely detection of various faults associated with batteries is necessary [3]. Among those faults, detecting internal short circuits (ISCs) of batteries is of significant importance, as ISC is one of the major causes of battery thermal runaway [4, 5]. Various abuse conditions, such as mechanical (e.g., collision and puncture), electrical (e.g., overcharge/discharge), and thermal (e.g., high-temperature heating) abuses, can result in ISC [6, 7]. Typically, an ISC can be described as the penetration of the separator by a lithium dendrite, leading to an electrical connection (i.e., short circuit) between the high-potential and low-potential components of the battery [8]. As a result, battery energy will be depleted through the short circuit and dissipated as heat, which imposes safety threats by overheating the battery [9, 10].

Different approaches have been proposed in the literature for ISC detection of single battery cells. For example, a recursive least squares algorithm was adopted to identify ISC based on abnormal state of charge (SOC) depletion and heat generation in Ref. [11]. In Ref. [12], electrochemical impedance spectroscopy was applied to infer ISC of different types. The open-circuit voltage (OCV) was utilized in Ref. [13] to determine the SOC depletion and estimate the ISC severity. A random forest model that correlates the slope of the OCV-SOC curve and internal resistance to ISC was developed in Ref. [14]. ISC detection of single battery cells can be achieved using the OCV, SOC, internal resistance, and temperature information. Nevertheless, the cell-level ISC detection approaches can not be applied to battery strings and modules unless independent current sensors are equipped for individual cells.

For ISC detection within battery strings, considerable studies utilized the difference in dynamic responses among individual cells. For instance, the difference between the SOC of one individual cell and the mean SOC of other cells in a series battery string was leveraged in Ref. [15] to assess the ISC severity. In Ref. [16], the correlation coefficients of voltage among individual cells were computed to identify an off-trend voltage drop for ISC detection. Deviations in the temperatures of series-connected cells were exploited to infer ISC status in Ref. [17]. Since assessing the difference among individual cells can be easier than assessing the conditions of individual cells, these approaches may remain effective even under sparse sensor placement. Nevertheless, uniformity in cell capacity and internal resistance is assumed to ensure their effectiveness [14].

This paper investigates the ISC detection problem of parallel-connected battery cells and aims at tackling the challenges in two aspects. First, the solution to the problem will be developed assuming sensor limitation, i.e., only one current sensor and one voltage sensor for a parallel battery string. Second, non-uniformity in cell capacity and internal resistance is assumed as cell-to-cell variations inevitably exist under manufacturing variability or non-uniform operating conditions (e.g., exposed to different temperatures) [18,19,20]. In this work, ISC detection is formulated as a binary classification problem. The objective is to determine whether the ISC of a string is severe enough for an immediate examination or cell replacement. To understand the effect of ISC on the dynamic response of parallel-connected cells, an electro-thermal model with ISC captured explicitly is first derived as a virtual test-bed to generate data for different ISC conditions. Through analyzing the ISC data, it is identified that the distribution of surface temperature among individual cells is a key indicator for ISC detection, given the limited sensing capability in onboard applications. The convolutional neural network (CNN) [21] is adopted for ISC detection using cell surface temperature and the total capacity of the string as inputs. Performance evaluation of the proposed CNN on data with noisy inputs validates the effectiveness and robustness of the proposed approach.

The contribution of this paper is three-fold. First, a model of parallel-connected battery cells with ISC is proposed by modifying the electrical and thermal models of a healthy battery string. Second, the electrical and thermal responses of parallel-connected battery cells are analyzed to reveal ISC-related signatures applicable for ISC detection. Third, a CNN-based ISC detection algorithm that exploits the cell temperature distribution and string capacity information is proposed and validated through extensive simulation studies.

The remainder of this paper is organized as follows: Sect. 2 introduces the electro-thermal models for the parallel battery string with ISC. The electrical and thermal responses of the battery string and the formulation of the ISC detection problem are presented in Sect. 3. A CNN-based ISC detector is proposed and evaluated in Sect. 4. Finally, concluding remarks and a plan for future work are provided in Sect. 5.

2 System Modeling

In this section, the electro-thermal model of a battery cell is first introduced. Then, the electro-thermal model is modified to account for the existence of ISC. Finally, the battery string model containing one cell with ISC is presented, which will be used to generate data for subsequent analysis and algorithm development.

Fig. 1
figure 1

Model of the parallel-connected battery cells with ISC

2.1 Electro-Thermal Model of a Healthy Battery Cell

Consider the dynamic model of a healthy battery cell as shown in Fig. 1a. It consists of three parts. The electrical model describes the relationship between current and voltage. The thermal model characterizes temperature variations caused by heterogeneous heat generation between cells. The resistance update model quantifies the dependency of cell internal resistance on the capacity, temperature, and SOC of a cell.

2.1.1 Electrical Model

The first-order equivalent circuit model (ECM), as shown in Fig. 1b, is used, given its adequate fidelity and low computational burden [22]. The dynamics of the ECM are given as [23]

$$\begin{aligned} \begin{bmatrix} \dot{v}_{\text {OC},j} \\ \dot{v}_{\text {c},j} \end{bmatrix} = \begin{bmatrix} 0 &{} 0 \\ 0 &{} -\frac{1}{\tau _j} \end{bmatrix} \begin{bmatrix} v_{\text {OC},j} \\ v_{\text {c},j} \end{bmatrix} + \begin{bmatrix} \alpha _j \\ \frac{R_{\text {t},j}}{\tau _j} \end{bmatrix} i_{\text {b},j} \end{aligned}$$
(1a)
$$\begin{aligned} v_\text {b} = \begin{bmatrix} 1&-1 \end{bmatrix} \begin{bmatrix} v_{\text {OC},j} \\ v_{\text {c},j} \end{bmatrix} - R_{\text {s},j}i_{\text {b},j} \end{aligned}$$
(1b)

where \(v_{\text {OC},j}\), \(i_{\text {b},j}\), and \(R_{\text {s},j}\) are the OCV, current (positive for discharge and negative for charge), and ohmic resistance of the \(j^{\text {th}}\) cell, \(R_{\text {t},j}\), \(\tau _{j}\), and \(v_{\text {c},j}\) are the resistance, time constant, and voltage of the RC pair of the \(j^{\text {th}}\) cell, and \(v_\text {b}\) is the terminal voltage. The coefficient \(\alpha _j\) in Eq. (1a), which describes the relationship between OCV and charge/discharge capacity, is given by

$$\begin{aligned} \alpha _j = -\frac{s_j}{Q_{j}} \end{aligned}$$
(2)

where \(Q_j\) is the capacity of the \(j^{\text {th}}\) cell, and \(s_j \ge 0\) is the slope of the OCV-SOC curve of the \(j^{\text {th}}\) cell.

2.1.2 Thermal Model

The lumped-parameter thermal model for a battery cell is adopted from Ref. [24]. This study assumes that the cells are separated by adiabatic layers, as shown in Fig.  1d. Therefore, heat conduction between adjacent cells is negligible, and the thermal dynamics of the \(j^{\text {th}}\) cell are represented with the cell core (\(T_{\text {c},j}\)) and surface (\(T_{\text {s},j}\)) temperatures as

$$\begin{aligned} \begin{bmatrix} \dot{T}_{\text {c},j} \\ \dot{T}_{\text {s},j} \end{bmatrix} = \begin{bmatrix} -\frac{1}{C_\text {c}R_{\uptheta }} &{} \frac{1}{C_\text {c}R_{\uptheta }} \\ \frac{1}{C_\text {s} R_{\uptheta }} &{} -\frac{h}{C_\text {s}}-\frac{1}{C_\text {s} R_{\uptheta }} \end{bmatrix} \begin{bmatrix} T_{\text {c},j} \\ T_{\text {s},j} \end{bmatrix} + \begin{bmatrix} \frac{H_j}{C_\text {c}} \\ \frac{hT_{\text {f},j}}{C_\text {s}} \end{bmatrix} \end{aligned}$$
(3)

where \(R_{\uptheta }\) lumps the conduction and contact thermal resistance between the core and surface of a cell, \(C_\text {c}\) and \(C_\text {s}\) are the core and surface heat capacities of a cell, and h is the heat transfer coefficient between the cell surface and coolant. Herein, the cells are assumed to have the same thermal properties, i.e., \(R_{\uptheta }\), \(C_\text {c}\), \(C_\text {s}\), and h are the same for all cells. \(T_{\text {f},j}\) and \(H_{j}\) are the coolant temperature and heat generation, respectively, at the \(j^{\text {th}}\) cell. They are considered as exogenous inputs driving the battery thermal model. It is considered that the joule heat \(H_j\) for a healthy cell is primarily generated by the ohmic and polarization resistance, given as Ref. [25]

$$\begin{aligned} H_j = i^2_{\text {b},j}(R_{\text {s},j}+R_{\text {t},j}) \end{aligned}$$
(4)

As validated in Ref. [26], Eq. (4) can be inaccurate in computing the heat generation under dynamic load profiles but is relatively accurate for modeling cell temperature variations. Therefore, it is adopted in this study as the ISC detection primarily relies on temperature variation.

2.1.3 Resistance Update Model

The resistance of a cell highly depends on its remaining capacity, SOC, and temperature [27]. Therefore, the cell resistance at a given SOC level (\(R_{ {x},0}\)) is first computed through linearly interpolating the experimental data from Ref. [28] collected under different SOC levels. The effect of capacity and temperature variations on the ohmic and diffusion resistance is then accounted by Ref. [29, 30]

$$\begin{aligned} R_{ {x},j} = \epsilon (1 + \kappa (T_{\text {c},j}-T_{\text {c},0})) \left( \frac{Q_{0}}{Q_{j}} \right) ^{\lambda }R_{{x},0}, \; \text {for}\, {x} = {\text {s}},{\text{t}} \end{aligned}$$
(5)

where \(\epsilon\), \(\kappa\), and \(\lambda\) are empirical coefficients with \(\epsilon > 1\), \(\kappa > 0\), and \(\lambda \ge 1\). \(T_{\text {c},0}\) and \(Q_{0}\) are the nominal cell core temperature and capacity, respectively. From  Eq. (5), it can be seen that the resistance will increase as temperature or capacity decreases [31].

2.2 Electro-Thermal Model of A Battery Cell with ISC

Typically, there are three types of ISC for lithium-ion cells: (I) a short between two current collectors, (II) a short between one current collector and the anode or cathode, and (III) a short between the anode and cathode. This study considers the type III ISC, which is the most common scenario [32]. Assuming that ISC resistance can properly include the resistance from chemical reactions, the ISC resistance is connected in parallel to the OCV to exclude the current collector resistance (see Fig. 1c). Mathematically, the dynamics of the electrical model in Eq. (1a) are modified as

$$\begin{aligned} \begin{bmatrix} \dot{v}_{\text {OC},j} \\ \dot{v}_{\text {c},j} \end{bmatrix} = \begin{bmatrix} \frac{\alpha _j}{R_{\text {ISC},j}} &{} 0 \\ 0 &{} -\frac{1}{\tau _j} \end{bmatrix} \begin{bmatrix} v_{\text {OC},j} \\ v_{\text {c},j} \end{bmatrix} + \begin{bmatrix} \alpha _j \\ \frac{R_{\text {t},j}}{\tau _j} \end{bmatrix} i_{\text {b},j} \end{aligned}$$
(6)

where \(R_{\text {ISC},j}\) is the ISC resistance of the \(j^{\text {th}}\) cell. Due to energy depleted by the ISC resistor, \(\frac{\alpha _jv_{\text {OC},j}}{R_{\text {ISC},j}}\) (i.e., the product of ISC current and OCV-SOC slope) is included in the dynamics of OCV to represent the change in OCV. As the current flows through the ISC resistance, heat will be generated by the ISC resistance. Consequently, under a constant-current profile, the joule heat H will be the sum of the heat generated from ohmic, polarization, and ISC resistance, derived as

$$\begin{aligned} H_j = i^2_{\text {b},j}(R_{\text {s},j}+R_{\text {t},j}) + \frac{v^2_{\text {OC},j}}{R_{\text {ISC},j}} \end{aligned}$$
(7)

From Eq. (7), it can be seen that a smaller ISC resistance leads to a larger generated heat and hence faster depleting of energy during battery charge/discharge.

2.3 Electro-Thermal Model of A Parallel Battery String with ISC

Consider a battery string consists of N parallel-connected battery cells. A detailed model of a healthy parallel battery string can be found in Ref. [30]. With ISC, the electric and thermal models of a string need to be modified. Suppose only the \(k^{\text {th}}\) cell of a battery string has ISC. Based on Eqs. (1) and (6), the electrical model of the battery string with ISC can be given as

$$\begin{aligned} \dot{\varvec {{X}}}_\text {s} = \varvec {{A}} \varvec {{X}}_\text {s} + \varvec {{B}} \varvec {{I}}_\text {s} \end{aligned}$$
(8a)
$$\begin{aligned} v_\text {b} = \varvec {C} \varvec {X} _\text {s} + \varvec {D} \varvec {I} _\text {s} \end{aligned}$$
(8b)

where \(\varvec {A} \in \varvec {R}^{2N\times 2N}\), \(\varvec {X} _\text {s} \in \varvec {R}^{2N}\), \(\varvec {B} \in \varvec {R}^{2N \times N}\), \(\varvec {I} _\text {s} \in \varvec {R}^{N}\), \(\varvec {C} \in \varvec {R}^{2N}\), and \(\varvec {D} \in \varvec {R}^{N}\) are given as follows.

$$\begin{aligned} \varvec {A} = \begin{bmatrix} 0 &{} 0 &{} \dots &{} 0 &{} 0 &{} \dots &{} 0 &{} 0 \\ 0 &{} -\frac{1}{\tau _1} &{} \dots &{} 0 &{} 0 &{} \dots &{} 0 &{} 0 \\ &{} &{} \ddots &{} \\ 0 &{} 0 &{} \dots &{} \frac{\alpha _k}{R_{\text {ISC},k}} &{} 0 &{} \dots &{} 0 &{} 0 \\ 0 &{} 0 &{} \dots &{} 0 &{} -\frac{1}{\tau _k} &{} \dots &{} 0 &{} 0 \\ &{} &{}&{}&{} &{} \ddots &{} \\ 0 &{} 0 &{} \dots &{} 0 &{} 0 &{} \dots &{} 0 &{} 0 \\ 0 &{} 0 &{} \dots &{} 0 &{} 0 &{} \dots &{} 0 &{} -\frac{1}{\tau _N} \\ \end{bmatrix} \end{aligned}$$
(8c)
$$\begin{aligned} \varvec {X} _\text {s} = \begin{bmatrix} v_{\text {OC},1}&v_{\text {c},1}&v_{\text {OC},2}&v_{\text {c},2}&\dots&v_{\text {OC},N}&v_{\text {c},N} \end{bmatrix}^{\text{T}} \end{aligned}$$
(8d)
$$\begin{aligned} \varvec {B} = \begin{bmatrix} \alpha _1 &{} 0 &{} \dots &{} 0 &{} 0 \\ -\frac{R_{\text {t},1}}{\tau _1} &{} 0 &{} \dots &{} 0 &{} 0 \\ 0 &{} \alpha _2 &{} \dots &{} 0 &{} 0 \\ 0 &{} -\frac{R_{\text {t},2}}{\tau _2} &{} \dots &{} 0 &{} 0 \\ &{} &{} \ddots &{} \\ 0 &{} 0 &{} \dots &{} 0 &{} \alpha _N \\ 0 &{} 0 &{} \dots &{} 0 &{} -\frac{R_{\text {t},N}}{\tau _N} \\ \end{bmatrix}, \; \mathbf {I} _\text {s} = \begin{bmatrix} i_{\text {b},1} \\ i_{\text {b},2} \\ i_{\text {b},3} \\ i_{\text {b},4} \\ \vdots \\ i_{\text {b},N-1} \\ i_{\text {b},N} \end{bmatrix} \end{aligned}$$
(8e)
$$\begin{aligned} \varvec {C} = \begin{bmatrix} 1&-1&0&\dots&0 \end{bmatrix}, \; \varvec {D} = \begin{bmatrix} -R_{\text {s},1}&0&\dots&0 \end{bmatrix} \end{aligned}$$
(8f)

Since the cells are connected in parallel, the current of individual cells satisfies the Kirchhoff’s laws, expressed as

$$\begin{aligned} \varvec {R} \mathbf {I} _\text {s} = \varvec {E} \mathbf {X} _\text {s} + \varvec {F} i_\text {t} \end{aligned}$$
(9a)

where \(\varvec {R} \in \varvec {R}^{N \times N}\), \(\varvec {E} \in \varvec {R}^{N \times 2N}\), and \(\varvec {F} \in \varvec {R}^{N}\) are given as follows.

$$\begin{aligned} \varvec {R} = \begin{bmatrix} 1 &{} 1 &{} 1 &{} \dots &{} 1 &{} 1 \\ -R_{\text {s},1} &{} R_{\text {s},2} &{} 0 &{} \dots &{} 0 &{} 0 \\ 0 &{} -R_{\text {s},2} &{} R_{\text {s},3} &{} \dots &{} 0 &{} 0 \\ \vdots &{} \vdots &{} &{} \ddots &{} \\ 0 &{} 0 &{} 0 &{} \dots &{} -R_{\text {s},N-1} &{} R_{\text {s},N}\\ \end{bmatrix} \end{aligned}$$
(9b)
$$\begin{aligned} \varvec {E} = \begin{bmatrix} 0 &{} 0 &{} 0 &{} 0 &{} 0 &{} \dots &{} 0 &{} 0 &{} 0 &{} 0 \\ -1 &{} 1 &{} 1 &{} -1 &{} 0 &{} \dots &{} 0 &{} 0 &{} 0 &{} 0 \\ 0 &{} -1 &{} 1 &{} 1 &{} -1 &{} \dots &{} 0 &{} 0 &{} 0 &{} 0 \\ &{} &{} &{} &{} &{} \ddots &{} \\ 0 &{} 0 &{} 0 &{} 0 &{} 0 &{} \dots &{} -1 &{} 1 &{} 1 &{} -1\\ \end{bmatrix}, \; \mathbf {F} = \begin{bmatrix} 1 \\ 0 \\ 0 \\ \vdots \\ 0 \end{bmatrix} \end{aligned}$$
(9c)

and \(i_\text {t}\) is the total current of the parallel battery strings. By substituting Eq. (8) into (9), the electrical model of a battery string containing one cell with ISC is derived as

$$\begin{aligned} \dot{\varvec {{X}}}_\text {s} =(\varvec {{A}} + \varvec {{B}} \varvec {{R}}^{-1} \varvec {E} ) \varvec {{X}}_\text {s} + \varvec {{B}} \varvec {{R}}^{-1} \varvec {{F}} i_\text {t} \end{aligned}$$
(10a)
$$\begin{aligned} v_\text {b} = (\varvec {C} + \varvec {D} \varvec {R} ^{-1} \varvec {E} ) \varvec{X} _\text {s} + \varvec {D} \varvec {R} ^{-1} \varvec {F} i_\text {t} \end{aligned}$$
(10b)

To model the thermal dynamics, the temperature of coolant at different locations is first introduced, provided as Ref. [33]

$$\begin{aligned} T_{\text {f},j} = {\left\{ \begin{array}{ll} T_{\text {f,in}}, \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad j = 1 \\ T_{\text {f},j-1}-\frac{h}{C_\text {f}}(T_{\text {f},j-1}-T_{\text {s},j-1}), \quad j = 2, ..., N \end{array}\right. } \end{aligned}$$
(11)

where \(T_{\text {f,in}}\) is the coolant temperature at the inlet, \(C_\text {f} = C_\text {p}V_{\text {cool}}\), \(V_{\text {cool}}\) is the flow rate of the coolant, and \(C_\text {p}\) is the heat capacity of the coolant. \(V_{\text {cool}}\) is determined by the thermal management system and typically increases as the cell number N increases. By stacking up the thermal model of individual cells and incorporating Eq. (11), the thermal model for the battery string with ISC is given as

$$\begin{aligned} \dot{\varvec {{X}}}_\text {T} = \varvec {{A}}_{\text {TS}} \varvec {{X}}_\text {T} + \varvec {{B}}_{\text {TS}} \varvec {{u}}_\text {T} \end{aligned}$$
(12a)

where the matrices \(\varvec {A} _{\text {TS}} \in \varvec {R}^{2N \times 2N}\), \(\varvec {X} _\text {T} \in \varvec {R}^{2N}\), \(\varvec {B} _{\text {TS}} \in \varvec {R}^{2N \times (N+1) }\), and \(\varvec {u} _\text {T} \in \varvec {R}^{N+1}\) are presented in Eqs. (12b) and (12c). It should be pointed out that the above battery string model is obtained by assembling models of single battery cells. Therefore, the effectiveness of the model will deteriorate if there exists large interconnect resistance or thermal gradient among cells [34]. In addition, the model can be ineffective in the SOC ranges where the nonlinearity between SOC and OCV is large (e.g., low and high SOC ranges for lithium iron phosphate cells) [35].

$$\begin{aligned} \begin{aligned} { \varvec {A} _{\text {TS}} = \begin{bmatrix} -\frac{1}{R_{\text {in}}C_\text {c}} &{} \frac{1}{R_{\text {in}}C_\text {c}} &{} 0 &{} 0 &{} \dots &{} 0 &{} 0 &{} 0 \\ \frac{1}{R_{\text {in}}C_\text {s}} &{} -\frac{1}{C_\text {s}}(h+\frac{1}{R_{\text {in}}}) &{} 0 &{} 0 &{} \dots &{} 0 &{} 0 &{} 0 \\ 0 &{} 0 &{} -\frac{1}{R_{\text {in}}C_\text {c}} &{} \frac{1}{R_{\text {in}}C_\text {c}} &{} \dots &{} 0 &{} 0 &{} 0 \\ 0 &{} \frac{h^2}{C_\text {s}C_\text {f}} &{} \frac{1}{R_{\text {in}}C_\text {s}} &{} -\frac{1}{C_\text {s}}(h+\frac{1}{R_{\text {in}}}) &{} \dots &{} 0 &{} 0 &{} 0 \\ \vdots &{} \vdots &{} &{} &{} \ddots &{} \\ 0 &{} 0 &{} 0 &{} 0 &{} \dots &{} 0 &{} -\frac{1}{R_{\text {in}}C_\text {c}} &{} \frac{1}{R_{\text {in}}C_\text {c}} \\ 0 &{} \frac{h^2}{C_\text {s}C_\text {f}}(1-\frac{h}{C_\text {f}})^{N-2} &{} 0 &{} \frac{h^2}{C_\text {s}C_\text {f}}(1-\frac{h}{C_\text {f}})^{N-3} &{} \dots &{} \frac{h^2}{C_\text {s}C_\text {f}} &{}\frac{1}{R_{\text {in}}C_\text {s}} &{} -\frac{1}{C_\text {s}}(h+\frac{1}{R_{\text {in}}}) \\ \end{bmatrix}} \end{aligned} \end{aligned}$$
(12b)
$$\begin{aligned} \begin{aligned} {{\varvec{B}} _{\text{T} \text{S}} = \begin{bmatrix} \frac{1}{C_\text{c}} &{} 0 &{} 0 &{} \dots &{} 0 &{} 0 \\ 0 &{} 0 &{} 0 &{} \dots &{} 0 &{} \frac{h}{C_\text{s}} \\ 0 &{} 0 &{} \frac{1}{C_\text{c}} &{} \dots &{} 0 &{} 0 \\ 0 &{} 0 &{} 0 &{} \dots &{} 0 &{} \frac{h}{C_\text {s}}(1-\frac{h}{C_\text {f}}) \\ &{} &{} &{} \ddots &{} \\ 0 &{} 0 &{} 0 &{} \dots &{} \frac{1}{C_\text {c}} &{} 0\\ 0 &{} 0 &{} 0 &{} \dots &{} 0 &{} \frac{h}{C_\text {s}}(1-\frac{h}{C_\text {f}})^{N-1}\\ \end{bmatrix}, \; {\varvec{X}}_{\text{T}} = \begin{bmatrix} {{T}}_{\text {c},1} \\ {{T}}_{\text {s},1} \\ {{T}}_{\text {c},2} \\ {{T}}_{\text {s},2} \\ \vdots \\ {{T}}_{\text {c},N} \\ {{T}}_{\text {s},N} \end{bmatrix}, \; {\varvec{u}}_{\text{T}} = \begin{bmatrix} {H}_{1} \\ {H}_{2} \\ {H}_{3} \\ {H}_{4} \\ \vdots \\ {{H}}_{N} \\ {T}_{\text{f,in}} \end{bmatrix}} \end{aligned} \end{aligned}$$
(12c)

3 ISC Signature Characterization

In this section, the setup for simulating the battery string with ISC using the models derived in Sect. 2 is first presented. Then, the electrical and thermal responses to characterize the signatures corresponding to ISC are analyzed. Finally, the ISC detection problem is formulated.

3.1 Simulation Setup

The experimental data for a \(\text {LiFePO}_4\) cylindrical cell with nominal capacity \(Q_0\) of \(62.18~\mathrm {A\cdot h}\) collected in Ref. [28] are adopted to calculate the parameters in Eq. (8). The capacity values of the cells are first chosen. Then, the values of ohmic resistance (\(R_{\text {s},j}\)), diffusion resistance (\(R_{\text {t},j}\)), and time constant (\(\tau _{j}\)) are obtained by interpolating the experimental data based on cell capacity. The parameters of the thermal model in Eq. (12) and those for updating internal resistance in Eq. (5) are the same for all the cells. These parameters are adopted from Ref. [30] and summarized in Table 1. This study considers battery strings with five cells connected in parallel, i.e., \(N = 5\), and assumes only one cell has ISC without loss of generality. The employed load profile to generate data includes a repeated charge and discharge process with a duration of \(20~\mathrm {s}\). The magnitude of the current is \(1~\mathrm {C}\). The initial SOC for all the cells is 0.6. Since the battery responses are marginally affected by the step size when step size is less than \(10~\mathrm {s}\), the step size is set to \(10~\mathrm {s}\) to reduce the computation cost for simulation. Each battery string is simulated for \(5000~\mathrm {s}\) to ensure that the thermal system has reached its steady state.

Table 1 Model parameters

3.2 Electrical Versus Thermal Signatures

To understand the effect of ISC on battery string responses, a parallel battery string containing five cells with one cell having different ISC resistance is simulated. Without loss of generality, the third cell is considered as the ISC cell. The capacity values of five cells are listed in Table  2. Four different ISC resistance values are investigated, which are 0.5, 1, 10,  and \(100~\mathrm {\Omega }\). The simulations are performed with the model developed in Sect. 2 and the setup presented in Sect. 3.1.

Table 2 Capacity of individual cells

The terminal voltage and cell SOC under different ISC resistance are compared. It can be seen in Fig.  2(b)–(e) that the SOC levels of parallel-connected battery cells diverge over time due to the non-uniformity in cell capacity. As the ISC resistance decreases, the SOC depletion is accelerated for the cell with ISC. This is an ISC-related signature. However, individual current sensors are required for each cell to accurately estimate its SOC, which is impractical for most real-world applications (e.g., electric vehicles). The terminal voltage of a battery string can be obtained using one voltage sensor. However, as shown in Fig.  2(a), the terminal voltage of battery strings with different ISC resistance has a negligible difference, which will be susceptible to the noises of voltage sensors. Moreover, the relationship between the magnitude of terminal voltage fluctuations and the ISC severity is also non-monotonic. The monotonic relationship between the ISC severity and OCV/SOC drop can not be observed between the ISC severity and terminal voltage as the (internal) resistance change caused by the temperature variations will perturb it. Therefore, ISC-related signatures in electrical responses are insufficient for ISC detection of parallel battery strings.

Fig. 2
figure 2

Terminal voltage and cell SOC of a string with cell \(\#3\) having different ISC resistance

As assessing cell surface temperature is more practical, e.g., using infrared (IR) techniques [36] and temperature estimators [37], the distribution of cell surface temperature within a battery string is examined. From Fig. 3, the following observations can be obtained:

  1. (1)

    With a sequential cooling structure, the coolant will be heated by the upstream cells, leading to reduced cooling effects for downstream cells. Consequently, the temperatures of downstream cells will be higher than those of upstream cells regardless of the presence of ISC.

  2. (2)

    The temperatures of all cells rise due to the heat from ohmic and polarization resistance. For the cell with ISC, additional heat is generated by the ISC resistance. As a result, the temperature rise of the cell with ISC (i.e., cell \(\#3\)) increases as the ISC becomes more severe.

  3. (3)

    As the ISC resistance decreases, the temperature rise of the cells at the downstream of the ISC cell (i.e., cell \(\#4\) and cell \(\#5\)) increases. This is because the coolant temperature becomes higher after absorbing more heat from the ISC cell.

Based on the above observations, the cell surface temperature distribution can be a promising indicator for ISC detection of parallel battery strings. The above analysis is performed under a specific setup of cell-to-cell capacity (and resistance) non-uniformity. The level of non-uniformity in cell capacity (and resistance) will impact the effectiveness of identified signatures, which will be further discussed in Sect. 4.1.

Fig. 3
figure 3

Cell surface temperature of a string with cell \(\#3\) having different ISC resistance

3.3 ISC Detection Problem

The ISC detection problem of parallel battery strings is formulated as a binary classification problem in this study. Given a threshold value of ISC resistance (\(\zeta _\text {R}\)), the parallel battery strings with ISC are categorized into two classes: non-faulty and faulty strings. The non-faulty strings include the healthy strings and the strings with ISC resistance larger than the threshold value. These strings can charge/discharge with negligible energy loss caused by ISC and pose a low risk of thermal runaway. On the other hand, the faulty strings have ISC resistance smaller than or equal to the threshold value, indicating that immediate examination or replacement is necessary. Therefore, the objective of the ISC detection is to correctly classify a parallel battery string into one of the two classes.

4 CNN-Based ISC Detection Algorithm

In this section, a CNN is proposed to address the ISC detection problem, and the proposed approach is evaluated on the simulated data with noise in the cell temperature measurement and uncertainties in string capacity to demonstrate its effectiveness and robustness. Then, discussions are provided concerning the practical aspects of the proposed approach.

4.1 CNN-Based ISC Detector

To assess the ISC severity of a parallel battery string using thermal information, one can follow the cell-level or string-level approaches proposed in the literature.

  1. (1)

    Cell-level approach The cell-level approach uses the temperature rise from the nominal temperature of each cell to assess its ISC severity. The effectiveness of this approach relies on the exact knowledge of the cell condition (e.g., capacity and internal resistance). If the non-uniformity in cell capacity is small, the cell condition may be estimated using the string condition or dynamic responses. However, in the presence of substantial cell-to-cell (capacity) variations, the cell condition can not be accurately monitored, and the ISC detection performance will degrade.

  2. (2)

    String-level approach As illustrated in Sect. 3.2, the difference in temperature rise between the cell with ISC and other cells increases as the ISC gets more severe. Then, if a cell has a considerably larger temperature rise compared to other cells, the cell is likely to have high ISC severity. However, in the presence of cell-to-cell variations, the temperature difference between cells can also be caused by uneven current distribution. Figure  4 shows the cell surface temperature distribution of a string with larger cell-to-cell capacity variation (i.e., the standard deviation of cell capacity is \(3.898~\mathrm {A \cdot h}\)) compared to those in Fig.  3 (i.e., the standard deviation of cell capacity is \(1.383~\mathrm {A \cdot h}\)) and \(1~\mathrm {\Omega }\) ISC resistance in the third cell. As shown in Fig.  4, the temperature of cell \(\#3\) increases significantly due to the existence of ISC. Meanwhile, the temperature rise of cell \(\#4\), a cell without ISC, is also abnormally large. This is because the cell \(\#4\) has a much larger capacity than other cells, and a larger current (i.e., load) will flow through cell \(\#4\). As a result, using only the difference in the temperature rise can also be insufficient for ISC detection under large cell non-uniformity.

    Fig. 4
    figure 4

    Cell surface temperature of a string with larger non-uniformity in cell capacity compared to the string in Fig. 3 (Cell \(\#3\) of the string has an ISC of \(1~\mathrm {\Omega }\))

Given the aforementioned difficulties in detecting ISC for parallel-connected cells, a deep learning approach is adopted to fuse the cell-level and string-level approaches. Considering that the temperature of cells within a battery string contains temporal (i.e., over consecutive periods) and spatial (i.e., along the path of the same coolant flow) correlations, the vanilla and recurrent NNs can be ineffective, i.e., vanilla NN cannot well capture both spatial and temporal correlations, and recurrent NN cannot well capture the spatial correlation. Therefore, CNN is chosen to develop the ISC detector, given its ability to capture both spatial and temporal correlations. Since the electrical responses of a battery string provide marginal information about the ISC severity, the cell surface temperature sequences are used as the input to CNN. In addition, the total capacity of a battery string (\(Q^\text {t} = \sum _{j=1}^N Q_j\)) is used as an additional input to CNN. The string capacity could provide a rough estimate of the cell capacity, which might be useful to assess the ISC severity as the cell-level approach. For the output of the CNN, one can label non-faulty strings with 0 and faulty strings with 1, and the CNN can be devised to output a value between 0 and 1. However, the ISC resistance information can not be fully utilized in this setup during the CNN training to learn ISC-related signatures. Therefore, in this study, the CNN is formulated to output the ISC resistance. The string is then classified by comparing the CNN output with the threshold value. With this ISC detection framework, the threshold value of ISC resistance will be a design parameter of the battery management system. The users can determine the threshold value based on the battery chemistry, cooling system, type of application, and the battery replacement cost.

The architecture of CNNs is given in Fig.  5 with the hyperparameters summarized in Table 3. “Conv1D” and “Maxpool1D” represent one-dimensional convolution and max-pooling operations, respectively. “ReLU” and “FC” denote the rectified linear activation and the fully-connected unit, respectively. The architecture and hyperparameters of the proposed CNN are chosen based on the results presented in Ref. [38]. The available data, containing 50000 strings, will be partitioned into \(60\%\) for training, \(20\%\) for validation, and \(20\%\) for testing. To avoid bias in testing accuracy caused by dataset partition, five-fold cross-validation is conducted. The development and evaluation of CNNs are conducted in PyTorch with the Adam optimizer.

Fig. 5
figure 5

Architecture of the proposed CNNs

Table 3 Hyperparameters of the proposed CNNs

Let \(N_\text {f}\) and \(N_\text {n}\) be the number of faulty and non-faulty strings, as illustrated in Fig.  6. \(N^\text {c}_\text {f}\) and \(N^\text {c}_\text {n}\) are the numbers of correctly-classified faulty and non-faulty strings. \(N^\text {i}_\text {f}\) is the number of faulty strings that are incorrectly classified as non-faulty strings, and \(N^\text {i}_\text {n}\) is the number of non-faulty strings that are incorrectly classified as faulty strings. According to the above notations, the accuracy, false alarm rate, and missed detection rate are defined as follows to quantify the classification performance:

$$\begin{aligned} \text {Accuracy }(\%) = \frac{N^\text {c}_\text {f}+N^\text {c}_\text {n}}{N_\text {f}+N_\text {n}} \end{aligned}$$
(13a)
$$\begin{aligned} \text {False Alarm }(\%) = \frac{N^\text {i}_\text {n}}{N_\text {n}} \end{aligned}$$
(13b)
$$\begin{aligned} \text {Missed Detection }(\%) = \frac{N^\text {i}_\text {f}}{N_\text {f}} \end{aligned}$$
(13c)
Fig. 6
figure 6

Illustration of the quantities used in the definitions of accuracy, false alarm rate, and missed detection rate

4.2 CNN Training and Performance Analysis

A dataset containing 50000 parallel battery strings with different ISC resistance values are generated through simulations based on the model presented in Sect. 2 and the setup discussed in Sect. 3.1. One random cell in each battery string is selected as the cell with ISC. The ISC resistance is sampled as \(R_{\text {ISC}} \sim \mathcal {U}(0.1,100)\), where \(\mathcal {U}(a,b)\) denotes a uniform distribution with a and b being the lower and upper bounds, respectively. The lower bound is set as \(0.1~\mathrm {\Omega }\) since a thermal runaway is usually triggered in the simulation when \(R_{\text {ISC}} < 0.1~\mathrm {\Omega }\), and the ISC detection will be trivial. The upper bound is chosen as \(100~\mathrm {\Omega }\) because the strings with \(R_{\text {ISC}} > 100~\mathrm {\Omega }\) have nearly identical electrical and thermal responses as healthy strings. The string capacity is sampled as \(Q^\text {t} \sim \mathcal {U}(0.8NQ_0,NQ_0)\). The cell capacity values are then selected given the string capacity. Following the above procedure, the strings at different health conditions are considered, and the constructed strings contain different levels of cell-to-cell (capacity) variations.

Based on the ISC resistance values adopted in Refs. [17, 39, 40], the threshold value for ISC resistance is chosen as \(\zeta _\text {R} = 1~\mathrm {\Omega }\) for the case study. Since our knowledge of the cell surface temperature and string capacity can be inaccurate, we consider that the temperature and total capacity are corrupted by additive white noise.

4.2.1 Effect of Using Different Data Labeling Strategies

As discussed in Sect. 4.1, one can label the faulty and non-faulty strings as 0 and 1, respectively, to train a CNN that directly performs the binary classification (i.e., classification CNN). However, the classification CNN will pay more attention to the characteristics of strings with \(R_\text {{ISC}}\) close to \(\zeta _\text {R}\) (i.e., near the boundary between two classes) to better differentiate two classes. As a result, the ISC-related signatures extracted by the classification CNN can lack generalizability for strings with different ISC severity and robustness under inaccurate inputs.

Alternatively, one can label a string with its ISC resistance and train a CNN (i.e., estimation CNN). The estimation CNN will be trained to estimate the ISC resistance at different ISC severity. However, since ISC causes marginal differences in dynamic responses of strings with large ISC resistance, the ISC resistance of these strings can not be accurately estimated merely based on ISC-related signatures, especially under cell non-uniformity. Consequently, the estimation CNN must learn non-ISC-related signatures for enhanced accuracy, leading to overfitting and deteriorated robustness.

Based on the above analysis, the maximum ISC resistance value (\(R^{\text {max}}_{\text {ISC}}\)) will be constrained when labeling strings. In particular, the strings with ISC resistance larger than \(R^{\text {max}}_{\text {ISC}}\) will be labeled with \(R^{\text {max}}_{\text {ISC}}\). In this way, CNN can extract most of the ISC-related signatures and does not need to extract non-ISC-related signatures to estimate large ISC resistance accurately. Here, \(R^{\text {max}}_{\text {ISC}}\) should be chosen based on the threshold value \(\zeta _\text {R}\) and dynamic responses of string at different ISC severity. In this study, \(10~\mathrm {\Omega }\) is adopted as the maximum ISC resistance value for labeling.

To demonstrate the effectiveness of using estimation CNN with \(R^{\text {max}}_{\text {ISC}}\), the performance from following CNNs is compared:

  1. (1)

    Classification CNN;

  2. (2)

    Estimation CNN without \(R^{\text {max}}_{\text {ISC}}\);

  3. (3)

    Estimation CNN with \(R^{\text {max}}_{\text {ISC}}\).

According to the uncertainty in temperature reported in Refs. [37, 41], noise with standard deviations of 0.03, 0.05, 0.07, 0.1, 0.3, 0.5, 0.7, and 1 \(^\circ\)C are added to the temperature data. The noise standard deviation in the string capacity is chosen as \(0.02~\mathrm {A \cdot h}\). The performance of three CNNs is compared in Fig. 7. As can be seen from Fig.  7, the estimation CNN with \(R^{\text {max}}_{\text {ISC}}\) achieves the best performance and robustness compared to the other two CNNs.

Fig. 7
figure 7

CNN performance with different data labeling strategies

4.2.2 Effect of Using String Capacity

The proposed CNN (i.e., estimation CNN with \(R^{\text {max}}_{\text {ISC}}\)) is compared to the CNN that has the same setup but does not use string capacity (i.e., CNN without \(Q^\text {t}\)). The noise standard deviations in the temperature data are 0.03, 0.05, 0.07, 0.1, 0.3, 0.5, 0.7, and 1 \(^\circ\)C. The noise standard deviations in the string capacity (\(\sigma _{Q^\text {t}}\)) are 0.01, 0.02, and \(0.03~\mathrm {A \cdot h}\). As shown in Fig.  8, the proposed approach can achieve better performance (i.e., higher accuracy, lower false alarm rate, and lower missed detection rate) under noisy CNN inputs by incorporating the string capacity. When the noise in temperature is low and the uncertainty in string capacity is high, the proposed approach has a slightly worse accuracy and false alarm rate than the CNN without \(Q^\text {t}\). However, the CNN without \(Q^\text {t}\) has a significantly larger missed detection rate, leading to compromised battery safety.

Fig. 8
figure 8

CNN performance with or without string capacity

4.3 Discussion

Based on the analysis in Sect. 4.2, the architecture in Fig. 5 and a data labeling strategy are proposed to train a CNN for ISC detection under noises in temperature and uncertainties in string capacity. Additional discussion is provided to facilitate the practical application of the proposed approach.

4.3.1 Observability of Cell Surface Temperature

The surface temperature of individual cells within a battery string is used to detect ISC in the proposed approach. IR techniques may be employed to measure the cell surface temperature directly [36, 42]. Alternatively, temperature estimators can be applied to reconstruct the cell surface temperature based on the sparse temperature measurements of some cells. To understand how the number of temperature sensors influence the cell surface temperature estimation, observability analysis is conducted based on the thermal model given in Eq. (12).

By treating the heat generation as inputs to the model, the observability matrix of the thermal system only depends on the thermal model parameters listed in Table  1, the number of temperature sensors, and the locations of these sensors. The minimum values of condition number, listed in Table 4, are then computed by using different numbers of sensors for the thermal model with two states (i.e., core and surface temperatures). The sensor allocation that achieves the minimum condition number is also provided. As shown in Table 4, a minimum of two sensors is required to ensure the full observability of a battery string containing five cells. However, considering the large condition number of using two sensors, a minimum of three sensors will be needed for satisfactory temperature estimation accuracy and robustness.

Since only the surface temperature is required for ISC detection, one may lump the surface and core temperature of a cell as one state variable and reduce the order of the thermal model, as discussed in Refs. [43, 44]. The minimum values of condition number for the one-state model are listed in Table 4 with the employed sensor allocation. It can be seen that fewer sensors (e.g., two sensors) will be required to achieve a satisfactory estimation performance. In addition, the temperature sensor placed at cell \(\#5\) can provide the most information about the cell surface temperature distribution, as the temperature of cell \(\#5\) will be affected by the temperature of all upstream cells. Finally, it should be pointed out that the heat generation (i.e., inputs to the thermal system) is assumed to be known in the above analysis. However, given that the current and resistance of individual cells can not be directly monitored, advanced observer design that can remain effective under unknown or uncertain inputs (e.g., [45, 46]) will be required for temperature reconstruction.

Table 4 Observability analysis of cell surface temperature under sparse measurements

4.3.2 Computational Complexity

The computation time for the proposed algorithm is about \(0.002~\mathrm {s}\) on a computer with a 2.9 GHz Intel Core i5 processor and 16GB RAM. In the future, cloud computing technology may be used to implement the proposed approach on platforms with limited onboard computing resources, e.g., electric vehicles [47] and smartphones [48]. With the cloud computation, the onboard measurements can be periodically sent to the cloud [49], which will return the classification results. Since the CNN will be stored in the cloud, the memory overhead of the onboard computer can also be further reduced.

4.3.3 Cell Chemistry and Shape

The data of a LiFePO\(_4\) cylindrical cell is adopted for the analysis and evaluations in this study. In reality, the chemistry or the shape of the batteries can vary. It should be pointed out that the proposed methodology and ISC-related signatures should remain effective for different battery cell assembly as temperature rise is a well-established characteristic for cells of different chemistry and shapes [50, 51]. Nevertheless, the deep-learning model may need to be retrained with the data from cell assembly of the same type if the cell characteristics (e.g., thermal capacity) change significantly.

4.3.4 ISC Resistance Values and the Threshold Value

In this study, the ISC resistance values are considered to vary from 0 to \(100~\mathrm {\Omega }\) because the battery performance starts to degrade (i.e., increased battery temperature and decreased charge/discharge efficiency) in this resistance range. Yet, this range for the ISC resistance may vary based on the electrical and thermal characteristics of the battery system. In addition, the choice of the ISC threshold value (i.e., \(1~\mathrm {\Omega }\)) also needs to be changed accordingly. In a real application, these values can be determined based on the temperature rise (e.g., risk of triggering thermal runaway given the operating temperature) and energy depleted per charge/discharge cycle (e.g., the ratio between the depleted energy to the provided energy) from preliminary testing data.

5 Conclusions and Future Work

This paper investigates the internal short circuit (ISC) detection problem for parallel-connected battery cells. The ISC detection problem is formulated as classifying a battery string under faulty or non-faulty conditions based on the ISC resistance. An electro-thermal model for the studied battery considering ISC is first derived. By analyzing the string terminal voltage, cell state of charge, and cell surface temperature, the distribution of cell surface temperature is found to be critical for assessing the ISC severity. A convolutional neural network (CNN) is then developed to map the cell surface temperature sequences and the string capacity to the ISC resistance. Based on the estimated ISC resistance from CNN, the parallel battery strings are then classified. To enhance the robustness of the proposed CNN, a data labeling strategy is also proposed when processing the data for CNN training. By evaluating the proposed approach under noisy CNN inputs, the effectiveness and robustness of the proposed approach are demonstrated.

In this work, the cell resistance and time constant are assumed to vary with cell capacity based on the specific experimental data. Therefore, the variability of their relationship due to cell non-uniformity may not be well captured. In the future, experiments on more cells will be conducted to better capture this variability and further validate the proposed approach. In addition, considering the effect of the inhomogeneity in cell surface temperature on the proposed approach will also be of interest. Finally, experiments of strings with and without ISC will be conducted in the future to verify the proposed approach under real-world conditions.