Introduction

Growing reliance on lithium-ion batteries (LiBs) for hybrid or electric vehicles (EVs) and stationary and grid-storage applications drives market growth.1,2 In these applications, LiBs have stringent performance and service-life expectations. In EV applications, LiBs are expected to last 1000 cycles/15 calendar years.3,4 In grid-storage applications, expected life target is as high as 8000 cycles/20 calendar years.5 Ensuring safety throughout this long service life in these applications where hundreds or thousands of cells are connected in complex and variable series–parallel configurations, is crucial. Failure would increase the likelihood of expensive warranty liabilities, recalls, and, in worst-case scenarios, catastrophic events leading to increased risk or loss of life.6,7,8,9,10,11

In EV applications, a safety–critical event in LiB could arise post-collision and manifest at different time scales—e.g., immediate outburst after a major collision12 or latent manifestation at a later time (i.e., a stranded energy scenario).13,14,15,16 Besides such major events, the state-of-health (SOH) and the state-of-safety (SOS) of a single or subset of batteries in larger systems could devolve in the absence of proper monitoring or diagnosis, eventually rendering a string or pack unstable.17,18 Conditions that could lead to LiB SOH and SOS issues may include fast charge, low-temperature charge, overcharge, overdischarge, internal and external shorts, and overtemperature, sometimes triggered by malfunctioning battery management systems (BMSs) and/or thermal-management systems, sensor malfunction or resiliency issues, inappropriate compression, or divergent aging in a pack.12,17

Existing research primarily focuses on LiB-diagnostic technologies for single cells. Little research has focused on extending or demonstrating the efficacy of those technologies to multicell configurations. Many reasons could be ascribed for this lag in diagnostic technology transfer (e.g., nascent and developing technology, the subtle nature of LiB faults and their traits in complex multicell configurations, sensor limitations, resource intensiveness related to extending single-cell diagnostics to multicells, and increased price considerations). This gap in diagnostic technology extension or maturation impedes the adoption of LiBs in larger battery systems. This article provides a review of existing cell-level diagnostics that could potentially be extended to systems-level multicell diagnostics. The complexities and challenges associated with extending cell-level diagnostics to multicell configurations with specific battery fault case studies—Li plating, internal short circuit, state-of-charge (SOC) imbalance, and cell-to-cell aging heterogeneity—are presented. Through this article, we hope to provide a broader visibility to and recognition of the challenges and gaps associated with the development of LiB diagnostics to researchers, end-users, and supporting communities and bolster efforts to accelerate diagnostic technology development, validation, and demonstration for current and future high-energy batteries.

Battery performance and safety diagnostics

Cell-level diagnostics

LiB performance, life, and safety are closely intertwined. A number of standards and testing protocols exist that provide a general sense of LiB safety under different major-abuse scenarios (e.g., major collision events).19 Many of the safety–critical responses could be minimized or suppressed by better designing the cell and systems.20,21,22,23,24 However, battery aging and safety also evolve gradually over time before making a catastrophic manifestation in the absence of proper monitoring or diagnostics. Common conditions include Li plating due to fast or low-temperature charging or path-dependent aging; mild overcharge and overdischarge due to faulty or less-resilient sensors and management; nonuniform aging due to aggressive use or inadequate or faulty management; internal and external shorts due to latent manufacturing defects or incorrect use, inappropriate fixturing, or cell damage and aging; and divergent aging due to cell-to-cell variability.17,25,26,27,28,29,30,31,32,33,34,35,36,37,38 Literature surveys indicate the existence of a diverse set of diagnostics for LiB cells—diagnosing Li plating,39,40,41,42,43,44 overcharge,45,46,47 overdischarge,45,48,49 internal shorts,50,51,52,53 external shorts,54,55,56 and electrode- and cell-level inhomogeneities28,57,58,59,60,61,62—employing various thermal, electrochemical, acoustic, and entropy-based methods, often combined with different modeling-based approaches. Most of these methods are still in development; therefore, they have not been extended to module, string, or pack levels.

Module- and pack-level diagnostics

Not as many module, string and pack-level advanced diagnostics are available as are extant at the cell level. A few cell-level diagnostics have been extended to small modules and strings: SOH estimation along with identifying cell-level nonuniformities, such as overvoltage, low voltage, capacity, and impedance nonuniformity;17,63,64,65,66,67,68 internal69,70,71 and external55,72 short diagnostics; and overcharge and overdischarge diagnostics.17,73,74,75,76 While gauging the adoption rate of these diagnostics in the state-of-the-art BMS technologies for EVs and stationary and grid-storage application is unclear, diagnostics are making inroads into the commercial space,77,78,79,80,81,82 and in the coming years, more and more diagnostics are expected to follow this path.

Case studies: The challenges associated with extending diagnostics from single to multicell

Model-based management and diagnostics

Extending cell-level diagnostics to multicell configuration is not straightforward. Many cell-level diagnostics discussed earlier rely on mathematical reduced-order models (ROMs): physics-based, data-driven equivalent circuit ROMs. These ROMs parameters evolve with SOH and are sensitive to chemistry, design, and operating conditions (e.g., temperature or SOC). These sensitivities make generalization of a particular model-based detection method across various LiB chemistries and designs difficult and introduce additional uncertainty in actual field applications. Some of the model-based techniques also require additional current and temperature measurements that, in current pack designs, may be unavailable. Moreover, additional computations associated with multicell model-based diagnostics could easily overwhelm the BMSs. All these challenges hinder the adoption of model-based cell-level advanced diagnostics into multicell configurations despite having significant promise.

Feature and rule-based diagnostics

Incremental capacity or differential voltage-based management and diagnostics

Feature and rule-based diagnostics are simpler, but still have limitations for multicell configurations. For instance, a prominent method of diagnosing battery aging mode is incremental capacity analysis (IC-dQ dV–1) (or differential voltage analysis (dV dQ–1).17,28,62,64 The method works well for single cells at chemistry research and development (R&D) stage, but its implementation in multicell configurations is challenging. First, IC requires slow rate charge or discharge data with higher sampling rate, which would take days to complete. Even if applied to specific portions of the charge or discharge profile, the assessment would still take time. Taking derivatives of discrete signals, particularly with respect to voltage, could be problematic for battery chemistries or portions of voltage change with a flatter response in the presence of increased noise.83,84 Better instrumentation in laboratory-scale analysis could avoid some of these issues; however, implementing this technique would require more accurate sensors inside packs or advanced algorithms to manipulate data before differentiation.83

Figure 1 shows the slow rate IC signatures of cells arranged in different configurations at different aging states. Details on testing and aging conditions can be found in literature.17,28 For the single cell with graphite/lithium nickel manganese cobalt oxide (NMC) chemistry shown in Figure 1a, the IC peak-intensity reductions can be correlated with percent capacity fade, primarily caused by loss of lithium inventory (LLI) as the solid electrolyte interphase (SEI) layer grows at the negative electrode due to elevated temperature calendar test. Combining a few 20% calendar-aged cells with unaged cells (0% fade) in series–parallel configurations shows a completely different picture (Figure 1b–c). The capacity heterogeneity in the parallel strings has a significant influence in the IC curves (Figure 1b) and, as a result, would make separating overall module capacity fade from cell-to-cell aging heterogeneity extremely challenging. For the series configurations, the IC evolutions associated with cell-to-cell heterogeneity are a bit more optimistic, showing potential for tracing through the gradual absence of the low voltage peak back to a particular fault (Figure 1c). However, when such IC signatures are tracked in a full pack, we see additional uncertainties. As an example, we present IC signatures of a 96S2P pack with a different chemistry (i.e., graphite/lithium manganese oxide [LMO]), aged differently up to 27% under a different duty cycle.28

Figure 1
figure 1

Slow rate dQ dV–1 (IC) plots for multicell configurations at different aging states: (a) single gr/NMC cell, calendar aged,17 (b) 1S4P gr/NMC module, calendar aged,17 (c) 10S1P gr/NMC string, calendar aged,17 (d) 96S2P gr/LMO Nissan Leaf pack aged with a practical duty cycle (DST discharge and 2C DCFC charged).28

Using additional single-cell test data under similar conditions, it was qualitatively determined that the cells in the pack had loss of active material (LAM) in the negative electrode (LAMNE) as an added aging mode along with LLI28 and, as a result, showed distinctly different peak evolution. Besides this, the pack IC evolution also shows distinct drift with aging, even after ohmic correction at low C-rate at C/25. These drifts did not show up in the cells, and it is not yet fully clear why the drifts arise in the pack. Possible reasons could include distinct aging behavior associated with different cell designs and operating conditions, coupled with the nonuniformity of aging. Nevertheless, the challenge for this diagnostic to estimate overall SOH and underlying aging modes in the presence of heterogeneity in high voltage strings can be clearly seen. Therefore, a logical, practical approach would be to examine cells in smaller groups (e.g., modules and strings), as shown in Figure 1b–c, combining responses to obtain relevant information on overall SOH of a large pack and the determining underlying causes.

As a summary, IC analysis, while useful at chemistry R&D stage, likely is not practical as a diagnostic and prognostic tool for the battery architectures at service due to complexity of the method, its longevity, and uncertainty in analysis.

Broadband electrochemical impedance-based management and diagnostics

Multiple case studies of broadband electrochemical impedance spectroscopy (EIS) show its effectiveness as a diagnostic tool to evaluate (1) external short in a cell (Figure 2), (2) higher self-discharge within a cell in different size strings (Figure 3), and (3) aging nonuniformity within series and parallel connected modules or strings (Figure 4). The broadband (0.1–1638.4 Hz) impedance spectra are generated by the impedance measurement box in ~10 s, thereby making this diagnostic a practically promising one.17 Either the difference in complex magnitude of impedance, Z, or the real part of Z, Zʹ, with or without the specific battery fault is used as the diagnostic signal. The purpose is to evaluate whether early and fast diagnosis of these battery faults is possible considering the noise behavior of the instrument. Details of these studies can be found in References 17 and 76.

Figure 2
figure 2

Change in complex magnitude of impedance, Z (\(Z = \sqrt {(Z^{{\prime}{2}} + Z^{{\prime\prime}{2}} }\)) at 0% depth of discharge (DOD): (a) measurement immediately after applying short and (b) measurement 30 min after applying short. ∆Z is the difference between the baseline (without short, at rest) and the signal of interest (with applied short). The EIS measurement uncertainty is characterized by limit of detection (LOD = 3σ), the lowest value measurable above the background noise, and limit of quantitation (LOQ = 10σ), the minimum value at which a quantitative value can be confidently measured.17,76

Figure 3
figure 3

Change in complex magnitude of impedance, Z (\(Z = \sqrt {(Z^{{\prime}{2}} + Z^{{\prime\prime}{2}} }\)) at 50% depth of discharge for (a) 4S, (b) 6S, and (c) 10S strings in series. One of the cells in these strings is discharged 20% more than the other cells to create a condition of localized increased self-discharge within strings of different sizes.17Z is the difference between the baseline (without any self-discharge) and the signal of interest (with self-discharge). Note the LOD and LOQ change with string size.

Figure 4
figure 4

Change in real magnitude of impedance at 50% depth of discharge for different cell configurations with capacity heterogeneity: (a) 10S1P strings with either 1 or 3, 20% calendar-aged cells, (b) 1S4P string with either 1 or 3, 20% calendar-aged cells.17

Figure 2 shows that the ΔZ signal strength (or detectability) associated with external short depends on multiple factors: the short severity, measurement time, frequency, and instrument noise. For instance, it is difficult to get a detectable signal above the noise floor if the short is very soft; for example, at C/100 (C/100 equivalent internal or external short will discharge a battery in 100 h), measurement delay weakens the detection signal even when the short is more aggressive (i.e., C/5 [5 h discharge]) (Figure 2b), and the detectable diagnostic signals (beyond LOQ line) are primarily concentrated in the low-frequency domain—less than 2 Hz in this case—indicating a diffusion-dominant change due to the nature of the fault. Note: cell-level observations do not provide information on how detectability would be affected in multicell configurations where a subset of cells or a whole string experiences the same fault.

Figure 3 shows the diagnostic signal, ∆Z, for strings where a single cell’s DOD is 20% higher than the other cells. The diagnostic signal associated with the imbalance is detectable in the same low-frequency domain (i.e., it was less than 2 Hz for a small 4S string), but became weak for a 6S string and fell below the noise level for the 10S string. Therefore, string size could impact the detectability of impedance-based diagnostics for similar faults. Increasing resolution and accuracy of the instrument may improve detectability, but challenges to this technique in larger strings could persist.

Figure 4 shows another application of the impedance-based method for detecting cell-to-cell capacity heterogeneity due to nonuniform aging. The signal can distinguish an increased number of aged cells within a 10S string (Figure 4a), but fails to do so for a 4P module (Figure 4b) because of insensitivity of the increased impedance in parallel settings. Besides these mild-abuse and nonuniform-aging examples, this impedance-based method can identify impending major-abuse conditions—overcharge and overtemperature—in single- and multicell series configurations where similar, but more distinct changes are observed.76

The impedance-based method discussed here has potential to quickly identify battery issues. Beyond online battery monitoring and management, this could be useful for first and second responders for evaluating battery state-of-safety (SOS) or a battery pack’s aging heterogeneity at the end of useful first life. How the method would perform in field applications, under the compounding effects of aging and temperature nonuniformity, remains to be seen. The methods require baseline information of fault and aging conditions and, therefore, could require extensive resources. More evaluation is needed to generalize detectability of configurations beyond lab settings.

Electrochemical nascent short circuit detection

Additional methods of detecting a short circuit (SC) exist; one notable example is internal SC detection, developed by Sazhin et al.52,70,85 The method is validated using precise simulated internal shorts with different external resistors that cause evolution of elevated self-discharge (SD) or internal SC current (ISC), as shown in Figure 5a. Other useful detection metrics are current zero crossing point (CZCP) time and SD or SC resistance (also shown in Figure 5a). SC current increases, and the associated CZCP time (in the range of minutes) reduces as the severity of the short (lower resistance value) increases, as shown in Figure 5a. The method is also extended to small multicell series and parallel configurations for different detection scenarios.75 Figure 5b shows that the SC current can be detected in multicell configurations, and signal strength increases when similar shorts exist in multiple cells. However, detecting a single-cell short in longer string settings can be less straightforward as the signal strength decreases (Figure 5c). This poses limitation for isolating one or more faulty cells from long string measurements. Therefore, the SD technique could be implemented on reasonably sized smaller string segments with higher SD current and, finally, on individual cells following a top-down approach. This method is particularly useful for cells connected in parallel, where fault exists because adding a healthy string in parallel to a string containing one shorted cell does not reduce detection sensitivity (Figure 6).76 The healthy string generates no additional shorting current. The SD current strengthens when similar faults exist in multiple strings. In summary, this method could identify nascent internal short in single- and multicell configurations in minutes. Unlike impedance-based methods, the SC method is sensitive to parallel-string issues. The method is promising for field applications due to its simplicity, brief duration, and broad applicability to any battery chemistry and design.

Figure 5
figure 5

(a) Single-cell current response with simulated external short, (b) SC current evolution in three-cell string with up to three short-circuited cells, and (c) SC current evolution for the one short-circuited cell within up to three-cell string.52,70 Short are created by using external resistors in parallel to the battery.

Figure 6
figure 6

ISC evolution with the addition of a parallel string at 90% SOC.76

Electrochemical methods for Li plating

The last example shows the issues associated with electrochemical (EC) detection of Li plating. As discussed earlier, researchers have identified and used individual global EC signatures—open-circuit voltage (OCV) or end-of-charge rest voltage (EOCV), dV dt–1, dQ dV–1 (or dV dQ–1), and Coulombic efficiency—for varied Li-plating conditions: low-temperature operation, N:P ratio < 1, and fast-charge conditions.36,37,39,40,41,86 However, one should be careful against generalizing these individual signatures as diagnostics for Li plating across all conditions. For instance, the strength of the OCV, dV dt–1 and dQ dV–1 varies significantly with charging rate, upper charge cutoff voltage, rest after charging, discharge C-rate (where applicable), and temperature.40,87 One can expect to see EC signatures distinctly and reliably in early cycling at low temperature and < 1 N:P ratio Li-plating conditions, but in other Li-plating conditions, such as high-temperature aggressive fast charging, some signatures might not present distinctly and reliably, although plating conditions exist. For instance, Chen et al.42 observed distinct dV dt–1 signatures—that is, the plateau between 2 to 4 min in Figure 7—due to chemical intercalation of plated Li at 30°C. The plateau vanishes after Cycle 5 in Figure 7 despite other signs of Li plating such as CE, EOCV, and so on are present. No pertinent dQ dV–1 signature is observed, even in early cycles where distinct dV dt–1 signatures are observed. Chinnam et al.87 later found that reversibility of plated lithium is a strong function of C-rate, decreasing with increased C-rate—thereby, making the pertinent dV dt–1 very weak. This reduced reversibility, combined with rest after charge, is attributed to the nonexistence of dQ dV–1 signature for fast-charging plating condition.40,87 Chen et al.42 concluded that multiple signatures must be combined to make a conclusive decision on Li plating for fast-charging conditions. Chen et al.’s framework should be generally applicable to other Li-plating conditions.

Figure 7
figure 7

Post-charge open-circuit dV dt–1 (or dOCV dt−1) signatures associated with Li plating for a gr/NMC532 cell. The cells were tested at 30°C with 6C CC-CV charge and C/2 discharge with 15 min rest in between charge and discharge steps.42,95

Safety and diagnostic needs for next-generation high-energy-density batteries

Tremendous R&D efforts are being undertaken to increase the current state-of-the-art LiB’s specific energy from 250 Wh kg–1 to as high as 500 Wh kg–1 by moving away from graphite in the anode to silicon or Li metal, paired with either high-nickel cathodes or sulfur with either liquid or solid electrolytes.3,88 Besides pushing for high-energy-density active materials with innovative architectures, shedding weight from nonactive materials by using thinner current collectors, thinner and less-porous separators, tabs, etc., is an additional strategy to boost Wh kg−1.3 Unlike conventional graphite-based batteries, these battery technologies are also suggested to operate at higher compressive pressure up to megapascals,34,83,89 which further increases with battery SOC and age.84,90 Reported mild- and major-abuse testing of these high-energy-density batteries suggests more safety concerns than for traditional LIB. Therefore, the use of high-energy-density materials in a cell with thinner supporting components at higher compressive pressure calls for more innovative, reliable, and advanced monitoring and diagnostics besides some of the diagnostics highlighted in this article.

The need for standard testing methods and platforms

R&D efforts in developing and maturing advanced management and diagnostics (M&D) toolsets and algorithms for different battery configurations are expected to continue. Implementing, extending, or validating these advanced M&D technologies in multicell configurations in real applications for different battery aging and fault scenarios is time and resource intensive, which could potentially delay their maturation and adoption. Therefore, reconfigurable test setups and rapid validation platforms must be developed91,92,93 parallel to facilitate quick maturation of these emerging M&D technologies developed by the broader R&D community. These platforms advance the ability to compare and verify the applicability and validity of different M&D technologies systematically and more directly in real-time environments. Such activities would also support and ensure appropriate tools are in place for consumer and emergency-responders’ safety.

Summary

LiBs have become an integral part of human life. Advanced management and diagnostics (M&Ds) throughout the service life of LiBs are indispensable, particularly in large, expensive, and critical systems (e.g., EVs, stationary and grid-storage applications, and the aerospace industry). Relying on overdesign and underutilization of battery packs as an alternative to advanced M&Ds is poor strategy; it brings unmitigated risk of potentially catastrophic events and significant financial loss related to warranty and recall liabilities and could ultimately fuel negative public perception. The declining price of LiB packs94 also justifies having more investment in developing robust and advanced M&Ds to ensure the safety of LiB powered systems.

Advanced M&D in multicell configurations currently are in a nascent state. Many cell-level electrochemical battery M&Ds exist, but very few have been implemented in multicell configurations. Challenges to implementing single-cell diagnostics to multicells include the subtle and transient nature of detection signals, the sensitivity of a detection signal with string configuration and size, time required to detect the signal, issues separating the cell-to-cell heterogeneity from overall aging, the need for additional, more accurate sensors, more onboard computational power, and large sets of baselining data specific to design and chemistry. Some of the diagnostics have the potential to work for a particular battery fault, but not for another. It thus makes sense to have multiple and complementary diagnostics, possibly combining multiple electrochemical signals, in place. A modular approach in detecting issues in large strings and parallel configurations can be implemented. This requires more experimental validation and for quicker maturity and transition from lab to field applications.