Introduction

Background on Lithium-Ion Battery Research

Due to growing concerns about the environment and sustainability, there is an urgent need for advanced energy storage technology to facilitate the adoption of new Electric Vehicles (EVs) and smart grids [1]. A Lithium-ion Battery (LIB) stores energy through reversible lithium-ion reduction. In a standard cell, graphite serves as the negative electrode, acting as an anode during discharge. The positive electrode is often a metal oxide that acts as a cathode during discharge [2]. Figure 1 illustrates the most commonly used LIB. The schematic shows the movement of lithium-ions in the electrolyte, shuttling reversibly between the two electrodes of the device.

Fig. 1
figure 1

Illustration of the widely utilized LIB configuration featuring LiCoO2 cathode and graphite anode during the discharge state

LIBs that include LiCoO2 cathode and graphite anode are at the forefront of modern energy storage systems, powering a wide range of applications. To ensure their optimal performance, it is crucial to develop advanced models, optimization techniques, and management strategies. LIBs have become a transformative technology due to their exceptional advantages in high operating potential and energy/power density. LIBs are widely used in modern society, from small-scale applications like mobile phones and laptops to large-scale applications like EVs and microgrids [2]. Despite the energy density of conventional LIBs nearing the theoretical maximum, their performance and cost remain unsatisfactory [3]. Consequently, extensive efforts have been dedicated to explore new electrode and electrolyte materials in order to enhance the capabilities of current LIBs [4].

Overview of Machine Learning and its Relevance in Scientific Research

Machine Learning (ML) is a branch of Artificial Intelligence (AI) which focuses on the use of data and algorithms to imitate the way that humans learn, gradually improving its accuracy. ML methods enable computers to learn without being explicitly programmed and have multiple applications in scientific research, such as data analysis, pattern recognition, anomaly detection and simulation [5]. ML can also help scientists gain new insights and understand from large and complex datasets [6].

ML has recently emerged as a powerful tool in LIB research. It has been used to accelerate materials development, such as screening fast ion conductor candidates and filtering electrolytes in consideration of suppression of dendrite formation in lithium metal anodes [7]. More examples of using ML in LIB research are presented in Chap. 5. ML algorithms commonly used in battery research include supervised and unsupervised learning [8]. Major ML techniques including Supervised and Unsupervised learning are explained later in this chapter. These approaches have gained attention due to the high complexity of the LIB cell production chain and advancements in digitalization and information technology [9]. The main objective of using ML in LIB research is to accelerate the design and optimization of the next generation of batteries [10]. Chapter 4 of this paper addresses the current challenges of LIB technology while Chap. 5 focuses more on how ML can address those challenges as well as the knowledge gaps in the field.

By harnessing the power of algorithms and data analysis, researchers have been able to unravel complex relationships and patterns within LIBs. Various ML methods such as Artificial Neural Networks (ANN), support vector machine, and random forest, have been applied to predict LIB performance, optimize material synthesis, and enhance electrolyte design [11].

As historical investigations into materials have predominantly depended on either trial and error experiments or fortunate discoveries, both of which necessitate a large number of laborious trials that are time consuming, expensive and inefficient [12]. Figure 2 shows the process of efficiency enhancement in materials’ research from trial and error to first principles, high-throughput screening and ML approach. In the past decades, computational chemistry such as first principles calculations, molecular dynamics and Monte Carlo techniques have become a major approach to aid and enhance experimental research for materials exploration and design. However, the current models are not capable of predicting many real world materials challenges due to poor scaling of calculations and high computational costs [13]. This is explained in more details in section ‎5.8. Thus, it is crucial to accelerate materials’ research by finding new approaches and methods. Using ML is a new approach to accelerate LIB’s research by processing the data and finding correlations between different factors and variables.

Fig. 2
figure 2

The advancement of techniques in materials research

Different ML Techniques used in LIB Research

Researchers have made great progress in predicting how batteries will behave for improving their performance, and managing them efficiently by using ML techniques. ML techniques have been applied to LIB research for various purposes, such as screening fast ion conductor candidates [7], predicting ionic conductivity [10], and designing solid-state electrolytes [14]. Detailed examples are presented in Chap. 3 and 4 of this paper. Some of the common ML techniques used in LIB research are:

  • Supervised learning: a technique that learns from labelled data and predicts the output for new inputs. Supervised learning algorithms can be used to predict the ionic conductivity of a material based on its chemical composition [10].

  • Unsupervised learning: a technique that learns from unlabeled data and finds patterns or clusters in the data. Unsupervised learning algorithms can be used to group materials with similar properties or structures [7].

  • Artificial Neural Network (ANN): composed of node layers, containing an input layer, one or more hidden layers, and an output layer to learn complex features from data. ANN algorithms can be used to model the electrochemical behavior of batteries based on experimental data [15].

  • Reinforcement learning: a technique that learns from trial and error and optimizes the actions based on rewards or feedback. Reinforcement learning algorithms can be used to design optimal experiments for material synthesis or characterization [14].

Main Contributions in this Research

This paper provides a comprehensive review of the current state and future trends in using ML techniques in LIB research. The paper discusses the applications, challenges, and opportunities in applying ML techniques to LIB research. Chapter 1, starts with a background about LIB, ML and a general description of using ML techniques in LIB research. Chapter 2 describes the applications of ML in design such as materials optimization, in manufacturing, service, and end of life at recycling stage. The most common applications of ML techniques in LIB research were prediction of State Of Charge (SOC), State Of Health (SOH) and Remaining Useful Life (RUL). Chapter 3 describes challenges of applying ML techniques in LIB research. It is included but not limited to data availability, computational complexity, black-box nature, interpretability and explainability of the models. Data bias is a main challenge that is described in detail. Moreover, solutions are suggested to tackle the challenges. Chapter 4 is about the possible future trends. This chapter is including but not limited to ML models, lack of knowledge of micro-behavior and micro-mechanics, self-improving models, big data, need for accuracy and incorporation of first principal models.

The 4 chapters of this paper aim to explore the significance of using ML in LIB research by discussing its applications, challenges, gap in knowledge and potential future directions. In the existing literature, there are limitations found in ML methodologies and LIB technology-based review studies. The main objectives in this study are to highlight the benefits of using ML in LIB research and to provide insights into how it can accelerate the discovery and development of lithium-based materials and technologies. This research has 3 main contributions: (i) comprehensive review of the latest advancements in ML techniques applied to LIB research by June 2023, with a focus on their practical applications and outcomes. (ii) Main challenges of research in this field are addressed. (iii) Research gaps are identified. The findings presented in this paper will serve as a valuable resource for researchers and practitioners in the field, aiding in the design and optimization of lithium-based systems.

Unlike existing literature that focus on using machine learning in LIB research, this manuscript delves deeper into applications, challenges, and future trends, offering a more comprehensive analysis of recent advancements in this area. Specifically, challenges and future trends of using ML in Lib research is not studied in the available literature review. For example, the importance of the emerging N-Shot Learning field is not available in the current literature review.

In conclusion, our manuscript distinguishes itself from other recent reviews by offering a novel synthesis of literature, focusing on emerging research directions, providing in-depth analysis of controversies, and integrating multidisciplinary perspectives. This paper not only contributes to our understanding of Machine Learning applications but also empowers professionals in this field to harness its capabilities effectively.

Applications of ML in Lithium Research

The application of ML techniques in this field has enabled researchers to gain deeper insights and enhance various aspects of the battery lifecycle, from materials design and fabrication to performance evaluation and optimization. Figure 3 provides a concise and easily comprehensible summary of ML applications throughout the entire lifecycle of LIBs. This includes an overview of existing methods, important considerations in LIB lifecycle including design, manufacturing, service and end of life. This figure also presents some examples of using ML in LIB research through its lifecycle.

Fig. 3
figure 3

LIB lifecycle research strategy and main applications of ML in LIB research

Particularly, the remarkable characteristics of ML approaches motivated the authors to employ data-driven approaches for summarizing the advancements in LIB technology throughout its entire lifecycle. The input characteristics that are commonly used and their respective applications and ML models are summarized in Table 1. For example, Hsu et al. used NN to predict SOH and RUL in LIBs.

Table 1 ML models used in a range of research topics of LIBs

SOH, SOC, and RUL are metrics that are relevant across all phases of LIB lifecycle. Table 2 shows how they relate to each of the lifecycle stages:

Table 2 Relationship between SOH, SOC, and RUL metrics and each one of the LIB lifecycle phases

In General, while SOH, SOC, and RUL are influenced by the initial design, they are metrics primarily used for ongoing monitoring, management, and end-of-life decisions throughout the battery’s operational life. They are not confined to the design phase but are integral throughout the entire lifecycle of the battery.

Consequently, machine learning techniques applied to predict SOH, SOC, and RUL are applicable throughout all stages of the LIB lifecycle.

Design

ML has been employed to expedite the development of enhanced LIB. Rather than solely expediting scientific analysis through data pattern recognition, researchers have merged ML with empirical knowledge and physics-guided equations to uncover processes that impact LIB design [35]. The main trends in utilizing ML for design of LIBs are presented in below.

Optimizing Battery Materials and Design

ML has been applied to characterize battery performance [36], lifetime, and safety [37]. ML can be used to accelerate the understanding of new materials, chemistries [38], and cell designs [39]. For instance, ML has been used to increase battery life prediction accuracy by automatically generating equation components and narrowing down the selection from millions of possible combinations to identify a model that balances predictive accuracy with simplicity [38]. As another example, ML has been used in quantitative microscopy analysis to map the orientation and morphology of sub-particle grains in 3D [39]. This combined approach of electron backscatter diffraction (EBSD), ML, and modeling, represents the first demonstration of mapping and simulating dynamic phenomena within single electrode particles [39].

As another example, Naha et al. [40] used a supervised ML technique for internal short circuit detection of LIBs. They identified and extracted a set of features that encompassing the physics of Li-ion cell with short circuit fault. Then, a training feature set was generated with and without an external short-circuit resistance across the battery terminals. They introduced internal short by mechanical abuse to emulate a real user scenario. A random forest classifier was trained with the training feature set. The minimum fault detection accuracy for the testing dataset was 97% [40].

The microstructure of a composite electrode controls the charging and discharging process of individual LIB particles. Visualization of LIB particles helps to understand electrode degradation mechanisms that are directly associated with the spatial arrangement of different components in the electrode, including carbon matrix, void, binder, and active particles. For instance, Jiand et al. used ML assisted statistical analysis, and experiment-informed mathematical modeling to understand the electrochemical consequences of LIB particles’ evolving (de)attachment with the conductive matrix [41]. Figure 4 shows how a mix of active LiNi0.8Mn0.1Co 0.1 O2 (NMC) particles and inactive carbon/binder domains build the porous electrode.

Fig. 4
figure 4

3D microstructure of the composite battery cathode that is generated by help of ML. A mix of active NMC particles and inactive carbon/binder domains build the porous electrode [41]

Lithium Metal Anode Design and Stability

A range of strategies have been reported to improve the stability and safety of Lithium metal as an anode. For instance, Ahmad et al. used ML techniques to develop a computational screening method of inorganic solid electrolytes for suppression of dendrite formation in lithium metal anodes. Using ML techniques helped them to accelerate the process of screening by predicting the properties of solid electrolytes through the identification of structure − property relationships [42]. As another example, Kim et al. developed a ML model to facilitate the design process of electrolyte for lithium metal anodes. By using this approach, they extracted a previously unidentified insight that low solvent oxygen content can lead to superior cyclability [43].

Predicting Battery Performance Using ML Techniques

Predicting the performance and behavior of LIBs is significantly important for their optimal utilization and integration into various applications. ML techniques can be used as a powerful tool in this domain by enabling accurate prediction and characterization of battery performance. ML has recently emerged as a promising modeling approach to determine the SOC, SOH and RUL of batteries [44]. Data-driven modeling uses historical data, real-time data or both, for training a ML algorithm to predict the future behavior of LIBs [45]. The prediction of SOC, SOH and RUL is probably the most common applications of ML techniques in LIB research. Table 1 that is presented in Sect. 2, shows a range of ML models for various applications, including RUL, SOC and SOH estimation.

Battery Material Characterization

A greater number of studies apply ML techniques to estimate the SOC, SOH and RUL of LIBs and less are focused on characterization instead of estimation. As an example for characterization of mechanism, Bhowmik et al. used a semi- supervised generative deep learning model to characterize formation Solid Electrolyte Interphase of battery [46]. As another example, Zhang used ML techniques to characterize the degradation pattern of LIB from impedance spectroscopy data [47].

Manufacturing

Various stages of LIB manufacturing, such as electrode production, cell assembly, cell finishing, evaluation, and screening, have benefitted from the application of ML techniques. ML techniques offer a non-intrusive solution with remarkable accuracy and minimal processing requirements for optimizing and modeling engineering challenges within LIB manufacturing. For instance, Drakopoulos et al. utilized a ML approach in their research to focus on developing graphite-based anode electrodes [48]. They established a connection between manufacturing protocols and the final electrochemical and cycle life performance parameters by leveraging a ML model trained on a database containing input and output attributes. Consequently, they predicted and designed the formulation and manufacturing process to yield thick, high-coat-weight, graphite-based electrodes [48]. In another investigation, a study proposed an ensemble learning framework based on RUBoost ML method for classifying electrode quality in LIB manufacturing [49]. Their proposed framework effectively classified three important quality indicators including (1) electronic conductivity, (2) thickness, and (3) half-cell capacity, for both LiFePO4 and Li4Ti5O12 based electrodes. The models developed within their framework addressed class imbalance issues and accurately predicted the qualities of the manufactured electrode [49].

Service

Due to high efficiency of using ML models, researchers used the technique to enhance LIB services.

Lithium-ion Battery Modeling, Optimization, and Management

Battery cells can be optimized to maximize either energy or power, depending on their intended use. For instance, thicker electrode batteries exhibit bigger capacity while thinner electrode batteries are more suitable for power delivery [2]. Optimization techniques such as the progressive quadratic response surface method have been used to optimize design variables of existing electrode materials for enhanced power and capacity of LIBs [50]. Researchers used a range of ML models for fault detection and diagnosis process of LIBs that are presented in a research by Samanta et al. [51].

Early Detection of Battery Degradation Using ML

ML techniques have been widely adopted for efficient, reliable, and accurate prediction of battery degradation in LIBs [15]. For example, researchers have combined Gaussian process ML model with electrochemical impedance spectroscopy (EIS) to build an accurate battery forecasting system [47]. This approach takes the entire EIS spectrum as input and automatically determines which spectral features predict degradation [47]. Another example is the use of deep extreme learning approach to predict battery degradation and RUL of LIBs [52].

End of life

ML in LIB Recycling

ML has great potential for improving the efficiency of lithium recycling [53]. By predicting battery lifetime and optimizing the recycling process, it is possible to reduce waste and increase the yield of recycled materials. For instance, researchers used a data-driven ML algorithm to predict the lifetime of solid-state batteries [54]. This approach provides a new way for the batch classification, echelon utilization, and recycling of batteries [54]. ML algorithms have shown great promise in predicting the lifetime of solid-state batteries [55]. In addition, ML algorithms such as ANN and random forests have been used to predict waste generation at the municipal level with high accuracy [56]. Furthermore, ML techniques can be used to optimize the recycling process by predicting the optimal conditions for each step. This can help reducing the amount of energy required for recycling and increase the yield of recycled materials [55]. Additionally, ML can be used to identify impurities in recycled materials and predict their effect on battery performance [57].

Challenges in Applying ML to LIB Research

To apply the above-mentioned applications of ML in LIB research, there are many challenges and research areas that need to be addressed. These challenges and research areas require interdisciplinary and systemic approaches that consider the technical and economic aspects of LIB studies.

Data Availability

Application of ML models to predict LIBs properties requires the availability of high-quality data. One of the challenges of explaining model predictions in LIB research for using ML is the availability of battery data. This is a crucial hurdle in battery informatics and researchers have been working on mitigating the data scarcity challenge [15]. For example, LIB recycling data is very scarce and complex. The presence of various recycling methods, differences in recycler producers, and variations in experimental setups pose challenges in comparing data and confirming models [58]. Section 4.1 explains how integration of domain knowledge and transfer learning can help to mitigate data scarcity.

Data Preprocessing and Cleaning Challenges

The next challenge are data pre-processing and cleaning that are critical in any ML project. These steps involve identifying and correcting errors, inconsistencies, and missing values, transforming the data into a format that can be easily understood by the ML algorithm, and creating new features from the existing features in the data or reducing the dimensionality of the data. Poor data quality can primarily affect the accuracy and lead to false prediction [59]. Because of the importance of data pre-processing for using ML algorithms in LIB, researchers have been trying to provide pre-processed databases. For example, Hargreaves et al. provided a pre-processed dataset for lithium-ion conductors and their conductivities, to be used by other researchers in this field [14]. Their dataset contains chemical composition, assigned structural label, and ionic conductivity at a specific temperature. This dataset saves time of researchers and aid experimentalists in prioritizing candidates for further investigation as lithium-ion conductors [14].

Limited Sample size

LIB research often involves a limited number of samples, particularly for novel materials or designs. This can lead to overfitting in training the ML models, where the model captures noise in the data rather than true underlying patterns [60]. For example, Zhang et al. faced with the challenge of limited sample size in their research to predict RUL of LIBs. They used a dropout technique to address the overfitting challenge [61].

Computational Complexity

Another challenge is deriving models that are highly accurate, have low computational complexity, and enable real-time state and parameter estimation [62]. The application of ML techniques in LIB research presents challenges related to computational requirements and complexity. ML models, particularly ANN models, can have millions or even billions of parameters, leading to considerable computational and memory requirements [63]. This can result in long training times and high energy consumption. Ensuring low-latency and real-time processing capabilities while maintaining model performance and accuracy is also a critical computational complexity challenge. Researchers are working on finding ways to increase performance without increasing computing power [63]. For instance, to keep the computational complexity of SOC estimations low, Lucchetta used Nonlinear Auto Regressive with eXogenous input with only one hidden layer and a few neurons [64].

Model Generalization

Generalization refers to the ability of a trained model to accurately make predictions on new or unseen data [65]. LIBs operate under various conditions, such as temperature, discharge rate, and cycling protocol. Ensuring that ML models generalize well across these diverse conditions is challenging and requires careful consideration of the model’s architecture and training methodology. For example, Zhang et al. suggested a deep learning model that is capable of overcoming the generalization challenge to predict the life of LIBs [66]. Moreover, Schofer et al. developed a ML framework to predict life time of lithium-ion cells with improved generalization [65].

Black-Box Nature of ML Models, Their Interpretability and Explainability

One of the major challenges faced is the black-box nature of some ML models that may lead to lack of interpretability and explainability. Due to their complex and non-linear structures, it becomes difficult to interpret and understand the underlying processes and decision-making mechanisms [67]. This lack of transparency limits the ability to gain insights into the relationships between input variables and model outputs, hindering the development of a comprehensive understanding of LIB behavior. This refers to the difficulty in understanding the decision-making process of these models [68]. While these models can provide accurate predictions, their inner workings can be difficult to interpret, limiting their usefulness for scientific discovery [69].

Interpretability and explainability are important qualities for ML models used in scientific research. These qualities permit the identification of potential model issues or limitations, build trust in model predictions, and unveil unexpected correlations that may lead to scientific insights [58]. However, achieving interpretability and explainability can be challenging due to the complexity of ML models and the need for uncertainty estimates for model explanations [70]. Furthermore, quantifying and propagating uncertainties through the models is critical for assessing the reliability of predictions and making informed decisions. To overcome the challenge, when Hargreaves et al. couldn’t interpret the material data by visualization, they calculated errors of each prediction and plotted via histogram to quantify the distribution of errors [14].

Quantification of uncertainty is a method to address the black-box nature and enhance interpretability and explainability of ML models. However, it can be a challenge in LIB research. Battery systems exhibit inherent variability, and understanding the uncertainty in model predictions is crucial for decision-making and risk assessment [71]. Uncertainty quantification methods play a pivotal role in reducing the impact of uncertainties during both optimization and decision making processes [71]. However, the poor explainability of some ML models such as ANN models has hindered their adoption in safety and quality-critical applications [72]. To overcome the challenge of quantification of uncertainty, Zhang et al. developed methods to enhance the explainability of ANN models through uncertainty quantification-based frameworks [72].

Furthermore, lack of interpretability can be addressed by using methods known as explainable or interpretable ML (XML/IML) that aim to fill this gap in transparency. Faraji et al. presented a comprehensive review of XML/IML methods in LIB research [73]. This lack of interpretability hinders the ability to extract actionable insights and limits their practical implementation. While interpretability and explainability of data in LIB research is not considered comprehensively yet, it is expected that more researchers focus on the topic in future studies.

Scalability of ML Algorithms for Large Datasets

The next challenge is scalability that refers to the ability of a system to handle an increasing amount of work or data without a decrease in performance. Scalability is a significant challenge when applying ML algorithms to large datasets in LIB research [74]. The abundance of data generated from experimental and computational studies requires efficient algorithms that can handle the volume, velocity, and variety of the data. Traditional ML algorithms often struggle to scale with large datasets, resulting in increased computation time and resource requirements. This challenge has led to the development of scalable ML techniques specifically tailored for big data applications in LIB research [74]. Traditional ML algorithms face critical challenges such as scalability to truly unleash the hidden value of big data [75]. In a successful case, Roman et al. designed scalable data-driven models for battery SOH estimation by emphasizing the value of confidence bounds around the prediction [74].

Data bias

Data Collection bias

Data bias is the next challenge that refers to the systematic error introduced into the collected data due to the complex interplay among various factors that shape the overall characteristics of battery materials [76]. The overall characteristics of battery materials are shaped by numerous factors spanning various length scales, owing to the complex interplay among electronic, structural, and microstructural variables. For instance, the atomic-level crystalline structure and chemical composition dictate the conductivity of a solid electrolyte. As we zoom out to larger scales, aspects like particle morphology, size, and arrangement within the electrolyte’s microstructure impact its conductivity. On the scale of the battery cell, the interplay between the electrolyte and electrode, along with the formation of an interface layer between them, further contribute to conductivity variations [76]. The interplay of these factors gives rise to significant fluctuations in the tested conductivities, even among materials that share identical compositions. As a result, this is introducing a bias into the collected data. As an illustration, the conductivity of garnet Li5La3Ta2O12, contingent upon the synthesis techniques and temperatures employed, can exhibit a range spanning two orders of magnitude, from 10 − 6 to 10 –4 S cm − 1 [77].

This complexity underscores the necessity of augmenting data labeling beyond the materials’ scope by extending it to encompass details regarding synthesis, processing, and characterization. Nonetheless, this poses a considerable challenge, given that not all data sources inherently include comprehensive material characterizations. Even when information is comprehensively presented in publications, accurately associating materials’ properties with their corresponding characterizations remains a complex task, often requiring extensive scanning of lengthy articles. This inherent need for cross- or co-referencing across various sections of content presents a significant hurdle in the transition from human-readable to machine-readable formats. To tackle this quandary, a recent development involves the introduction of a canonical ontology for materials synthesis. This ontology employs a controlled lexicon and establishes constrained relationships between concepts to address the challenge [78].

Anthropogenic Bias

Anthropogenic bias refers to the influence of human beings on nature, and how this influence can introduce bias into data, models, and systems. For example, if a dataset is collected by humans, it may contain biases that reflect the beliefs, values, and perspectives of the people who collected the data. Scientists tend to focus on systems that have the highest likelihood of success and often choose to present the most significant results to demonstrate their scientific points. This can result in an overrepresentation of certain domains and a lack of negative examples in published literature [79].

Realistically, only a small portion of the entire materials space should exhibit special functionality. Negative data, which is often not considered worthy of publication, can actually benefit ML models by enabling more trustworthy exploration of unknown domains [80]. Disregarding the abundance of negative data, anthropogenic bias in sampling fails to accurately represent the actual data distribution. When comparing ML models trained on biased human-selected reactions to those trained on unbiased randomly generated reactions for synthesizing amine-templated metal oxides, addressing this bias significantly enhanced the ML models and expedited the discovery of new materials [79].

Avoid the Bias

Ensuring the integrity of model performance while mitigating data bias and anthropogenic bias necessitates complete transparency regarding the quantity and quality of data. It’s crucial to exercise caution, as assessing the quantity and quality of datasets can be complex and subjective, influenced by the choice of ML algorithms and intended applications [81]. Therefore, when reporting and evaluating ML research, using data quantity and quality should not be used as judgment criteria. A more crucial step involves disclosing the data collection and pre-processing methodology, along with promoting open access to published data. In a recent study, Artrith et al. suggested a set of guidelines for reporting ML models [81]. These guidelines encompass detailing all data sources, documenting the data selection strategy, including access dates or version numbers, describing data cleaning procedures, and assessing the extent of data pre-processing. They proposed a comprehensive checklist for the reporting and assessment of ML models that aim to establish a high standard for data reporting protocols within the materials domain [81].

The materials research community still needs time to fully understand and transition to improved communication of materials synthesis, in order to expand the impact of the insights contained in each published synthesis method and contribute to a global body of unified knowledge on materials synthesis. This can be the ultimate approach to avoid data bias.

Interdisciplinary Nature

The interdisciplinary nature of using ML in LIB research is another challenge to develop the technology. Successful application of ML techniques in LIB research requires expertise from multiple fields, including materials science, electrochemistry, data science, and engineering. Effective collaboration among these disciplines can be challenging but is essential for holistic progress [82].

Future Trends

LIB research has made significant advancements in applying ML techniques, but it still faces important knowledge gaps that need to be addressed to ensure the long-term viability and development of AI-powered systems in LIB-related domains. The following section, explores these future trends and gaps. Figure 5 shows the future trends in LIB research that are addressed in this research.

Fig. 5
figure 5

Future trends of using ML in LIB research

ML Techniques for Small Datasets

In the realm of LIB research, certain domains yield substantial data volumes, whereas others may only possess small datasets. This discrepancy can arise from various factors, including the expenses and time constraints associated with testing, the necessity for specialized equipment, or the extended duration required for specific experiments. For example, conducting tests on LIB demands significant time and financial resources. This is because specialized equipment, such as multi-channel cyclers, potentiostats, and thermal chambers, is essential [83], and a standard battery degradation reliability test can span over six months of continuous cycling [84]. Consequently, datasets for these tests may be relatively limited in size. In these situations, ML techniques that are capable of handling small datasets become notably valuable. Approaches such as Transfer Learning, N-shot Learning, Imbalance Undersampling/Oversampling, Asymmetric Loss Function, and Ensemble Learning can be particularly advantageous. These approaches assist researchers in deriving meaningful insights and predictions from limited datasets. For example, MA et al. used Transfer Learning to predict LIB health status with a high accuracy [85] or Zhang et al. used N-shot learning to estimate SOH because LIB degradation data is small [86]. Despite using few shot learning for a range of applications in LIB research, the method used frequently to predict lifetime of LIBs [87, 88]. For example, Tang et al. used the ML technique to detect abnormality of LIB lifetime by using relatively small dataset of first-cycle aging data [89]. In the future, there will likely be more research that uses those ML techniques to make the most of data from small datasets.

ML Techniques for Big Datasets

These days, big fleets of EVs are using LIBs. Transmitting daily data to the cloud facilitates can help in improving LIB design, manufacturing and use. The cloud-based architecture can periodically develop ML models, allowing for self-teaching and self-improvement by leveraging server farms [90]. Also, advancements in measurement techniques will allow more high throughput experiments that will help to generate big data (Fig. 2) and the capability to make real-time decisions regarding what to synthesize and test next by employing the outcome of the high throughput will accelerate big data generation [91]. Thus, future research directions for the fields that big data is available will be important in future [92]. They involve developing ML algorithms optimized for bigger datasets through techniques like ANN. Three main techniques to tackle big data challenges in future are explained in detail.

Deep Learning Architectures

Deep learning is a subset of ML that uses artificial Neural Networks with multiple hidden layers to analyze complex data and features [93]. Researchers have used deep learning techniques to improve the accuracy and efficiency of LIB research [94]. For instance, researchers have used a deep learning-based segmentation approach to achieve reliable segmentations of volumetric images of LIB electrodes [95]. The application of deep learning architectures can enhance the accuracy, efficiency, and understanding of complex LIB systems by driving advancements in the technology. However, there might be computational complexity challenges which were discussed earlier in section ‎3.4. Table 3 presents examples of using deep learning in LIB research, including the type of model used, the aim and the output. As the table suggests, deep learning used for a range of purposes from estimation of SOC to integration with physics-based models.

Table 3 Examples of using deep learning and reinforcement learning in LIB research

Reinforcement Learning for Optimization

Reinforcement learning (RL) is a type of ML that uses a feedback system to train a learning algorithm. RL presents opportunities for optimization in LIB systems using ML [101]. Among all of the available ML models, RL is highlighted here because researchers used RL for a range of applications in LIB research and they believe, this approach can be used to improve the efficiency and quality of LIB research. For example, Mishra et al. have used reinforcement learning to optimize the performance of LIBs. This approach improved the accuracy and efficiency of ML models for LIB research [103]. RL has been applied to tasks such as battery management [102], optimal resource allocation [101], and control of LIB systems [104]. Table 2 demonstrates that RL is used in a range of applications. The table demonstrates that researchers used combination of first principles models and ML models such as Entropy-based RL that will be discussed in section ‎4.6.

Active Learning

The next ML model with high potential of using in future is Active learning. It is a type of ML that actively selects valuable data points to construct a high-performance classifier while keeping the size of the training dataset to a minimum [105]. Through the strategic selection of informative data points for labeling, active learning algorithms have the ability to substantially decrease the labeled data needed for model training. This is particularly valuable in LIB research, where data collection and labeling can be time-consuming and expensive [106]. By leveraging active learning, the discovery of new materials, battery performance optimization, and experimental design can be accelerated. In addition, by actively querying samples that have the highest potential to enhance the model’s performance, active learning empowers researchers to concentrate their efforts on the most informative data points, resulting in faster and more accurate predictions [105].

Researchers have used active learning to screen new functional materials for lithium solid-state electrolytes [15, 99] which led to improved accuracy and efficiency of ML models in LIB research [15].

Addressing Lack of Knowledge of Micro-Behavior and Micro-Mechanics

The next future trend in the field can be addressing lack of knowledge micro-behavior and micro-mechanics. Data-driven methods can be constructed without considering the underlying mechanisms of a system. However, batteries are complex systems with non-linear interactions between multiple factors [107]. Some electrochemical processes within batteries are not fully understood [108]. By investigating these internal mechanisms, it is possible to improve the ability to extract meaningful information from numerous, interacting features and increase the effectiveness of testing efforts [109]. Future research is expected to use ML techniques to investigate the micro-behavior and micro-mechanics of LIBs and their interrelationships.

Self-Improving Models or Algorithms in Continuous Evolution

The subsequent future trend of using ML in LIB research can be self-improving models. Due to the nonlinear interactions among multiple factors, individually altering each parameter may not provide a comprehensive understanding [108]. On one hand, incorporating domain knowledge and corresponding testing technologies is necessary to ensure effective modeling efforts. On the other hand, when battery types or operating conditions change, models specifically designed for particular settings require recalibration or reconstruction. Furthermore, as tasks evolve, model re-training becomes essential. Presently, advanced algorithms pave the way for self-improving models [110]. The concept of meta learning, inspired by human learning processes, leverages prior knowledge to facilitate the learning of new tasks, often referred to as learning to learn. Researchers have started using meta-models in LIB research [111], however more research is expected in future.

Incorporating First Principles Models with ML

The next trend will be incorporating the first principles’ models with ML. This approach can present significant opportunities for advancing LIB research. By integrating the fundamental principles and equations that govern LIB behavior into ML frameworks, the accuracy, interpretability, and generalization of the models can be enhanced. Physics-based models provide a solid foundation for understanding the underlying mechanisms and interactions within the LIB system [10]. ML algorithms can then be utilized to capture complex non-linear relationships and learn from data to improve predictions and optimize battery performance [112]. This hybrid approach combines the strengths of both first principles models and ML models by enabling to overcome challenges such as limited data availability and the black-box nature of pure ML models that was discussed in details in chapter ‎3.

Hybrid Models for Improved Accuracy and Interpretability

Afterwards, the hybrid models that integrate physics-based models with ML models present opportunities for improved accuracy and interpretability in LIB research. These models blend domain knowledge with data-driven approaches to perform physics-informed learning of LIB behavior. For example, researchers have proposed hybrid models that combine a single particle model with thermal dynamics with a feedforward ANN to achieve high-precision modeling for LIBs [113]. These hybrid models can provide considerable voltage predictive accuracy under a broad range of C-rates and can be conscious of the state-of-health to make predictions throughout a battery’s cycle life [113]. Furthermore, by incorporating feature engineering techniques and domain knowledge, hybrid models can capture relevant physical and chemical properties of lithium systems, leading to more accurate predictions and actionable results [113].

Transfer Learning and Knowledge Transfer

Eventually, transfer learning that is a ML approach that applies knowledge learned from a source domain to a new target domain has a high potential to enhance LIB research in future [114]. In LIB research, transfer learning can be used to reduce the data requirement of model training for new batteries by leveraging knowledge learned from a source battery with a large amount of data [114]. This approach can improve the accuracy and efficiency of ML models for LIB health management [85]. Ma et al. have developed a transfer learning framework to realize real-time personalized health status prediction for unseen battery discharge protocols at any charge-discharge cycle [85]. In a separate research, Zhou et al., used transfer learning to estimate SOH [115]. By using common feature canonical variates, they used transfer learning as a bridge to transfer the knowledge obtained by the source SOH estimate model, which was trained by data from the complete degradation process [115]. However, another research that used Matrix Profile Empowered Online Knee Onset Identification, shows outperformed results of the research that used transfer learning [116].

Knowledge transfer techniques are able to transfer knowledge from one lithium system to another by enhancing the predictive capabilities and efficiency of the models [117]. This approach can accelerate the development of accurate and robust models for LIB research [118]. Leveraging knowledge from related domains or materials provides valuable opportunities for enhancing the performance of ML models in LIB research.

Conclusion

In conclusion, this research underscores the transformative potential of machine learning (ML) in addressing the fundamental challenges of lithium-ion battery (LIB) optimization. By leveraging ML techniques, we can streamline the exploration of chemical, formulation, and operational condition spaces, significantly reducing the need for extensive experiments and computations. This not only accelerates development cycles but also aids in identifying critical variables that impact battery behavior.

Outcome of this research can be used by the researchers who are interested to leverage ML techniques to explore LIBs. As discussed earlier in this paper, although obstacles persist, extensive efforts have been invested in every aspect of the LIB life cycle, from micro-mechanisms to macro-operations, where ML algorithms play a vital role in explaining features, uncovering behaviors, optimizing parameters, determining operational status, and predicting cycle life. With initial steps taken and substantial progress achieved, we are optimistic about the prospects of a data-based LIB exploration that is healthy, safe, cost-effective, and environmentally friendly.

The notable contributions of this research are the exploration of the opportunities and challenges of using ML in LIB research. In addition, future trends of using ML in the field are presented to the researchers to accelerate the research by overcoming the challenges and. A road map of advanced ML models, addressing lack of knowledge of micro-behavior and micro-mechanics, self-improving models, models for big data, models for fewer data, incorporating first principle models with ML, hybrid models, transfer learning and knowledge transfer presented, aiming to overcome current challenges and drive innovation.

This research not only contributes to the current state of LIB research but also influences its trajectory. It offers valuable insights and practical guidance for researchers and practitioners in the field, paving the way for a future where ML-driven approaches redefine how we approach LIB optimization and research. As we continue to advance our understanding and application of ML in LIBs, we are confident that our work will inspire further developments in theory, practice, and research, propelling the field toward sustainable and efficient energy solutions.