Abstract
Perovskite solar cells (PSC) are formed by different layers composed of thin films of various materials, in which the properties of every thin layer affect the performance of the cell. The identification of those most relevant properties (or descriptors) has a significant impact on the optimization and cost reduction of the Perovskite solar cell. This relevance is typically evaluated by adjusting a model using subsets of features, but in the present work, we propose to use the mutual information measure to quantify the statistical association between input descriptors and Perovskite solar cell performance parameters (Voc, Jsc, FF, PCE). As a result, it is found that ion X is the factor that most impacts the performance of the solar cell. On the other hand, variables such as band gap, Perovskite layer thickness, and A and B ions are also important. In this work, we identify some of the most important factors affecting Perovskite solar cells’ performance, and it could help to improve the efficiency of Perovskite solar cells. In addition, this proposed method could also be applied to other types of functional coatings, thin films, and surfaces.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
Perovskite solar cells are currently one of the best alternatives to replace silicon solar cells due to their high absorption coefficient, low cost, and ease of manufacture [1, 2]. As a result, there are numerous publications attempting to cover all the different issues associated with these cells.
However, the performance of Perovskite solar cells (PSC) is highly sensitive to the physicochemical properties of the Perovskite material. These properties include the crystal structure, composition, and morphology of the Perovskite film. The information about these considerations (in real and not simulated conditions) can be extracted from scientific articles to be later on stored into data sets, as already done in [3,4,5,6]. Then, such information is codified into variables called descriptors or characteristics. This information is of considerable importance for solving questions of interest in the area of Perovskite materials science, e.g., to determine those most relevant descriptors and, therefore, those most influential synthesis conditions from the point of view of solar cell performance.
It is important to take into account that the traditional way to develop new materials is usually based on trial and error, which is time-consuming and expensive [7]; thus, knowing which variables are the most relevant could have a significant impact. In fact, researchers can develop more efficient solar cells by understanding how different processing techniques, parameters, and elements affect its performance [8,9,10], with consequent savings in time and resources during perovskite cell synthesis process.
On the other hand, interest about machine learning has recently increased among materials science researchers [11,12,13,14], and from machine learning perspective, knowing those most relevant descriptors (those most statistically related to performance) makes it possible to build more precise and accurate models [14, 15] by reducing complexity and avoiding overfitting of the same model [16]. In addition, before model construction, it is important to identify those key features closely related to the target properties in order to obtain simpler and explainable models.
There are several methods for relevant feature selection. A widely used set of methods in machine learning, called wrapper methods, measure the relevance of descriptors by using the predictive power of a model fitted to the available data [17]. In addition, they can find relevant subsets of features by estimating the predictive power of each of those possible subsets. These methods are usually computationally expensive, and the returned result is only valid from the point of that particular model used [18]. On the other hand, there are the filter methods, which tend to be faster and more efficient from the computational point of view; the results are easier to interpret; and the provided results are unique in the sense that they are independent of any model because they do not adjust any model. These methods are typically based on statistical association measures between random variables.
Among those statistical association measures, Pearson’s correlation coefficient offers simplicity and ease of interpretation, but it assumes that the relationship between the involved random variables is linear, which may not be true in problems with such as complex relationships as in materials science. Even so, Pearson coefficient was utilized in [19] as a filter method to select input descriptors for machine learning algorithms, in [20, 21] to quantify the linear correlation between synthesis descriptors Perovskite solar cells, in [9, 22] to determine the quality of the predictions over five machine learning algorithms. But this metrics only can show the variables that have a linear correlation and ignoring others. On the other hand, there are rank correlation measures that work for non-linear phenomena, but only if its underlying relationship is monotonic type [23]. In contrast, there are mutual information measures that are able to capture general non-linear dependencies between solar cell descriptors, which is more appropriate. Furthermore, mutual information is relatively resistant to noise and outliers, and no assumptions are required about the probability distributions of the involved variables. These properties make the MI an adequate measure to measure the statistical association and, therefore, the relevance of the descriptors with respect to important variables associated with the performance of Perovskite solar cells.
Identifying relevant descriptors for Perovskite solar cell synthesis is crucial for advancing the field. Identifying the key factors that influence performance can streamline the development process, leading to more efficient and cost-effective solar cell production [24]. This knowledge could reduce the need for time-consuming and resource-intensive trial-and-error methods and allow for targeted experimentation and optimization. Scientists can accelerate progress towards high performance and stable Perovskite solar cells by understanding the meaning of these descriptors.
In the present work, it is proposed to use the measure of mutual information in order to quantify the amount of information contained in the descriptive variables of the synthesis process with respect to physicochemical properties of Perovskite solar cells. These properties measure the Perovskite solar cell performance and correspond to Open Circuit Voltage (Voc), Short Circuit Current Density (Jsc), Fill Factor (FF), and Power Conversion Efficiency (PCE).
2 Method
2.1 Data
The data used in present study were taken from the dataset published in [25], which consists of 43, 239 records with 411 descriptors: 262 inputs (characteristics) that describe the synthesis process of the Perovskite solar cell and 149 outputs including the performance values \(V_{oc}, \,\, J_{sc},\,\, FF,\,\, PCE \). These data were extracted by manual review from 16, 000 scientific articles published since 2008 (the first studies) to 2020. The descriptors could be Boolean, categorical, or numeric (integer or float).
We opted for analyzing the performance parameters \(V_{OC},\) \(J_{SC}, FF\), and PCE. During the pre-processing stage, those input variables with zero variance were discarded because they do not provide any information. Additionally, those categorical variables that presented more than 100 categories were discarded due to the possibility of increased uncertainty in the mutual information estimation. Finally, those observations related to Perovskites of more than one layer were removed in order to reduce the complexity of the analysis.
Out of total of 49 variables resulting from the pre-processing, 9 are categorical, 23 are numeric, and 17 are Boolean. Categorical and Boolean variables were converted to numeric variables using the LabelEncoder tool from the sklearn Python library. On the other hand, the data was encoded in order to represent the Perovskite material in terms of the proportion of elements of Perovskite layer: MA, MethylAmmonium; FA, formaldehyde; Cs, Cesium; Pb, Lead; Sn, Tin; I, Iodine; Br, Bromine; Cl, Chlorine. This representation is easier to interpret. In addition, three variables were created to represent the A, B, and X ions of the Perovskite structure, where \(A = MA - FA -Cs\), \(B = Pb - Sn\), and \(X = I - Br - Cl\).
2.2 Mutual information
The mutual information I(X, Y) is viewed as a measure of statistical dependence between the two random variables X and Y. It is symmetric in X and Y, that is, \(I(X, Y) = I(Y, X)\); it is non-negative \(I(X, Y) \ge 0\); and it is equal to zero if X and Y are independent random variables. The MI between two random variables x and y, with joint density \(f_{X, Y}(x, y)\), is defined as [26],
Mutual information (MI) can also be expressed in terms of Entropy \(H(\cdot )\), which is a measure of uncertainty of random variables. I(X, Y) is defined as the reduction in uncertainty of a random variable due to another random variable. In particular, \(I(X, Y) = H(X) - H(X \mid Y)\) is the reduction of the uncertainty of X due to the knowledge of (Y), and \(I(X, Y) = H(Y) - H(Y \mid X)\) is also the reduction in the uncertainty of Y due to the knowledge of X. In addition, it can be defined as \(I(X, Y) = H(X) + H(Y) - H(X, Y)\).
\(H(\cdot )\) can also be viewed as the amount of information, on the average, required to describe that random variable. For the case of a continuous random variable, the term differential entropy is typically used instead of entropy because not all the properties of discrete mutual information are the same for continuous mutual information. The differential entropy H(X) of a continuous random variable X with density \(f_X(x)\) is defined as [26],
Although mutual information is able to detect and quantify non-linear relationships between random variables, the interpretation of the quantified value, unlike the Pearson correlation \(\rho \), is less intuitive. Pearson’s correlation provides standardized values between -1 and 1 that indicates the level of the type of relationship. In contrast, MI gives only positive values, and they are not standardized.
A transformation of the MI value is proposed in [27], called informational correlation coefficient (ICC), which provides a standardized version (zero as the minimum value and one as the maximum value) that allows comparisons with the Pearson’s correlation \(\rho \). Assuming we have a bivariate normal distribution, ICC would be equal to \(\rho \). Recently, in [28], it was proposed a modified version of \(\rho _{ICC}\) denoted as \(\rho _{MICC}\) (informational correlation coefficient, MICC) in order to improve the performance of ICC and to reduce its bias. This transformation is denoted as
where \(\mathcal {W}(\cdot )\) is the Lambert’s function and I(x, y) is the estimated mutual information value. As a consequence, an MI value of 0.2 would be comparable to a Pearson’s correlation value of 0.34.
MI is useful to determine those descriptors with the highest statistical association with respect to performance parameters. However, due to the interaction between input descriptors, the MI by itself still does not answer the question about which is the best set of descriptors that as a whole provides the highest information [29]. Conditional mutual information, as a concept, could be used to deal with that question. It is defined as the reduction in the uncertainty of X due to knowledge of Y when Z is given or provided [30]:
2.3 Mutual information estimation
As observed in Eq. (1), the MI calculation is straightforward if the underlying joint probability distribution \(f_{X, Y}(x, y)\) is already known; however, it is typically unknown, and our knowledge of the distribution comes from the data itself. For the case of discrete variables, the estimation of the joint probability density function from the data is a straightforward task; however, this is not the case for continuous type variables. In these cases, non-parametric methods are required. They make use of the geometry of the underlying sample to estimate the local probability density function \(f(x_i, y_i)\) from the data \((x_i, y_i)\) [31]. The most popular method for estimating (MI) is by using the non-parametric estimator introduced in [32], which estimate MI from k-nearest neighbour statistics. The k-nearest neighbour estimator is a non-parametric method that estimates the density of data points in the feature space, and uses this information to compute mutual information. For the case in which there are both discrete and continuous variables, improved versions have been proposed, such as the one proposed in [33].
We used \(scikit-learn\) python library to estimate MI, which is based on k-nearest neighbour methods shown in [32] and in [33]. Although this tool has the particularity that it can work with all the input variables at the same time, it was decided to carry out the estimation variable by variable because the high number of missing values on the dataset reported in [4] in the input variables causes a reduction in the amount of available complete data, which leads to the problem of the curse of dimensionality.
We opted for performing cross-validation and bootstrapping procedures to estimate a MI value with less uncertainty, and it also provides the standard error associated to the estimated MI value. In particular, 10-fold cross-validation is performed forming a vector of \(10\times 1\) MI estimates. The average of this vector is reported as one instance of a 10-times bootstrapping process, where the average of these ten values is reported as the final MI estimation and its standard deviation corresponds to the standard error of the MI estimation.
3 Results and discussion
MI estimates, in respect to each of the performance variables, are shown in Fig. 1. It shows those 20 most relevant variables among the 49 that were included in this. In general, it is observed that ion X is the factor that most impacts the performance of the solar cell. On the other hand, variables such as Band Gap, Perovskite layer thickness, and A and B ions are also important.
Iodine concentration consistently appears as the most relevant from the perspective of PCE, \(V_{oc}\), \(J_{sc}\), and FF, as observed in Fig. 1. The presence of iodine in the absorber layer intervenes in the bandgap adjustment, thus improving the Voc [34]. Moreover, it helps to obtain films with larger grain sizes and fewer defects [35, 36]. On the other hand, bromine concentration is another feature with the highest MI value. It shows the importance of the cation X in the perovskite structure, and it is used for improving the solar cell performance and for reducing the effects of iodine in the cells.
MA, FA, and Cs concentrations are also variables with remarkable relevance. This result seems to be in agreement with [20], which conclude that \(A-\)site cations have the most significant influence on PCE. In that same work, regression techniques such XGBoost were used to determine those most relevant descriptors. In the present work, although MA is not the one that contributes the most in terms of information, it appears among the most important. In [20], by performing a Pearson correlation matrix, considerable correlation between PCE and A-site cations is observed. On the other hand, Eg is relevant in respect to \(PCE, V_{oc}\), and \(J_{sc}\). It is important to take into account the intrinsic relationship between Eg and \(V_{oc}\).
A matrix of Pearson correlation values was obtained in [20] in order to detect interactions between descriptors. Similarly, but for the case of non-linear relationships, in the present work, we obtained a matrix of statical associations between descriptors (see Figure S1 in the supporting information). In [20], as well as in present work, considerable correlations between PCE and \(A-\)site cations are observed.
Results about correlations with Tperovskite are not reported in [20]. In particular, when we estimate the Pearson correlation coefficient between Tperovskite and PCE, we obtain a value of \(-0.0005\), indicating that there is no linear relationship between the two variables, but using the mutual information, a value of 0.07 is obtained, indicating that a relationship does exist. A scatter plot between Tperoskite and PCE is shown in Fig. 2.
Figure S1 (including in the supporting information) shows that there are interactions between descriptors. For example, there is a high relationship between FA (formaldehyde ratio) and MA (MethylAmonium ratio), suggesting that only one of the two should be included in a feature set. The same is true for the case of iodine content I and bromine content Br. Eg and Tperovskite show a considerable relationship. In this scenario, it is appropriate to apply feature selection methods based on partial information measures, in particular, conditional mutual information.
Regarding CMI (conditional mutual information), the results for the first order case are shown in Table 1. According to these results, if we take Tperovskite as the best variable (as shown in Fig. 1), then the variable that makes it the best team, that provides more additional information to that already provided by Tperovskite, is I.
It is important to clarify that several descriptors contained in the data set were not taken into account due to the amount of data available. That is, those variables with little data were discarded in order to provide a more reliable estimate of the mutual information. Although the dataset consists of more than 40, 000 observations, it is plagued by missing data. In particular, taking the relative humidity versus PCE variable (see Fig. 3) yields only 67 observations out of 42, 000. The graph shows lower values of PCE for very low and high values of relative humidity. The best PCE values are for relative humidity values between 30 and \(40\%\). In other words, it should be necessary to use statistical association measures that are adjusted to detecting non-liner relationships in case of missing data.
4 Conclusions and future work
We introduce a method that quantifies the degree of statistical association between descriptors of Perovskite solar cells. It is able to measure its statistical association even in cases of non-linear relationships between descriptors; moreover, since we do not use any model, this estimation does not depend on any model either, thus achieving a general quantification of input features relevance. With this study, we have found that ion X is the factor that most impacts the performance of the solar cell. On the other hand, variables such as Band Gap, Perovskite layer thickness, and A and B ions are also important.
Regarding future work, due to the amount of missing data and the curse of dimensionality, it is difficult to estimate the joint mutual information between sets of input descriptors and performance measures in order to establish the set of optimal features. It is important to take into account that mutual information estimation implies N-dimensional probability density function estimation procedures. On the other hand, by using feature selection by means of adjustable models (wrapper methods), we would also experiment problems. As the dimension of the model increases, the number of available complete observations decreases, thus having a less number of observations as we include input descriptors. We would be discarding information as we increase the model complexity. Therefore, it would be necessary to carry out adequate feature selection methods for missing data problems. Another strategy is to implement techniques such as MICE (Multiple Imputation for Chained Equations) to estimate missing values before estimations.
Availability of data and materials
Data used in the work is published in the Perovskite Database Project [4]
Change history
31 May 2024
Funding information updated.
References
J.J. Yoo, G. Seo, M.R. Chua, al, Efficient perovskite solar cells via improved carrier management. Nature 590, 587–593 (2021). https://doi.org/10.1038/s41586-021-03285-w
D.V. Anand, Q. Xu, J. Wee, K. Xia, T.C. Sum, Topological feature engineering for machine learning based halide perovskite materials design. npj Comput. Mater. 8(1), 203 (2022). https://doi.org/10.1038/s41524-022-00883-8
Performance analysis of perovskite solar cells in 2013-2018 using machine-learning tools. Nano Energy 56, 770–791 (2019). https://doi.org/10.1016/j.nanoen.2018.11.069
H.A.G.-F.A.e.a. T.J. Jacobsson, An open-access database and analysis tool for perovskite solar cells based on the fair data principles. Nature Energy 7, 107–115 (2022). https://doi.org/10.1038/s41560-021-00941-3
Chemical Materials Solutions Center-Korea Research Institute of Chemical Technology: Perovskite Solar Cells DB. http://www.perovskite.info/about Accessed 11 Jan 2022
J. Velez-Sanchez, M. Botero-Londoño, A. Sepúlveda, C. Otalora-Bastidas, C. Camacho Parra, Absorber layer thickness as a new feature in statistical learning tools of perovskite solar cells. J. Appl. Res. Technol. 21, 858–865 (2023) https://doi.org/10.22201/icat.24486736e.2023.21.5.2057
Q. Tao, P. Xu, M. Li, W. Lu, Machine learning for perovskite materials design and discovery. npj Comput. Mater. 7(23) (2021). https://doi.org/10.1038/s41524-021-00495-8
C. Suryanarayana, Mechanical alloying and milling. Prog. Mater Sci. 46(1), 1–184 (2001). https://doi.org/10.1016/S0079-6425(99)00010-9
J. Li, B. Pradhan, S. Gaur, J. Thomas, Predictions and strategies learned from machine learning to develop high-performing perovskite solar cells. Adv. Energy Mater. 9(46), 1901891 (2019)
F. Khmaissia, H. Frigui, M. Sunkara, J. Jasinski, A.M. Garcia, T. Pace, M. Menon, Accelerating band gap prediction for solar materials using feature selection and regression techniques. Comput. Mater. Sci. 147, 304–315 (2018). https://doi.org/10.1016/j.commatsci.2018.02.012
P. Raccuglia, K.C. Elbert, P.D.F. Adler, C. Falk, M.B. Wenny, A. Mollo, M. Zeller, S.A. Friedler, J. Schrier, A.J. Norquist, Machine-learning-assisted materials discovery using failed experiments. Nature 533(7601), 73–76 (2016). https://doi.org/10.1038/nature17439
R. Ramprasad, R. Batra, G. Pilania, A. Mannodi-Kanakkithodi, C. Kim, Machine learning in materials informatics: recent applications and prospects. npj Comput. Mater. 3(1), 54 (2017). https://doi.org/10.1038/s41524-017-0056-5
K.T. Butler, D.W. Davies, H. Cartwright, O. Isayev, A. Walsh, Machine learning for molecular and materials science. Nature 559(7715), 547–555 (2018). https://doi.org/10.1038/s41586-018-0337-2
S.R. Kalidindi, M. De Graef, Materials data science: current status and future outlook. Annu. Rev. Mater. Res. 45(1), 171–193 (2015)
C.M. Bishop, Pattern recognition and machine learning (Information Science and Statistics), 1st (edn.) Springer, USA, (2007)
T. Hastie, R. Tibshirani, J. Friedman, The elements of statistical learning : data mining inference and prediction, Springer, Boston, (2009)
M. Mammeri, L. Dehimi, H. Bencherif, F. Pezzimenti, Paths towards high perovskite solar cells stability using machine learning techniques. Sol. Energy 249, 651–660 (2023). https://doi.org/10.1016/j.solener.2022.12.002
I. Guyon, A. Elisseeff, An introduction to variable and feature selection. J. Mach. Learn. Res. 3, 1157–1182 (2003). https://doi.org/10.1162/153244303322753616
Y. Hu, X. Hu, L. Zhang, T. Zheng, J. You, B. Jia, Y. Ma, X. Du, L. Zhang, J. Wang, B. Che, T. Chen, S.F. Liu, Machine-learning modeling for ultra-stable high-efficiency perovskite solar cells. Adv. Energy Mater. 12(41), 2201463 (2022)
Y. Lu, D. Wei, W. Liu, J. Meng, X. Huo, Y. Zhang, Z. Liang, B. Qiao, S. Zhao, D. Song, Z. Xu, Predicting the device performance of the perovskite solar cells from the experimental parameters through machine learning of existing experimental results. J. Energy Chem. 77, 200–208 (2023). https://doi.org/10.1016/j.jechem.2022.10.024
Y. Liu, W. Yan, S. Han, H. Zhu, Y. Tu, L. Guan, X. Tan, How machine learning predicts and explains the performance of perovskite solar cells. Solar RRL 6(6), 2101100 (2022)
H. Sahu, W. Rao, A. Troisi, H. Ma, Toward predicting efficiency of organic solar cells via machine learning and improved descriptors. Adv. Ener. Mater. 8(24), (2018). https://doi.org/10.1002/aenm.201801032 . Cited by: 144
J. Dickinson Gibbons, S. Chakraborti, Nonparametric statistical inference 4th (edn.) Marcel Dekker, USA, (2003)
M.M. Salah, Z. Ismail, S. Abdellatif, Selecting an appropriate machine-learning model for perovskite solar cell datasets. Mater. Renewable Sustainable Energy 12(3), 187–198 (2023). https://doi.org/10.1007/s40243-023-00239-2
T.J. Jacobsson, A. Hultqvist, A. García-Fernández, et al, The perovskite database project. https://www.perovskitedatabase.com/ Accessed 11 Jan 2022
T.M. Cover, J.A. Thomas, Elements of information theory (Wiley Series in Telecommunications and Signal Processing) Wiley-Interscience, USA, (2006)
E.H. Linfoot, An informational measure of correlation. Inf. Control 1(1), 85–89 (1957). https://doi.org/10.1016/S0019-9958(57)90116-X
A modification of Linfoot’s informational correlation coefficient. Austrian J. Stat. 46(3-4), 99–105 (2019). https://doi.org/10.17713/ajs.v46i3-4.675
G. Wang, F.H. Lochovsky, Feature selection with conditional mutual information maximin in text categorization. In: Proceedings of the thirteenth ACM international conference on information and knowledge management. CIKM ’04, pp. 342–349. Association for Computing Machinery, New York, NY, USA (2004). https://doi.org/10.1145/1031171.1031241
Entropy, relative entropy, and mutual information, pp. 13–55. Wiley, Ltd (2005). Chap. 2. https://doi.org/10.1002/047174882X.ch2
N. Carrara, J. Ernst, On the estimation of mutual information. Proceedings 33(1), (2019). https://doi.org/10.3390/proceedings2019033031
A. Kraskov, H. St ogbauer, P. Grassberger, Estimating mutual information. Phys. Review. E. 69(6), (2004)
B.C. Ross, Mutual information between discrete and continuous data sets. PLoS ONE 9(2), (2014). https://doi.org/10.1371/journal.pone.0087357
N.J. Jeon, J.H. Noh, Y.C. Kim, W.S. Yang, S. Ryu, S.I. Seok, Solvent engineering for high-performance inorganic-organic hybrid perovskite solar cells. Nat. Mater. 13(9), 897–903 (2014). https://doi.org/10.1038/nmat4014
H. Zhou, Q. Chen, G. Li, S. Luo, T.-B. Song, H.-S. Duan, Z. Hong, J. You, Y. Liu, Y. Yang, Interface engineering of highly efficient perovskite solar cells. Science 345(6196), 542–546 (2014)
W. Nie, H. Tsai, R. Asadpour, J.-C. Blancon, A.J. Neukirch, G. Gupta, J.J. Crochet, M. Chhowalla, S. Tretiak, M.A. Alam, H.-L. Wang, A.D. Mohite, High-efficiency solution-processed perovskite solar cells with millimeter-scale grains. Science 347(6221), 522–525 (2015)
Funding
Open Access funding provided by Colombia Consortium. The Universidad Industrial de Santander (www.uis.edu.co) provided financial support through the project 3952 and MINCIENCIAS-Colombia (minciencias.gov.co) through the project 890.
Author information
Authors and Affiliations
Contributions
All authors contributed to the study conception and design. Material preparation, data collection, and analysis were performed by Jeisson Emilio Vélez Sánchez, Alexander Sepúlveda Sepúlveda, and Monica Andrea Botero Londoño. The first draft of the manuscript was written by Alexander Sepúlveda, and all authors commented on previous versions of the manuscript. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare no competing interests.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Vélez, J., Botero L., M.A. & Sepulveda, A. Measurement of information content of Perovskite solar cell’s synthesis descriptors related to performance parameters. emergent mater. (2024). https://doi.org/10.1007/s42247-024-00667-4
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s42247-024-00667-4