Introduction

As one of the most effective infertility treatments, assisted reproductive technology (ART) is becoming increasingly advanced and widely utilized due to continuous improvements in essential technologies such as controlled ovarian stimulation (COS), ultrasound-guided oocyte collection, sperm processing, embryo culture and transfer, pre-embryo transfer genetic diagnosis, etc. Despite significant advancements, the success rate of In vitro fertilization (IVF) cycle seems to have reached a plateau: currently, the clinical success rate still hovers at 40–50% with the final live birth rate of around 30% [1]. Given that the primary objective of IVF treatment is to attain live birth, the current situation is far from ideal. The low success rate, necessity for repeated cycles, costly treatments, and complex procedures impose a significant financial and emotional burden on infertile couples. If clinicians are able to make an accurate and reliable prediction of the live birth outcome (LBO) prior to the IVF cycle and adjust the treatment strategies accordingly, improved LBO could be achieved, which would hold great significance for both clinicians and patients.

Several factors have been identified as potential influencers of the final pregnancy outcome, such as basic clinical characteristics, COS strategies, embryo transfer-related details, and so forth [2]. Van Loendersloot et al. [3] developed a pregnancy rate prediction model based on 13 indicators including the female age, duration of subfertility, previous ongoing pregnancy, male subfertility, diminished ovarian reserve, endometriosis, basal follicle stimulating hormone (bFSH), number of failed IVF cycles, fertilization, number of embryos, mean morphological score per Day 3 embryo, presence of 8-cell embryos on Day 3 and presence of morulae on Day 3. Nevertheless, we believe that the prediction of LBO deserves more attention than that of the pregnancy rate. On this particular research topic, a retrospective cohort study conducted by Metello et al. [4] showed that the age of the patient, anti-Müllerian hormone (AMH), antral follicle count (AFC), and infertility factors are significant determinants of LBO. Additionally, a multi-center big data study conducted by Wen et al. [5] confirmed that female age, cycle number, female body mass index (BMI), male factor, ovulation disorder, and endometrial thickness are important predictors of LBO. The current prediction models have only focused a limited number of predictors before the IVF cycle and lacked consistent and reliable standards, which limits their application in real clinical practice. For a complete IVF cycle, there are numerous factors influencing the LBO, especially after oocyte retrieval (such as the embryo transfer strategy). Therefore, it is highly meaningful to comprehensively evaluate the influencing factors in the IVF cycle (including post-oocyte retrieving factors) and establish a comprehensive and accurate prediction model for LBO.

Based on historical clinical information of patients and machine learning methods, this research aims to develop a prediction model for LBO (i.e., a classification model of live birth or not), which fully considers representative indicators of patients during IVF-ET cycle, such as basic clinical characteristics, COS strategies, ovarian responsiveness to gonadotropin (Gn), embryo transfer strategy, etc. The proposed model is then utilized for predicting the LBO as well as optimizing the embryo transfer strategy in real clinical practice. The establishment of the LBO model can play a positive role in optimizing treatment plans, reducing short-term and long-term risks of IVF treatment, and improving LBO. Additionally, it can provide a scientific and objective assessment of the outcome of IVF treatment, alleviate patient’s psychological burden, and increase treatment confidence and patient’s compliance during the process.

Methods

Study design and participants

Patients undergoing IVF/Intracytoplasmic sperm injection (ICSI) cycle in the Reproductive Center of Renmin Hospital of Wuhan University were enrolled in our study, and each patient is transferred with one or two embryos in each fresh cycle. Exclusion criteria comprised: (1) pulmonary tuberculosis; (2) serious medical diseases such as hypertension, diabetes, liver, and kidney diseases; (3) uterine malformation, intrauterine adhesion and hydrosalpinx; (4) oocyte donation cycles or natural cycles; (5) Progestin-primed ovarian stimulation (PPOS), luteal phase stimulation or micro-stimulation program; (6) chromosomal abnormalities in infertile couples. A total of 1405 women’s clinical data were included in developing the LBO prediction model.

IVF patients underwent COS and transvaginal oocyte retrieval following human chorionic gonadotropin (HCG) trigger when one or two dominant follicles reached 18 mm in diameter. The selected sperm and egg was fertilized to form embryos. One or two embryos were cultured and transferred at either cleavage stage (2–3 days after oocyte collection) or blastocyst stage (5–6 days after oocyte collection). Serum HCG test and B-ultrasonography were conducted 14 and 30 days respectively post-transfer to confirm pregnancy. Live birth was defined as the delivery of a fetus alive at 28 weeks gestation, who remained alive for at least one month.

To build the LBO model, 16 related factors were analyzed initially in our study, including 5 basic clinical characteristics (age, BMI, infertility type, infertility duration, infertility cause); and 11 clinical cycle indexes (COS treatment regimen, Gn starting dose, Ovarian sensitivity index (OSI), E2 level on HCG day, Progesterone (P) value on HCG day, luteinizing hormone (LH) value on HCG day, endometrial thickness on HCG day; pronuclei (2PN) number, transferable embryos number, high-quality embryos number, embryo transfer strategy (the stage and embryos transferred number)).

The proposed research is outlined in Fig. 1.

Fig. 1
figure 1

Overview of the proposed research. (1)~(2) Collect data of patients and find the factors related to the LBO; (3) Conduct univariate analysis to determine the significant influencing factors of LBO initially; (4) Conduct multivariate analysis to determine the influencing factors of LBO further; (5) Based on machine learning method, the ANN-based LBO model and SVM-based LBO model are built; (6) The prediction performance of the two models are compared, and the one, i.e., the SVM-based LBO model, that has better modeling performance is selected. (7) 81 new patients were involved in the clinical application of the proposed model to verify the prediction performance and the function of recommendation of the embryo transfer strategy

Machine learning methods

The LBO prediction model aims to, by taking the clinical information of patients as input, output the result of live birth or not. Therefore, it is a classification model. This study uses two typical classification models, i.e., artificial neural network (ANN) and support vector machines (SVM) to build the LBO prediction model. The corresponding two models are defined as the ANN-based LBO model and the SVM-based LBO model.

Artificial neural network model

By choosing appropriate ANN hyperparameters, ANN, in theory, can approximate any linear and nonlinear function. The dataset is randomly divided into a training set (70%), a validation set (17%), and a test set (13%). These sets are respectively used for calibrating and optimizing the parameters of the ANN, adjusting the hyperparameters and complexity of the model, and testing the generalization ability of the trained ANN model. Note that the sample division for the three sets is determined via a trial-and-error method: we have tried dozens of combinations for the percentage of training, validation, and test set, and the one with the best prediction performance is our final selection. During training, the Polak-Ribiére conjugate gradient algorithm is employed to update the parameters of the neural network, with the iteration termination conditions being: the maximum number of iterations is 2000, the minimum gradient is \({1\times 10}^{-10}\), or the minimum iteration step size is \({1\times 10}^{-6}\). Since the proposed model is a classification model, cross entropy function is selected as the loss function.

Support vector machines (SVM) model

SVM is also a powerful classification model in machine learning. When training the SVM model, the data is randomly divided into a training set (80%) and a test set (20%), with the main parameters set as follows: SVM kernel function: Gaussian function; kernel scale parameters: automatically selected; optimization algorithm: ISDA iterative single data algorithm. Other parameters of the SVM model (such as the number of cross-folds, tolerance of gradient differences, maximum number of numerical optimization iterations, etc.) are automatically selected and optimized using the OptimizeHyperparameters function of the SVM algorithm.

Evaluation indicators of the prediction model

The following indicators of classification models can express the modeling performance of the proposed ANN-based and SVM-based LBO:

(1) Graphic indicators: receiver operating characteristic (ROC) curve and area under curve (AUC) value. Among them, the closer the ROC curve is toward the point (0,1), i.e., the further it deviates from the 45-degree diagonal to the upper left corner, the larger the area of the AUC, and the better performance of the classification model.

(2) Quantitative indicators: precision, sensitivity (also called recall), accuracy, and F1 score. The greater the precision, sensitivity, accuracy and F1 score, the better the performance of the prediction model.

Clinical application and validation of the model

The clinical application of the proposed model is applied to the incoming patients after the model has been built. With the same screening criterion as these for building the mode, we select 82 patients who underwent IVF-ET treatment at the Reproductive Center of Renmin Hospital of Wuhan University, to whom we apply the proposed model for predicting the LBO and customizing the embryo transfer strategy.

Statistical and machine learning model

Univariate and multivariate analyses were applied to determine whether the 16 factors, as previously described, had statistically significant effects on the LBO. The selected meaningful influencing factors (P<0.05) were taken as input to establish the machine learning model of LBO. Data processing and correlation analysis are completed in IBM SPSS Statistics 24. In the univariate analysis, continuous and categorical variables were analyzed by \(t\)-test and \({\chi }^{2}\)-test, respectively, while the multivariate analysis was conducted by binary logistic regression. A two-sided test was performed, with P < 0.05 considered significant.

The ANN and SVM machine learning method is programmed and implemented in MATLAB R2021a. The software prototype “Decision Support System of IVF–Embryo transfer strategy recommendation”, as shown in Fig. 2, is developed in Visual Studio 2019 and QT 6.0.

Fig. 2
figure 2

Prototype software - Decision Support System of IVF – Embryo transfer strategy determination. For each patient, the user manually inputs the values of 7 influencing factors on the left side of the software and clicks the “Predict the LBO” button on the right to call the SVM-based LBO model embedded in the software. The predicted result will show in the dialog box of “LBO”. By clicking the “Recommend embryo transfer strategy” button, the strategy that has good LBO will be recommended in the dialog box. The information displayed on the software interface in Fig. 3 is the clinical information of Patient # 48 and its corresponding predicted LBO, for which the predicted LBO is “Non-live birth” and the recommended strategy is “2 embryos at cleavage stage”

Results

Univariate analysis of factors influencing live birth outcomes

The 1405 patients were divided into a live birth group (592 patients) and a non-live birth group (813 patients), as listed in Table 1. Univariate analysis revealed that 10 out of the 16 potential influencing factors exhibited statistical significance on the outcome of live birth: age (P < 0.001), OSI (P = 0.003), infertility cause (P = 0.048), COS treatment regimen (P < 0.001), Gn starting dose (P = 0.001), endometrial thickness on HCG day (P = 0.007), LH value on HCG day (P < 0.001), P value on HCG day (P = 0.032), 2PN number (P = 0.049) and embryo transfer strategy (P < 0.001). The age, P value of HCG day, and LH value of HCG day in the live birth group were significantly lower than those in the non-live birth group (P<0.05); and OSI, Gn starting dose, endometrial thickness on HCG day and 2PN number of the live birth group were significantly higher than those of non-live birth group (P<0.05). However, there were no statistically significant differences in infertility type, infertility duration, BMI, E2 level on HCG day, transferable embryo number, and high-quality embryo number between the live birth group and the non-live birth group (P > 0.05).

Table 1 Demographic information and univariate analysis results of the 16 influencing factors of the LBO model

Multivariate analysis of factors influencing live birth outcomes

Ten factors selected in univariate analysis (age, OSI, infertility cause, COS treatment regimen, Gn starting dose, endometrial thickness on HCG day, LH value on HCG day, P value on HCG day, 2PN number, embryo transfer strategy) were further included in binary logistic regression for multivariate analysis. The results are presented in Table 2. It shows that age (P < 0.001), OSI (P = 0.027), COS treatment regimen (P ≤ 0.028), Gn starting dose (P < 0.001), endometrial thickness on HCG day (P < 0.001), P value on HCG day (P = 0.023), embryo transfer strategy (P ≤ 0.045) were significant correlated with LBO (P < 0.05). The remaining three factors (infertility causes, LH value on HCG day, and 2PN number) have no significance (P > 0.05), and therefore were excluded in the subsequent modeling of LBO process.

Table 2 Multivariate analysis results of the 10 initial screened impact factors

Live birth outcome model based on machine learning

Given the 7 final selected impact factors from multivariate analysis, the proposed ANN-based and SVM-based LBO model can be built with the input and output being:

7 inputs: Age, OSI, COS treatment regimen, Gn starting dose, endometrial thickness on HCG day, P value on HCG day, and embryo transfer strategy.

Output: LBO (whether live birth is achieved or not)

Modeling results of ANN-based LBO model

The total 1405 samples were randomly divided into a training set (979 cases), a verification set (243 cases), and a test set (183 cases). To effectively train the model, the samples in the three sets are randomly chosen from the total dataset, and each sample can be randomly chosen only once when constructing the three sets. In this way, the model can be trained with good prediction performance and generalization. The number of nodes and hidden layers of the ANN model was determined by trial and error: the ANN-based LBO model in this work has 2 hidden layers containing 5 and 3 nodes, respectively.

The modeling performance of the proposed ANN-based LBO is shown in Fig. 3. The ROC curves of the training set, validation set, and test set are shown in Fig. 3(a), (b), and (c), respectively, with the AUC being 0.726, 0.719, and 0.701. The precision, sensitivity, accuracy, and F1 score is listed in Table 3.

Fig. 3
figure 3

ROC curves of ANN-based LBO model. (A) the training set; (B) the validation set; (C) the test set

Table 3 Modeling performance of the proposed ANN-based LBO model

The overall prediction performance of the proposed model on the training set is a litter better than that on the validation and test sets. This is because the parameters of the ANN model itself are calibrated and optimized from the training set. In addition, there is not much difference between the performance of the two, indicating that the established ANN-based LBO model has good generalization ability.

Modeling results of SVM-based LBO model

Unlike the ANN model, the SVM model only needs to divide the samples into two groups: training set (1124) and test set (281). The two sets are also generated randomly, like those for the ANN model. For both the training set and test set, the established SVM-based LBO model has good classification performance, with the two ROC curves shown in Fig. 4 and AUCs being 0.912 and 0.854, respectively. The precision, sensitivity, accuracy, and F1 score are listed in Table 4.

Fig. 4
figure 4

ROC curves of SVM-based LBO model. (A) the training set; (B) the test set

Compared with the test set, the training set has better performance except for sensitivity. This is because the parameters of the SVM model itself are determined based on the training set, so it has a better classification effect on the training set. In addition, for the test set itself, the SVM model’s AUC = 0.854 and F1 score = 77.18% for its classification results, indicating that the established model has good generalization ability, facilitating its further clinical application and verification.

Table 4 Modeling performance of the proposed SVM-based LBO model

Comparison and selection of the final LBO model

To select the final model for predicting the LBO of the IVF-ET cycle, the modeling performance of the proposed two models is compared. Regardless of the training set or the test set, the modeling evaluation indicators of SVM-based model is significantly better these of the ANN model: training set (AUC: 0.912 vs. 0.726; precision: 86.41% vs. 67.23%; sensitivity: 75.58% vs. 67.72%; accuracy: 84.78% vs. 75.52%; F1-score: 80.63% vs. 67.47%); test set (AUC: 0.854 vs. 0.701; precision: 77.50% vs. 62.96%; sensitivity: 76.86% vs. 68.92%; accuracy: 80.43% vs. 71.04%; F1-score: 77.18% vs. 65.81%). Therefore, the SVM-based LBO model is selected as the final model for predicting the LBO of patients under the IVF-ET cycle.

Validation and clinical application of the proposed SVM-based LBO model

Validation of the proposed SVM-based LBO model

Based on the proposed SVM-based LBO model, a prototype software called “Decision Support System of IVF – Embryo transfer strategy determination” is developed, with the user interface shown in Fig. 2.

The predictive performance of the established SVM-based LBO model was verified in actual clinical practices. For each of the new 82 patients, the information of 7 impact factors (i.e., OSI, COS treatment regimen, Gn starting dose, endometrial thickness on HCG day, P value on HCG day, and embryo transfer strategy) was taken as the input of the software (shown in Fig. 2), and the SVM-based LBO model embedded in the software is used for prediction.

The overall prediction results (i.e., ROC) of 82 patients are shown in Fig. 5, with the evaluation indicators of the prediction: AUC = 0.862, precision = 90.57%, sensitivity = 75.00%, accuracy = 74.39%, F1 score = 82.05%. Prediction results prove that the proposed model has good prediction performance, which further verifies the effectiveness of the model.

Fig. 5
figure 5

ROC of the proposed model on the clinical application sample

Also, the calibration plot and decision curve analysis (DCA) of the proposed model toward the clinical application sample are shown in Fig. 6. Both the apparent and bias-corrected curves are pretty close to the ideal line (Fig. 6(A)), and by utilizing our model, the net benefit could constantly be improved significantly for any threshold probability (Fig. 6(B)); therefore, the effectiveness of the model in terms of prediction the LOB and clinical application is further validated.

Fig. 6
figure 6

(A) Calibration plot of the proposed model; (B) DCA curve of the proposed model

Clinical application of the proposed model in recommending the embryo transfer strategy

In addition to the prediction of LBOs, the established model is utilized to guide the clinical practice, e.g., achieving optimal embryo transfer strategy in the IVF cycle. The process of customizing the embryo transfer strategy involves two steps:

  1. (1)

    Given six basic indicators of the patients (Age, OSI, COS treatment regimen, Gn starting dose, endometrial thickness on HCG day, and P value on HCG day), the corresponding LBO is predicted for the four embryo transfer strategies (i.e., Strategy-1: 1 embryo at cleavage stage; Strategy-2: 2 embryos at cleavage stage; Strategy-3: 1 embryo at blastocyst stage; Strategy-4: 2 embryos at blastocyst stage).

  2. (2)

    Among four predicted LBOs, the one with the good outcome (i.e., live birth) will be selected as the optimal strategy, which will be further used to assist clinicians in making the final decision on embryo transfer.

The recommended embryo transfer strategy based on our model for the 82 patients is shown in Fig. 7. As can be seen from the figure, there may be several different alternatives for the same patient, and all of which have the predicted live birth. For example, both Strategy-2 and Strategy-4 can give Patient #4 live birth; Patient #20 has even more options: Strategy-2, Strategy-3, and Strategy-4.

Fig. 7
figure 7

Comparison between model-based recommended strategy and clinician-based strategy (In the figure, the horizontal axis represents the patient ID., and the vertical axis represents the embryo transfer strategy)

Compared with a clinician-based embryo transfer strategy that relies on the experience of clinicians, the recommended strategy based on the proposed model may have better LBOs for some patients. For example, for Patient #48 shown in Fig. 2: the clinician-based decision is Strategy-4, and the outcome is non-live birth; however, the recommended result from our model is Strategy-2 (as shown in Fig. 7), meaning that, by choosing Strategy-2 instead of Strategy-4, Patient #48 could have a greater likelihood of live birth. Similarly, for Patient # 53, if Strategy-4 is adopted instead of Strategy-3, it has a greater probability of obtaining a live birth.

Discussion

This study constructed a novel LBO prediction model based on the retrospective clinical data from 1405 patients. Before constructing the model, univariate and multivariate analysis identified 7 statistically significant influencing factors out of 16: Age, OSI, COS treatment regimen, Gn starting dose, endometrial thickness on HCG day, P value on HCG day and embryo transfer strategy. By taking the 7 screened factors as inputs and the LBO as outputs, two machine learning models, i.e., the ANN-based LBO model and the SVM-based LBO model, were established. By comparing the evaluation indexes of the two models, the SVM-based LBO model demonstrates superior modeling performance, with precision: 77.50% (test set) ~ 86.41% (training set), sensitivity: 75.58%~76.86%, accuracy: 80.43%~84.78%, F1 score: 77.18%~80.63%, AUC: 0.854 ~ 0.912. Therefore, the SVM-based LBO model is selected as the final model for predicting the LBO. Compared with the prevailing research methods, the model proposed in this study demonstrates significantly improved predictive performance.

Our proposed model demonstrates that female age is a significant predictor of LBO. As women age, their ovarian reserve capacity and reactivity decrease, leading to a reduction in the number of oocytes and embryos acquired [2, 3, 6], decreased fertilization ability of eggs and developmental potential of embryos, and an increased proportion of abortion and abnormal birth [7, 8]. In clinical practice, it is advisable for older women to promptly arrange a pregnancy plan and actively engage in pregnancy assistance intervention.

Our finding in Sect. 3 also revealed that the Gn starting dose is an essential factor affecting the LBO. The individualization of Gn starting dose is a standard clinical practice during COS in patients undergoing ART treatment [9], and it is determined based on the patient’s age, AMH, AFC, bFSH, BMI, etc. [10]. A small amount of Gn starting dose will lead to insufficient follicle recruitment. However, a large amount will lead to excessive follicle recruitment [11], resulting in an increased incidence of OHSS, and a rise in the progesterone level during COS, ultimately leading to an increase in the cancellation rate of fresh cycle transfer or a decrease in pregnancy rate [12] due to asynchronous endometrial development.

In our proposed model, we have established a robust correlation between the Ovarian Sensitivity Index (OSI), a composite measure of ovarian response [13, 14], and the LBO in patients undergoing IVF treatment for the first time. While previous studies have relied on indicators such as the number of retrieved oocytes, which are often used to reflect ovarian responsiveness, to predict LBO [15], clinical observations have revealed that patients with high ovarian response can still achieve favorable LBO outcomes even with minimal Gn dosage and a smaller quantity of retrieved oocytes [16]. On the contrary, for patients with poor ovarian response (POR), even with increased doses and duration of Gn stimulation and a normal number of oocytes obtained, the live birth rate could still remain low [17, 18]. The concept of OSI serves as a superior measure of ovarian responsiveness to Gn stimulation [14, 19]. In this study, we chose OSI as an indicator due to its comprehensive reflection of both the total dose of Gn and the ovarian response. It allows for simplification in inputting the proposed model without compromising modeling performance.

Results in Sect. 3 also indicate that the COS treatment regimen is a crucial determinant of LBO. In contrast to agonists, antagonists lack the “flare-up” effect and can suppress Gn within a few hours without causing excessive pituitary gland suppression, thereby reducing the required dose and duration of Gn and significantly lowering the incidence of OHSS [20, 21]. Nevertheless, numerous studies have demonstrated that the live birth outcome associated with GnRH antagonist regimens is inferior to that of GnRH agonist long regimens [22,23,24]. In our study, a total of 353 patients were treated with GnRH antagonist regimens, among which only 94 achieved live birth, with a live birth rate of 26.63%, which was much lower than the 55.24% of the GnRH agonist long regimens and the 43.05% of the Ultra-long GnRH agonist regimens. This phenomenon may be attributed to reduced endometrial receptivity in infertile women undergoing GnRH antagonist regimens, leading to a decreased embryo implantation rate [25].

Scholars have suggested establishing a threshold for the P value on HCG day at 1.5-2.0ng/mL [26] due to the potential risk of fresh transplant pregnancy failure associated with higher values [27]. Furthermore, studies have indicated that an increase in P value on HCG day may lead to abnormal expression of endometrial embryo implantation proteins (vascular endothelial growth factor and placental expression factor) and differences in epigenetic profiles, ultimately leading to asynchronous development of the embryo and endometrium and adverse pregnancy outcomes [28]. Elevated P value on HCG day has also been linked to reduced oocyte and embryo quality, leading to lower rates of excellent embryos and cumulative live birth rates [29]. These findings are consistent with the negative partial regression coefficient of P value on HCG day (i.e., B=-0.310) presented in Table 2.

The appropriate endometrial thickness on HCG day is crucial for successful embryo implantation. Some studies have reported that no pregnancy occurs when endometrial thickness is less than 5 mm [30]. Additionally, when the endometrial thickness falls below 7 mm [31, 32] or exceeds 16 mm, it is not conducive to embryo transfer and implantation, which results in a low clinical pregnancy rate [33]. Currently, endometrial thickness of more than 7 mm is considered the conventional lower limit for embryo transfer. Our study suggests that endometrial thickness on HCG day should be regarded as an important index before embryo transfer; endometrial thickness should be adjusted to the ideal thickness before transfer to improve the live birth rate.

Embryos are usually transferred into the womb during the cleavage stage (2–3 days after fertilization) in early times. Prolongation of embryo culture time in vitro to blastocyst stage (5–6 days after fertilization) has been widely employed recently, with the benefits of screening high-quality embryos and keeping the embryo development stage relatively synchronous with the endometrium [34]. Studies have reported that blastocyst transfer yields a higher clinical pregnancy rate and birth rate than cleavage embryos when an equivalent number of embryos are transferred [35,36,37]. The data in Table 1 also demonstrates that the live birth rate following blastocyst transfer is significantly higher than that of embryo transfers at the cleavage stage (1 embryo transfer: 22.42% vs. 15.38%; 2 embryo transfer: 54.34% vs. 46.34%). Two cleaved embryos and one or two blastocyst embryos were independent promotors of clinical live birth compared with one cleaved embryo. It is important to note that the outcome of IVF live birth is not only related to the timing and number of embryos transferred but also the quality of the embryos [38]. However, due to the inherent subjectivity in embryo quality rating and the lack of widespread adoption of time-lapse technology in IVF centers, IVF center rating is not entirely consistent, so we did not include factors of embryo quality rating in this study. This is an aspect that will be explored in future research, utilizing a machine learning algorithm to assess the impact of embryo quality on LBOs. Overall, our findings highlight the significant influence of both stage and number of embryos transferred on the outcome of live birth.

The findings in Sect. 3.4 indicate that the recommended strategy derived from the proposed model may yield superior LBOs for some patients compared to the clinician-based embryo transfer approach. These results have potential implications for informing evidence-based decision-making in IVF clinical practice. Due to the traditional preference for transferring 2 embryos in IVF-ET treatment, the historical data utilized to model the LBO also predominantly supports this strategy, leading to a tendency towards a potential transplant strategy of 2 embryos (the clinical LBO of Strategy-3 is significantly higher than that of Strategy-1 in Sect. 3.4.2 ). However, scholars believe higher LBOs can be achieved with a single blastocyst transferred (i.e., Strategy-3), and this strategy is becoming the mainstream method of embryo transfer for its decreased multiple pregnancies rate and OHSS risk and increased cumulative pregnancy rate. As the practice of single blastocyst transfer gradually becomes prevalent in current and forthcoming data, the expertise and knowledge encompassed in the newly established model will be continuously updated to align with this mainstream embryo transfer strategy, thereby enhancing the recommended outcomes.

The proposed LBO prediction model in this study is developed by leveraging clinical big data through machine learning techniques. The model considers a comprehensive range of factors, including basic clinical characteristics, COS treatment regimen, OSI, P value on HCG day, endometrial thickness on HCG day, and embryo transfer-related indicators to capture the critical processes in the IVF cycle. Based on the established prediction model, it is possible to forecast LBO by iteratively selecting different embryo transfer strategies and ultimately identifying the optimal strategy based on expected outcomes.

In addition to the recommendation of embryo transfer strategy, the proposed model is also suitable for decision-making on other influencing factors of the LBO model. For example, the model can calculate the endometrial thickness interval, from which the expected live birth situation can be achieved. In this way, whether or not to perform embryo transfer can be determined based on actual endometrial thickness and the prediction LBO. This will be another future work.

Conclusions

We have developed an SVM-based model that can accurately predict the LBO of the IVF, as measured by the indicators of precision, accuracy, sensitivity, and F1 score. This model has been successfully applied in clinical practice to provide precise LBO predictions and as a reliable and scientific tool for guiding decision-making in IVF treatment. For example, it can recommend embryo transfer strategies, optimize COS treatment regimens, and determine the ideal endometrial thickness interval for future embryo transfers. It is of great significance in making treatment decisions, alleviating patients’ psychological burden, and promoting patient compliance throughout the IVF treatment cycle.