Introduction

The appearance of two pronuclei (2PN) 16–18 h after fertilisation is called normal fertilisation. The genetic state of embryos is unpredictable before transfer in in vitro fertilization (IVF) treatments. Typically, normal fertilisation and embryos of high morphological grade are initially selected for transfer1. However, 2PN embryos can also have chromosomal abnormalities, which increase or decrease the number of whole chromosomes or fragments, resulting in infertility, recurrent pregnancy loss (RPL), or birth defects after embryo transfer2. Pre-implantation genetic testing (PGT) technology based on high-throughput sequencing or microarray can detect the genetic status of embryos before transfer and select euploid embryos for transfer to reduce the risk of pregnancy3,4. The most important yet difficult task in IVF treatment with non-PGT is the selection of euploid embryos. Previous studies have shown that the euploidy rate of embryos is correlated with female age, morphological grade, and the number of frozen days of embryos5,6,7. However, it is subjective, less rigorous and not sufficient to predict the euploidy of embryos based on these factors alone in non-PGT treatments.

Embryonic development is a complex and dynamic process. The conventional morphological grading method for embryos is limited to observation and evaluation at a certain point in the embryo development period when observed under a microscope. Furthermore, this method requires the embryos to be exposed to changing environments, which could affect their developmental potency. Frequently, embryos are ignored when 2PN or important morphological images are not observed under a microscope. Although PGT can show the genetic state of embryos before transfer, biopsies in PGT treatments may damage the developmental potency of embryos. In addition, PGT technology is strictly restricted in clinical application in China.

In recent years, the morphokinetic characteristics of embryos have been associated with the assessment of their developmental potency without invasive testing using the AI method. Meseguer et al. proved that embryo implantation can be predicted by an AI-based model using morphological and morphokinetic characteristics8. A preclinical validation study by Storr et al. suggested that the results of these studies may differ from clinical settings, algorithms, or IVF clinics9. In 2020, Ver Milyea et al. showed an effective and positive prediction of embryo viability using an AI-based model using static images captured by optical light microscopy in a multicentre study including 8,886 embryos and 11 IVF clinics10. Lee et al. demonstrated that euploid embryos possessed better morphokinetic expressions than mosaic embryos and proved that the generally applicable KIDScore D5 model was able to distinguish euploid blastocysts with specific morphokinetic characteristics11. Furthermore, AI-based model has been applied to predict FET outcomes in many studies, showing effective and positive performance. Adolfsson et al. proved that compared to the morphological grade (Gardner Schoolcraft criteria), the D3 KIDScore model was superior for D3 embryos and blastocysts in predicting live birth12. Petersen et al. found that the KIDScore model could predict the implantation potency of embryos transferred on D313. Kato et al. demonstrated that the KIDScore model worked well for the prediction of pregnancy and live birth outcomes in patients of advanced age14. Ueno et al. demonstrated a deep learning model capable of predicting pregnancy after FET of blastocysts15. Gazzo et al. reported that embryo selection using the KIDScore model improved implantation rates after FET, and embryos with the highest KIDScore had a higher probability of being euploid and implanting16. The KIDScore model was significantly associated with implantation rates after FET in D5 blastocysts, although Reignier et al. reported that the predictive performance required further improvement17. Adolfsson et al. negated their previous study which proved that the KIDScore model was superior to the Gardner grade of blastocysts in predicting live births12. The authors concluded that the KIDScore seemed unsuccessful in showing better performance than the morphological grade of blastocysts, even when studied in a small sample size, but it was undeniable that the KIDScore was still effective in predicting live birth18. The above studies demonstrated the application value of AI-based model based on morphological and morphokinetic characteristics, providing a clinical basis and potential for developing further applications of AI-based models in IVF treatments.

One of the most important goals of IVF is to acquire as many euploid embryos as possible. The selection of euploid embryos for non-PGT treatment is challenging; however, they have excellent developmental potency and offer an extremely high chance of live birth. Better morphokinetic performance of embryos is more likely to be euploid. Studying the predictive potency of the AI-based model in the euploidy of embryos avoids the influence of endometrial conditions, subjective selection for embryos, and immunological factors after frozen embryo transfer (FET) in the analysis. Moreover, PGT technology could provide an objective assessment of the genetic status of embryos, help to study model to predict the euploidy of embryos in non-PGT treatments. Besides, TLM system enables continuous assessment during the embryonic developmental period and avoids defining this development, which is a dynamic process, by static observations; the morphokinetic characteristics of embryos provide a basis for predicting the euploidy of the embryos. Meanwhile, TLM can reduce the exposure time to changing environments. Although many study proved that AI-based model was effectively to predict the FET outcomes in IVF treatments, however, a previous study suggested that the results of studies may differ from clinical settings or IVF clinics9. In our study, we comprehensively assessed the predictive effect of female age, clinical indications of PGT (implantation failure (RIF), RPL, severe teratozoospermia (STS), and female advanced age (FAA: equal or older than 38 years old and require assisted reproductive technology)), number of embryonic frozen days, morphological characteristics (Gardner grade of blastocysts), morphokinetic characteristics (time to reach two cells (t2), three cells (t3), four cells (t4), five cells (t5), time to reach blastocyst (tB), KIDScore of blastocysts in D5) of the euploid blastocysts, and live birth after FET in TLM in PGT treatments, in order to provide a basis for embryo selection in non-PGT IVF treatments.

Results

In this study, a total of 403 PGT for aneuploidy (PGT-A) treatments were investigated, including 83 patients with RIF, 184 patients with RPL, 83 patients with STS, and 53 patients with FAA (Table 1). There was no significant difference in female age among patients in the RIF, RPL, and STS groups (P > 0.05), but it was significantly lower than that in the FAA group (P < 0.05). A total of 1,396 blastocysts were successfully tested in PGT-A, including 296 in the RIF group, 691 in the RPL group, 274 in the STS group, and 135 in the FAA group. In the current study, the rates of euploidy, mosaicism, and aneuploidy in all PGT-A treatments were 52.51, 10.32, and 37.17%, respectively. There was no significant difference in the euploidy rate between the RIF, RPL, and STS groups (P > 0.05), but it was significantly higher than that in the FAA group (P < 0.05).

Table 1 Introduction of PGT-A treatments in this study.

Relationship between morphological characteristics and euploidy of blastocysts

In the present study, 773 and 623 blastocysts were frozen on D5 and D6, respectively (Table 2). The euploidy rate of D5 blastocysts graded as AA in patients < 38 years old was 98.43%, and in patients ≥ 38 years of age was 70.83%, significantly higher than that of D5 blastocysts graded as AB, BA, BB, BC, or CB (P < 0.05). The euploidy rate of D6 blastocysts graded as AA in patients younger than 38 years was 96.49%, and in patients ≥ 38 years of age was 88.24%, significantly higher than that of D6 blastocysts graded as AB, BA, BB, BC, or CB (P < 0.05). Blastocysts graded as AA showed a significantly higher euploidy rate than the others, which indicated that the Gardner morphology grade was an important characteristic for predicting euploid blastocysts.

Table 2 Relationship between Gardner grade and euploidy of blastocysts.

Relationship of morphokinetic characteristics and clinical parameters to euploidy of blastocysts

Morphokinetic characteristics of time to reach two cells (t2), three cells (t3), four cells (t4), five cells (t5), and time to reach blastocyst (tB) of blastocysts frozen on D6 were significantly delayed compared to blastocysts frozen on D5 (P < 0.05), and the KIDScore of D5 blastocysts was higher than that of D6 blastocysts (P < 0.05) (Table 3). In all blastocysts, Spearman correlation of euploidy to t2, t3, t4, t5, tB, KIDScore, female age, clinical indication of PGT-A, Gardner grade of blastocysts, and the number of embryonic frozen days are presented in Table 4. t2, t3, t4, t5, and tB, female age, and the number of embryonic frozen days were significantly and negatively correlated with euploidy in all blastocysts (P < 0.05), indicating that the possibility of euploidy increased as t2, t3, t4, t5, tB, female age, and the number of embryonic frozen days decreased. KIDScore and Gardner grades were significantly and positively correlated with the euploidy of blastocysts, suggesting that the possibility of euploidy increased as the KIDScore and Gardner grade increased (P < 0.05). In addition, there was no significant correlation between euploidy and the clinical indication for PGT-A (P > 0.05).

Table 3 Morphokinetic characteristics of blastocysts in this study.
Table 4 Relationship between characteristics and euploid by Spearman correlation analysis.

Two-logistic regression was used to analyse the relationship between euploidy and t2, t3, t4, t5, tB, KID score, female age, Gardner grade, and number of embryonic frozen days (Table 5). In all blastocysts, the result showed that t2 (OR 0.730, 95% CI 0.584–0.913), t3 (OR 0.597, 95% CI 0.460–0.775), t5 (OR 0.544, 95% CI 0.468–0.633), KIDScore (OR 0.544, 95% CI 0.389–0.762), female age (OR 0.892, 95% CI 0.865–0.920), Gardner grade (OR 1.790, 95% CI 1.436–2.231), and number of embryonic frozen days (OR 0.150, 95% CI 0.055–0.407) were significantly correlative to euploid (P < 0.05), but t4 (OR 1.005, 95% CI 0.846–1.195) and tB (OR 0.935, 95% CI 0.869–1.005) were not significantly correlative to euploidy (P > 0.05). Two-logistic regression was used to adjust OR and showed that t2 (adjusted OR 0.742, 95% CI 0.593–0.928), t3 (adjusted OR 0.601, 95% CI 0.489–0.738), t5 (adjusted OR 0.543, 95% CI 0.472–0.625), KIDScore (adjusted OR 0.646, 95% CI 0.488–0.855), female age (adjusted OR 0.891, 95% CI 0.864–0.919), Gardner grade (adjusted OR 1.728, 95% CI 1.392–2.145), and number of embryonic frozen days (adjusted OR 0.341, 95% CI 0.218–0.535) were significantly correlated with euploidy (P < 0.05). The predicted probability value of each blastocyst was also calculated by two-logistic regression. A receiver operating characteristic (ROC) curve was used for the area under the curve (AUC) among t2, t3, t5, KIDScore, female age, Gardner grade, and number of embryonic frozen days in the prediction of euploid blastocysts (Fig. 1). The ROC curve is shown in red, and the black curve is the reference line. The results represented that the model was effectively to predict euploidy of blastocysts (AUC = 0.879). The Youden index represented maximum when predicted probability value was 0.4182. At this point, the predictive cut-off value was 0.4182, sensitivity was 0.884, and specificity was 0.700 in this model. When predicted probability value of a blastocyst with euploid unknown was equal or greater than 0.4182, the blastocyst was predicted be euploid. The results showed that the model calculated by t2, t3, t5, KIDScore, female age, Gardner grade, and number of embryonic frozen days was capable of predicting the euploidy of blastocysts more effectively than a single parameter of t2 (AUC = 0.243), t3 (AUC = 0.206), t5 (AUC = 0.166), KIDScore (AUC = 0.730), female age (AUC = 0.371), Gardner grade (AUC = 0.707), and number of embryonic frozen days (AUC = 0.572) (Table 5).

Table 5 Relationship between characteristics and euploid by two-logistic regression and ROC curve analysis.
Figure 1
figure 1

ROC of characteristics to predict euploidy of blatocysts. The ROC curve was showed in red and the black one was the reference line. X-axis showed the 1-specificity and Y-axis showed the sensitivity.

Relationship between morphokinetic characteristics, morphological characteristics, and clinical parameters to live birth

A total of 295 euploid blastocysts, including 186 frozen on D5 and 109 frozen on D6, were transferred in 295 FET treatments. The intrauterine pregnancy rate (80.11% vs. 72.48%) and live birth rate (65.59% vs. 58.72%) of transferred D5 blastocysts were higher, but not significantly, than those of D6 blastocysts (P > 0.05) (Table 6). The non-pregnancy rate (19.89% vs. 27.52%) of D5 blastocysts transferred was lower, but not significantly, than that of D6 blastocysts (P > 0.05). The abortion rate (18.12% vs. 18.99%) was nearly the same in D5 and D6 blastocysts (P > 0.05). Live birth was not significantly correlated with t3, t4, t5, tB, KIDScore, female age, clinical indication of PGT-A, Gardner grade of blastocysts, endometrial thickness, or number of embryonic frozen days (Table 7) (P > 0.05). t2 was significantly correlated with live births (P < 0.05). However, live birth was not significantly correlated with t2 by two-logistic regression (OR 0.856, 95% CI 0.602–1.219, and P > 0.05). ROC was also used for t2 to predict live births, and AUC = 0.569. The results showed that t2, t3, t4, t5, tB, KIDScore, female age, clinical indication for PGT-A, Gardner grade, endometrial thickness, and number of embryonic frozen days were not capable of predicting live birth after FET in PGT-A treatments.

Table 6 FET outcomes after euploid blastocysts transferred in this study.
Table 7 Relationship between characteristics and live birth by Spearman correlation analysis.

Discussion

In the current study, we failed to predict live births using the D5 KIDScore v3.1 model or clinical characteristics in PGT-A treatments, which was inconsistent with previous research results12. In fact, we found that live birth performance was not significantly related to morphology, morphokinetic characteristics, or clinical parameters after euploid blastocyst FET, although we set strict limits to minimise individual variation between patients. A model that can effectively predict FET outcomes using the basis and objective parameters would certainly be desirable. However, we believe that euploid embryos are vital to FET outcomes and good morphokinetic or morphological performance most likely indicates euploid embryos. The positive results of previous studies proved that embryos with high developmental potency were screened for FET. However, the predictive power and significance diminished as pregnancy progressed. We considered that many individual factors, such as endometrial conditions, may affect the prediction of FET outcomes. For example, quite a few patients with RIF suffered from window of implantation shifting, which could cause failure even in FET outcomes of euploidy19,20. Therefore, although previous studies have demonstrated that good morphokinetic performance of embryos is capable of predicting implantation, pregnancy, or live birth, we suspected that if it was the prediction of FET outcomes, more individual parameters of patients rather than morphokinetic characteristics of embryos should be considered to evaluate the prediction effects. Indeed, in the present study, we proved that morphokinetic characteristics of embryos were not significantly related to live birth when euploid blastocysts were transferred. We believe that it is more rigorous and objective to create a model using morphokinetic, morphological characteristics, or clinical parameters to predict the euploidy of blastocysts, rather than to predict the FET outcomes such as pregnancy or live birth, which could be more likely to be affected by individual factors.

Here, we used a model with a combination of t2, t3, t5, KIDScore, Gardner grade, female age, and number of embryonic frozen days to effectively predict the euploidy of blastocysts (AUC = 0.879, sensitivity = 0.884, and specificity = 0.700). We have summarised the criteria for blastocyst selection. In IVF treatments with non-PGT, we collected t2, t3, and t5, KIDScore, Gardner grade, and number of embryonic frozen days of blastocysts from TLM, female patient age, and input these into our model. Then, two-logistic regression was used to analyse the predicted probability values of blastocysts. When the predicted probability value was equal or larger than 0.4182, the blastocysts were predicted to be euploid. Quality assurance (QA) was guaranteed by two embryologists during blastocyst development. However, more samples should be included in future studies to prove the effectiveness of the model and to further optimise it. It is also worth noting that KIDScore was not a valid predictor of euploidy in the current study. The KIDScore is a comprehensive judgment based on the characteristics of each embryo during development. Some embryos with poor morphokinetic characteristics in the early stages acquired high morphological grades, resulting in similar or even identical scores to well-developed embryos with average morphological grades. In the present study, the combined characteristics model performed better than the KIDScore algorithm (AUC = 0.879 vs. AUC = 0.730, respectively). Therefore, more morphokinetic characteristics must be included in the predictive model.

Euploidy is an important factor affecting the developmental potency of embryos. In IVF treatments, the aneuploidy rate of 2PN embryos is 25–40%, and the rate of aneuploid embryos increases with female age21,22. PGT-A can help avoid aneuploid embryos from FET; however, PGT-A is a limiting medical technology in IVF treatments in China, and only patients who strictly conform to clinical indications, including FAA, RIF, RPL, and STS, can receive PGT-A treatment. Euploid embryos were selected for FET using PGT-A in patients with FAA, which increased the cumulative pregnancy rate22. Patients with RIF and RPL suffered from multiple persistent pregnancy failures and using PGT-A could decrease the abortion rate and increase the pregnancy rate in patients23,24. STS is one of the clinical indications of PGT-A, since the aneuploidy rate of sperm from patients diagnosed with STS increases, resulting in a higher aneuploidy rate in embryos than in patients without STS25. During the development period of embryos of PGT-A treatments in the present study, we found that the clinical parameters, morphokinetics, and morphological characteristics of euploid embryos are worthy of attention and reference, which could aid in the selection of euploid embryos in non-PGT IVF treatments.

The mosaic rate of blastocysts of up to 26% was acceptable in PGT-A treatments, and it was controversial whether mosaic embryos could be transferred26,27. Previous study demonstrated that many blastocysts diagnosed as mosaic were caused by the amplification method in PGT-A, technical reasons for biopsy, and poor embryo quality27. In fact, many blastocysts diagnosed as mosaic have been proven to be euploid after FET. Embryos with low mosaic levels are recommended for FET when euploid embryos are lacking28. However, FET of mosaic embryos is not common in China, and the mosaic is the same as aneuploidy in embryo selection for FET in most IVF clinics. In PGT-A treatments, blastocysts with good development potency were avoided for FET because they were diagnosed as mosaics. More studies on the FET outcomes of mosaic embryos are needed.

In this study, we showed our research result of predictive effect of female age, clinical indications of PGT, number of embryonic frozen days, morphological characteristics, morphokinetic characteristics on the euploid blastocysts and live birth, which may differ from IVF clinics9. We did our own research in IVF lab, and we found that the model we built was effective to predict the euploidy of blastocysts, but failed to predict live birth after FET. However, the present study had some limitations. Since this was a retrospective study including 1396 blastocysts from a single IVF clinic, more samples from different clinics are required to further optimise this model. A prospective randomised study is needed to confirm the relevance of morphokinetics, morphological characteristics, and clinical parameters in euploid blastocysts. Although rigorous QA and statistical analysis were performed in previous studies, and AI-based model decided the euploidy of blastocysts by a mathematical method, the adjusted parameters of embryos from embryologists were subjective, which were difficult to avoid. AI-based model could quantify the developmental potency of embryos, but the reality was more complicated than that of the model. Here, the prediction effects of this model were only referenced by embryologists, although it worked well. Our study provided non-PGT IVF treatments with a method to predict the euploid blastocysts noninvasively, but more samples are required to verify the effectiveness of this model.

Methods

Patients

This study used a total of 403 patients with RIF (n = 83), RPL (n = 184), STS (n = 83), and FAA (n = 53) who received PGT-A treatment at the Reproductive Medicine Centre of Xuzhou Maternal and Child Health Care Hospital from January 2019 to January 2022. None of the patients had chromosomal karyotype abnormalities. This study was approved by the Ethics Committee of Reproductive Medicine of Xuzhou Maternal and Child Health Care Hospital, and the patients received detailed genetic counselling and signed relevant informed consent forms.

Procedure

All methods were performed in accordance with the relevant guidelines and regulations. Controlled ovarian hyperstimulation (COH) was performed using long-acting or antagonist protocols. Human chorionic gonadotropin (HCG) was injected when the largest diameter of the follicles reached 18 mm or the diameter of two follicles reached 16 mm. Transvaginal ovum pick-up (OPU) was guided by b-ultrasound. Intracytoplasmic sperm injection (ICSI) was performed for all the PGT-A treatments. Embryonic culture and morphological observations were performed using the G-TL culture system (Vitrolife, Sweden) and TLM system (Vitrolife). During embryo development, TLM provided parameters of time to reach t2, t3, t4, t5, and tB by D5 KIDScore v3.1 model. To maintain QA, two embryologists adjusted these parameters after 5–6 days of embryonic culture post fertilisation, and morphological grading of blastocysts was performed (referring to the Gardner blastocyst grading standard) by D5 KIDScore v3.1 model, where two embryologists adjusted the grades. A biopsy of blastocysts graded above BC or CB was performed. PGT-A was used for the genetic testing of biopsy products. Euploid blastocysts were prepared for FET according to the PGT results. Clinical pregnancy was defined as blood HCG > 20 mIU/mL 14 days after transfer. Intrauterine pregnancy was defined as gestational sac (GS) detected by b-ultrasound 28 days after blastocyst transfer. At 16–20 weeks of gestation, amniocentesis was performed to analyse foetal chromosome euploidy and karyotypes. Follow-up visits were performed until the end of the pregnancy or live birth.

Statistical analysis

SPSS 19.0 software (SPSS Inc., Chicago, IL, USA) was used for data statistics. The Student’s t-test was used to analyse the differences in female age, KIDScore, t2, t3, t4, t5, and tB among the groups. Statistical significance was set at P < 0.05. The chi-square test was used to analyse the differences in euploidy, mosaic, aneuploidy, non-pregnancy, intrauterine pregnancy, abortion, and live birth rates among the groups. Spearman’s correlation analysis was used to analyse the relationship between the characteristics or parameters and euploidy. ROC curve and AUC were used to evaluate the predictive effectiveness of euploid to characteristics or parameters.