Background

Accurately determining the normal range of early pregnancy markers can help to predict adverse pregnancy outcomes, such as miscarriage. It is also useful to determine the number of foetuses and their viability, type of twinning, and presence of gross fetal abnormalities, placental problems, and uterine or adnexal problems. Some studies have constructed reference intervals that mostly depend on natural conceptions of women with regular menstrual cycles and known dates of their last menstrual periods (LMPs) [1,2,3]. However, a discrepancy of more than 7 days in gestation calculated by menstrual history and by ultrasound was found in approximately 15% of women with regular menstrual cycles and specific LMP dates due to the variance in ovulation days [4]. Thus, ultrasound measurements of embryonic and foetal crown–rump length (CRL) are useful to estimate gestational age (GA) in early pregnancy [5, 6], and the classic Robinson curve is the most common method [7]. However, some researchers have shown that there is generalized underestimation of GA by the Robinson curve [8, 9]. These findings have led to uncertain accuracy of reference intervals for natural pregnancies.

During in vitro fertilization-embryo transfer (IVF-ET), the day of oocyte retrieval and ET are known; thus, the GA estimation is accurate. We speculate that reference intervals derived from IVF-ET data are more accurate than those derived from natural conception and be more suitable for IVF populations. With the rapid development of artificial reproductive technology, especially after the implementation of the two-child policy in mainland China, more infertile couples conceive with this treatment [10, 11]. However, there is no research focused on constructing reference intervals for 4 ultrasound indicators of early pregnancy following IVF-ET or specifically targeting the Chinese population.

This study analysed data from a large cohort of 30,416 singleton pregnancies with normal outcomes from a Chinese population, aiming to construct reference intervals for gestational sac diameter (GSD), yolk sac diameter (YSD), heart rate (HR) and CRL at 6–10 gestational weeks (GW) following IVF-ET. The optimal models for predicting GSD, CRL, YSD and HR based on GA were also analysed.

Methods

Patients

The institutional review board approved this study before data collection (LL-SC-2019-015). The study was conducted using anonimized dataset of patients for research purposes and that it was conducted in agreement with Helsinki declaration for research ethics. STROBE Guidelines were followed for reporting this observational study [12]. This retrospective study was involved 30,416 singleton pregnancies following IVF-ET at the Reproductive and Genetic Hospital of CITIC Xiangya from January 2010 to December 2016 (Changsha, China, Fig. 1). In order to create models that are applicable to more patients, the study population was minimally selected. The age of women in the studied population was up to 45 years. Due to the retrospective nature of this study, informed consent was waived. The kinds of insemination methods included IVF, intracytoplasmic sperm injection (ICSI), IVF/ICSI, and preimplantation genetic diagnosis (PGD). In these patients, 1–3 fresh or frozen embryos with good quality were transferred at the day-3 or day-5 stage, and the embryo scoring method was described in our previous studies [13, 14]. Serum-human chorionic gonadotropin (hCG) levels were measured on day 14 (blastocysts on day 12), and transvaginal scans were usually performed in the first trimester to confirm clinical pregnancy.

Fig. 1
figure 1

Flow chart of patient inclusion. IVF-ET, in vitro fertilization-embryo transfer; MA, maternal age; GA, gestational age

Pregnancy and perinatal outcomes were tracked by a specified team via telephone call or fax at our centre. The inclusion criteria were as follows: (1) live singleton pregnancy following IVF-ET; (2) embryonic GSD, YSD, HR were fully measured and recorded at 6–10 GW; and (3) live birth after 37 GW of a phenotypically normal neonate with a birth weight > 5th percentile for GA [15]. All enrolled women were informed the possibility of using their ultrasound records from the first trimester to construct reference intervals before ultrasound examinations were performed.

Ultrasound measurements

We collected the first ultrasound examination results from each patient during 6–10 GW. The ultrasound scans were performed by 4 experienced sonographers with a GE VOLUSON E8/730 (General Electric Tech Co., Ltd., New York, USA) equipped with a 5–9 MHz transvaginal probe. The measurements referred to the ISUOG practice guidelines [16] and conformed to uniform standards: GSD was calculated the mean value of 3 perpendicular diameters with the callipers placed at the inner edges of the trophoblast; YSD was calculated as the average of 3 perpendicular diameters with the callipers placed at the centre of the yolk sac (YS) wall; CRL was measured as the greatest length of the embryo in the anterior to posterior dimension; and HR was calculated from frozen M-mode images with electronic callipers by measuring the distance between two heart waves.

Intra- and interobserver reliability of measurements was tested on a random selection of 30 pregnancies at day 28 after ET. Each observer performed two measurements of GSD, YSD, CRL and HR on separate occasions and was unaware of others’ results. Written informed consent was obtained from all test patients before ultrasound scanning. The reference intervals were analysed according to the gestational days (GD) and GW. The GD can be deduced by adding 17 to the day of ET for cleavage stage embryos or adding 19 for blastocysts (Day 5 or Day 6), and the corresponding GW was obtained by dividing the GD by 7 [14]. The calculation method of fresh embryo and frozen embryo was the same.

Statistical analysis

All statistical analyses were performed using SPSS software version 17.0 (SPSS, Inc., Chicago, IL, USA). Measurements are presented as the mean ± standard deviation (SD), and the enumerated data are expressed as numbers (percentages). The curve-fitting method was used to screen the optimal models for predicting GSD, CRL, YSD and HR based on GD and GW. We determined the optimal model based on the size of coefficient of determination (R2). The model with the largest R2 was ultimately selected as the best model. Additionally, the percentile method was used to calculate the 5th, 50th, and 95th percentiles for each time point. Scatter plots of GSD, CRL, YSD and HR compared with GD and GW were obtained. Correlation coefficients were calculated to analyse the intra- and interobserver reliability. A p value < 0.05 was considered significant.

Results

From January 2010 to December 2016, 100,718 infertile patients obtained clinical pregnancies via IVF-ET in our hospital. After data exclusion, a total of 30,416 singleton pregnancies with normal outcomes were included in this study. The clinical characteristics of the study population were shown as Table 1. The measurements of GSD, CRL, YSD, and HR showed significant intra- and inter-observer correlations (p < 0.001).

Table 1 Clinical characteristics of the study population

Gestational sac diameter

There was a significant linear association between GSD and GA. The best fit models were as follows: GSD = -29.180 + 1.070 GD (R2 = 0.796, P < 0.001) and GSD = -29.180 + 7.492 GW (R2 = 0.796, P < 0.001). Figure 2 shows scatter plots with the 5th, 50th, 95th percentiles of GSD against GD.

Fig. 2
figure 2

Scatter plots with the 5th, 50th, 95th percentiles of gestational sac diameter (GSD) against gestational days (GD)

Crown–rump length

There was a significant quadratic association between CRL and GA. The most appropriate fit models were as follows: CRL = − 11.960 - 0.147 GD + 0.011 GD2 (R2 = 0.976, p < 0.01) and CRL = − 11.960 - 1.028 GW + 0.535 GW2 (R2 = 0.976, p < 0.001). Figure 3 shows scatter plots with the 5th, 50th, 95th percentiles of CRL versus GD.

Fig. 3
figure 3

Scatter plots with the 5th, 50th, 95th percentiles of crown–rump length (CRL) against gestational days (GD)

Yolk sac diameter

A significant association between YSD and GA was found. The following quadratic models showed the most appropriate fit: YSD = − 2.304 + 0.184 GD - 0.011 GD2 (R2 = 0.500, p < 0.01), and YSD = − 2.304 + 1.288 GW - 0.054 GW2 (R2 = 0.500, p < 0.001). Scatter plots with the 5th, 50th, 95th percentiles of YSD against GD are presented in Fig. 4.

Fig. 4
figure 4

Scatter plots with the 5th, 50th, 95th percentiles of yolk sac diameter (YSD) against gestational days (GD)

Heart rate

A significant association between HR and GA was found. The following quadratic models showed the best fit: HR = − 350.410 + 15.398 GD - 0.112 GD2 (R2 = 0.911, p < 0.001) and HR = − 350.410 + 107.788 GW - 5.488 GW2 (R2 = 0.911, p < 0.001). Scatter plots with the 5th, 50th, 95th percentiles of HR against GD are presented in Fig. 5.

Fig. 5
figure 5

Scatter plots with the 5th, 50th, 95th percentiles of heart rate (HR) against gestational days (GD)

Additionally, the details of the reference intervals for GSD, CRL, YSD and HR based on GD and GW are shown in Tables 2 and 3, respectively.

Table 2 Reference intervals for GSD, YSD, CRL and HR based on GD
Table 3 Reference intervals for GSD, YSD, CRL and HR based on GW

Discussion

In this study, we constructed reference intervals for GSD, YSD, CRL, and HR at 6–10 GW for an IVF population with a large sample of Chinese women. The optimal models for predicting GSD, CRL, YSD and HR based on GA were also presented.

In this study, a high proportion of CS is noted in Table 1. This high proportion of CS may be due to the high CS rate in China, which was estimated to be approximately 50% of births [11, 17]. However, the CS rate was as high as 73.2% in this study. The babies were conceived via IVF, and the implementation of the two-child policy in China has led to an increase in the number of elderly maternal pregnancies; over half of elderly mothers underwent CS for their first delivery; these factors might have contributed to the high CS rate in the IVF population [11].

Optimal models for predicting GSD, CRL, YSD and HR based on GA were established and showed that GSD linearly increases with GA. CRL, YSD, and HR had significant quadratic associations with GA. These models can be conveniently used in clinical practice to calculate the corresponding values of GSD, CRL, YSD and HR according to GA. However, the YSD models showed relatively lower R2 (0.500 for both GD and GW) than the other models, suggesting that the prediction models can only explain 50% of the changes in YSD; thus, in addition to GA, there are other factors to be explored.

The reference intervals for GSD, YSD, HR and CRL at 6–10 GW were constructed from a large sample in this study. This data can provide clinicians a reliable reference to analyse the development of early embryos after IVF-ET and facilitate monitoring of pregnancy outcomes at an early stage. GSD, YSD and CRL were found to gradually increase from 6 to 10 GW. However, HR increased from 6 GW, reaching a peak at 9 GW (176.0 bpm) and decreasing from there. This trend in HR was consistent with the results of previous studies [18, 19] and may be due to the development of the embryonic heart and its conductive system [20].

For comparison with previous studies, we performed a literature search of PubMed, and representative literature is listed in Table 4 [2, 5, 7, 21,22,23,24,25,26]. Most previous studies were conducted between the 1990s and 2000s and had small sample sizes of subjects with spontaneous conception or a mixed population. The most obvious difference between our study and previous studies was the CRL at early GA. In the studies by Grisolia et al. [22] and McLennan et al. [26], the CRL at day 45 was 7 mm; however, the CRL was 3.4 mm in our study. Both these studies used dating models among spontaneous conception or mixed populations to calculate GA according to CRL. Some researchers have suggested that the use of assisted reproduction data can improve dating accuracy; however, the accuracy is limited before 7 GW and is equivocal for menstrual dating beyond that GA [26], which may partly explain the considerable differences in CRL at day 45 between our study and previous studies. Additionally, CRL has been reported to overestimate gestation [27], and using CRL to determine GA has been reported to be less accurate than GA estimated by a certain LMP or day of oocyte retrieval in early pregnancy [28]; therefore, the CRL corresponding to the calculated GA is longer than the CRL of the same GA in IVF populations.

Table 4 Reference values for GSD, YSD, CRL and HR in previous studies and the present study

The most popular formula for pregnancy dating originated from the study by Robinson and Fleming [7], and several studies proposing different dating equations have been reported since then. The use of different formulas can lead to discrepancies in GA estimation and corresponding differences in GSD, CRL, YSD and HR. In addition, different measurement methods may also lead to differences in ultrasound indicators. For example, when measuring YSD, some researchers prefer to place the calliper on the outside limits of the YS wall [29], while some place the calliper on the inner limits of the YS wall [30]. The measurements made in the study by Robinson and Fleming [7] were measured transabdominally, which might not be the same as measurements obtained transvaginally. Furthermore, the values in some articles were presented as means [23,24,25], while they were reported as medians in other studies [5, 22], which may also partly cause these differences.

Our study has several strengths. The large sample size allowed us to establish special reference intervals and construct optimal models for GSD, CRL, YSD and HR for IVF populations, which may be helpful for accurately analysing and monitoring the development of early pregnancy following IVF-ET. However, one potential weakness was that all the data were confined to one reproductive centre; although it is the largest centre in China, territorial limitations exist. Future studies with multi-centre samples are necessary to establish nationwide or worldwide references. Secondly, although the total sample was quite large, the patients were unevenly distributed. Most patients underwent their first ultrasound on day 28 after ET (45 GD, n = 12,687); however, much fewer patients underwent ultrasound on other days, particularly on later days. However, it is impractical to perform ultrasound for each patient every day to evenly distribute the sample. Therefore, future studies are needed to verify our reference intervals. Thirdly, to compare normal data with abnormal outcomes and try to understand whether the measurements may be somehow function as prognostic factors for abnormalities would be an interesting future work. Fourthly, since we collected the data retrospectively from the hospital database, some baseline data such as pharmacological treatments uses, parity, significant maternal diseases and smoking status were missing.

In addition, only fresh embryos, frozen embryos and days of transplantation were recorded for transplantation, but not blastocyst transplantation, so we were unable to further analyze the results of blastocyst transplantation. Previous studies have found lower uterine artery pulsatility index, proportion of small-for-gestational-age (SGA) [31] decreased risks of preterm \birth and low birth weight babies but a higher risk of large for GA babies as well as hypertensive disorders of pregnancies associated with pregnancies conceived from frozen embryos compared to fresh transfer [32]. While the difference between fresh and frozen embryos needs to be further confirmed by our follow-up studies. Another potential weakness was that IVF pregnancy may not be biologically equivalent to spontaneous conception due to increased risks of obstetrics and perinatal complications were shown for IVF pregnancies [33,34,35]. Thus, whether references based on IVF population are suitable for natural conceptions needs further elucidation.

Conclusions

In conclusion, this study involving a large number of normal pregnancies presented the reference intervals for GSD, CRL YSD and HR at 6–10 GW. These data can be used as reliable references for analysing the development of early embryos after IVF-ET and for monitoring pregnancy outcomes at early stages.