Background

Thanks to the development of assisted reproductive technology and delayed childbearing, the incidence of twin pregnancies rose steadily in the last four decades. The twinning rate is now estimated at around 2 ~ 3% in all pregnancies [1, 2]. Twin pregnancies are at higher risks of multiple adverse perinatal outcomes than singleton pregnancies, mainly due to prematurity and/or fetal growth restriction (FGR) [3, 4]. Thus, identifying fetuses with growth restriction is crucial in prenatal care of twin pregnancies.

It is a common clinical practice to evaluate twin growth status using a fetal growth chart that was developed for singleton pregnancies. Twin and singleton fetuses may follow similar growth patterns during the first and second trimesters [5], but the growth of twins slows down in the third trimester, and the growth curves between twin and singleton pregnancies diverge significantly after 28–32 weeks gestation and the difference between them widens with advancing gestation [6,7,8]. Whether the growth difference between singletons and twins is a pathological consequence(real growth problem) or a physiological adaption remains controversial. However, it has now been well-recognized that the growth of twins lags behind that of singletons especially at late gestation. Therefore, using singleton standards for twins may identify more SGA fetuses especially at late gestation, leading to over-monitoring and treatment, and increasing medical burden and costs. It is now widely acknowledged that singletons and twins need separate growth charts to assess their growth accurately [9].

Some studies have tried to establish fetal growth charts for twins from population-based birthweight [7, 10,11,12]. However, as infants born prematurely are more likely to be growth restricted than fetuses who remain in utero at the same gestational age [13], a birthweight-based chart would underestimate the proportion of FGR before term. In the past decade, several fetal growth charts for twins based on ultrasonography measurements have been created, some of which were stratified by chorionity [8, 14,15,16,17].

At the same time, Zhang et al. proposed a method to develop an adjustable fetal weight standard for twins [18]. It adopts the Gardosi’s proportionally principle [19], and assumes that the standard deviation is a constant fraction of the mean weight through gestation [20]. Based on the theory, by anchoring to the mean birth weight and standard deviation of a specific gestation age (i.e. 37.5 weeks), corresponding percentiles across each gestational age can be calculated based on normal distribution following Hadlock’s formula [21]. To date, there was no ultrasound-based growth chart specially built for Chinese twins. Also, the effectiveness of Zhang’s method needs to be validated.

Our study aimed to construct a fetal growth chart for Chinese twins based on ultrasound biometric measurements, and compare it with Zhang’s and other twin fetal growth charts for validation [8, 14,15,16,17].

Methods

Population

This study used data from a prospective study on preeclampsia screening in twin pregnancies. A total of 1475 women were approached and 1225 were enrolled between gestation of 11 weeks 0 days and 13 weeks 6 days and followed to delivery or end of pregnancy at the Shanghai Frist Maternity and Infant Hospital in 2014–2017 [22]. At enrollment, an ultrasound scan was conducted for each twin. Ultrasound-estimated gestational age (Us-GA) was calculated based on the fetal crown-rump length of the larger twin using the formula by Robinson and Fleming: Us-GA (in exact weeks) = (8.052*\(\sqrt{\mathrm{C}\mathrm{R}\mathrm{L}}\)+23.73)/7 [23], and chorionicity was determined by the presence of T sign (monochorionic diamniotic, MCDA) or λ sign (dichorionic diamniotic, DCDA) at the junction site of intertwin membrane with the placenta. Pregnancies with uncertain chorionicity were not eligible for preeclampsia screening study. Written informed consents were obtained from all participants.

In the present study, we firstly excluded pregnancies with unmatched Us-GA and last menstrual period-based gestational age (LMP-GA) (n = 32), in which the difference between Us-GA and LMP-GA were: 1) more than 6 days for gestation estimates between 11 weeks 0 days and 12 weeks 6 days of gestation; or 2) more than 7 days between 13 weeks 0 days and 13 weeks 6 days. For those conceived by in vitro fertilization (IVF), the last menstrual period (LMP) was calculated by the date of transfer minus 14 days and embryo age at transfer. LMP-GA was used as the gestational age in all analyses.

We further excluded pregnancies: 1) with monochorionic monoamniotic twins (n = 2); 2) with maternal age < 20 or > 35 years (n = 191); 3) with fetal chromosomal or major structural abnormality reported during pregnancies or after delivery (n = 59); 4) with crown-rump length discordance > 10%, or nuchal translucency ≥ 3.5 mm in either twin (n = 124); 5) with complications including but not limited to hypertensive disorders (including preeclampsia), diabetes, twin-twin transfusion syndrome(TTTS), selective intrauterine growth restriction (sIUGR, defined as estimated fetal weight < 10th percentile in the small fetus and weight discordance ≥ 25% between the two fetuses) (n = 219); 5) undergoing fetal reduction (n = 22); 6) delivery before 32 weeks (n = 13); 7) ending in miscarriage, termination, or fetal death in either twin (n = 52); or 8) being lost to follow-up (n = 50) or having no data on ultrasound biometric measurements (n = 63). In this way, we aimed to select only healthy women who were at a better condition for optimal fetal growth and only healthy fetuses who were considered to have an optimal growth, and to construct an optimal growth standard for twin-fetuses. The flow chart for the study population was presented in Fig. 1.

Fig. 1
figure 1

Flow chart for the study population

Maternal characteristics and birth weight

Maternal characteristics and medical history were recorded, including maternal age, weight, height, parity (nulliparous or parous), method of conception (spontaneous conception, ovulation induction, and in vitro fertilization). The birth weight of the twins was measured by electronic baby balance and recorded immediately after birth.

Ultrasound biometric measurements

Transabdominal ultrasound scans of fetal biometric measurements were conducted by 3 certified sonographers in the Department of Fetal Medicine at the Shanghai First Maternity and Infant Hospital, who were specially trained and had experience in obstetrical and fetal ultrasonography. All scans were performed on the Voluson E8 machines (GE Healthcare Ultrasound Milwaukee, WI, USA). At the first scan, twin A or twin B was accurately labeled using the placental site, fetal position (up or down; right or left), and cord insertion. For each fetus, biparietal diameter (BPD), head circumference (HC), abdominal circumference (AC), and femur length (FL) were measured according to the ISUOG Guideline [24]. Each biometric index was measured twice, and the average was calculated. Estimated fetal weight (EFW) was calculated using ultrasound biometric parameters by Hadlock formula IV: Log10 weight = 1.3596–0.00386 × AC × FL + 0.0064 × HC + 0.00061 × BPD × AC + 0.0424 × AC + 0.174 × FL [25]. Measurements were excluded if the EFW was unreasonable, defined as greater than 5 standard deviations from the mean.

Statistical analysis

Smoothed estimates of fetal growth chart and percentiles for both monochorionic (MC) and dichorionic (DC) twins were obtained using linear mixed models, which could account for the dependency of the data, including clustering of the twins and serial measurements on the same fetus. The EFW measurements were log-transformed to ensure the homoscedasticity of variance across gestational age and the normal distribution of EFW at each gestational age. We included random effects for both mother (twin-pair) and individual fetus (serial EFW measurements, fetus-level). For the modeling procedure, we tested for models of log-transformed EFW on gestational age, gestational age squared and gestational age cubed. The best fit model was selected based on the lowest Akaike information criteria (AIC) value and residual standard errors (Supplementary Table S1). Finally, the model of log-transformed EFW on gestational age, gestational age squared was selected. The scatters of log-transformed EFW against gestational age were plotted in Fig. 2 (a, MC twins; b, DC twins).

Fig. 2
figure 2

Scatter plot of log-transformed estimated fetal weight vs. gestational age in MC (a) and DC twins (b)

Gestational age specific percentiles for fetal weight were calculated on the log scale and then back-transformed to the original fetal weight scale in grams. The gestational age specific variance was estimated by combining the estimated twin-pair level, fetus-level and residual variance, and the corresponding standard deviation (SD) was then estimated by regressing the squared root of the gestational age specific variance on gestational age [17]. We assumed a normal distribution of the log fetal weight on each gestational age, and used the formula Mean ± Zα × SD to obtain the log scale percentile, where Mean is the predicted value of the optimal model, is the corresponding value for the percentile of the standard Gaussian distribution, and SD is gestational age specific standard deviation [20]. Based on the same method, the standards for twin fetal biometric measurements (BPD, HC, AC, and FL) were also calculated. However, with the model of log-transformed measurement on gestational age, gestational age squared, and gestational age cubed being selected.

In order to investigate whether the fetal growth is different between pregnancies conceived naturally and pregnancies conceived by in vitro fertilization, we added a sensitivity analysis and compared the fetal growth charts between the two sub-populations. At the same time, for comparison, following Zhang’s method [18], we created a growth chart for twins by anchoring to the mean birth weight and standard deviation at the gestational age of 37 weeks in the study population (37 + 0 to 37 + 6 weeks, 356 fetuses, 2709.8 ± 274.0 g).

To assess the performance of the growth chart in identifying the true “small” fetuses who were at higher risk of adverse perinatal outcomes, we applied the established chart to live-born twins of the source cohort, and compared it with Hadlock’s singleton standard. Among the 1225 twin pregnancies enrolled for preeclampsia screening, 1091 women delivered 2 live births, of which 1920 births had complete perinatal information. The odds ratios (ORs) of neonatal death and adverse perinatal outcomes between small for gestational (SGA) and non-SGA infants were estimated. The neonatal death was defined as death within 7 days after birth, and adverse perinatal outcomes included neonatal death, neonatal intensive care unit (NICU) stay for ≥ 14d, and transfer to a higher-level or special care unit.

Role of funding source

The funders had no role in: the design and conduct of the study; collection, management, analysis, and interpretation of the data; preparation, review, or approval of the manuscript; and decision to submit the manuscript for publication. The corresponding author had full access to all the study data and had final responsibility to submit for publication.

Results

A total of 398 twin pregnancies were included, with 796 live-born infants of whom 214 were MC and 582 were DC. Overall, 3954 ultrasound measurements were included (1877 for MC twins and 2077 for DC twins). There was a median of 10 (interquartile range 8–11) ultrasound scans per fetus in MC twins and 2 (1-6) in DC twins. The maternal and fetal characteristics were displayed in Table 1. The average maternal age was 29.8 ± 2.8 years. About 90% of the women were nulliparous, and 54.0% conceived by in vitro fertilization, of which 11.2% for MC and 69.8% for DC twin pregnancies. The average gestational age at delivery was 36.5 ± 1.2 weeks, and the average birth weight of the infants was 2567 ± 344 g, in which MC twins were found to be delivered earlier and smaller at birth.

Table 1 Maternal and fetal characteristics of the twin pregnancies by chorionicity

As it is shown from the growth chart (Fig. 3), the MC twins were consistently lighter than the DC twins, but the difference was pretty small throughout the whole gestation. Thus, for simplicity, we built only one combined growth standard for both MC and DC twins using a linear mixed model. The weight percentiles for twin fetuses by gestational age were presented in Table 2 and the percentiles for fetal biometric measurements (BPD, HC, AC, and FL) in Table 3. In the sub-population sensitivity analysis, we found that twin fetuses with mothers conceived naturally were a little lighter than those with mothers conceived by in vitro fertilization, similarly, the difference was pretty small throughout the whole gestation (Supplementary Figure S1).

Fig. 3
figure 3

Growth chart for MC and DC twins in the present study (linear mixed model)

Table 2 Weight percentiles for twin fetuses by gestational age
Table 3 Percentiles for twin fetal sonography measurements by gestational age

When comparing the chart with that built using Zhang’s method (weight percentiles displayed in Supplementary Table S2), we found that the two charts almost overlapped except that the Zhang’s curve was slightly lower in the 90th percentiles (Fig. 4). Furthermore, we compared our charts with those from previous studies based on different populations. Compared to those from the US (NICHD study) [8], Brazil [15], Italy [16] and UK [17], the Chinese twins had very similar 50th percentiles, but higher 5th and 10th percentiles and lower 90th and 95th percentiles, especially in late gestation (> 28 weeks or > 32 weeks; Fig. 5). The only exception is that the 10th percentiles for Chinese MC and DC twins almost overlapped with those from Canada, however, the Chinese twins had lower 50th and much lower 90th percentiles, especially in late gestation (Fig. 5) [14].

Fig. 4
figure 4

Growth chart for twins built from linear mixed model and that built from Zhang’s method

Fig. 5
figure 5

Comparison of growth chart for our twins and that of Fetal Growth Studies from the NICHD (Grantz KL, 2016), Brazil (Araujo Júnior E, 2014), Italy (Ghi T, 2017), UK (Stirrup OT, 2016), and Canada (Shivkumar S, 2015)

Compared with the Hadlock singleton standard, the application of our growth chart to live births of the source cohort resulted in a much lower proportion of SGA (< 10th) (26.9% for Hadlock’s vs 16.1% for our growth chart). When applying our growth chart, the ORs of neonatal death and adverse perinatal outcomes for SGA compared with non-SGA [3.49 (95%CI: 0.58, 20.99) and 3.74 (95%CI: 2.85, 4.92), respectively] were substantially elevated relative to the Hadlock’s standard [1.81(0.30, 10.91) and 2.30(1.79, 2.94), respectively]. And the ORs increased slightly when the analyses were restricted to those without birth defects (Table 4).

Table 4 Comparison of the ability of the growth chart in predicting adverse perinatal outcomes

Discussion

Principle findings

In this prospective study, we constructed a normal fetal growth standard for Chinese twins. The MC twins were consistently lighter than the DC twins but the differences were very small throughout the gestation. The growth chart built using linear mixed model was comparable to that by Zhang’s method [18]. Overall, Chinese twins had almost identical the 50th percentiles to those reported in previous studies, but tended to have a narrower range between the 10th and 90th (5th and 95th) percentiles in late gestation (> 28 weeks or > 32 weeks).

Comparison with previous studies in twin pregnancies

The construction of a fetal growth chart relies much on the population that the study selects and the statistical method that it adopts. To obtain an optimal fetal growth standard, we selected only healthy twin pregnancies, which was similar to most previous studies [8, 14, 15] except one use unselected pregnancies [17], and another one further excluded twin with a birthweight below the 5th percentile of their national singleton standard [16]. To construct a standard, we used a stricter inclusion criteria than other studies [14,15,16], i.e. pregnancies with unmatched Us-GA and LMP-GA, or maternal age < 20 years or > 35 years, or crown-rump length discordance > 10%, or sIUGR were all excluded, which was different from most previous studies [14,15,16]. When modeling fetal growth for twins, the dependency of data, namely, the clustering of the twins and serial measurements on the same twin, should be taken into consideration. The present study used linear mixed model accounting for data correlation from both mother-level and fetus-level, which was also considered in most previous studies [8, 14, 16, 17], but was not in the Brazil study that used polynomial regression [15].

When comparing our charts with those from previous studies, we found that the Chinese twins had very similar 50th percentiles, but higher 5th and 10th percentiles and lower 90th and 95th percentiles especially in late gestation [8, 14,15,16,17]. The difference may originate from several aspects. Firstly, it is now well recognized that the difference in fetal growth is largely due to biological differences among regions and ethnicities [26]. Some of the previous studies were multicenter or included several ethnicities, which would lead to larger range. However, the present study included only Chinese twin pregnancies, and most of them were Han ethnicity, and few of them were too thin or too heavy. Thus, the population may be genetically and physically more homogenous, which would make the growth percentiles narrower. Secondly, the study design may also play an important role. Some of the previous studies used unselected twin pregnancies [17], who might have had more complications (i.e. sIUGR or TTTS) and larger variation in fetal growth, resulting in a wider range for fetal growth reference. As the present study aimed to construct an optimal growth standard, healthy twin-pregnancies with stricter definition were selected, who were likely to have smaller variation and a narrower range. Also, since the fetal growth standard is gestational-age-dependent, the exclusion of women with inaccurate GA (unmatched Us-GA and LMP-GA) can lead to a narrower range. Furthermore, repeated measurements on an individual fetus were more homogeneous than those from a cross-sectional study that used only one measurement from the fetus[Brazil]. Finally, our study was conducted in one center and the ultrasound scans were performed by three experienced, well-trained sonographers, whereas some of the previous studies used data from several centers, which would have larger inter-observer variation and wider range for fetal growth. Supplementary Figure S2 indicates low inter-observer variation and good homogeneity.

Clinical implications

The use of a singleton fetal growth chart to evaluate the twin pregnancies is a very common practice. However, it has been well demonstrated that compared to singletons, the growth of twin fetuses become slower and the fetal growth curves diverge significantly in late gestation (i.e. after 28–32 weeks) [6,7,8]. Therefore, twins need a separate standards to evaluate their growth and identify growth restriction and adverse prenatal outcomes more accurately. Indeed, when applying the present chart instead of the Hadlock singleton standard to live-born twins, the proportion of SGA identified was more precise, and the risk of adverse events in SGA identified was substantially elevated. When the identification of SGA was more precise, unnecessary medical costs and burden could be avoided. Moreover, given that the Zhang’s curve is very similar to the ones of this study, indicating that Zhang’s method is applicable to Chinese twins, other areas may use Zhang’s method to generate their own curves if deemed necessary. However, prior to a new standard being applied in clinical use, prospective studies are warranted to ensure its performance to identify pregnancies that are at higher risk of adverse perinatal outcomes.

Strength and limitation

The present study has several strengths. Firstly, all the materials were from a prospectively-designed cohort study, which enabled us to obtain the information with minimal bias. Secondly, gestational age was ascertained by first-trimester CRL of the larger twin, and those with unmatched Us-GA and LMP-GA were excluded from the present study. By doing so, the accuracy of gestational age was ensured. Thirdly, all the ultrasound scans were conducted by three experienced sonographers, and ultrasound biometry was measured according to the same standard operating procedure. Fourth, the linear mixed model, which took the correlation within the twin-pair and serial measurements of a single fetus into account, provided a better estimation of the fetal growth for twin pregnancies.

Still, there are several limitations that we should acknowledge. First, the study was conducted in a single tertiary center, which might limit the generalizability of its results. However, most twin pregnancies are commonly referred and delivered at tertiary hospitals in China. As a tertiary hospital of about 30,000 deliveries per year in Shanghai, our study population should be of good representativeness. Indeed, though 69% of our study subjects were from east of China, our study subjects covered 88% (30/34) of provinces in China. Thus, our results can be applied at least to twin pregnancies in the east part of China. Other areas may generate their own curves by Zhang’s method given that the Zhang’s curve is very similar to the ones of this study. Finally, although the ability of the growth chart in identifying small fetuses at risk of neonatal death and adverse perinatal outcomes appeared to be good, future studies with long-term follow up are needed to determine the best cut-point in predicting long-term fetal outcomes-the ultimate goal of monitoring fetal growth.

Conclusion

We created a fetal growth chart for Chinese twins. The MC twins were consistently lighter than the DC twins but with small differences throughout the gestation. Overall, the Chinese twins were identical to previous studies in the 50th percentiles, but tended to have narrower ranges at late gestation. Our standard performed much better than the Hadlock’s standard in predicting low birth weight infants associated with adverse perinatal outcomes in twin pregnancies. Our study also indicates that Zhang’s method is applicable to Chinese twins in generating fetal growth reference.