Background

Ultrasound has been an indispensable tool for diagnosis in obstetrics and fetal growth assessment for at least 4 decades [1,2,3]. Clinical management in pregnancies is increasing based on ultrasound measurements derived in the first trimester and on the recognition of pathological fetal growth, which depends on reliable, standardized growth curves [4]. Although it is widely known that boys are slightly larger than girls in the first trimester and at birth, there has been no consideration of fetal gender in the development and interpretation of fetal growth curves [5,6,7,8]. This gender dichotomy seems important since there is clear evidence that gestation-specific neonatal outcomes are worse in boys, indicating the vulnerability of the male embryo and fetus [9, 10].

Many charts have been published on fetal growth using different methodologies from the early 1990s until early in this decade, after which new (dating) protocols emerged [11]. Most normal ranges were designed from cross-sectional data [12,13,14,15,16,17,18,19], which by their nature may represent fetal size at a given point but do not directly infer growth. To derive information on fetal growth, statistical strategies using repeat measurements are required but longitudinal methodologies are utilized more rarely [20, 21]. Given these complexities, the World Health Organization (WHO) Multicentre Growth Reference Study (MGRS) Group recommended Generalized Additive Model for Location, Scale and Shape (GAMLSS) for the construction of the WHO Growth Standards [22, 23]. Most recently, growth charts have been developed in the regions of Europe and the USA and customization based on ethnicity is reported [11, 12, 18, 19, 24].

Our aim was to develop gender-specific longitudinal first, second, and third trimester normal growth reference curves within a low-risk Caucasian population with a robust WHO-endorsed longitudinal statistical methodology. Further, we aimed to test the validity of these curves by comparing the estimated fetal weights derived from these charts to actual birth weight, and determine whether there were gender differences in fetal growth trajectories and immediate birth outcomes.

Methods

This was an observational longitudinal cohort study of first, second, and third trimester fetal biometry ultrasound examinations performed during 2002–2012 in the University Hospital Leuven. The study was approved by the ethics committee of the University Hospitals KU Leuven. The data was selected from the astraia© ultrasound database with the following criteria (Fig. 1): indication “routine fetal growth” (level 1 and 2 ultrasound scanning for fetal anomalies, excluded), singleton pregnancy, ethnicity “Caucasian,” and gestational age confirmed by a crown-rump length (CRL) measurement (3–83 mm) in the first trimester [25]. Only pregnancies with at least two or maximum three scans (first, second, and third trimester) were selected, representing a routine of care scheme for a low-risk population. The measurements were performed with the following ultrasound machines (with time period of usage): Kretz Voluson 730 (2002–2006), ESAOTE Technos (2002–2006), Acuson Sequoia (2002–2007), General Electric Voluson® 730 Expert (GE Healthcare Medical Systems, Kretztechnik, Zipf, Austria, 2007–2012), General Electric Voluson E8 (GE Healthcare Medical Systems, Kretztechnik, Zipf, Austria, 2007–2012). The first three devices were equipped with a 4–8-MHz curved linear array probe. The GE Voluson E730 and GE Voluson E8 used a curved 4–8-MHz volumetric 3D abdominal probe. All growth data were immediately stored in an electronic database (astraia© Software Inc., Munich, Germany). Fetal measurements were based on the following two-dimensional biometric parameters: biparietal diameter (BPD), head circumference (HC), abdominal circumference (AC), and femur length (FL), as designated in the guideline descriptions (Additional file 1) [26]. Only the complete fetal datasets (all four measurements) were analyzed. Neonatal data from the included patients were extracted from their birth files for gestational age at delivery, gender, birth weight, birth length, head circumference, Apgar scores (AS) for the first and fifth minute after birth, umbilical cord arterial pH, and base excess (BE) measurement. Only the gender-specific neonatal datasets were analyzed.

Fig. 1
figure 1

Flowchart on the selection procedure for normal routine fetal ultrasound scans between 2002 and 2012. *UK, unknown gender

Statistical analysis

Outliers in BPD, HC, AC, or FL were removed from the data. Generalized Additive Models for Location, Scale and Shape (GAMLSS; www.gamlss.org) was applied to construct the growth curves for all four fetal routine fetal biometry measurements: BPD, HC, AC, and FL, by the use of the R package software [22, 23]. We assessed several distributions: Box-Cox-t, Box-Cox Cole and Green, and Box-Cox power exponential. Goodness-of-fit of the models was assessed with QQ plots, Akaike Information Criteria (AIC), and worm plots. The goodness-of-fit was investigated covering the gestational age 12–40-week period and for substrata of this period. GAMLSS smoothed the antenatal growth curves for BPD, HC, AC, FL, and estimated fetal weight (EFW). For the EFW, the Hadlock-3 formula [Log10 EFW = 1.3350.0034 (AC) (FL) + 0.0316 (BPD) + 0.0457 (AC) + 0.1623 (FL)] was used [11]. The 5th, 10th, 50th, 90th, and 95th percentiles were plotted with grid lines. The whole analysis was done three times: for all pregnancies, for boys, and for girls. SAS 9.4 was used for merging the fetal database with the neonatal database and analyzing the neonatal data (Mann-Whitney test).

Results

Between 2002 and 2012, 89,933 scans were selected. After restricting to a low-risk population, a total of 27,680 scans remained representing 12,368 pregnancies (Fig. 1). The mean maternal BMI was 23.8 kg/m2 (std. 4.8), 6.6% of the women smoked. Gender-specific birth datasets could be ascertained in 76.1% of the cases and are outlined in Table 1. In total, we had 4900 boys and 4513 girls, representing respectively 10,992 and 10,092 scans. The mean birth weight, birth length, and head circumference were significantly (p < 0.001) different for boys (3450 g, 50.9 cm, 34.9 cm) as compared to girls (3329 g, 50.1 cm, 34.3 cm). A low 1-min AS (≤ 5) was more common in boys (3.8%) as compared to girls (2.9%) (p = 0.01) as was a low 5-min AS (≤ 7) for boys (3.2%) compared to girls (2.3%; Table 1) (p = 0.009). The arterial umbilical cord pH was lower in boys compared to girls (p < 0.001). There was no difference in asphyxia, defined as a pH < 7.10, in boys (0.9%) compared to girls (1.0%, p = 0.90), and abnormal BE (< − 10 mEq/L) was the same for both sexes. There was no difference in preterm birth (< 37 weeks) for girls (5.7%) and boys (6.5%, p = 0.14; Table 2) which occurred in 6% of the pregnancies overall. In the preterm group, boys were heavier (p = 0.003), longer (p = 0.005), and had larger head circumferences (p = 0.006). The immediate outcome of AS and pH were also different in boys and girls, although not statistically different due to the smaller preterm group (Table 2). The term group is outlined separately in Additional file 2.

Table 1 Neonatal data for boys, girls, and combined in term and preterm pregnancies
Table 2 Neonatal data for boys, girls, and combined in preterm (< 37 weeks) pregnancies

GAMLSS longitudinal fetal antenatal growth curves for BPD, HC, AC, and FL from 12 to 40 weeks were developed for boys, girls, and combined (Additional file 3). For each parameter, the 5th, 10th, 50th, 90th, and 95th centiles were constructed. Actual values for these centiles and grid curves are outlined in Additional file 4. Comparing the two gender growth trajectories and their percentiles, for BPD, there was a significant (p < 0.001) difference for all percentiles in boys having higher BPD measurements (Fig. 2, Table 3). At 24 weeks, the 50th percentile BPD for boys (60.4 mm) is significantly higher as compared to girls (58.9 mm, p < 0.001; Additional file 5). This corresponds to a difference of three gestational days. The boys’ 5th percentile aligns with the 10th percentile of the girls, and the 90th percentile aligns with the 95th percentile of the girls. For HC, these differences were even more pronounced (p < 0.001; Additional file 5). The prenatal difference of HC of boys at the 95th percentile increases to + 6.5 mm at 35 weeks, but it is already present at 2 weeks of gestation (+ 3.8 mm; Fig. 3, Table 4). The neonatal head circumference confirmed this difference of + 6 mm as being significant between boys and girls (p < 0.001; Table 1). Generally, prenatal AC measurements were significantly higher in boys than in girls, but less demonstrable across the total gestational period than for BPD and HC (Fig. 4). For FL, there was no significant difference between boys and girls in their antenatal growth percentiles (Fig. 5). The EFW was different in boys throughout the gestational age at different percentiles compared to girls, except for the 40 weeks measurement (Table 5). Girls reach the 500 g EFW 1 day later (22wks3/7) as compared to the boys (22wks2/7; Additional file 5). At the 50th percentile at 24 weeks, boys are estimated to be 21 g heavier compared to girls (p = 0.02; Additional file 5).

Fig. 2
figure 2

Biparietal diameter (BPD) in millimeters for boys and girls from 20 to 30 weeks of gestation for percentiles 5, 10, 50, 90, and 95

Table 3 BPD reference values for boys and girls from 12–40 weeks
Fig. 3
figure 3

Head circumference (HC) in millimeters for boys and girls from 20 to 30 weeks of gestation for percentiles 5, 10, 50, 90, and 95

Table 4 HC reference values for boys and girls from 12 to 40 weeks
Fig. 4
figure 4

Abdominal circumference (AC) in millimeters for boys and girls from 20 to 30 weeks of gestation for percentiles 5, 10, 50, 90, and 95

Fig. 5
figure 5

Femur length (FL) in millimeters for boys and girls from 20 to 30 weeks of gestation for percentiles 5, 10, 50, 90, and 95

Table 5 EFW reference values for boys and girls from 12 to 40 weeks

Discussion

In this study, we have constructed antenatal growth and estimated fetal weight charts, with a strict and clearly defined selection protocol in a normal Caucasian population and separately for boys and girls. Boys have significantly larger late -second and third trimester HC, BPD, and AC measurements than girls. For FL, there are no differences. The implication of these findings is that a boy and a girl at exactly 24 weeks of gestation might, based on the current late second trimester dating protocols with head measurements, be assigned a gestation as much as 3-day difference and an EFW difference of 21 g at 24 weeks favoring the boys. These antenatal differences were confirmed at birth with boys being significantly heavier, longer, and having larger head circumferences as compared to girls. The 1- and 5-min AS and cord pH was lower in boys. The dating and weight estimation differences could potentially be taken into account in determining prenatal and immediate perinatal viability management in terms of timing the administration of maternal steroids for fetal lung maturation, decisions for delivery, and possible resuscitation. Also, in the post-term period management in pregnancy, these gender differences could also potentially influence decisions including the timing of labor inductions, affecting an even larger population. Consequently, if second trimester dating of the pregnancy has been undertaken, girls are potentially put at risk of stillbirth in the post-term period by assuming the gestational maturity to be less than it is [27].

In one cross-sectional study, a difference has been shown between fetal head measurements for both boys and girls, although the curves were constructed with the older linear regression models [28]. They also confirmed the birth weight difference but did not report information on neonatal head circumference or other outcomes (AS, cord pH). Another unselected multi-ethnic combined cross-sectional and longitudinal population study also found differences in fetal head and abdomen measurements using statistical methods current at that time; however, no birth outcomes were available [29].

While it has been demonstrated that gestation-specific neonatal outcomes are worse in boys than in girls [9, 10], what had not been previously appreciated in a routine population is that boys have lower Apgar scores at both 1 and 5 min and lower cord pH values at delivery than girls. These results underline male vulnerability in the perinatal period. In a recent published elegant report on neonatal outcome in appropriately grown term babies, gender differences were demonstrated in terms of lower Apgar scores at 5 min and higher rates of instrumental deliveries for failure to progress in labor for boys [30]. This concerned a multi-ethnic retrospective cohort from one center and birth data specified for both genders. They demonstrated a birth weight difference of 135 g at term, comparing closely with the 121 g that we report, but their data lacked other anthropometric data (birth length and head circumference) and antenatal growth data. It is of course possible that neonatal outcomes are worse because immediate birth outcomes are worse. Whether this is an attribute of being male per se, or some effect of fetal size on delivery, cannot be explained from their results or ours. We can demonstrate that the gender differences in fetal anthropometry starting from 20 weeks onwards affect fetal dating and the estimated fetal weight. In our preterm sub-analysis, the birth weight differences between boys and girls are also present in absolute mean differences (∆birth weight 161 g, ∆birth length 0.8 cm, ∆HC 0.6 cm), and there are noticeable differences between AS and umbilical cord pH (Table 2), although not statistically significant due to smaller numbers. One hypothesis is that either the differences in biometry are relatively more important in the (full-grown) male fetus interacting with maternal pelvic limitations causing more labor dystocia for boys, and hence lower AS. Alternatively, other fetal gender-specific factors can influence the birth process and compromising the immediate birth outcomes. Gender-specific body composition at birth has been reported, where the male infant has more fat mass and lean body mass than the female infant, especially in well-nourished mothers [31]. This phenomenon has been associated with gender-different intrauterine physical adaptations to an enhanced nutrient supply from the mother. The male infant body composition has been more subject to maternal influences as higher pre-gestational BMI and excessive gestational weight gain [32]. Lastly, the lung maturation of the male fetus proceeds slower than in the female fetus, possibly contributing to a higher rate of low AS in the term grown fetus. In animal studies, lung fluid secretion is inhibited and the lung fluid absorption initiated by adrenalin infusions at birth [33]. And preterm asphyxiated male infants have lower adrenaline levels than female infants, again putting the boys at higher risk [34]. Whether in the term infant this will be similar is unknown.

Strength and weakness

Our antenatal growth curves are unique in that all four fetal growth parameters (BPD, HC, AC, and FL) were measured in standardized circumstances in accordance with international guidelines [26]. Longitudinal growth charts were constructed for each parameter, with the WHO advocated GAMLSS method used [22, 23]. GAMLSS can combine longitudinal data with a cross-sectional component and can construct centiles in a way that they are constrained and do not cross. Further, in using the GAMLSS analysis statistics, one could, by synchronizing the statistical methods of the WHO, align the biometry measurements with the neonatal and pediatric charts [22, 23]. With the available neonatal data, we could discriminate different growth curves for boys and girls for all four fetal growth parameters and hence the EFW. Since the introduction of ultrasound in antenatal care, many reports on fetal growth curves have been published [11,12,13,14,15,16,17,18,19,20,21]. Recognizing pathological fetal growth depends on reliable, standardized growth curves [35]. Discrepancies between the curves have often been attributed to the differences in methodology and population selection [36]. A recent report reviewed fetal growth charts, demonstrating the wide variations of methodologies on how these charts have been constructed concluding that there were many grounds for bias in the growth curves that are currently used [37]. Particularly in “inclusion/exclusion criteria,” “ultrasound quality control measures,” and “gestational dating protocols,” many ambiguities existed. Standardization of the methodologies with a checklist was recommended to define a high-quality study [37]. When we compare our growth charts to the requirements, these would be compliant for the combination of a high-quality control score, longitudinal design, sample size, and the fact that all four parameters (BPD, HC, AC, and FL) were examined (Additional file 6). All growth measurements were reviewed by certified staff members, judging all the scanned images as to whether they adhered to the protocol described. We also incorporated a strict protocol on pregnancy dating. Only pregnancies that had a first trimester confirmation scan on gestational age were included: crown-rump length (CRL) measurement between 3 and 83 mm (gestational age ≥ 5+0 and < 14+0 weeks) [4, 25]. In Belgium, in routine obstetrical care, every pregnant woman will be offered a first, second, and third trimester ultrasound scan with fetal growth measurements. In many countries, the third trimester scan is not part of the routine care for low-risk pregnancies [38]. Also, to measure the four fetal growth parameters in the first trimester is not a routine care and allowed us to define “fetal growth” through serial measurements, instead of “fetal size,” as defined through cross-sectional measurements [12,13,14,15,16,17,18,19, 39]. Furthermore, we were able to eliminate aberrant fetal growth and extreme maternal influences by excluding fetal anomalies (level 1 and 2 indications) and including only the mothers enrolled to a routine obstetric care scheme [40]. Finally, a population-based cohort was generated with a significant sample size over a period of 11 years. The description of a routine population could also be supported by our neonatal data. Neonatal data was complete for 76% in our cohort. The rate of premature birth was 6%, which is consistent with the European nationally accepted norms. In our population selection, we further customized the charts for one maternal and one fetal factor. We selected on ethnicity “Caucasian” and the fetal gender. Other ethnicity-derived customized growth curves have arisen in response to the early reference charts from mainly Europe and the USA [18, 19]. Ethnicity was reported to have a discriminative influence on fetal growth [24, 41]. The aim of the INTERGROWTH-21st study was to construct prescriptive instead of descriptive curves using the same statistical methods as used in our study (GAMLLS) [42]. The study population comprised 35% of the pregnant population, recruited highly selected healthy, educated (> 75% of a local level), non-obese (BMI 18–30 kg/m2), non-smoking women, 18–35 years of age and recruited in selected institutes. This highly qualitative study (Additional file 6) represents a fascinating investigation of the physiology of fetal growth, concluding that optimal growth potential can be attained irrespective of the ethnicity in a selected population, which is in contradiction with the previous studies. Unfortunately, it lacks information on fetal gender differences; not all measurements were longitudinal, and the derived charts are by their selective nature manifestly not representative of a general population, regardless of the ethnicity concerned. Our current study adds these advantages. Girls and boys both have different neonatal growth curves, assuming there is a discriminative effect of the gender on their growth trajectories. In more than three quarters of our cohort, complete neonatal data was registered, including gender registration. Therefore, we focused on developing two separate fetal growth charts, both for boys and girls. Comparing the extremes of growth (< p5 and > p95), the female fetus is considered wrongfully small or non-macrosomic and the male fetus vice versa when compared to the INTERGROWTH-21st curves (Table 6). Fetal gender, unlike maternal ethnicity, is not commonly known in the first trimester but it is from the 20 weeks’ scan onwards (“anomaly” scan). From a clinical point of view, it seemed therefore relevant to start discriminating these curves from 20 weeks of gestation onwards.

Table 6 Cross-sectional gestational age comparison of INTERGROWTH-21st and gender-specific (M/F) fetal head measurements at 5th and 95th percentiles

Some limitations on constructing these charts have to be addressed. The study was performed in a university teaching hospital, a large tertiary referral center, not necessarily reflecting a routine setting. This center, on the other hand, also has a regional remit for routine obstetric care for low-risk pregnancies, but the included cases were not selected on maternal morbidity nor on parental characteristics. Some maternal characteristics (e.g., smoking occurred in 6.6%) were not excluded in the selected cohort, deliberately to prevent “super-normalization” of the cohort. But artificial conception was excluded for intracytoplasmic sperm injection, since this is a level 1 ultrasound indication. Finally, it is expected that within this large time period, some women with subsequent pregnancies were included more than once for this cohort.

Implications for clinical practice

Our fetal growth curves for the Caucasian population resemble predictive growth curves with the gender specified which can discern aberrant from normal fetal growth. The longitudinal aspect and large cohort, covering the full trimesters, have not been reported before in the Caucasian population. The neonatal data gave us the opportunity to customize for the fetal gender. There was a marked difference between fetal boys and girls in their growth trajectory for fetal head measurements and to a lesser extent the abdominal circumferences. Also for the estimated fetal weight, there was a difference. This gender differentiation is important in antenatal and perinatal care. Prenatal ultrasound is used not only to define fetal growth, but also gestational age. Both growth and fetal age are important in defining the time point of fetal viability and the optimization of the timing of obstetrical interventions, e.g., medical elective birth or administration of corticosteroids for fetal lung maturation in cases of threatened premature birth. Second trimester dating depends on fetal growth parameters and particular on the fetal head measurement. Our results suggest a gender-specific approach in counseling future parents on important issues when fetal viability starts and when is the best time point to start obstetrical interventions.

The gender differences are further demonstrated by the immediate birth outcomes for males: different anthropometry (heavier, longer, and bigger heads), lower AS, and lower cord pH. The significant lower AS and umbilical cord pH in boys underline the fetal male vulnerability, although in the asphyxia group (pH < 7.10), there was no predominance by males, stating that boys do not have a higher risk of acidemia at birth in a routine population. Therefore, one can argue on the clinical importance of the pH findings (and perhaps also the AS) in our study.

Conclusion

In summary, we present fetal growth curves with the latest statistical tools in a large, routine pregnant population with state-of-the-art ultrasound technology. The data covers the pregnancy period from 12 weeks onwards, and there were differences between boys and girls for the fetal head and fetal abdomen measurements and the estimated fetal weight. Also, the immediate neonatal outcome demonstrated gender differences favoring the girls. This could give caretakers the opportunity to take into account a gender-tailored approach in life decision care both at the margins of viability and post-term.