Introduction

Human milk oligosaccharides (HMOs) are found in abundance in human milk and make up the largest solid component after lactose and lipids [1,2,3,4]. HMOs can become characterized as complex group of more than 200 different, nondigestible, and non-nutritional carbohydrates, providing an energy source for beneficial intestinal bacteria. There is evidence that HMOs improve the host defense by strengthening the gut barrier and immune-modulating effects and other mechanisms. Bovine milk, in contrast to human milk, contains relatively low levels of oligosaccharides, and the prevalence of fucosylated oligosaccharides, in particular, is quite low [5]. 2′-fucosyllactose (2′FL) is a trisaccharide composed of glucose, galactose, and fucose and is one of the most abundant HMOs. Levels of 2′FL vary depending on the secretor blood group status of an individual woman as well as ethnicity and stage of lactation, with 2′FL levels from about 0.9 to above 4 g/L in mature milk among secretors [6,7,8,9,10,11,12,13,14]. Another predominant HMO in human milk is lacto-N-neotetraose (LNnT) at levels ranging from 0.1 to 0.6 g/L with higher levels within the first month of lactation [7,8,9,10, 15,16,17].

Advancements in manufacturing technology now enable the synthesis of HMOs, and preclinical studies have established their safety for the purposes of supplementation of infant formulas [18, 19]. Safety, tolerance, and adequate growth as well as potential clinical benefits have been demonstrated in randomized controlled trials (RCTs) of term infant formulas supplemented with 2′FL alone and in combination with LNnT [20,21,22]. An RCT in the United States of America (USA) found that infants receiving formula supplemented with either galacto-oligosaccharides (GOS) or GOS + 2′FL demonstrated adequate growth and good tolerance [21]. Another RCT conducted in Belgium and Italy examined a study formula containing 1.0 g/L 2′FL and 0.5 g/L of LNnT in the test arms, while the control arm received standard formula without HMOs [20]. The HMO-supplemented formula was again well-tolerated and supported age-appropriate growth. A third study in the USA compared tolerance in infants receiving a 100% whey, partially hydrolyzed infant formula with the probiotic Bifidobacterium lactis with and without the further addition of 2’FL and found that the HMOs-supplemented formula was well-tolerated [22].

Evidence is emerging that HMOs play an important role in the development of a balanced intestinal microbiota and in supporting immune protection in breastfed infants [23,24,25]. Preclinical models have found that both 2’FL and LNnT promote the growth of Bifidobacterium species [26, 27]. Additionally, in a RCT of a term infant formula supplemented with 2’FL and LNnT, lower rates of parent-reported morbidity (particularly lower respiratory tract illnesses such as bronchitis) and lower use of antipyretics and antibiotics in the group receiving HMOs-supplemented formula were reported compared to the control infants [20]. Stool samples collected for microbiota assessment and metabolic signature at 3 months showed that the addition of 2’FL and LNnT shifted the stool microbiota closer to that observed in breastfed infants both in composition and function [28]. Collectively, these findings, in conjunction with the documented differences in HMOs composition between human and bovine milk, have provided a solid rationale for the benefits of bovine milk-based infant formulas with HMOs.

While the evidence provided to date in RCTs is supportive of the safety and tolerance of HMOs-supplemented infant formula, studies are needed in a real-world setting because results from a highly controlled RCT do not always translate outside of the trial setting [29]. Additionally, a relatively large proportion of infants in real-world settings are fed with both human milk and formula [30,31,32], a mixed feeding regimen not studied in RCTs. The current study was thus designed to complement and enhance existing RCTs by assessing the growth, safety, and tolerance of healthy term infants, consuming an infant formula supplemented with HMOs either exclusively or mixed with human milk in a real-world setting.

Participants and methods

Study design

This was a three group, non-randomized, open-label, prospective study in healthy, term (37–42 weeks of gestation) infants enrolled at age 7 days to 2 months. The study was conducted between 08/07/2019 and 24 July 2020, in 12 centers (pediatrician and adolescent doctors) throughout Germany and Austria. One study group included infants who were exclusively formula fed (FF), while a second group included infants who were mixed fed, i.e., received both formula and human milk (MF). The third group included exclusively breastfed infants (BF) serving as a reference population. Formula-fed infants were eligible to participate if their parent(s) had independently elected, before study enrollment, to formula feed. Breastfed infants were eligible if the infants had been exclusively breastfed since birth, and their parent(s) had decided to continue exclusively breastfeeding until at least 4 months of age. Exclusion criteria included any known intolerance/allergy to cow’s milk (formula-fed group only), conditions requiring infant feedings other than those specified in the protocol, and evidence of significant systemic disorders (cardiac, respiratory, endocrinological, hematologic, gastrointestinal, or other).

At study enrollment, FF and MF infants received the study formula and were fed for 8 weeks (56 days). Formula was prepared and fed at home and was given ad libitum. Infants completed an in-person clinic visit at enrollment (baseline) and again at day 56 ± 3 days (week 8 visit). A telephone visit with the parents was also conducted on day 28 ± 3 days (week 4 visit).

Study product

Commercially available in Germany and Austria since autumn 2018, the study formula was provided to the participants at no charge. It was a partially hydrolyzed 100% whey, term infant formula with 67 kcal/100 mL consisting of 1.9-g protein, 11.5-g carbohydrates, and 5.1 g of lipids per 100 kcal powder, and with two HMOs: 1.0 g/L of 2′FL and 0.5 g/L of LNnT.

Ethical approval and informed consent

This study protocol was approved by the ethics committee of the Berlin Chamber of Physicians. Prior to the conduct of any screening tests, informed consent was obtained from each participant’s parent. Good clinical practice was followed by all sites throughout the study. The study was registered with ClinicalTrials.gov NCT05150288.

Study measures

At baseline and again at the clinic visit at week 8, anthropometry measures were obtained including weight, length, and head circumference using standardized procedures. Infant weight was measured without clothing or diaper on a calibrated electronic scale to the nearest 10 g. Recumbent length was measured on a pediatric length board to the nearest 1 mm. Head circumference was measured to the nearest 1 mm using a nonelastic plastic-coated measuring tape. Body mass index (BMI) was calculated as weight (kg)/(length (m))2. Z-scores for weight for age, length for age, head circumference for age, and BMI for age were calculated using the World Health Organization (WHO) Child Growth Standards [33].

The infant’s gastrointestinal (GI) symptom burden was assessed via the Infant Gastrointestinal Symptom Questionnaire (IGSQ) [34], a validated 13-item questionnaire that assesses GI-related signs and symptoms as observed by parents over the previous week in 5 domains: stooling, spitting up/vomiting, gassiness, crying, and fussing. Each item is scored on a scale of 1 to 5 with higher values indicating greater GI distress. A composite IGSQ score is derived from summing the individual scores with a possible range of 13 to 65 where higher values indicate greater GI distress and values ≤ 23 indicate no digestive distress [34]. The IGSQ was administered at baseline, week 4, and week 8.

A formula satisfaction questionnaire was administered to parents of infants in the formula-fed groups at week 4 and week 8 including three questions regarding the parent(s)’ experience with the study formula. Questions included the following: “Did your child like what he/she consumed?”, “How satisfied are you overall with the study product?”, and “Would you continue to provide the study formula to your child?”.

Adverse events (AE) were captured from the time of enrollment through the end of study. All AEs were assessed by the site investigator for duration, intensity, frequency, and relationship to study formula. AEs were classified by system, organ, and class (MedDRA SOC codes). In relation to published data of other studies, we expected AEs like the following: atopic diseases (f.e., eczema or cow’s milk protein allergy) infectious diseases; gastrointestinal symptoms, use of medication, and others.

Statistical methods

Demographics and other baseline characteristics were compared between all pairwise combinations of feeding groups using two-sided Wilcoxon rank-sum tests for continuous variables and Fisher’s exact tests for categorical variables. Fisher’s exact tests were computed from contingency tables. For tables larger than 2 × 2, a Monte Carlo estimation of the exact p-value was performed with 20,000 samples; otherwise, a direct exact p-value computation was performed. Missing values were excluded before performing the aforementioned tests.

The co-primary outcomes were growth and composite IGSQ score. Feeding group comparisons were assessed individually at each time point (baseline visit and week 8 visit) for all growth measures using the analysis of covariance (ANCOVA) controlling for baseline value, age, gender, and study center. Listwise deletion was performed to handle missing values in the models. Tolerance was assessed via the IGSQ scores. The 13 individual questions in the IGSQ as well as the five domain scores and the composite IGSQ score were tabulated for each feeding group at each time point (baseline visit, week 4 visit, and week 8 visit). These scores were compared between the feeding groups individually at each time point using ANCOVA controlling for the baseline scores and age at baseline. The derived inferential statistics on the IGSQ scores were based on the sandwich estimator of the variance–covariance matrix of the models’ parameters due to some heteroscedasticity observed in the models’ residuals. Listwise deletion was performed to handle missing values in the models.

All analyses were conducted using SAS BASE 9.4/SAS STAT 15.1 on the SAS Life Science Analytics Framework (SAS LSAF, SAS Institute Inc., Cary, NC, USA) version 5.2.2. Due to the descriptive nature of the trial, no adjustment for multiple testing was performed. The statistical significance was assessed using an α-level of 5%.

Being a real-world evidence (RWE) study, sample size was based on practical and logistical feasibility and on the experience of published RCTs investigating safety and tolerance of formula containing 2′FL/LNnt [20,21,22]. The analysis set was defined by excluding infants who did not comply with the protocol (e.g., switched from breastfed group to formula-fed group and vice versa), were lost to follow-up, experienced tolerance issues, withdrew without explanations, or did not provide data due to other reasons. For BF, infants who were not exclusively breastfed and received other formula than the study product were excluded from the analysis set. All analyses of growth, tolerance, and satisfaction in this paper were conducted in the analysis set. AEs were reported for all enrolled infants.

Results

Subject disposition and demographics

In this study, 117 infants were enrolled including 51 FF, 22 MF, and 44 BF infants (Fig. 1). The number of subjects in the analysis set was 46, 22, and 38, respectively, in FF, MF, and BF, with primary exclusion reasons being major protocol deviations (namely, breastfed infants who were fed a non-study formula at least once during the study) and intolerance issues.

Fig. 1
figure 1

Flow chart of subject disposition

The demographics and baseline characteristics of the infants included in the analysis set are shown in Table 1. MF was slightly younger at enrollment as compared with FF and BF. The gender distribution was comparable between groups. There were no differences between the three groups in terms of mothers’ ethnicity or educational attainment. Mothers of the infants in FF were the youngest. Fathers in FF had a slightly lower level of education than those in MF and BF. Parents in FF had significantly higher proportions of smoking compared to those in BF.

Table 1 Demographics and baseline characteristics by feeding groups in the analysis set (N = 106)Subject characteristics

Growth

Age-appropriate growth was observed in all three feeding groups. Baseline weight and length were slightly lower in MF. By week 8, there were no significant differences between any feeding groups for any of the anthropometric measures (all ANCOVA p-values between feeding groups > 0.05). Mean Z-scores for weight, length, head circumference, and BMI at baseline and week 8 are shown in Fig. 2. Weight-for-age, length-for-age, and BMI-for-age z-scores were comparable between all feeding groups at week 8. The mean z-scores were within ± 0.5 of the WHO medians at week 8. Head circumference-for-age z-scores were also comparable between groups and tracked closely with the WHO standards.

Fig. 2
figure 2

Anthropometric mean z-scores at baseline and week 8, by feeding group, analysis set (N = 106). Bars represent 95% confident intervals (two sided). BF, exclusively breastfed group; MF, mixed-fed group; FF, formula-fed group (exclusively)

Gastrointestinal tolerance

Table 2 shows descriptive characteristics for the five IGSQ domains and the overall composite IGSQ score. Composite IGSQ scores demonstrated low GI distress in all feeding groups at all time points. At baseline, FF had significantly greater GI distress compared to BF (mean difference [95% confidence interval (CI)] FF-BF = 5.18 [2.44, 7.91], p = 0.0003). However, there were no significant differences in the composite IGSQ score between any of the feeding groups at week 4 or week 8 suggesting that the GI tolerance in FF improved after introduction of study formula. For one of the five domains of the IGSQ (gassiness), there were no significant differences in scores between the groups at baseline, week 4, or week 8. There was a minor difference in fussiness between FF and BF at week 4 (mean difference [95% CI] = 0.821 [0.089, 1.554], p = 0.028), but no differences were observed at baseline or week 8. For the stooling domain, FF had significantly higher scores (i.e., more stooling issues) than BF at baseline (mean difference [95% CI] = 1.335 [0.543, 2.126], p = 0.001) but showed significant improvement by week 8 (mean difference [95% CI] =  − 0.151 [− 0.704, 0.403], p = 0.59), with FF moving closer to the stooling profile of BF. For the spitting-up/vomiting domain, FF again had significantly greater distress compared with BF at baseline (mean difference [95% CI] = 1.476 [0.457, 2.469], p = 0.005) and showed significant improvement at week 8 (mean difference [95% CI] =  − 0.128 [− 0.865, 0.609], p = 0.731), moving closer to the spitting-up/vomiting profile of BF. Lastly, FF had significantly greater distress for the crying domain compared with BF at baseline (mean difference [95% CI] = 0.951 [0.189, 1.713], p = 0.015) and week 4 (mean difference [95% CI] = 0.639 [0.047, 1.232], p = 0.035) and showed significant improvement at week 8 (mean difference [95% CI] =  − 0.339 [− 0.877, − 0.199], p = 0.214), becoming more comparable to BF.

Table 2 Composite IGSQ and domain scoresa at baseline, week 4, and week 8, by feeding group, analysis set (N = 106)

For the individual items in the 13-item IGSQ, FF passed more hard stools than BF and MF and had more difficulties in passing bowel movements than BF at baseline. Still at baseline, FF compared to BF showed a higher number of times the baby arched its back in pain when spitting up, was crying during feeding, or it was not possible to stop the baby from crying and also a higher number of fussy days in the past week. In FF, total time spent crying and number of times unable to soothe baby’s fussiness were more than in MF. There were no significant differences between the groups for the other IGSQ individual items at baseline. At week 8, there were no significant differences in any of the individual items between BF, MF, and FF except the total time the baby spent crying. FF cried less than BF (mean difference [95% CI] =  − 0.303 [− 0.614, 0.007], p = 0.056) at week 8, but there were no other differences for the other crying items. Total crying time did significantly improve in FF between baseline and 8 weeks (mean difference [95% CI] =  − 0.444 [− 0.799, − 0.090], p = 0.015), while there was no change between baseline and 8 weeks for BF and MF.

Formula satisfaction

Most parents in MF and FF reported that their child liked what he/she consumed, and that they would continue to provide the study formula to their child at both weeks 4 and 8 (Table 3).

Table 3 Formula satisfaction questionnaire results [N (%)], at weeks 4 and 8 among parents of infants receiving study formula, by feeding group, analysis set (N = 68 infants receiving formula)

Adverse events

A total of 46 subjects experienced 69 AEs during the course of the study, and no serious AEs were reported (Table 4). A total of 7.8% (n = 4) of the AEs were potentially formula related and were only reported in the FF Group.

Table 4 Overview of adverse events, by feeding group, in the safety data set (n = 117)

Discussion

The results indicate that formula-fed infants, either exclusively or mixed fed, receiving the formula supplemented with 2′FL and LNnT, had age-appropriate growth in line with the WHO standards and comparable to BF infants. Growth was also comparable to that seen in previous studies with West and South European infant populations [35]. By week 8, GI tolerance as indicated by low IGSQ scores was comparable in the formula-fed infants with that in BF infants indicating the formula is well tolerated. The incidence of adverse events in all groups was low. As shown at Table 4, 7.8% (n = 4) of the AEs were potentially formula related. Despite the season of the year (fall-winter), cases of bronchitis were lower than expected from the literature [36]. Therefore, the results were not added as secondary outcomes.

The results of this RWE study are comparable to those from previous RCTs that have examined anthropometry and GI tolerance for term infant formulas supplemented with HMOs. One RCT was a multicenter, double-blind trial that enrolled 175 healthy term infants in Italy and Belgium at less than 14 days of age who were fed study formula for 6 months [20]. The HMO-supplemented formula (2′FL + LNnT) demonstrated age-appropriate growth as well as good tolerance as measured by parents. Another RCT included 189 term infants in the USA who were exclusively formula fed until 4 months of age [21]. Formulas with 2′FL (at two different dosages) and GOS were well-tolerated based on parental reports, and no significant differences were observed for growth compared to a control group. Notably, neither of those trials utilized a validated tool to assess tolerance, and thus, tolerance outcomes cannot easily be compared across studies. A recent RCT used the same validated IGSQ tool as in the current study to assess tolerance [22]. The HMO-supplemented formula was well tolerated as evidenced by similar IGSQ scores at week 6 between the groups with (mean [SD] = 20.9 [4.8]) and without (mean [SD] = 20.7 [4.3]) the addition of 2′FL. These scores are similar to those observed in FF in the current study at week 8 (mean [SD] = 19.1[4.5]). Additionally, a single-arm study of a formula supplemented with 2′FL fed to fussy infants showed significant improvement in IGSQ scores after 3 weeks of feeding (baseline mean [SD] = 34.1 [10.0]; week 3 mean [SD] = 21.4 [7.0]; p < 0.001) [37]. Although we did not limit our study to fussy infants, we also saw an improvement in IGSQ in FF in our study from baseline to week 8 (mean difference [95% confidence interval] =  − 6.639 [− 9.497, − 3.782], p < 0.0001). The improvement in GI tolerance in our study might be partially related to the natural evolvement of GI tolerance which improves with increasing age but could potentially also be attributed to the composition of the study formula including the two HMOs. Almost all infants switched to the study formula at the beginning of the study, i.e., they were receiving a different formula prior to enrollment (44 of 46 FF infants and 21 of 22 MF infants), suggesting that the HMO-containing study formula has better GI tolerance than the formulas without HMOs consumed prior to enrollment. Only one other real-world study has been conducted to our knowledge; a study with similar design to the current study was conducted in Spain and had very similar results for both growth and IGSQ scores [38]. The agreements between the previous RCTs, the Spanish real-world study, and the current real-world study are reassuring that growth, safety, and tolerance of HMO-supplemented formula are consistent and robust across different geographical populations.

This study has several strengths. First, GI burden was measured using a validated instrument, the 13-item IGSQ based on five separate domains of feeding tolerance. The use of a validated instrument provides information that is interpretable and meaningful to practicing clinicians. Second, this was an RWE study, a design distinct from the RCT, simpler, less restrictive, but still in line with current clinical practices, enhancing the generalizability of the results and providing complementary data to RCTs. The published prevalence of infants who are mixed fed indicates that at age 1 month, 30% of infants receive mixed feedings [30, 31], similar to that observed in this study. Thus, the demonstration of appropriate growth and good tolerance in the mixed feeding group of infants in this study provides important evidence not found in the RCTs conducted to date. Some limitations of the study should also be noted. An open-label, non-randomized design increases the risk for bias, in particular for response bias (even for validated questionnaires), and higher attrition rates and missing data. In a study with specifically defined feeding regimens such as ours, randomization is however not possible. The main aim of randomization is to have study groups with comparable characteristics. We therefore compared the baseline characteristics in our three groups, and there are no substantial differences except for infant age at enrollment and parents’ smoking status. Infant age at enrollment was slightly lower in MF compared with both BF and FF. As it may take longer for mothers to establish exclusive breastfeeding or formula-fed patterns, the younger age in MF at enrollment can be expected. To account for the slight difference in baseline age between the groups, we included baseline age in the statistical models to reduce potential risk of bias arising by the non-randomized nature of our trial. Smoking was higher among the parents in FF compared with BF which for the fathers could be linked to the slightly lower education level in FF compared with BF. Alternately, the parents in BF may have underreported their smoking habits to make a positive impression, a form of social-desirability bias. The study formula was supplemented with just a single level of 2′FL and LNnT, and thus, this study cannot assess whether the observed growth and tolerance effects might differ over a wider range of levels of these HMOs. Additionally, this study, while multicenter, took place within only two countries (Germany and Austria), and its results may not be generalizable outside of Northwest Europe. Furthermore, the authors want to point out that even if supplemented with HMOs, infant formula is not comparable with the gold standard “human breast milk” concerning multiple aspects. The aim of research in infant formula is of cause not to replace human breast milk — but to have the best possible substitute available in case breastfeeding fails or breast milk is not available.

In conclusion, this is one of the first studies to use real-world evidence to examine the supplementation of infant formula with HMOs. The results obtained were similar to those found in more tightly controlled RCTs, indicating robust effects for growth, safety, and tolerance in association with HMO-supplemented infant formulas.