Introduction

Bronchopulmonary dysplasia (BPD) is one of the most common and severe sequelae in very preterm infants, and several studies have developed clinical prediction models to guide patient management [1, 2]. A web-based prediction model from the National Institutes of Health (NIH) reported areas under the receiver operating characteristic (ROC) (AUROC) curve of 0.81, 0.82, and 0.83 on days 3, 7, and 14 of postnatal age, respectively, for BPD prediction in infants born at 23–30 weeks [3]. In recent years, the application of lung ultrasonography in neonatal intensive care units (NICUs) has considerably increased due to its rapid acquisition, portability, lack of radiation exposure, excellent diagnostic accuracy, and reproducibility in following the disease progress [4,5,6,7,8,9]. Previous studies have revealed that lung ultrasound scores (LUSs) in the first 14 days after birth are useful for predicting BPD, with an AUROC curve of 0.7–0.9 in preterm infants [6, 9, 10]. Extremely preterm infants born before 28 weeks of gestation benefit the most from accurate early prediction of BPD; however, recent studies have expressed concerns about the accuracy of LUSs for predicting BPD in this population. The evolution of LUSs in this population is significantly different from that in infants with gestational ages (GAs) of 28–30 weeks and 31–33 weeks, and a recent study found that the predictive value of LUSs may be insignificant in the multivariate models in which GA was the main predictor in this population [11, 12].

Therefore, this study aimed to evaluate the predictive value of LUS for the development of moderate-to-severe BPD (msBPD) at different time points in infants born before and after 28 weeks. Additionally, we constructed multivariate regression models using available clinical variables to assess the utility of LUSs.

Materials and methods

This prospective, observational, longitudinal diagnostic accuracy cohort study was conducted between January 2020 and July 2021 at Jilin University First Hospital, which has extensive experience in the use of neonatal ultrasound in China. This study was approved by the hospital’s institutional review board (clinical trial registered with www.chictr.org.cn [chiCTR1900023869]), and parental consent was obtained upon NICU admission. All procedures were performed in accordance with the Declaration of Helsinki.

This study enrolled preterm infants with GA < 32 weeks who were admitted to the NICU on the day of birth. The exclusion criteria were the following: (1) complex congenital malformations or chromosomal abnormalities, (2) congenital lung diseases or congenital heart defects, and (3) death before 36 weeks postmenstrual age.

All enrolled infants underwent lung ultrasonography performed by two ultrasound physicians at day 3 (D3), day 7 (D7, ± 1 day), day 14 (D14, ± 1 day), and day 21 (D21, ± 2 days) of postnatal age. The procedure was performed after at least 30 min in the supine position, following a standardized protocol [13]. All ultrasonographic images and videos were digitally recorded, anonymized, and reviewed by a senior independent ultrasonographer blinded to patients’ clinical information. Lung ultrasonography was performed with a linear probe (9–12 MHz) according to availability. The LUS calculation was based on the description of Brat et al. [6, 9], and the total score ranged from 0 to 18.

Respiratory data (PtcO2 and type of respiratory support) were collected on the same day in which the LUS was calculated. The ventilatory policy is shown in the supplementary material. PtcO2 was estimated with adequately calibrated transcutaneous oxygen tension monitoring devices (TCM4, Radiometer, Denmark). Respiratory support was defined as (i) invasive mechanical ventilation, (ii) non-invasive ventilation, (iii) oxygen (high-flow or low-flow oxygen), or (iv) none. If more than one respiratory support was recorded during the day, the highest form of ventilation was used. Additionally, given that the duration of hemodynamically significant patent ductus arteriosus (hsPDA) is associated with an increased risk of BPD [14, 15], we dynamically recorded the presence of hsPDA using echocardiography within 1 day of lung ultrasonography: left atrial-aortic root ratio > 1.3, PDA size > 1.5 mm, and a ductal left-to-right shunt [16].

The study’s primary outcome was msBPD, defined as the requirement for oxygen or respiratory support at 36 weeks postmenstrual age or at the time of discharge, and the secondary outcome was any grade of BPD defined by the NIH in 2001 [17].

Statistical analysis

The sample size was calculated using PASS software (NCSS, LLC, Kaysville, UT, USA). Approximately 35% of preterm infants with GA < 32 weeks developed msBPD in our NICU during the first 6 months of the study. An AUROC curve of ≥ 0.7, as previously published, and an α error of 0.05 and 90% power were targeted [18]. Based on this calculation, the required sample size was 94 infants.

Normally distributed continuous variables were expressed as the mean ± standard deviation (SD) and compared with Student’s t test, while non-normally distributed continuous variables were summarized as medians and interquartile values and compared with Mann–Whitney test. Categorical variables were summarized as counts and percentages and compared with chi-square (χ2) or Fisher’s test, as appropriate. LUSs were compared between groups using repeated-measures analysis of variance. The ROC procedure was used to analyze the reliability of LUSs for predicting msBPD at different time points. The AUROCs were compared using DeLong’s test, and the optimal cut-off points were determined using Youden’s method.

Data analysis was based on different GA groups. First, the predictive performance of LUSs for msBPD was evaluated according to different GA groups. Second, the enrolled infants were allocated to one of two GA groups (23–27 weeks and 28–32 weeks) to evaluate the LUS evolution and the contribution of different predictors of msBPD. Linear multilevel mixed-effects regression was used to estimate the predictive model of LUS evolution for msBPD as an effect-modifying and interaction variable. To evaluate the utility of LUSs as predictive models, multivariate logistic regression was applied for msBPD prediction using available clinical covariates without multicollinearity and compared with LUSs and GA-adjusted LUSs. Model 1 included GA alone, model 2 included LUS alone, model 3 included GA and LUS (GA-adjusted LUS), and model 4 included GA, sex, and presence of hsPDA, type of respiratory support, and PtcO2 on the day of lung ultrasonography. Multicollinearity between clinical covariates was examined according to tolerance and variance inflation factor. Multicollinearity between clinical covariates was examined according to tolerance and variance inflation factor. GA was chosen as a covariate because it plays a prominent role in the development of BPD [19], and birth weight was excluded because it creates relevant multicollinearity with GA. The predictive performances of the selected models were assessed using the AUROC curve and 95% confidence interval (CI), and goodness-of-fit was evaluated using the Hosmer–Lemeshow test.

Analyses were performed using Stata version 16.0 (Stata Corp., College Station, TX, USA), MedCalc version 19.0 (MedCalc bvba, Ostend, Belgium), and GraphPad Prism V8.0 (San Diego, CA, USA).

Results

There were 190 eligible infants during the study period (Fig. 1). The clinical characteristics of the enrolled infants (n = 150) are described in Table 1; 63 (42%) were diagnosed with msBPD, and 87 (58%) had no/mild BPD. The mean GA and birth weight of enrolled infants were 28.70 (SD: 1.59) weeks and 1176.5 (SD: 224.6) g, respectively. Infants who developed msBPD were more preterm, more likely to have hsPDA, had higher CRIB-II scores, and required longer respiratory support. There were 43 (28.7%) and 107 (71.3%) infants born at 23–27 weeks and ≥ 28 weeks of gestation, respectively, and their clinical characteristics are described in Supplemental Table 1.

Fig. 1
figure 1

Study chart

Table 1 Basic population details

Diagnostic accuracy of the LUSs in all infants

This cohort had significant differences in LUSs between infants with and without BPD and between infants with msBPD and with no/mild BPD from D3 to D21 (Fig. 2A, B; P < 0.0001, between-participant comparisons). LUSs had good diagnostic accuracy for predicting any BPD and msBPD on D3, D7, D14, and D21, as shown in Table 2 and Supplemental Table 2. The optimal time for BPD prediction was D14 with a cut-off point of 4 (sensitivity: 72.9%, specificity: 90.7%, AUROC curve: 0.87, 95% CI: 0.81–0.92). In comparison, the optimal times for msBPD prediction were D7 and D14, with the cut-off points of 5 and 4 for each time (sensitivity: 71.4% and 85.7%, specificity: 73.6% and 67.8%; AUROC curve: 0.78 [95% CI: 0.71–0.84] and 0.82 [95% CI: 0.75–0.88], respectively).

Fig. 2
figure 2

Lung ultrasound scores (LUSs) at different time points. Statistical significance: *P < 0.05, **P < 0.01, ***P < 0.001, ****P < 0.0001. A Boxplot showing LUSs at D3, at D7, at D14, and at D21 in infants with BPD and without BPD. B Boxplot showing LUSs at D3, at D7, at D14, and at D21 in infants with msBPD and no/mild BPD. C, D Boxplot and predicted LUS evolution over time in infants born at 23–27 weeks of gestation. E, F Boxplot and predicted LUS evolution over time in infants born at 28–32 weeks of gestation

Table 2 The predictive ability of LUS for moderate-to-severe bronchopulmonary dysplasia (msBPD) in different GA groups

Diagnostic accuracies of LUSs and LUS evolution in infants born at 28–32 weeks

Among infants with GA ≥ 28 weeks, LUS showed a similar moderate predictive accuracy for msBPD on D3 (AUROC curve: 0.77, 95% CI: 0.67–0.84), D7 (AUROC curve: 0.74, 95% CI: 0.65–0.82), D14 (AUROC curve: 0.77, 95% CI: 0.68–0.85), and D21 (AUROC curve: 0.78, 95% CI: 0.68–0.85) (Table 2). Infants who developed msBPD had significantly higher LUSs than those with no/mild BPD from D3 to D21 (Fig. 2C). The LUS evolution predicted is displayed in Fig. 2D, showing that LUSs decreased rapidly during the first week of life, followed by a slow decrease, in infants with no/mild BPD. In contrast, LUSs remained high and stable in infants with developing msBPD. The statistics of the linear multilevel mixed-effects regression models are shown in Supplemental Table 2.

There was no multicollinearity between independent variables in model 4 (tolerance > 0.1 and variance inflation factor < 10 for GA, type of respiratory support, and PtcO2) (Supplemental Table 3). PtcO2 values were allocated to one of three groups: (i) hypoxemia (PtcO2 < 50 mmHg), (ii) normoxemia (PtcO2 50–80 mmHg), or (iii) hyperoxemia (PtcO2 > 80 mmHg). In the five-factor model, the variables most associated with an increased risk of msBPD changed depending on the postnatal day (Table 3). Low GA (odds ratio (OR): 0.43; 95% CI: 0.24–0.78; P < 0.01) and the presence of hsPDA (OR: 3.94; 95% CI: 1.42–10.9; P < 0.01) were associated with the development of msBPD on D3, while higher levels of respiratory support (OR: 0.14; 95% CI: 0.04–0.45; P < 0.01), lower PtcO2 values (OR: 0.24; 95% CI: 0.08–0.76; P < 0.01), and the presence of hsPDA (OR: 5.96; 95% CI: 1.55–22.9; P < 0.01) were associated with msBPD on D14. LUS provided a diagnostic accuracy similar to that of GA-adjusted LUS and the clinical five-factor model (model 4) on D3 (AUROC curve: 0.77 [95% CI: 0.67–0.84] vs 0.81 [95% CI: 0.72–0.88], P = 0.18; AUROC curve: 0.77 [95% CI: 0.67–0.84] vs 0.78 [95% CI: 0.69–0.86], P = 0.84) and D7 (AUROC curve: 0.74 [95% CI: 0.65–0.82] vs 0.77 [95% CI: 0.67–0.84], P = 0.42; AUROC curve: 0.74 [95% CI: 0.65–0.82] vs 0.82 [95% CI: 0.74–0.89], P = 0.19). The AUROC curve for the clinical five-factor model on D14 was 0.91 (95% CI: 0.84–0.96), which was significantly higher than that of LUSs (AUROC curve: 0.77, 95% CI: 0.68–0.85, P < 0.05) and GA-adjusted LUSs (AUROC curve: 0.80, 95% CI: 0.71–0.87, P < 0.05) on the same day (Table 4).

Table 3 Multivariate analysis to predict msBPD in infants born at 23–27 weeks and 28–32 weeks
Table 4 AUROC curves from different models to predict msBPD in infants born before 32 weeks of gestation

Diagnostic accuracies of LUSs and LUS evolution in infants born at 23–27 weeks

In infants born at 23–27 weeks, LUS showed a low diagnostic accuracy with higher cut-off scores to predict msBPD on D3 (AUROC curve: 0.66, 95% CI: 0.51–0.79), D7 (AUROC curve: 0.69, 95% CI: 0.54–0.83), D14 (AUROC curve: 0.70, 95% CI: 0.54–0.83), and D21 (AUROC curve: 0.70, 95% CI: 0.54–0.83) (Table 2). Higher LUSs were observed in infants who developed msBPD from 3 days onwards, without significant differences compared to those in infants with no/mild BPD (Fig. 2E). The LUS evolution which predicted using the mixed-effects regression model is shown in Fig. 2F, and it showed the presence of persistently high LUS within 3 weeks after birth, regardless of their subsequent evolution to msBPD.

In the five-factor model, lower GA was consistently associated with the severity of BPD regardless of the postnatal day (D3, OR: 0.19, 95% CI: 0.04–0.85, P < 0.05; D7, OR: 0.14, 95% CI: 0.03–0.68, P < 0.05) (Table 3). The AUROC curve for GA to predict msBPD was 0.75 (95% CI: 0.59–0.85), and it provided diagnostic accuracy similar to that of LUSs, GA-adjusted LUSs, and the clinical five-factor model for all three time points (Delong’s test, overall P > 0.05; Table 4).

Discussion

This study found that an early LUS, in the first 2 weeks of postnatal age, can predict any BPD and msBPD in infants with GA < 32 weeks. The study also showed that the contribution of LUS differed between extremely preterm infants and preterm infants born at 28–32 weeks of gestation. LUSs provided similar moderate predictive performance as GA-adjusted LUS and clinical multivariate models in infants born after 28 weeks, while LUSs seem to be less helpful in infants born before 28 weeks.

We followed a protocol and BPD definition similar to that of Loi et al. [9] and Alonso-Ojembarrena and Lubián-López [18]. Ojembarrena and Lubián-López [18] and Alonso-Ojembarrena et al. [20] reported that an LUS of ≥ 5 on D14 was the best predictor of any form of BPD (sensitivity: 74%, specificity: 100%, AUROC curve: 0.93, 95% CI: 0.80–0.99), and the optimal time for msBPD prediction was at 1 week of age, with a cut-off point of 8 (sensitivity: 70%, specificity: 79%, AUROC curve: 0.79, 95% CI: 0.74–0.84). In contrast, the present study had a similar AUROC curve to that of Alonso-Ojembarrena et al. [20] but a lower cut-off point. This difference may be due to either the intercenter variability in oxygen administration and the decision for respiratory support or the higher proportion of infants with msBPD in the total sample (42% vs 24.5%).

In infants born at 28–32 weeks of gestation, LUSs performed well in predicting the development of msBPD, similar to that observed in the total sample. LUSs decreased rapidly during the first week of life and remained high and stable in infants with evolving msBPD, as previously reported [12, 21]. Interestingly, the present study found that the predictive performance of the clinical five-factor model was superior to that of LUSs on D14. Thus, reducing the frequency of lung ultrasonography in these infants is possible if other studies verify this result. However, Alonso-Ojembarrena et al. [20] found that the diagnostic performance of their LUSs was comparable to that of the clinical multivariate model that included GA, sex, prenatal corticosteroids, surfactant, and invasive mechanical ventilation at D7, and late-onset sepsis before D7, and Loi et al. [9] showed that GA-adjusted LUS had slightly higher reliability than the NIH calculator. The authors thought that a possible explanation was the enrolment of relatively mature preterm infants (≥ 28 weeks) in current analysis. In addition, the authors believed that this difference may have been due in part to the choice of expanded LUS protocols that considered the posterior lung fields. BPD is a dis-homogeneous lung disease characterized by impaired alveolar structure and lung vascular growth [22]. Theoretically, with impaired ventilation in the gravity-dependent posterior lung areas, the probability of atelectasis is significantly increased, thereby increasing LUS. Therefore, a fuller representation of the lateral and posterior fields of the lung may be more accurate for the prediction of msBPD in very preterm infants. However, the additional value of adding posterior lung fields to the LUS in predicting BPD remains controversial. Loi et al. [9] and Liu et al. [8, 23] demonstrated that the predictive performance of expanded LUS protocols (10-region LUS and 12-region LUS) was superior to that of the 6-region protocol, and a meta-analysis and two prospective multicenter studies concluded that the diagnostic accuracies of LUS and expanded LUS for BPD and msBPD were similar [20, 24]. We believe it is valuable to question whether scanning of the posterior lung zones improves diagnostic accuracy, and national multicenter LUS data should be integrated to explore and optimize lung scanning protocols.

Alonso-Ojembarrena et al. [12] and Raimondi et al. [21] reported different LUS patterns from birth until 6–14 weeks in infants born before 28 weeks. Persistently high LUSs in the first 4 weeks after birth for infants with GA < 28 weeks were described, regardless of their subsequent evolution to msBPD. This may be related to the non-specificity of LUSs and the impact of frequent acute clinical events (inflammation and complications associated with respiratory support) due to respiratory insufficiency of prematurity. These results were consistent with those of our study. However, we observed no significant differences between infants born at 23–27 weeks of gestation with no/mild BPD and msBPD on D3 and D7 in our study. The effect of GA on LUSs and the high incidence of BPD in this group may have influenced the results. In infants born before 28 weeks, the predictive performance of LUSs was lower than that of infants with GA ≥ 28 weeks, and this difference was due to the persistently high LUSs in the more immature infants. GA remained the dominant predictor of msBPD, and adding LUSs or other clinical predictors in the first 2 weeks only marginally increased the AUROC curve; they were insufficient to reach statistical significance at any time point. Woods et al. [11] prospectively enrolled 100 infants born at < 28 weeks of gestation and reported that LUS on D3–D4 and D7 accurately predicted BPD (according to Jensen et al. [19]). However, lung ultrasonography performed within the first week of life has little value for GA-based models [11]. The AUROC curve for GA in predicting BPD in our cohort was 0.75; this was in alignment with their results, albeit with slightly higher AUROC curve values. Additionally, our results could suggest that the contribution of LUSs in this population with a higher BPD prevalence remains controversial. Mohamed et al. [10] and Abdelmawla et al. [25] reported that the AUROC curve for LUSs in the first 2 weeks after birth was as high as 0.95. The generalizability of these studies is limited owing to the impact of variability in respiratory management among different NICUs on the incidence of msBPD and the small sample size.

This study had certain limitations. First, although our data are in close agreement with that of other reports, this was only a single-center pragmatic study, which lacks the scientific rigor required to support widespread changes in practice. In addition, the intercenter variability in oxygen administration and decisions for respiratory support can significantly affect the generalizability of our results; thus, these findings need to be interpreted cautiously. Second, 6-h pronation has been reported to significantly improve lung aeration and gas exchange in infants with evolving BPD, resulting in decreased LUSs [26]. In our study, all infants were placed supine for the lung scanning, and the precise time of supine or prone position prior to lung ultrasonography was not recorded for each infant, which was likely to affect LUS results. Another limitation of our study is the small proportion of infants born before 28 weeks, and future studies should focus on the predictive value of LUS in this population.

Conclusion

The collective results of available studies have shown that LUS has great promise for clinical research and the management of preterm infants. However, we must recognize that the contribution of LUS to predict msBPD in infants born before and after 28 weeks is uncertain. Hence, multicenter prospective studies that focus on the predictive performance of LUS data in infants of different GAs are necessary.