Comparison of lung ultrasound scores with clinical models for predicting bronchopulmonary dysplasia

Lung ultrasound scores (LUSs) have been demonstrated to accurately predict moderate-to-severe bronchopulmonary dysplasia (msBPD). This study attempted to explore the additional value of LUSs for predicting msBPD compared to clinical multivariate models in different gestational age (GA) groups. The study prospectively recruited preterm infants with GA < 32 weeks. Lung ultrasound was performed on days 3, 7, 14, and 21 after birth. A linear mixed-effects regression model was used to evaluate LUS evolution in infants born before and after 28 weeks. The receiver operating characteristic (ROC) procedure was used to analyze the reliability of LUS and clinical multivariable models for predicting msBPD. The optimal time to predict msBPD in all infants was 7 days with a cut-off point of 5 (area under the ROC (AUROC) curve: 0.78, 95% confidence interval (CI): 0.71–0.84). In infants with GA ≥ 28 weeks, LUSs provided a moderate diagnostic accuracy for all four time points (AUROC curve: 0.74–0.78), and the AUROC curve for the clinical multivariable model on day 14 was 0.91 (95% CI: 0.84–0.96), which was significantly higher than that of LUSs (AUROC curve: 0.77, 95% CI: 0.68–0.85, P < 0.05). In infants born at 23–27 weeks, LUSs showed a low diagnostic accuracy with higher cut-off points to predict msBPD, and the AUROC curve for GA to predict msBPD was 0.75 (95% CI: 0.59–0.85), providing diagnostic accuracy similar to that of LUSs. Conclusion: The contribution of LUSs to predict msBPD in infants with different GAs remains controversial and requires further investigation. What is Known: • Lung ultrasound scores (LUSs) have been demonstrated to accurately predict moderate-to-severe bronchopulmonary dysplasia in infants with gestational age (GA)＜32 weeks. What is New: • The LUSs evolution differed between extremely preterm infants born before 28 weeks and preterm infants born at 28–32 weeks of gestation. • LUSs provided similar moderate predictive performance as GA-adjusted LUS and clinical multivariate models in infants born after 28 weeks, while LUSs seem to be less helpful in infants born before 28 weeks. Supplementary Information The online version contains supplementary material available at 10.1007/s00431-023-04847-y.


Introduction
Bronchopulmonary dysplasia (BPD) is one of the most common and severe sequelae in very preterm infants, and several studies have developed clinical prediction models to guide patient management [1,2]. A web-based prediction model from the National Institutes of Health (NIH) reported areas under the receiver operating characteristic (ROC) (AUROC) curve of 0.81, 0.82, and 0.83 on days 3, 7, and 14 of postnatal age, respectively, for BPD prediction in infants born at 23-30 weeks [3]. In recent years, the application of lung ultrasonography in neonatal intensive care units (NICUs) has considerably increased due to its rapid acquisition, portability, lack of radiation exposure, excellent diagnostic accuracy, and reproducibility in following the disease progress [4][5][6][7][8][9]. Previous studies have revealed that lung ultrasound scores (LUSs) in the first 14 days after birth are useful for predicting BPD, with an AUROC curve of 0.7-0.9 in preterm infants [6,9,10]. Extremely preterm infants born before 28 weeks of gestation benefit the most from accurate early prediction of BPD; however, recent studies have expressed concerns about the accuracy of LUSs for predicting BPD in this population. The evolution of LUSs in this population is significantly different from that in infants with gestational ages (GAs) of 28-30 weeks and 31-33 weeks, and a recent study found that the predictive value of LUSs may be insignificant in the multivariate models in which GA was the main predictor in this population [11,12].
Therefore, this study aimed to evaluate the predictive value of LUS for the development of moderate-to-severe BPD (msBPD) at different time points in infants born before and after 28 weeks. Additionally, we constructed multivariate regression models using available clinical variables to assess the utility of LUSs.

Materials and methods
This prospective, observational, longitudinal diagnostic accuracy cohort study was conducted between January 2020 and July 2021 at Jilin University First Hospital, which has extensive experience in the use of neonatal ultrasound in China. This study was approved by the hospital's institutional review board (clinical trial registered with www. chictr. org. cn [chiCTR1900023869]), and parental consent was obtained upon NICU admission. All procedures were performed in accordance with the Declaration of Helsinki.
This study enrolled preterm infants with GA < 32 weeks who were admitted to the NICU on the day of birth. The exclusion criteria were the following: (1) complex congenital malformations or chromosomal abnormalities, (2) congenital lung diseases or congenital heart defects, and (3) death before 36 weeks postmenstrual age.
All enrolled infants underwent lung ultrasonography performed by two ultrasound physicians at day 3 (D3), day 7 (D7, ± 1 day), day 14 (D14, ± 1 day), and day 21 (D21, ± 2 days) of postnatal age. The procedure was performed after at least 30 min in the supine position, following a standardized protocol [13]. All ultrasonographic images and videos were digitally recorded, anonymized, and reviewed by a senior independent ultrasonographer blinded to patients' clinical information. Lung ultrasonography was performed with a linear probe (9-12 MHz) according to availability. The LUS calculation was based on the description of Brat et al. [6,9], and the total score ranged from 0 to 18.
Respiratory data (PtcO2 and type of respiratory support) were collected on the same day in which the LUS was calculated. The ventilatory policy is shown in the supplementary material. PtcO2 was estimated with adequately calibrated transcutaneous oxygen tension monitoring devices (TCM4, Radiometer, Denmark). Respiratory support was defined as (i) invasive mechanical ventilation, (ii) non-invasive ventilation, (iii) oxygen (high-flow or low-flow oxygen), or (iv) none. If more than one respiratory support was recorded during the day, the highest form of ventilation was used. Additionally, given that the duration of hemodynamically significant patent ductus arteriosus (hsPDA) is associated with an increased risk of BPD [14,15], we dynamically recorded the presence of hsPDA using echocardiography within 1 day of lung ultrasonography: left atrial-aortic root ratio > 1.3, PDA size > 1.5 mm, and a ductal left-to-right shunt [16].
The study's primary outcome was msBPD, defined as the requirement for oxygen or respiratory support at 36 weeks postmenstrual age or at the time of discharge, and the secondary outcome was any grade of BPD defined by the NIH in 2001 [17].

Statistical analysis
The sample size was calculated using PASS software (NCSS, LLC, Kaysville, UT, USA). Approximately 35% of preterm infants with GA < 32 weeks developed msBPD in our NICU during the first 6 months of the study. An AUROC curve of ≥ 0.7, as previously published, and an α error of 0.05 and 90% power were targeted [18]. Based on this calculation, the required sample size was 94 infants.
Normally distributed continuous variables were expressed as the mean ± standard deviation (SD) and compared with Student's t test, while non-normally distributed continuous variables were summarized as medians and interquartile values and compared with Mann-Whitney test. Categorical variables were summarized as counts and percentages and compared with chi-square (χ 2 ) or Fisher's test, as appropriate. LUSs were compared between groups using repeatedmeasures analysis of variance. The ROC procedure was used to analyze the reliability of LUSs for predicting msBPD at different time points. The AUROCs were compared using DeLong's test, and the optimal cut-off points were determined using Youden's method.
Data analysis was based on different GA groups. First, the predictive performance of LUSs for msBPD was evaluated according to different GA groups. Second, the enrolled infants were allocated to one of two GA groups (23-27 weeks and 28-32 weeks) to evaluate the LUS evolution and the contribution of different predictors of msBPD. Linear multilevel mixed-effects regression was used to estimate the predictive model of LUS evolution for msBPD as an effect-modifying and interaction variable. To evaluate the utility of LUSs as predictive models, multivariate logistic regression was applied for msBPD prediction using available clinical covariates without multicollinearity and compared with LUSs and GA-adjusted LUSs. Model 1 included GA alone, model 2 included LUS alone, model 3 included GA and LUS (GA-adjusted LUS), and model 4 included GA, sex, and presence of hsPDA, type of respiratory support, and PtcO2 on the day of lung ultrasonography. Multicollinearity between clinical covariates was examined according to tolerance and variance inflation factor. Multicollinearity between clinical covariates was examined according to tolerance and variance inflation factor. GA was chosen as a covariate because it plays a prominent role in the development of BPD [19], and birth weight was excluded because it creates relevant multicollinearity with GA. The predictive performances of the selected models were assessed using the AUROC curve and 95% confidence interval (CI), and goodness-of-fit was evaluated using the Hosmer-Lemeshow test.

Results
There were 190 eligible infants during the study period ( Fig. 1). The clinical characteristics of the enrolled infants (n = 150) are described in Table 1; 63 (42%) were diagnosed with msBPD, and 87 (58%) had no/mild BPD. The mean GA and birth weight of enrolled infants were 28.70 (SD: 1.59) weeks and 1176.5 (SD: 224.6) g, respectively. Infants who developed msBPD were more preterm, more likely to have hsPDA, had higher CRIB-II scores, and required longer respiratory support. There were 43 (28.7%) and 107 (71.3%) infants born at 23-27 weeks and ≥ 28 weeks of gestation, respectively, and their clinical characteristics are described in Supplemental Table 1.

Diagnostic accuracy of the LUSs in all infants
This cohort had significant differences in LUSs between infants with and without BPD and between infants with msBPD and with no/mild BPD from D3 to D21 ( Fig. 2A, B; P < 0.0001, between-participant comparisons). LUSs had good diagnostic accuracy for predicting any BPD and msBPD on D3, D7, D14, and D21, as shown in Table 2 and Supplemental Table 2. The optimal time for BPD prediction was D14 with a cut-off point of 4 (sensitivity: 72.9%, specificity: 90.7%, AUROC curve: 0.87, 95% CI: 0.81-0.92). In comparison, the optimal times for msBPD prediction were D7 and D14, with the cut-off points of 5
In the five-factor model, lower GA was consistently associated with the severity of BPD regardless of the postnatal day   Table 3). The AUROC curve for GA to predict msBPD was 0.75 (95% CI: 0.59-0.85), and it provided diagnostic accuracy similar to that of LUSs, GAadjusted LUSs, and the clinical five-factor model for all three time points (Delong's test, overall P > 0.05; Table 4).

Discussion
This study found that an early LUS, in the first 2 weeks of postnatal age, can predict any BPD and msBPD in infants with GA < 32 weeks. The study also showed that the contribution of LUS differed between extremely preterm infants and preterm infants born at 28-32 weeks of gestation. LUSs provided similar moderate predictive performance as GA-adjusted LUS and clinical multivariate models in infants born after 28 weeks, while LUSs seem to be less helpful in infants born before 28 weeks. We followed a protocol and BPD definition similar to that of Loi et al. [9] and Alonso-Ojembarrena and Lubián-López [18]. Ojembarrena and Lubián-López [18] and Alonso-Ojembarrena et al. [20] reported that an LUS of ≥ 5 on D14 was the best predictor of any form of BPD (sensitivity: 74%, specificity: 100%, AUROC curve: 0.93, 95% CI: 0.80-0.99),   This difference may be due to either the intercenter variability in oxygen administration and the decision for respiratory support or the higher proportion of infants with msBPD in the total sample (42% vs 24.5%).
In infants born at 28-32 weeks of gestation, LUSs performed well in predicting the development of msBPD, similar to that observed in the total sample. LUSs decreased rapidly during the first week of life and remained high and stable in infants with evolving msBPD, as previously reported [12,21]. Interestingly, the present study found that the predictive performance of the clinical five-factor model was superior to that of LUSs on D14. Thus, reducing the frequency of lung ultrasonography in these infants is possible if other studies verify this result. However, Alonso-Ojembarrena et al. [20] found that the diagnostic performance of their LUSs was comparable to that of the clinical multivariate model that included GA, sex, prenatal corticosteroids, surfactant, and invasive mechanical ventilation at D7, and late-onset sepsis before D7, and Loi et al. [9] showed that GA-adjusted LUS had slightly higher reliability than the NIH calculator. The authors thought that a possible explanation was the enrolment of relatively mature preterm infants (≥ 28 weeks) in current analysis. In addition, the authors believed that this difference may have been due in part to the choice of expanded LUS protocols that considered the posterior lung fields. BPD is a dis-homogeneous lung disease characterized by impaired alveolar structure and lung vascular growth [22]. Theoretically, with impaired ventilation in the gravity-dependent posterior lung areas, the probability of atelectasis is significantly increased, thereby increasing LUS. Therefore, a fuller representation of the lateral and posterior fields of the lung may be more accurate for the prediction of msBPD in very preterm infants. However, the additional value of adding posterior lung fields to the LUS in predicting BPD remains controversial. Loi et al. [9] and Liu et al. [8,23] demonstrated that the predictive performance of expanded LUS protocols (10-region LUS and 12-region LUS) was superior to that of the 6-region protocol, and a meta-analysis and two prospective multicenter studies concluded that the diagnostic accuracies of LUS and expanded LUS for BPD and msBPD were similar [20,24]. We believe it is valuable to question whether scanning of the posterior lung zones improves diagnostic accuracy, and national multicenter LUS data should be integrated to explore and optimize lung scanning protocols.
Alonso-Ojembarrena et al. [12] and Raimondi et al. [21] reported different LUS patterns from birth until 6-14 weeks in infants born before 28 weeks. Persistently high LUSs in the first 4 weeks after birth for infants with GA < 28 weeks were described, regardless of their subsequent evolution to msBPD. This may be related to the non-specificity of LUSs and the impact of frequent acute clinical events (inflammation and complications associated with respiratory support) due to respiratory insufficiency of prematurity. These results were consistent with those of our study. However, we observed no significant differences between infants born at 23-27 weeks of gestation with no/mild BPD and msBPD on D3 and D7 in our study. The effect of GA on LUSs and the high incidence of BPD in this group may have influenced the results. In infants born before 28 weeks, the predictive performance of LUSs was lower than that of infants with GA ≥ 28 weeks, and this difference was due to the persistently high LUSs in the more immature infants. GA remained the dominant predictor of msBPD, and adding LUSs or other clinical predictors in the first 2 weeks only marginally increased the AUROC curve; they were insufficient to reach statistical significance at any time point. Woods et al. [11] prospectively enrolled 100 infants born at < 28 weeks of gestation and reported that LUS on D3-D4 and D7 accurately predicted BPD (according to Jensen et al. [19]). However, lung ultrasonography performed within the first week of life has little value for GA-based models [11]. The AUROC curve for GA in predicting BPD in our cohort was 0.75; this was in alignment with their results, albeit with slightly higher AUROC curve values. Additionally, our results could suggest that the contribution of LUSs in this population with a higher BPD prevalence remains controversial. Mohamed et al. [10] and Abdelmawla et al. [25] reported that the AUROC curve for LUSs in the first 2 weeks after birth was as high as 0.95. The generalizability of these studies is limited owing to the impact of variability in respiratory management among different NICUs on the incidence of msBPD and the small sample size. This study had certain limitations. First, although our data are in close agreement with that of other reports, this was only a single-center pragmatic study, which lacks the scientific rigor required to support widespread changes in practice. In addition, the intercenter variability in oxygen administration and decisions for respiratory support can significantly affect the generalizability of our results; thus, these findings need to be interpreted cautiously. Second, 6-h pronation has been reported to significantly improve lung aeration and gas exchange in infants with evolving BPD, resulting in decreased LUSs [26]. In our study, all infants were placed supine for the lung scanning, and the precise time of supine or prone position prior to lung ultrasonography was not recorded for each infant, which was likely to affect LUS results. Another limitation of our study is the small proportion of infants born before 28 weeks, and future studies should focus on the predictive value of LUS in this population.

Conclusion
The collective results of available studies have shown that LUS has great promise for clinical research and the management of preterm infants. However, we must recognize that the contribution of LUS to predict msBPD in infants born before and after 28 weeks is uncertain. Hence, multicenter prospective studies that focus on the predictive performance of LUS data in infants of different GAs are necessary.