Introduction

Bronchopulmonary dysplasia (BPD) is a major chronic respiratory complication of preterm infants [1]. Since BPD was described for the first time more than five decades ago [2], perinatal care has seen major changes. In the 1960s, invasive mechanical ventilation was the only way to manage respiratory distress syndrome (RDS), and such exposure was shown to injure premature lungs. Different therapeutic or preventive strategies like antenatal steroids, surfactant, ventilation strategies and postnatal steroids have been studied to reduce the incidence of BPD [3, 4].

Although the administration of antenatal steroids is associated with a reduction of serious adverse outcomes related to prematurity (perinatal and neonatal death, RDS, intraventricular haemorrhage, necrotising enterocolitis), no reduction in chronic lung disease has been shown [5]. Since the 1990s, prophylactic surfactant therapy has dramatically improved survival [6]. However, the analysis of studies on the routine use of nasal continuous positive airway pressure (nCPAP) showed a decrease in ‘BPD or death’ in infants stabilised on nCPAP [7], rendering the practice of general prophylactic surfactant application obsolete.

Already 20 to 30 years ago, many research groups identified risk factors and developed scoring methods to predict BPD [8, 9]. However, those scores were rapidly outdated due to significant changes in the management of very low birth weight (VLBW) infants or due to changes in BPD definitions. More recent publications used other variables to build new predictive scoring systems [10,11,12]. A 2013 systematic review concluded that existing clinical predication models for BPD cannot be used in practice due to low statistical quality, and suggested that future predictive scores should be validated before implementation [13]. A new BDP imaging score performed with lung ultrasound was recently evaluated and showed promising results [14, 15]. Risk stratification of BPD might be useful for counselling families and clinical trials. More importantly, it may help to identify patients at high risk for BPD by pre-selecting them for specific treatment approaches like postnatal steroid administration, even if the optimal type, dosage and timing are still not well defined [16].

Our objective was to develop and validate a predictive BPD risk score by identifying risk factors based on a national database.

Methods

Design and data collection

This was a national multicentre registry study with prospective data collection for the development and validation of a predictive model for BPD. Patient data were extracted from the national database of the Swiss Neonatal Network (SwissNeoNet), which prospectively collects data from each of the nine Swiss neonatal intensive care units (NICUs) providing tertiary-level neonatal care. All admitted live-born infants with a birth weight < 1501 g and/or gestational age between 23 0/7 and 31 6/7 weeks postmenstrual age (PMA) were included. Data from patients born between January 1, 2009 and December 31, 2010 were analysed as the derivation cohort. Exclusion criteria were missing or incomplete data sets, congenital malformations or syndromes with a potential impact on the respiratory system, and death before fulfilling the BPD definition, therefore not allowing the analysis of the combined outcome ‘BPD or death’. The following variables, considered to have a potential impact on development of BPD, were extracted from the registry and patient records [17]: mode of delivery, gestational age (GA), birth weight (BW) and corresponding z-score, singleton or multiple pregnancy, antenatal steroid treatment, intubation in the delivery room, significant (requiring medical or surgical treatment but not if treated only with fluid restriction and diuretics or if treated prophylactically in the absence of symptoms) patent ductus arteriosus (PDA), proven infection (clinical evidence of infection as well as at least one relevant positive blood or cerebrospinal fluid culture), surfactant therapy, number of cumulative days of mechanical ventilation (MV) and anonymised NICUs. The outcomes predicted by the models were BPD28 and BPD36. BPD28 was defined as the need for supplementary oxygen (> 12 h per day) during ≥ 28 cumulative days between birth and 36 weeks PMA. The criterion for BPD36 was the need for supplementary oxygen at 36 weeks PMA. Data from patients born between January 1, 2014 and December 31, 2015 and included in the registry were analysed as the validation cohort, using the same exclusion criteria and extracting the same items as for the derivation cohort.

Statistical analysis

Data analysis was performed using STATA 14 software (StataCorp. 2015. Stata Statistical Software: Release 14. College Station, TX, USA). Clinical characteristics were described by median and range for continuous variables, and numbers and percentages for categorical variables. The association between risk factors and BPD was examined by univariable logistic regression. To build the multivariable model, a backward deletion procedure was performed. All the variables with significant (p < 0.1) association with the outcome from univariable analysis were entered in the initial model. Then, variables with p value >0.05 were deleted one at a time. Only variables with p value <0.05 were retained in the final model. The BPD risk score was then built as the sum of these factors and weighted according to their ORs. Internal validation of the score was performed using the bootstrap method (repeated 1000 times), as described by Harrell et al. [18]. Multicollinearity was examined using commonly used diagnostic measures of collinearity: the Variance Inflation Factor (VIF) and tolerance. To assess the performance of the derived prognostic score, we examined two indices of accuracy: discrimination and calibration. Discrimination, i.e. the degree to which the prognostic score enables the discrimination between patients with favourable and unfavourable outcome, was assessed by calculation of the area under the receiver operating characteristic (ROC) curves (AUCs) in the derivation and validation cohorts. ROC analyses were performed at 36 weeks PMA and on day of life (DOL) 1. Calibration, i.e. the agreement between predicted and actual outcome, was assessed in the derivation and validation cohorts with the use of the Hosmer–Lemeshow goodness-of-fit test. Furthermore, sensitivity and specificity for the best cut-off value were assessed. The best threshold value of the score was chosen manually so as to simultaneously maximise the sensitivity and specificity. This procedure was performed for the two definitions of BPD.

Results

During the observed period for the derivation cohort (2009 and 2010), 1488 patients were born in the SwissNeoNet. Among them, 34 were excluded because of missing data and 49 because of malformations or syndromes with potential impact on the respiratory system. There were 71 deaths in the delivery room. Finally, 102 and 109 were excluded because they died before having reached the BPD28 and BPD36 criteria, respectively (Fig. 1a). The incidence of BPD28 among this population was 266/1232 (21.6%) and 138/1225 (11.3%) for BPD36.

Fig. 1
figure 1

a Derivation cohort profile. b Validation cohort profile. SwissNeoNet: national database of the Swiss Neonatal Network; DR: delivery room; PMA: postmenstrual age

The cohort used for validation (SwissNeoNet 2014–2015) included 2006 patients. Twenty-four were excluded due to incomplete data, 86 due to syndromes or malformations with pulmonary impact, and 60 which died in the delivery room. Of this cohort, 103 and 112 patients did not reach the age permitting a diagnosis of BPD28 or BPD36, respectively (Fig. 1b). The incidence of BPD28 was 437/1733 (25.2%) and 191/1724 (11.1%) for BPD36.

Calculated VIFs varied between 1.01 and 3.12 (no VIF >10) which indicate the absence of multicollinearity between the retained variables in the BPD28 and BPD36 BPD risk score models.

Seven variables were identified as independent predictors of BPD28 in a multivariable logistic regression analysis: GA, BW, antenatal corticosteroid treatment, surfactant administration, proven infection, PDA requiring medical or surgical treatment, and duration of MV in days. For each covariate in the model, the β-coefficient was directly related to the corresponding OR (Table 1).

Table 1 Univariate and multivariate logistic regression analysis for the BPD28 definition

The BPD28 risk score derived from the β-coefficients was

−0.40 × GA* − 0.12 × BW* − 0.51 × antenatal steroid** + 0.64 × PDA** + 0.85 × proven infection** + 1.15 × surfactant** + 1.03 × days of MV* − 2.45 (*continuous variables were corrected and centrated as explained in Annex 1; **for categorical variables—yes = 1 and no = 0).

The association between the scores and the probability to develop BPD28 is presented in Table 2 and Fig. 2.

Table 2 Probability of developing BPD28 according to the BPD risk score
Fig. 2
figure 2

Correlation between the BPD risk score and the probability of developing BPD28. Frequency: number of patients in the validation cohort; risk of BPD: probability of developing BPD28

In the derivation cohort, maximum sensitivity (82%) and specificity (82%) were calculated at a cut-off of − 1.4 (range − 5 to + 4) at 36 weeks PMA. At DOL 1, a cut off of − 2.2 (range − 5.5 to + 0.8) for maximum sensitivity (81%) and specificity (81%) was found. The BPD risk score had an excellent discriminatory power, as shown by its AUC of 0.90 at 36 weeks PMA and 0.88 at DOL 1 and was well calibrated as confirmed by the Hosmer–Lemeshow test (Hosmer–Lemeshow χ2(28) = 24.40, Prob > χ2 = 0.66).

For the BPD36 definition, five variables were found to be statisticaly significant: BW, surfactant administration, proven infection, significant PDA and the sum of days of MV (Table 3). The model built from the β-coefficients was

Table 3 Univariate and multivariate logistic regression analysis for the BPD36 definition

− 0.17 × BW* + 0.52 × PDA** + 1.29 × proven infection + 0.63 × surfactant** + 1.06 × days of MV* − 3.81 (* continous variables were corrected and centrated as explained in Annex 2; **for categorical variables—yes = 1 and no = 0). The AUC of 0.89 at 36 weeks PMA and 0.84 at DOL 1 showed an excellent discriminatory power of the model, and good calibration was confirmed by the Hosmer–Lemeshow test (Hosmer–Lemeshow χ2(28) = 26.48, Prob > χ2 = 0.55). Maximum sensitivity (83%) and specificity (83%) were calculated at a cut-off of − 2.2 (range − 6 to + 2) in the derivation cohort at 36 weeks PMA. At DOL 1, a cut off of − 3.6 (range − 6 to + 2) for maximum sensitivity (77%) and specificity (77%) was found.

In the validation cohort, the performance of the score showed also an excellent discriminatory power as shown by the corresponding AUC values of 0.92 and 0.88 for BPD28 and BPD36, respectively.

As an example, for a preterm infant born at 26 3/7 weeks PMA (GA* = − 2.81), with an incomplete course of antenatal corticoids (antenatal steroid** = 1), a BW of 600 g (BW* = − 6.33), with 3 days of MV (MV* = −0.035), who received surfactant (surfactant** = 1), was treated for PDA (PDA** = 1) and had no proven infection (infection** = 0), the score would be 0.7 and the risk of developing BPD28 would be 66.3%. If the score was calculated for the same patient 1 week later after an additional 7 days of MV, the score would increase to 1.6 and therefore the probability to develop BPD would be 83%. As another example, the BPD36 risk score for this same premature infant born with 3 days of MV (MV* = 0.35), who received surfactant (surfactant** = 1), and was treated for PDA (PDA** = 1) without infection (infection** = 0) would be − 1.21 and the risk of developing BPD36 would be 22.89%. This risk would increase to 48.65% if the duration of MV increased to 10 days.

Discussion

We developed a BPD risk score based on a large nationwide cohort and provide validation. This score allows excellent prediction of the risk of BPD early in life and at any time during hospitalisation, using relatively simple data. All predictors identified by the multivariable regression are assumed risk factors [17], except for surfactant administration and antenatal steroids. The fact that surfactant was found to be a risk rather than a protective factor in this study can be explained by the clinical reality that its administration is a marker of disease severity. To compare the differences in RDS support between the nine national NICUs, we sent them a questionnaire. The responses made it possible to consider a quite homogeneous approach in the respiratory support of VLBW infants in Switzerland. In time frame of the study in the vast majority of the Swiss units, only patients with severe RDS effectively were intubated and underwent surfactant administration, which is in line with international guidelines for the management of RDS [19]. Then, in the most recent Cochrane analysis [5], antenatal steroids did not affect the risk of BPD while a recent publication by Deshmukh and Patole [20] found that the risk was increased in extremely preterm infants below 25 WG. In the present study, antenatal steroids only were associated with reduction of BPD28 but not BPD36. Finally, lower GA and BW, proven infection, PDA and MV were associated with increased risk of BPD28 while only lower BW, proven infection, PDA and MV were for BPD36. As in any prediction model, accuracy of our BPD score was highest at the timepoint when the predicted event was supposed to occur. As the majority of our model variables are available at birth, and our sensitivity analysis revealed reliable prediction at DOL 1, we were able to demonstrate the validity of our score as of DOL 1 until 36 weeks PMA, with growing accuracy over time when more information on MV, PDA and infection become available.

In this national cohort of preterm infants < 32 weeks PMA and/or BW < 1501 g, the incidence of BPD varied widely according to the definition used. The clinical diagnosis of BPD is often used for quality control and benchmarking between NICUs [21, 22], and for the evaluation of new preventive and therapeutic strategies [23, 24]. Unfortunately, the latest consensus definition of BPD [25] is still not uniformly used in the recent literature. This is probably due to two of its shortcomings: (1) the definition requires an accurate number of hours (>12 h/day) and days of oxygen dependency, data which may be difficult to obtain in clinical practice and may not be systematically collected in patient records; and (2) in the National Institute of Child Health and Human Development (NICHD) consensus definition of BPD, nCPAP use is not considered as a possible substitute for supplemental oxygen administration and, therefore, the duration of nCPAP use is not considered in our algorithm. However, a higher mean airway pressure reduces oxygen requirement. It should be noted that nCPAP was the favoured primary mode of respiratory support in most Swiss units during the study period. According to the BPD36 definition, all infants treated with nCPAP at 36 weeks PMA would not be considered as having BPD if FiO2 is 0.21 (i.e. room air). If such an infant received more than 28 cumulative days of oxygen between birth and 36 weeks PMA, it would however fullfil the criteria of BPD28. Such incoherencies may in part explain the large differences between BPD incidences in the literature [26] and in our study (25.2% vs. 11.1% in the validation cohort for BPD28 and BPD36, respectively). The definition of BPD36 as used in this study may therefore be considered as a surrogate for moderate to severe forms of BPD28, according to the NICHD consensus definition.

Until now, as mentioned by Onland et al. [13], most published BPD scoring systems are only poor to moderate predictors of BPD because most of them lack validation [8, 12, 27, 28]. Furthermore, there is no published study presenting a calibration assessment of the model. In the present study, two indices of accuracy were used to assess the performance of the score. The discrimination power evaluated by the AUC was excellent and good calibration was confirmed by the Hosmer–Lemeshow test for both definitions in the derivation and validation cohorts. Otherwise, in most studies, scores assess the risk of developing BPD only at a particular moment. For example, Sinkin et al. [8] suggested a prediction at 12 h or 10 DOL, whereas Rozycki et al. [9] proposed an evaluation at 8 h of life or 14 DOL. In the study by Laughon et al. [12], prediction was made at the specific DOL 1, 3, 7, 14, 21 or 28. Compared to the Laughon score [12], our BPD score has several advantages: (1) it is based on easily assembled parameters available at any DOL until 36 weeks PMA; (2) infants of derivation and validation cohorts received comparable RDS management strategy promoting nCPAP over MV and are thus closer to typical modern day neonates; (3) the performance of the score is not only assessed by discrimination power but also by a calibration method. Promising new predicting imaging strategies using lung ultrasound are emerging [14, 15]. It would be interesting to compare this method with our model or to evaluate if the combination of both tools adds value to BPD prediction. The advantage of our score is that it is quickly achieved with existing data, and that it does not require performing lung ultrasounds, which may not be available in every NICU. Another strength of our study is its high number of patients, representing a whole population. SwissNeoNet includes > 97% of very preterm babies born in the country where patients are managed according to the best current standards with high rate of antenatal steroid treatments and preventive nCPAP. AUCs above 0.88 achieved in derivation and validation cohorts for both definitions allow an excellent prediction. Moreover, external validation brings more value to the results.

Our study presents some limitations. First, the data are extracted from a registry. Then, the scoring system was based on data coming from a country with lower BPD rate than reported internationally [29] with a quite homogeneous approach regarding respiratory support among the included tertiary centers. This model may not be applicable with the same accuracy in different settings. In addition, other variables with a potential impact on BPD such as gender or ethnicity were not included in the analysis. A further drawback is that the formula, due to its complexity, requires a calculator. To facilitate the use of the BPD risk score, we provide a web-based calculator at the following link: http://calc.chuv.ch/.

In conclusion, the presented BPD risk score allows to predict early in life and at any DOL the risk for a preterm infant to develop BPD and its potential to be moderate to severe by using seven or five easily available variables, respectively. This BPD risk score may be a useful tool in clinical practice and in neonatal research for the early identification and stratification of patients with a high risk of BPD.