Background

Retinopathy of prematurity (ROP) is a disease that causes vision loss, vision impairment, and blindness, most frequently manifesting among infants with low birth weight (BW) and poor health status. The survival of preterm infants has increased in the last few decades due to the rapid improvement in neonatal intensive care. Consequently, the incidence of ROP has increased, particularly in newly industrialized countries, comprising of a “third epidemic.” [1] The reported incidence of ROP that requires treatment varies from 0 to 34.8%, [2,3,4,5] depending on local neonatal care quality and characteristics of each individual patient.

The development of ROP is associated with multiple risk factors. Early gestation age (GA) and low BW are two of the most important risk factors. Other factors include blood transfusion, mechanical ventilation, anemia, respiratory distress, dyspnea, and poor health. Several screening guidelines of ROP based on GA and BW have been introduced for neonatologists to use in identification of preterm neonates who are at ≤32 weeks GA, or BW ≤1500 g. Risk criteria for preterm neonates include neonates at ≤32 weeks GA or with a BW ≤1500 g. An infant with a very unstable clinical course can also be identified to be of high risk for developing ROP, indicating a need for ophthalmology screening [6]. Challenges in identifying ROP in preterm neonates includes complying with screening guidelines, the expense of timely screenings, potential neurologic and cardiopulmonary side effects of dilated fundus examinations, and the large amount of work required by health professionals. Therefore, a more feasible methodology is necessary to identify infants who require ROP screening.

The ROPScore proposed by Eckert et al. is a scoring system that can be used to predict the severity of ROP [7]. This algorithm utilizes the following predictive variables: birth weight, gestational age, blood transfusion, mechanical ventilation and proportional weight gain at the sixth week of life. The score is calculated in the sixth week of life by use of a spreadsheet. A high score indicates that the infant has a high risk of developing severe ROP [8]. CHOP ROP (Children’s Hospital of Philadelphia ROP) used postnatal weight gain, BW, and GA in their ROP prediction model in a cohort of infants, which meets current ROP screening guidelines [9].

As far as known to the authors, only a few studies have validated this screening tool [8,9,10,11]. These studies were retrospective analyses of the efficacy of the ROPScore in American, Canadian, Italian and Brazilian populations. The purpose of this study was to evaluate the use of the ROPScore and CHOP ROP models to predict ROP in a Chinese population.

Methods

A retrospective cohort study was conducted from January 2009 to December 2019 in NICUs in Henan Province, China. The Life Science Ethics Committee of Children’s Hospital Affiliated to Zhengzhou University approved the study (IRB number 20081227).

Patient population

Patients eligible for enrollment included infants admitted to the NICU at GA ≤32 weeks or BW ≤1500 g. Infants with any of the following were excluded from this study: genetic metabolic diseases, congenital major abnormalities, and infants who died before the sixth week after birth.

Weight measurements

Follow standard clinical procedures for all infants and weight measurements were conducted weekly from birth to discharge. These measurements were repeated again at a GA of 40 weeks [12].

ROPScore screening

ROPScore Screening was conducted in the sixth week of life with a Microsoft Excel spreadsheet (Microsoft, Redmond, WA, USA), as suggested by Eckert et al. [7] This algorithm utilizes the following predictive variables: birth weight, gestational age, blood transfusion, mechanical ventilation and proportional weight gain at the sixth week of life [7]. The score is determined by linear regression, which takes into account the effect of each variable towards the onset of ROP.

CHOP ROP screening

Binenbaum et al. developed a simpler logistic regression based model named PINT ROP [13]. The PINT ROP cohort was at a high risk for ROP. Therefore, the investigators applied the same modeling approach to a low risk cohort, which is more representative of the current US ROP screening criteria (BW < 1501 g), to develop an updated model called CHOP ROP [9]. Data was collected from medical records and entered into a web-based database, consisting of BW, GA, weight gain rate measurements, detailed demographics, ophthalmologic and medical data. Data quality was ensured through implementing data input verification rules, data review and discrepancy checking algorithms, and investigation and analysis of all tag values [11, 14].

ROP screening and classification

ROP screening was performed for all extremely preterm infants by qualified ophthalmologists with expertise in ROP in accordance with the Chinese guidelines for the examination and treatment of ROP [15]. The choice to conduct additional ROP screening was determined according to the results of the initial screening. Termination of ROP screening was determined according to vascular development in the retina or up to 45 weeks of corrected GA [15]. ROP was subdivided into stages 1–5 based on the International Classification of ROP [16]. Mild ROP was defined as having stage 1 or stage 2 ROP in zone II or III without plus disease [12]. Type 1 ROP was defined as any stage ROP in zone I with plus disease; stage 3 ROP in zone I without plus disease; or stage 2 or 3 ROP in zone II with plus disease [17]. Type 2 ROP was defined as stage 1 or 2 ROP in zone I without plus disease; or stage 3 ROP in zone II without plus disease [17]. Severe ROP was defined as any prethreshold, any stage 3, or any threshold ROP [12].

Clinical data collection

The following clinical data was collected: age, sex, gestational age, birth weight, number of blood transfusion, weekly weight measurements, days of mechanical ventilation and oxygen administration, ROP examination results, and the incidences of necrotizing enterocolitis (NEC), bronchopulmonary dysplasia (BPD), intraventricular hemorrhage (IVH), and sepsis. Diagnosis of ROP was conducted by pediatric ophthalmologists. Evaluations of ROP were judged as follows: none, immature, or mature vascularization. Staging of disease was performed in accordance with the International Classification of ROP [18, 19].

Statistical analysis

SPSS software version 19.0 (SPSS, Inc., Chicago, IL, USA) was used for statistical analysis and data management. Maternal and infant characteristics were analyzed using descriptive methods and compared using t-test or one-way ANOVA (> 2 levels) for continuous variables and the chi-squared test for categorical variables. Receiver operating characteristic (ROC) curves were used to assess the accuracy of the continuous values of the ROPScore and CHOP ROP model to predict severe ROP. ROPScore was used as a dependent variable in conducting multiple linear regressions. The independent variables used in multiple logistic regression analysis were based on significant correlations and significant non-parametric univariate analyses. For severe ROP, these variables were: BW, GA, duration of ventilation, sepsis, and weight gain at the sixth week of life. The statistical significance level was set at p < 0.05.

Results

Baseline characteristics

In this study, 3624 children were screened for ROP and underwent weekly weight measurements. The ROPScore and CHOP ROP model was developed for infants with GA ≤32 weeks at birth or BW ≤1500 g. 37 infants were excluded due to incomplete weight data or because they had pathological conditions. Thus, 3587 infants born at GA ≤32 weeks or with BW ≤1500 g were included in this study. The prevalence at any stage of ROP was 372/3587 infants (10.4%). 192 preterm infants developed type 2 ROP (5.4%) and 180/3587 developed type 1 ROP that required treatment (5.0%). The baseline demographics and clinical characteristics for this cohort are shown in (Table 1). The weight gain rate was much lower in the type 1 or type 2 ROP groups compared to the group with no ROP (p < 0.001 respectively).

Table 1 Demographics of the 3587 very preterm infants included in the study

ROPScore outcomes

The accuracy of ROPScore in predicting ROP in our participants was determined by the ROC curve (Fig. 1). Sensitivity and specificity were obtained for continuous score values by using cut-off points. The range of ROPScore values was 7.2 to 19.6. The optimal cut-off point established for any stage of ROP was 12.3 (55.8% sensitivity and 77.8% specificity), whereas the optimal cut-off point for severe ROP was 13.3 (50.0% sensitivity and 87.0% specificity).

Fig. 1
figure 1

Receiver operating characteristic (ROC) curves for the detection of any stage of retinopathy of prematurity (ROP) (a) and of severe ROP (b), according to the ROPScore algorithm

The areas under the ROC curve for the ROPScore were 0.70 and 0.76 to predict any stage of ROP and severe ROP, respectively. The area value of severe ROP was significantly higher for ROPScore than the areas for BW (0.60), GA (0.73), and duration of ventilation (0.63), when measured separately (Table 2).

Table 2 Area under the ROC curve for ROPScore compared with other predictors for severe ROP

ROPScore and infant characteristics

Multivariate logistic regression analysis showed that BW, GA, duration of ventilation, number of blood transfusions, and weight gain at the sixth week of life were risk factors for ROP. ROPScore had less tendency of predicting ROP. The unadjusted coefficient was 0.064, with an odds ratio of 1.07 at a 95% confidence interval (CI, 1.03 to 1.11). The adjusted coefficient was 1.088 with an odds ratio of 2.97 at 95% CI (0.84 to 10.45) (Table 3).

Table 3 multiple logistic regression analysis of the predict factors of ROPScore for severe ROP

CHOP ROP model outcomes

The infants who developed type 1 ROP were correctly predicted by the CHOP ROP model (sensitivity, 100%), but with a low specificities of 21.4% from birth to six weeks of life, 41.2% in the third week, 36.9% in the fourth week, 32.6% in the fifth week, and 38.0% in the sixth week. These results are summarized in (Table 4).

Table 4 Prediction of Type 1 ROP by the CHOP ROP Model Based on Birth Weight, Gestational Age, and Daily Weight Gain Rate

Discussion

Eckert et al. developed a relatively uncomplicated model for predicting ROP in preterm infants, known as ROPScore [7]. The model is implemented using an Excel spreadsheet, which is comprised of a logistic regression equation used to calculate risk. The model includes continuous rather than dichotomized terms for BW and GA, weight gain at a single time point (6 weeks postnatal age) as a proportion of BW, dichotomous terms for blood transfusion and the use of oxygen in mechanical ventilation during the first 6 weeks of life. Assuming a specific cut-off level for low or high risk cases, ROPScore had a sensitivity of 98% and specificity of 56% for predicting ROP cases that required treatment in a cohort of 474 Brazilian infants [7]. In the present study, ROPScore had a sensitivity of 50% and specificity of 87% for predicting ROP cases that required treatment in a cohort of 3587 Chinese infants. These findings suggest that ROPScore should not be used to determine overall screening criteria. Instead, it should be used to reduce the frequency of exams in low-risk infants [7].

The poor performance of postnatal weight gain ROP models in countries with developing neonatal care systems may be related to differences in ROP pathophysiology, particularly in older GA infants. At older post-menstrual ages, endogenous production of insulin-like growth factor-1 (IGF-1) has already increased, such that low IGF-1 may play a smaller role in the pathogenesis of severe ROP [20]. In contrast, ROP in such infants might be driven primarily by high oxygen exposure, which has been shown to cause inhibition of vascular endothelial growth factor and retinal blood vessel destruction in oxygen-induced animal models of ROP. Notably, other predictive models currently undergoing testing in ROP also have limitations. For example, WINROP [21] was proposed for use in European populations and has been validated by several studies [12, 22,23,24], which have shown robust effectiveness in predicting ROP. However, some studies have shown that this score does not perform well in underdeveloped countries, in which moderate and late preterm infants can also develop ROP [25, 26].

We validated the CHOP ROP model in a large cohort of Chinese infants. The size of the cohort, including 180 infants who developed severe ROP, allowed us to estimate the sensitivity of the model with a high degree of precision. In this study, it was showed that the CHOP ROP model can be applied clinically to reduce the number of infants requiring examinations by one-third. No infants with type 1 ROP were excluded (sensitivity, 100%) using this model, which showed higher sensitivity compared to the evaluation of North American infants (sensitivity, 98.5%) [11]. Therefore, the CHOP ROP model could be used with confidence, ensuring that all infants with type 1 ROP are identified. The model can also be used to guide a modified screening schedule to reduce the number of examinations for lower-risk, older-GA infants.

In China, the prevalence of ROP varies according to the region, level of neonatal care, and access to ophthalmologic screening programs. Importantly, blindness caused by ROP can be prevented with timely screening [27]. The CHOP ROP and ROPScore models are useful for predicting ROP. Scoring systems have become widely used in neonatology, including neonatal intensive care, in order to aid in the detection of comorbidities. Predictive algorithms represent promising and appropriate tools that can be used to identify preterm infants at risk of developing severe ROP, as well as to reduce the excessive number of examinations performed for each preterm infant [28]. The CHOP ROP model was more sensitive than ROPScore for predicting type 1 ROP. The introduction of predictive algorithms remains in the preliminary phase and it should be emphasized that the goal is not to replace current screening guidelines. Rather, these tools can be used to help reduce the incidence of missed diagnoses of ROP [29, 30].

Regardless of the positive aspects of these predictive algorithm, there are also limitations in clinical application. First, ROPScore calculation uses preterm weight only at the sixth week of life. Hence, this test may be unable to detect high-risk preterm infants in which aggressive posterior ROP begins prior to weight measurement, then evolves rapidly [30]. Moreover, early hospital discharge of preterm infants who show robust growth is another factor that contributes to failure in collecting weight data at the correct time, which results in the inability to apply the ROPScore and CHOP ROP model.

Conclusion

We demonstrated that the ROPScore and CHOP ROP models were an effective, promising, and noninvasive screening tool for the prediction of ROP in a Chinese population of preterm infants. The results obtained by Eckert et al. [7] were compatible with the results obtained in the present cohort regarding high sensitivity. With regard to ROPScore cut-off points, we adjusted the values for use in a Chinese population (12.3 and 13.3, for any stage of ROP and severe ROP, respectively), similar to the cut-off points used in the original study [7]. This suggests that the cut-off points would have been sufficient to detect all preterm infants with severe ROP. However, the sensitivity was lower than that reported by Eckert et al. [7] Thus, the ROPScore may need optimization for the Chinese population. The sensitivity of CHOP ROP model was higher in our study than when applied to North American infants reported by Binenbaum et al. Therefore, the CHOP ROP model may more appropriate for the Chinese population.