Introduction

Hypoxic–ischemic encephalopathy (HIE) is an important form of neonatal encephalopathy, has high mortality and morbidity and occurs in 1–8 cases per 1,000 live births [1]. The effects of HIE on the brain can lead to serious conditions, rapid progression and poor prognoses. However, patients experience varying degrees of neurological sequelae. The optimal method for evaluating suspected HIE is magnetic resonance imaging (MRI) [2]. Owing to the different examination times, injury patterns and protocols, some HIE patients may have normal or pseudonormal MRI findings [3]. When HIE occurs with brain injury, it is particularly challenging to identify and define. Furthermore, it has been reported that neurologic deficits may also occur in some neonates with normal brain MRI findings [4].

The Sarnat Grading Scale assesses the severity of HIE [5]. Normal or pseudonormal MRI findings can affect the Sarnat Score and most findings are classified as mild or moderate HIE. According to the Sarnat Grading Scale, infants with mild HIE were previously considered to have a good prognosis without long-term disability. Recent data analysis suggests an increased risk of behavioral dysregulation and attention deficit among school-aged children with a history of HIE at birth, even those who had mild HIE [6,7,8]. The prevalence of abnormalities in infants with mild HIE or normal brain MRI findings ranges from 20−60% [9]. Infants with mild HIE or HIE and normal brain MRI findings who were cooled had a lower incidence of brain injury than non-cooled infants [10]. Therefore, it is necessary to identify and treat HIE in infants with normal brain MRI findings.

Although conventional MRI is widely used in neonatal HIE, it provides limited information via human observation alone. Radiomics refers to the high-throughput computational extraction and analysis of features of digital medical images and the conversion of the information into mineable data [11]. These data can be subsequently analyzed to construct biomarkers for disease prediction and diagnosis by using feature selection. Radiomics can provide potentially valuable information beyond the limitations of human analysis [12]. The clinical applicability of radiomics has been investigated in several studies. Previous studies have shown the value of radiomics features in predicting disease, treatment response and prediction and prognosis in various cancers [13,14,15,16,17]. However, only a few studies have used texture analysis to evaluate ischemic changes in neonates. No study has assessed normal MRI findings in neonates with HIE to predict potential brain injury. The current study aimed to develop and validate radiomics models to differentiate brain injury from normal MRI findings in neonates with HIE.

Materials and methods

Patients and data collection

This retrospective study was approved by the Medical Ethics Committee of the Hunan Children’s Hospital of South China University. Owing to the retrospective analysis of anonymized data, the need for informed consent was waived.

The database of the Neonatology Department was reviewed to identify neonates with perinatal asphyxia and HIE between January, 2018 and April, 2022. The inclusion criteria were as follows: (1) neonates aged ≥37 weeks of gestation who underwent MRI; (2) normal MRI findings; and (3) available demographic, clinical and laboratory data. The exclusion criteria were as follows: (1) premature infants (gestational age <37 weeks), (2) infants with motion artifacts on MRI and (3) infants with metabolic diseases. A total of 38 patients who met the above criteria were included in the positive group.

The control group consisted of neonates who underwent brain MRI within the first 2 weeks of life to investigate the possibility of a congenital central nervous system malformation. Infants without abnormalities observed on brain MRI were included. Based on the consensus of radiologists, the control group included 89 neonates with normal brain MRI findings. The identification and selection of the study cohort and the exclusion criteria are presented in Fig. 1.

Fig. 1
figure 1

Flow chart summarizing enrolment of the study population. HIE hypoxic–ischemic encephalopathy, MRI magnetic resonance imaging

Magnetic resonance imaging acquisition and processing

All patients underwent brain MRI, including diffusion-weighted imaging and susceptibility-weighted imaging (SWI) scans. All brain MRI scans were performed at our hospital using a 3.0-tesla (T) MRI scanner (MAGNETOM Skyra, Siemens Healthcare, Erlangen, Germany or MAGNETOM Prisma, Siemens Healthcare, Erlangen, Germany) with an eight-channel head coil with the same MRI parameters. Additional details can be found in Supplementary Material 1.

MR images were preprocessed before segmentation and feature extraction to remove potential differences between the studies acquired from the two different scanners. The image preprocessing was performed by X.D., a pediatric radiologist with 5 years of experience. Additional details can be found in Supplementary Material 2.

Model construction, evaluation and validation

MR images were moved to three-dimensional slicer software for segmentation and then saved for subsequent radiomics feature extraction. In this study, the deep medullary veins were assessed and quantified using the created regions of interest (ROIs) close to the lateral ventricles [18]. The ROI of the ganglia, thalami or deep medullary veins were manually determined by two pediatric radiologists (Y.Y. with 10 years of experience and X.D.) with unanimous agreement. The radiologists anonymized the clinical data. Additional details can be found in Supplementary Material 3.

Patients were randomly divided into two cohorts (training and validation cohorts) for each sequence in a ratio of 7:3. Considering that some features contribute to the positive classification performance and that others might add noise to it, the minimum redundancy maximum relevance (mRMR) was used to filter the features in our radiomics model to eliminate redundant and irrelevant features and retain the features with the maximum prediction efficiency. Thereafter, the least absolute shrinkage and selection operator (LASSO) was used to select effective and predictable features that are suitable for high-dimensional low-sample-size data with collinearity. A 10-fold cross-validation was used to select features with non-zero coefficients. After determining the number of features, the most predictive radiomics features were chosen to construct the radiomics model. The selected features in each model are termed the corresponding Rad-score. Prediction models based on radiomics feature parameters include the T1-weighted image (T1WI) basal ganglia model (T1WI-BG model), T1WI thalami model (T1WI-TH model), T2-weighted image (T2WI) basal ganglia model (T2WI-BG model), T2WI thalami model (T2WI-TH model), apparent diffusion coefficient (ADC) basal ganglia model (ADC-BG model), ADC thalami model (ADC-TH model) and SWI model. Based on the results of univariate and multivariate logistic regression analyses, the independent predictors of clinical characteristics were combined with the Rad-score obtained by the model that had the best performance to establish the combined nomogram model.

Clinical features from the univariate analysis (with statistical significance P<0.05) were used in the multivariate regression analysis. Features with P<0.05 in the multivariate regression analysis were included in the clinical model.

The diagnostic efficiency of different models was measured using receiver operating characteristic (ROC) curve and area under the curve (AUC) analyses in the training and validation cohorts. The Delong test was used to test the differences in the ROC curves. The predictive performances of the different models were calibrated and evaluated in the training and validation cohorts. The Hosmer−Lemeshow test was used to evaluate the calibration curves. Finally, decision curve analysis (DCA) was used to evaluate the clinical value of the different models.

All statistical analyses were performed using IBM Statistical Package for Social Sciences Statistics for Windows (Version 26.0; IBM Corp., Armonk, NY) and R software (Version 4.1.0; R Foundation for Statistical Computing, Vienna, Austria). Quantitative data were compared using Student’s t-test or the Wilcoxon test. Categorical data were compared using χ2 test. The “mRMRe” package was used to perform the mRMR analysis, and the “glmnet” package was used to execute the LASSO algorithm. The “pROC” package was used to plot the ROC curves. All statistical tests were two-sided, with statistical significance set at P<0.05.

Results

Clinical characteristics

A study group of 38 neonates with mild HIE and normal MRI findings were included. Based on the consensus of the radiologists, the control group included 89 neonates with normal brain MRI findings. In a ratio of 7:3, patients were randomly assigned to the training (n=90) or validation (n=37) cohorts. Details of the clinical characteristics and comparison between the HIE and control groups are presented in Table 1. Significant differences between the groups were found in some clinical manifestations (asphyxia, resuscitation, dyspnea and cyanosis), laboratory markers (alanine aminotransferase, aspartate aminotransferase, creatinine, creatine kinase isoenzyme, procalcitonin and lactic acid) and blood gas analysis (CO2, pH, and base excess). No variables were statistically different between the training and validation cohorts, thus suggesting reasonable classification.

Table 1 Demographic, clinical and laboratory features

The multivariate regression analysis included all parameters with P<0.05 from the univariate analysis. The final results showed that creatinine and lactic acid levels were independent predictors of HIE (Table 2). A clinical model was established using independent predictors.

Table 2 Positive results of univariate and multivariate regression analysis of clinical characteristics

Radiomic feature selection and construction of the Rad-score

All radiomics features with non-zero coefficients in the LASSO logistic regression model were selected to build the differentiation model. After dimensionality reduction, the potential predictors were selected from the 1,316 features identified from the training cohort for each sequence, and the ROIs are shown in Fig. 2 and Supplementary Material 4. The equation for each Rad-score is presented in Supplementary Material 5. After screening the features extracted from the T1WI-BG, T1WI-TH, T2WI-BG, T2WI-TH, ADC-BG, ADC-TH and SWI models, a total of 7, 11, 7, 11, 8, 11 and 10 radiomics features were retained, respectively.

Fig. 2
figure 2

Radionics analysis on axial magnetic resonance images in a 37-week-old male neonate with mild hypoxic ischemic encephalopathy. ac Regions of interest were placed on the basal ganglia (dark blue) and thalami (light blue) on the axial T1-weighted (W) images (a), T2-W images (b) and apparent diffusion maps (c). d Susceptibility-weighted image with regions of interest placed on the deep medullary veins (dark blue)

For the training and validation cohorts, the AUC of the SWI model were 1.00 and 0.98, respectively. Thus, the Rad-score obtained by the SWI combined with the independent predictors of clinical characteristics were used to establish the combined model (Supplementary Material 5).

Performance and validation of different prediction models

The Wilcoxon test was used to evaluate the difference between the two groups and the distribution of Rad-scores in the training and validation cohorts (Fig. 3). In the training cohort, HIE patients with normal brain MRI had a higher Rad-score than the control group in each MR sequence radiomics model and the combined models (P<0.05). This finding was confirmed in the validation cohort (P<0.05).

Fig. 3
figure 3figure 3

Rad-score scatterplots. ac T1-weighted images (T1-W) (a), T2-weighted images (T2-W) (b) and apparent diffusion coefficient maps (c) with regions of interest placed on the thalami. df T1-W (d), T2-W (e) and apparent diffusion coefficient maps (f) with regions of interest placed on the basal ganglia. g Susceptibility-weighted imaging with regions of interest placed on the deep medullary veins. All plots show significantly higher Rad-scores in the hypoxic-ischemic encephalopathy with MRI findings normal group (Label=1) than in the control group (Label=0), in both the training cohort and the validation cohort

The SWI model exhibited the best predictive performance among the seven single-sequence models. The AUCs of the training and validation cohorts in the SWI model were 1.00 (95% confidence interval [CI], 0.94−1.00) and 0.98 (95% CI, 0.82−0.99), respectively. In the training cohort, the AUCs of the T1WI-BG, T1WI-TH, T2WI-BG, T2WI-TH, ADC-BG, ADC-TH and clinical models, namely, 0.98 (95% CI, 0.85−0.97), 0.98 (95% CI, 0.79−0.94), 0.98 (95% CI, 0.83–0.96), 0.98 (95% CI, 0.89−0.99), 0.97 (95% CI, 0.77−0.92), 0.97 (95% CI, 0.79−0.94) and 0.82 (95% CI, 0.71−0.89), respectively, were relatively lower than the AUC of the SWI model. The AUC of the nomogram with creatinine, lactic acid and SWI Rad-score in the training cohort was 1.00 (95% CI, 0.94−1.00). The details are presented in Table 3 and Fig. 4.

Table 3 Accuracy and predictive value of different models
Fig. 4
figure 4

Calibration curves of the three models for the training (a) and validation (b) cohorts

No significant differences were found between the ROC of the SWI model and the combined nomogram model in the training (P=0.38) and validation (P=1.00) cohorts. The calibration curve showed that the predicted probability of each model was in good agreement with the observed values (Fig. 5).

Fig. 5
figure 5figure 5

Receiver operating characteristic curves for the training and validation cohorts for different models. ac T1-weighted images (T1-W) (a), T2-weighted images (T2-W) (b) and apparent diffusion coefficient (ADC) maps (c) with regions of interest placed on basal ganglia. df T1-W (d), T2-W (e) and ADC maps (f) with regions of interest placed on thalami. gh Susceptibility-weighted imaging with regions of interest placed on deep medullary veins. The graphs represent the susceptibility-weighted model versus the clinical model versus the combined model for the training (g) and validation (h) cohorts. ADC apparent diffusion coefficient, AUC area under the curve, BG basal ganglia, DMV deep medullary veins, SWI susceptibility-weighted imaging, TH thalami

The DCA based on the clinical, SWI and combined nomogram models is shown in Fig. 6. The decision curve showed that the SWI and combined nomogram models had better predictive performance than the clinical model.

Fig. 6
figure 6

Decision curve analysis for the three models. The light blue solid line and dark blue broken line represent the susceptibility-weighted imaging radiomics model and combined model, respectively. The black line represents the clinical model. Decision curves showed that the susceptibility-weighted imaging radiomics and combined models achieved more clinical utility than the clinical model. DMV deep medullary veins, SWI susceptibility-weighted imaging

Discussion

To the best of our knowledge, this is the first study to assess the utility of normal MRI in neonates with HIE to predict potential brain injury. MRI-based radiomics models of the basal ganglia, thalami and deep medullary veins allow for the accurate diagnosis of brain injury associated with HIE in neonates even when conventional brain MRI findings are normal. Our results show that radiomics features obtained from the basal ganglia and thalami on T1WIs, T2WIs and ADC maps have high diagnostic accuracy with AUC>0.90. The SWI model obtained from the deep medullary veins on the SWI had excellent diagnostic performance (AUC, 1.00), accuracy of 0.99, sensitivity of 0.96 and specificity of 1.00 in the training cohort. The combined nomogram model that was used together with the SWI Rad-score and creatinine and lactic acid levels did not significantly contribute to the differentiation between the brain injury with normal MRI findings in HIE and the normal groups, with an AUC of 1.00, an accuracy of 0.99, a sensitivity of 1.00 and a specificity of 0.98 in the training cohort. The Wilcoxon test, calibration curve and the Hosmer−Lemeshow test were performed to evaluate the predictive model. The results of our study suggest that the Rad-score value of each model is meaningful in the Wilcoxon test, thus indicating that radiomics is useful in each sequence and that radiomics features show the commonality of distinguishing the HIE group from the normal MRI and control groups in different sequences. There was good correlation between all models and the actual data.

Most studies have used conventional MRI features to predict brain injury in neonates with perinatal asphyxia. Conventional and further techniques for brain MRI have depicted the features of neonatal HIE, while Machie et al have used an MRI score to define abnormalities in HIE [19,20,21]. Parameters such as entropy, skewness and kurtosis, are commonly used. For example, Sarioglu et al. [22] used MRI-based texture features from the basal ganglia and thalami on apparent ADC maps and T1- and T2-WIs. The histogram entropy log-10 value was used as an indicator to differentiate between moderate-to-severe and mild HIE (P<0.001; odds ratio [OR], 266). An independent predictor of moderate-to-severe HIE was the absence of hyperintensity in the posterior limb of the internal capsule on T1WIs (P=0.012; OR, 17.11). Kim et al. [18] analyzed the value of the texture features of the deep medullary veins on SWI as a potential biomarker according to age and the presence of ischemic injury. Among these parameters, entropy showed a significant difference between the age groups (P=0.001). The ROC on skewness resulted in an AUC of 0.87 to differentiate infants with ischemic injury. The current study analyzed the relationship between traditional imaging signs and HIE brain injury even when MRI findings were normal. A series of quantitative imaging features can be extracted from the T1WI-BG, T1WI-TH, T2WI-BG, T2WI-TH, ADC-BG, ADC-TH and SWI models. Among the seven single-sequence models, the SWI model was superior to the other models in predicting potential brain injury with HIE in neonates without MRI abnormalities. The combined nomogram based on the SWI Rad-score and clinical factors may be used as a quantitative tool; however, it did not outperform the SWI model. According to the general pathophysiology, early pathological changes in HIE mainly include nerve cell degeneration, necrosis, brain edema, intracranial hemorrhage and cerebellar injury. After hypoxia occurs in brain tissue, cerebral blood flow perfusion decreases, arterioles show reactive dilation, oxygen intake by hypoxic brain tissue increases and hemodynamics at the microvascular level shows consequent damage, thus increasing the proportion of deoxyhemoglobin in venules. SWI is more sensitivite than conventional MRI in detecting abnormal venous dilatation in the brains of neonates with HIE [23]. Therefore, neonates with HIE have potential brain injuries even without MRI abnormalities. Radiomics refers to the high-throughput computational extraction and analysis of features from digital medical images and the conversion of information into mineable data. Wavelet_HHH_glszm (gray-level size-zone matrix [GLSZM]) _GrayLevelNonUniformity and wavelet_HLH_glcm (gray-level co-occurrence matrix [GLCM]) _maximal correlation coefficient were considered key parameters for the accuracy of the proposed SWI model according to their corresponding coefficients. The GLSZM consists of elements containing the number and size of gray areas. The gray-level band matrix includes features that describe the distribution of small/large areas and low/high gray areas. The GLCM is a matrix wherein the number of rows and columns represents the number of times the gray value is in a certain relationship (angle, distance), i.e. a second-order histogram. The features calculated on the GLCM included entropy, energy, contrast, homogeneity, dissimilarity and correlation [24,25,26]. The GLSZM and GLCM differ in their SWI sequence image gray level, image uniformity, contrast and homogeneity.

This study has several limitations. First, it is a single-center retrospective study with a lack of external verification, which might have led to case selection bias and limited generalizability. Second, although a relatively large number of neonates were included in this study, the cohort is small compared to those of other radiomics studies, particularly the HIE neonatal group with no MRI abnormalities; this might affect the general applicability of our results. A large-scale, prospective, multicenter study is required to validate our results. Third, manual segmentation was used to delineate the ROIs; automatic or semiautomatic segmentation, which are objective, were not used for comparison and verification. Fourth, various imaging protocols may potentially affect the radiomics. To handle this issue, image preprocessing before segmentation and feature extraction were performed to improve the robustness of the radiomic features. However, the variability of some imaging parameters that could not be normalized might have affected our results. The use of standardized imaging protocols is important to avoid low-quality unreliable results. Further research is needed to address these deficiencies.

Conclusion

This study developed and compared eight models to assess the utility of normal MRI in neonates with HIE to predict potential brain injury. The results suggest that the SWI and combined nomogram models have potential for use in differentiating brain injury from normal MRI in HIE, with the SWI model offering the greatest diagnostic value.