Abstract
To evaluate our two non-machine learning (non-ML)-based algorithmic approaches for detecting early ischemic infarcts on brain CT images of patients with acute ischemic stroke symptoms, tailored to our local population, to be incorporated in our telestroke software. One-hundred and thirteen acute stroke patients, excluding hemorrhagic, subacute, and chronic patients, with accessible brain CT images were divided into calibration and test sets. The gold standard was determined through consensus among three neuroradiologist. Four neuroradiologist independently reported Alberta Stroke Program Early CT Scores (ASPECTSs). ASPECTSs were also obtained using a commercial ML solution (CMLS), and our two methods, namely the Mean Hounsfield Unit (HU) relative difference (RELDIF) and the density distribution equivalence test (DDET), which used statistical analyze the of the HUs of each region and its contralateral side. Automated segmentation was perfect for cortical regions, while minimal adjustment was required for basal ganglia regions. For dichotomized-ASPECTSs (ASPECTS < 6) in the test set, the area under the receiver operating characteristic curve (AUC) was 0.85 for the DDET method, 0.84 for the RELDIF approach, 0.64 for the CMLS, and ranged from 0.71–0.89 for the neuroradiologist. The accuracy was 0.85 for the DDET method, 0.88 for the RELDIF approach, and was ranged from 0.83 − 0.96 for the neuroradiologist. Equivalence at a margin of 5% was documented among the DDET, RELDIF, and gold standard on mean ASPECTSs. Noninferiority tests of the AUC and accuracy of infarct detection revealed similarities between both DDET and RELDIF, and the CMLS, and with at least one neuroradiologist. The alignment of our methods with the evaluations of neuroradiologist and the CMLS indicates the potential of our methods to serve as supportive tools in clinical settings, facilitating prompt and accurate stroke diagnosis, especially in health care settings, such as Colombia, where neuroradiologist are limited.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
Introduction
Acute stroke is a significant cause of mortality and morbidity in both developed [1,2,3] and undeveloped countries [4]. Stroke episodes can manifest as either ischemic or hemorrhagic, necessitating precise diagnostic evaluation through imaging expertise prior to initiating treatment [5]. Surgical intervention may be considered in select cases of hemorrhagic stroke, whereas acute reperfusion therapies, such as thrombolytic therapy and endovascular thrombectomy, are viable options for managing ischemic stroke [5].
Recent large core trials, including SELECT 2, RESCUE-Japan, and ANGEL ASPECT, have demonstrated that among individuals with severe ischemic strokes, endovascular thrombectomy yields superior functional outcomes compared to those achieved with standard medical care, albeit with an association with vascular complications. Noncontrast head computed tomography (NCCT) serves as the primary imaging modality for promptly detecting both hemorrhagic and ischemic strokes in patients presenting with acute neurological deficits [6,7,8]. Ischemic infarct size within middle cerebral artery (MCA) territory was estimated with the Alberta Stroke Program Early CT Score (ASPECTS) [9]. The ASPECTS is associated with the features of ten regions in the vascular territory of the middle cerebral artery MCA: six cortical areas and four basal ganglia regions [9]. Typically, a patient with an ASPECTS < 6, as determined through nonadvanced imaging techniques under usual circumstances, is not eligible for acute reperfusion therapies) [6, 7, 10, 11].
The advancement of optimal machine learning technologies in the context of stroke is a complex undertaking contingent upon the utilization of extensive and high-quality datasets. The accessibility of such datasets is frequently constrained by factors such as the absence of data sharing agreements, apprehensions regarding patient privacy, and the substantial expenses associated with data sharing, storage, and quality control. These challenges consistently impede the establishment of robust open-access stroke registries [12, 13]. A significant limitation to the progress of machine learning-based algorithms in stroke treatment lies in the inherent difficulties required for follow-up assessments, which are particularly pronounced in developing countries where the absence of consistent and robust health systems hampers sustained long-term follow-up. Several researchers, conducting performance analyses and comparisons of diverse machine learning algorithms, have noted that approximately fifty percent of participants were excluded from their analyses due to missing outcomes and predictive variables. This exclusion introduces a level of uncertainty into the results [14].
ASPECTSs can be calculated by a specialized neuroradiologist; nevertheless, when someone with such expertise is not available in person or by teleradiology networks, an automated ASPECTS calculation may provide helpful information for neuroradiologists or neurologists during patient evaluation. However, methods based on AI require a large amount of data for tuning or optimization of the algorithms [15]. Commercial solutions that include automated ASPECTS modules [16] can be expensive, there is limited information on how they process the information and were not tailored to our local population.
The automated calculation of the ASPECTS involves two primary steps: segmenting the ASPECTS regions and subsequently evaluating the presence or not of an ischemic infarct within each region before computing the overall score. ASPECTS regions can be automatically identified using standardized coordinate spaces for patient volume alignment [17, 18], convolutional neural networks (CNNs) [19], registration algorithms using a reference template or an atlas [20,21,22,23], and deep learning networks (DLNs) [24, 25]. Methods for ischemic infarct evaluation using artificial intelligence (AI) are split into automatic algorithms using DLNs [25, 26] or CNNs [22,23,24], and explicit feature-based methods focusing on intensity or texture. Intensity features include the average and standard deviation (SD) [27]; the brain density shift comparing intensity histograms [20, 21]; and pixel intensity differences [17, 18, 27]. These features are compared against thresholds, contralateral regions, or used in binary classifiers [18, 28, 29]. Unfortunately, those AI methods require large amounts of well-labeled datasets, which are scarce for acute ischemic stroke [15].
The aim of this study was to develop and to validate the diagnostic performance two non-ML-based methods to detect ischemic infarcts in ASPECTS-identified regions tailored to the local population; these methods will be integrated into our web-based telestroke network system [30]. To avoid the need for large datasets of AI methods for ischemic infarct evaluation, statistical parameters of the intensities ASPECTS regions, measured in Hounsfield Units (HUs), were evaluated. In addition, patients involved in the study for tunning the algorithms were issued of our local population. Hence, a low-cost and tailored solution was achieved.
Materials and Methods
Our institutional review board (IRB) approved this retrospective study and waived the requirement for informed consent.
Study Design and Dataset
Our hospital is a private primary stroke center certified by the Joint Commission International and is one of the few centers in Colombia with 24/7 endovascular thrombectomy capabilities. Patients with acute stroke symptoms who presented to our hospital between 2013 and 2018 and for whom the stroke code was activated were eligible for the study. Patients were randomly selected without repetition. The diagnostic NCCT images were acquired using a standard protocol with a General Electric LightSpeed 64 slice CT scanner (General Electric Healthcare, GE Medical Systems, Milwaukee, WI, USA) with the following parameters: 100 kV, 10 mA, axial: 5 mm, sagittal: 3 mm, FOV: 26 cm, pixel spacing: 0.5 × 0.5 mm, and matrix: 512 × 512. The images were stored in standard Digital Imaging and Communication in Medicine (DICOM) format. Initially, CT images with artifacts and patients younger than 18 years were excluded.
This was a repeated measures study. All NCCT images were interpreted by four neuroradiologist, three with more than ten years of experience and one with four years of experience in neuroradiology, who reported the presence of acute ischemic infarcts and calculated the ASPECTS for each patient. The ASPECTS for each patient was also obtained using three automated ASPECTS systems: a commercial machine learning solution (CMLS), i.e., RapidAI ASPECTS (iSchema View, Menlo Park, CA, USA) [31], and the two algorithms proposed in this study.
Neuroradiologist interpretations were carried out using a DICOM-compliant medical workstation. The medical workstation used the viewer software Agfa IMPAX 6.5 (AGFA HealthCare, Mortsel, Belgium). Images were displayed using an E-2620 BARCO monitor (BARCO N. V, Kortrijk, Belgium), which has a 2-megapixel (MPx) LCD medical grayscale display, is DICOM-compliant, has a dot pitch of 0.249 mm, has a spatial resolution of 1600 × 1200 pixels, has a maximum luminance of 700 cd/m2, and displays 8-bit grayscale images. Relevant clinical data, such as sex, age, neurological symptoms, and medical history (e.g., diabetes, hypertension, and cardiac arrhythmia), were available to a neuroradiologist. We then compared the neuroradiologist’ interpretations to the results of the automated methods.
A detailed description of the sample, observers, reading systems, interpretation procedure, and data acquisition approach used in the present evaluation was reported in previous studies, in which the diagnostic performance of readers using different display systems was evaluated [32,33,34]. At the time of these studies, no automated ASPECTSs were available in our system.
Gold Standard
The true status of the presence of ischemic infarcts in the ASPECTS-identified regions was established by three more experienced neuroradiologist at our hospital (with 2/3 of the neuroradiologist in agreement). Our routine practice is to perform a follow-up CT scan 24 h after the initial CT scan. Therefore, neuroradiologist reviewed the initial and 24-h follow-up CT images, following the gold standard. In patients who experienced infarct evolution in the first 24 h, we could not perform a follow-up CT scan because the treatment decision was based on the initial CT scan.
Image Standardization
The initial processing of the images involved intensity normalization, where high- and low-intensity values were clipped to a predefined range. Additionally, noise reduction processes [35] were employed to enhance image clarity. Only pixels with intensities ranged from 10–55 HU were considered, to avoid cerebrospinal fluid and chronic infarcts, bone, and calcifications. For skull stripping, our algorithm first identified skull regions based on intensity thresholds. The largest skull contour was detected [36], and a mask was applied to isolate the brain area, effectively removing the skull from the image. The cerebral area was then trimmed by creating a mask to distinguish the brain area from the background, and the image was cropped to focus on the region of interest [37]. Postprocessing included resizing images to a standard dimension and adding uniform space around the image to maintain the brain's central position [37].
Image Analysis and Processing
Our methodology integrates advanced imaging and machine learning techniques to assess the impact of ischemic stroke within the MCA of the brain. For this algorithm, the first step is to standardize brain CT images using the ASPECTS-identified regions, focusing on segmentation of the basal ganglia and cortical regions. This process involves employing a CT-brain atlas [38] for alignment through both rigid and nonrigid registration techniques [39]. Our approach includes the use of a CNN to classify the images into distinct categories. Each step of our methodology, from image processing to the application of machine learning models, is designed to provide a comprehensive assessment of ischemic stroke effects, contributing to improved diagnostic and treatment strategies. Figure 1 summarizes the operational flowchart of our proposed algorithm.
Detection of Slices at the Basal Ganglia and Cortical Levels
The first step in segmentation was the detection of two slices, at the basal ganglia and cortical levels, in which ASPECTS-related regions must be identified. We employed a CNN utilizing transfer learning, specifically adopting the InceptionResNetV2 architecture [40] with pretrained ImageNet weights, excluding the top layer. Additional layers included convolutional, max pooling, flattening, and dense layers, culminating with an activation function for classification.
Registration of the Brain CT Atlas with Standardized Images
Rigid registration involved thresholding both atlas slices and patient images to enhance structural visibility, applying a 2D similarity transformation model for alignment [41], and resampling the atlas images [42] for initial overlay. Nonrigid registration starts with an identity transformation [43] and is optimized using a mutual information metric [44]. Additional image processing techniques, such as dilatation of specific regions [45], were used to enhance the visibility and distinction of important anatomical features. The atlas was resampled again, considering the nonrigid transformation, for precise overlay, with a specific focus on internal basal ganglia regions.
Detection of Ischemic Infarcts
For metric calculation, the mean intensity difference between corresponding regions in the left and right hemispheres was calculated within specific HU ranges [45] to target relevant tissue densities. The relative hemispheric difference was also computed as a measure of asymmetry in tissue density. In the region-specific analysis, regions of interest were extracted from preprocessed images for detailed analysis, including the calculation of the mean and SD of the HU intensity for each region.
Two methods for automatic ischemic infarct detection that do not use AI were proposed in this pilot study.
Mean HU relative difference (RELDIF)
This method calculates the relative difference Δ of the mean HU as a percentage of the hypodensity observed between a region and its contralateral side. If the relative difference Δ is greater than or equal to a calibration threshold, the region is considered to have an ischemic infarct. For each ASPECTS region, the calibration threshold was determined by calculating the area under the receiver operating characteristic (ROC) curve (AUC) for a large range of thresholds (0.05–50%, with steps of 0.05%), and the threshold producing the maximum AUC, with a specificity of at least 0.88, was selected as the calibration threshold for the region. This method was named RELDIF. A similar method was used in a previous study [46].
Density Distribution Equivalence Test (DDET)
This method is based on statistical Z test comparisons of two means with different standard deviations. In this method, the mean and standard error (SE) of a region were calculated over the HU of all the pixels in the region. Then an equivalence test is performed between the mean HU of the right and left regions. If significant results allowed us to claim equivalence of the two regions, no ischemic infarct was determined. In contrast, if nonequivalence was observed, a superiority test was performed to determine the region with the ischemic infarct. In the equivalence test, the null hypothesis was |difference (I-J)|—δ = 0, and the alternative hypothesis was |difference (I-J)|—δ < 0, where I and J are the mean HUs of the compared regions and δ is an HU margin representing the maximum difference indicating equivalence [47,48,49,50]. To find the optimal δm for each region, the AUC for a large range of HU margins (0.05–16 HU, with steps of 0.05 HU) and the margin producing the maximum AUC, with a specificity of at least 0.88, were selected as the calibration margins for the region. This method was named DDET. A method comparing the contralateral mean and SD was proposed by Shieh Y et al. [27], but margin thresholds were not used, and they are used in the present study.
In accordance with actual clinical procedures, neuroradiologist were provided with information regarding the side (laterality) of the focal acute neurological deficit prior to interpreting head CT images. Subsequently, either the DDET or RELDIF methods were utilized to assess the presence of a lesion within each ASPECTS-identified region on the affected hemisphere. Conversely, the CMLS (machine learning system) does not utilize these data; therefore, determination of the lesion side is solely based on imaging analysis and a training dataset.
Variables
The “Ischemic infarct detected” variable for the right and left hemispheres of each ASPECTS-identified region was established using two proposed methods (DDET and RELDIF). Thereafter, using the “Ischemic infarct detected” variable, two derived variables were calculated: 1) the ASPECTS for each patient and 2) the “Dichotomized-ASPECTS”, indicating that the ASPECTS was less than 6 (ASPECTS < 6), which was the main variable in our evaluation, as this was the main imaging contraindication for IV r-TPA administration.
Statistical Analysis
To evaluate and compare the diagnostic performance of our automatic ASPECTS calculation methods, the CMLS, and four neuroradiologists’ readings, several validity indicators were calculated for the three variables defined above, including nonparametric receiver operating characteristic (ROC) curves and the AUC, accuracy, specificity, and sensitivity. To calculate the AUC of each rater and the differences between two raters, DBM-MRMC 2.51 software (Medical Image Perception Laboratory, Iowa University, USA) was used [51]. Sensitivity, specificity, and accuracy were evaluated using generalized estimating equations (GEEs) in IBM SPSS Statistics 29 software (IBM Corp., Armonk, NY, USA); this software was also used to evaluate reliability with the Kappa coefficient [52]. The Kappa coefficients were ranked as defined by Altman [53]: “very good”, (1–0.81); “good”, (0.8–0.61); “moderate”, (0.6–0.41); “fair”, (0.4–0.21); and “poor”, < 0.2.
Noninferiority tests were performed for the AUC, accuracy, specificity, and sensitivity using the mean differences of paired comparisons and their SEs. The null hypothesis for the noninferiority test was difference (I-J) = -δ, and the alternative hypothesis was difference (I-J) >—δ, where I and J are the compared systems and δ is the maximum difference permitted to claim noninferiority, while for the equivalence test, the null hypothesis was |difference (I-J)|—δ = 0, and the alternative hypothesis was |difference (I-J)|—δ < 0 [47,48,49,50]. The significance level was set to α = 0.05.
Results
To determine the validity of our proposed methods for automatic ischemic infarct detection, optimized for our local population, we aimed to evaluate its diagnostic performance by comparing its validity indicators, as AUC, accuracy, specificity, sensitivity, between them and against the four neuroradiologist and the CMLS.
There was a total of 188 patients for whom the stroke code was activated and were eligible. Among these 188 patients, those with hemorrhagic (n = 25), subacute (n = 31), and chronic (n = 19) stroke were excluded, resulting in a final sample of 113 patients. Patients were aged 30 to 94 years, with a mean age of 69.7 years (SD = 15.1), and 58 (51%) were males.
The 113 patients were separated into two sets, a calibration set (n = 65) used to find the optimized parameters required in our automatic ASPECTS calculations. The second set (n = 48) was used to test and compare our automatic ASPECTSs with those calculated by the four neuroradiologists and the CMLS.
According to the gold standard, there were 52 patients with nonvisible ischemic infarcts (i.e., ASPECTS = 10) and 66 with ASPECTSs < 10. At basal ganglia regions, the most frequent infarcts were at the internal capsule (49), and at cortical regions the most frequent infarcts were at the MCA cortex lateral to insular ribbon (Table 1).
Detection of Slices at the Basal Ganglia and Cortical Levels
A dataset of 1484 brain CT slices (from 85/113 patients) was categorized into three classes: basal ganglia, cortical, and control. This dataset was partitioned such that 80% went into the training set and 20% went into validation set. Model training over 50 epochs resulted in a loss of 18%, a precision (VPP) of 93.7%, and a recall (sensitivity) of 93.3% during training and a loss of 25.6%, a precision of 92.9%, and a recall of 92.9% during validation.
ASPECTS- Regions Identification
Within the two slices detected, the contours of the ASPECTS-identified regions were plotted, and a neuroradiologist established whether the contours required additional adjustment before evaluating the presence of an ischemic infarct. The quality results of this segmentation for cortical regions showed a perfect match, with no adjustments needed in 100% of regions, while at basal ganglia regions, it ranged from 85 to 96% (Table 2).
Optimization of the Parameters for DDET and RELDIF Algorithms
The 65 patients of the calibration set were used to find the optimized parameters required in our automatic ASPECTS calculations. The equivalence margins δm parameter required for the DDET method were ranged from 1.95–3.9 HU at basal ganglia regions and ranged from 0.25–1.65 HU at cortical regions. The threshold Δ parameter for the RELDIF required for method were ranged from 4–9.1% at basal ganglia regions and ranged from 1.3–5.05% at cortical regions (Table 3). The AUC achieved with those parameters at the ten regions ranged from 0.57–0.86 for DDET, and higher values were observed for the accuracy, ranged from 0.83–0.94 for DDET, and ranged from 0.78–0.92 for RELDIF (Table 3).
Overall ASPECTS
The mean ASPECTS on the test set (n = 48) was 7.56 for the gold standard, 7.6 for RELDIF, and a lower mean of 7.4 was observed for DDET (Table 4). The mean ASPECTS for the other methods was greater, ranging from 8.23–8.67. Equivalence, at a margin of 5%, was documented among the gold standard, DDET, and RELDIF (all P < 0.001) (Table 4).
Dichotomized-ASPECTS
Using the calibration parameters on the test set (n = 65), the AUC, sensitivity, specificity, and accuracy were calculated for the Dichotomized-ASPECTS with our two automatic methods (DDET and RELDIF). The mean AUCs were 0.82 for DDET and 0.84 for RELDIF and 0.64 for CMLS; the mean AUCs ranged from 0.71–0.89 for the neuroradiologists. The accuracy of DDET was 0.85 and 0.88 for RELDIF and ranged from 0.83–0.96 for the other raters. The specificity of DDET was 0.87 and 0.90 for RELDIF and ranged from 0.95–1.00 for the other raters. The sensitivity was 0.78 for DDET, 0.78 for RELDIF, and ranged from 0.33–0.78 for the other raters (Table 5).
To compare the performance of both the DDET and RELDIF methods, paired noninferiority statistical tests with a noninferiority margin of 0.1 were performed (Table 5). The DDET method confirmed the noninferiority of the AUC with respect to the CMLS (P = 0.006), to the Neuroradiologist 2 (P = 0.04), and to the Neuroradiologist 4 (P = 0.03). The RELDIF method confirmed the noninferiority of the AUC with respect to the CMLS (P = 0.005), to the Neuroradiologist 2 (P = 0.03), and to the Neuroradiologist 4 (P = 0.02). The DDET method confirmed the noninferiority of the sensitivity with respect to the CMLS (P = 0.007), to the Neuroradiologist 2 (P = 0.02), and to the Neuroradiologist 4 (P = 0.02). The RELDIF method confirmed the noninferiority of the AUC with respect to the CMLS (P = 0.007), to the Neuroradiologist 2 (P = 0.02), and to the Neuroradiologist 4 (P = 0.02). Noninferiority accuracy was confirmed between DDET versus CMLS (P = 0.03), and between RELDIF versus CMLS (P = 0.02). For specificity, noninferiority was observed for DDET versus RELDIF (P = 0.005).
Detection of Ischemic Infarcts by Region
The AUC for all ten regions pooled was 0.7 for DDET, 0.71 for RELDIF, 0.66 for CMLS, and 0.71–0.8 for the neuroradiologists. The accuracy was 0.77 for DDET, 0.79 for RELDIF, and 0.81 for CMLS and ranged from 0.84–0.89 for the neuroradiologists. The specificity was 0.84 for the DDET and 0.86 for the RELDIF but ranged from 0.94–0.97 for the other methods. In contrast, the sensitivities of both the DDET and the RELDIF were 0.56 and 0.38 for the CMLS, respectively, and ranged from 0.44–0.63 for the neuroradiologists (Table 6).
For the AUC equivalence, at a margin of 0.1, was confirmed between the DDET and the RELDIF (P = 0.004), between the DDET and the CMLS (P = 0.04), between the Neuroradiologist 2 and both the DDET and the RELDIF (both P = 0.003), and between the Neuroradiologist 4 and the both the DDET (P = 0.01) and the RELDIF (P = 0.008); in addition, noninferiority was observed for the RELDIF against the CMLS (P < 0.001) (Table 6).
For the sensitivity superiority was confirmed for the DDET over the CMLS (P = 0.003), and for the RELDIF over CMLS (P = 0.004); in addition, superiority was confirmed for both the DDET and the RELDIF, over the Neuroradiologist 2 (P = 0.03 and P = 0.046 respectively). Other comparisons for accuracy, specificity, and sensitivity were included in Table 6. The detailed results grouped by basal ganglia regions (Table 7) and cortical regions were also evaluated (Table 8).
The overall agreement on infarcts detection between the DDET and RELDIF using the test set was “Almost perfect” (Kappa = 0.8); for basal ganglia regions pooled was “Almost perfect” (Kappa = 0.88), and for cortical regions pooled was “Substantial” (Kappa = 0.75). For individual regions, agreements ranged from 0.77–92, and for cortical regions were ranged from 0.38–1.0, with the lower value for M4- Anterior MCA territories (Table 9).
Discussion
Principal Results
Based on the detection of slices at the basal ganglia and cortical levels, high values for recall and precision were observed for the training and validation sets (all > 92%). Within these slices, a perfect match was observed (100%) in cortical ASPECTS-identified regions, and few adjustments were required for basal ganglia regions (matches ranged from 85–96%). For the Dichotomized-ASPECTS (ASPECTSs < 6), which is a contraindication to thrombolysis treatment, the mean AUCs observed using DDET (0.85) and RELDIF (0.88) were noninferior to that of two neuroradiologist, with a margin of 0.1. The detection of individual ischemic infarcts in the ASPECTS-identified regions showed higher sensitivity and lower specificity for our methods than for neuroradiologist, resulting in final ASPECTSs for DDET and RELDIF lower than those of neuroradiologist but statistically equivalent to the gold standard. Therefore, neuroradiologist and the CMLS showed a less conservative behavior, leading to more patients undergoing thrombolysis when assessed by the former than by our methods.
The detection of individual ischemic infarcts in the ASPECTS-identified regions showed higher sensitivity and lower specificity for our methods than for neuroradiologist, resulting in final ASPECTSs for DDET and RELDIF lower than those of neuroradiologist but statistically equivalent to the gold standard. The overall agreement on infarcts detection between the DDET and RELDIF shows their readability, allowing us to integrate either or both in our telestroke system.
Comparison with Prior Work
Several studies have evaluated automated ASPECTSs. Wolf et al. evaluated the syngo.via Frontier ASPECTS software (Siemens Healthcare GmbH, Erlangen, Germany) [46], which is based on the relative density difference between the affected and contralateral regions and obtained a mean AUC of 0.713 for the detection of an affected ASPECTS region (overall ten regions). A similar study was performed by Ayobi et al. using CINA-ASPECTS software (Avicenna. AI, La Ciotat, France), which uses deep learning and obtains higher AUC, sensitivity, and specificity values. Nagel et al. evaluated e-ASPECTS software (Brainomix®, Oxford, UK) and concluded that e-ASPECTS was noninferior to neuroradiologist in determining ASPECTSs. However, those studies are not comparable to our study, as they calculated scores for both hemispheres as if they were from different patients (20 ASPECTS-identified regions, instead of 10 in our study).
Chen et al. [54] evaluated the overall incidence of ischemic infarcts in NBC (NeuBrainCARE) software (Neusoft Medical Systems, China) and RapidAI ASPECTS software (iSchema View, Menlo Park, CA, USA) [31], obtaining AUCs of 0.71 and 0.76 for NBC and the CMLS, respectively; however, this study included only patients who had already received intravenous thrombolysis or mechanical thrombectomy and did not compare dichotomized-ASPECTSs, as in our study. However, the study of Ferreti et al. [55], also conducted using e-ASPECTS, reported dichotomized-ASPECTS evaluations, and obtained a mean AUC for this variable of 0.78 and a sensitivity and specificity of 0.75 and 0.73, respectively; these values are lower than the means observed in our study for DDET (0.82, 0.78 and 0.87, respectively) and for RELDIF (0.84, 0.78 and 0.9, respectively).
Limitations
A limitation of this pilot study is the lack of chronic patients. Even if these patients were included in our original neuroradiologists’ interpretation procedure, the ASPECTS regions were not evaluated for ischemic infarcts in these patients. Therefore, chronic patients were excluded from our comparisons. The same was true for subacute cases more than 4.5 h after onset, as those patients are also contraindicated for IV r-TPA. Nevertheless, if chronic lesions are included in our algorithms, its pixels are excluded as only those ranged from 10–55 HU are retained. Subacute lesions are not a problem either, as ischemic tissues exhibit decreased intensity over time [56], resulting in a more accurate detection. In addition, in our sample, there were few ischemic infarcts in the internal capsule or in the caudate head. Including more positive patients would improve the optimization of the margin for all regions, therefore improving the diagnostic performance of our methods. Our results promise statistically significant conclusions with lower margins and a small sample size, while AI methods require a greater sample size.
Future Work
Our two methods were validated, so that all 113 patients could be used in the calibration process, instead of the initial 65, to improve the performance of the algorithms, as we have the gold standard, the neuroradiologists' interpretations, and the CMLS evaluations, for all these cases.
The evaluation of ischemic infarct detection was performed using the slices selection and region segmentation provided by automated algorithms; however, after integrating this tool into our telestroke software, neuroradiologists could select better slices or adjust the ASPECTS-detected regions if desired, hence improving the performance of our system.
The algorithms developed in this work could be used with other populations after appropriated calibration. In the first instance, the optimization of the algorithm parameters was carried out with patients from the capital of the country (8 million inhabitants). In order to generalize it to other populations, several hospitals in different regions of the country, including indigenous and Afro-descendant populations, will be included in a next phase of development. This will allow us to evaluate if the optimization can be done with the aggregate of patients or if an optimization is necessary for each group of patients, according to their ethnicity and origin. This would also be the case for applications in other countries, especially in Latin America, which do not have their own stroke management tools.
Finally, as we have the patients’ images of chronic and subacute patients, the gold standard for the lesions at the ASPECTS-regions will be the established, allowing to increase the sample size and to better calibrate the algorithms for those cases. Once this is done, it will be put into production as a plug-in to the image viewing tools of our telestroke system [30], which provides support for in situ patient management, advanced hospital referral recommendations, teleradiology and teleneurology. This software is the basis of a new development called “appremia®” [57] which allows among others, integration with geo-referenced ambulance networks, WhatsApp communications for notifications to specialists and patient's relatives, as well as specialist schedule management.
Conclusions
The algorithm shows promise as a helpful diagnostic tool, particularly in settings, such as Colombia, where health care resources are limited. We were able to obtain an algorithm adjusted to the local population by calibrating it with our own patients.
The alignment of our methods with the evaluations of neuroradiologist indicates the potential of our methods to serve as supportive tools in clinical settings, facilitating prompt and accurate stroke diagnosis. This approach is crucial for early stroke detection, where timely interventions are vital for patient outcomes.
Additionally, the algorithm's consistent performance across various patient groups, including those not involved in the initial optimization, highlights its generalizability and applicability in real-world clinical scenarios. Our study thus offers significant insights into the application of machine learning in medical diagnostics, presenting a viable approach to stroke assessment and treatment in resource-limited environments.
Data Availability
The data is stored in a private database and can be shared upon request.
Abbreviations
- AI:
-
Artificial intelligence
- ASPECTS:
-
Alberta Stroke Program Early CT Score
- AUC:
-
Area under ROC curve
- CI:
-
Confidence interval
- CMLS:
-
Commercial machine-learning solution
- CNN:
-
Convolutional neural networks
- DDET:
-
Density Distribution Equivalence Test
- DICOM:
-
Digital Imaging and Communication in Medicine standard
- DLN:
-
Deep learning networks
- HU:
-
Hounsfield Units
- LB:
-
Lower bound
- MCA:
-
Middle cerebral artery
- ML:
-
Machine learning
- NCCT:
-
Non-contrast head computed tomography
- RELDIF:
-
Relative difference
- ROC:
-
Receiver operating characteristic
- SD:
-
Standard deviation
- SE:
-
Standard error
- UB:
-
Upper bound
References
World Health Organization: Stroke--1989. Recommendations on stroke prevention, diagnosis, and therapy. Report of the WHO task force on stroke and other cerebrovascular disorders. Stroke. 20:1407–31, 1989.
Taylor TN, Davis PH, Torner JC, Holmes J, Meyer JW, Jacobson MF: Lifetime Cost of Stroke in the United States. Stroke. 27:1459–66, 1996.
Bonita R: Epidemiology of stroke. The Lancet. 339:342–4, 1992.
Saposnik G, Del Brutto OH, Diseases for the IS of C: Stroke in South America: A Systematic Review of Incidence, Prevalence, and Stroke Subtypes. Stroke. 34:2103–7, 2003.
Kulcsar M, Gilchrist S, George MG: Improving Stroke Outcomes in Rural Areas Through Telestroke Programs: An Examination of Barriers, Facilitators, and State Policies. Telemed J E Health. 20:3–10, 2013.
Jauch EC, Saver JL, Adams Jr. HP, Bruno A, Connors JJ, Demaerschalk BM, et al.: Guidelines for the early management of patients with acute ischemic stroke: a guideline for healthcare professionals from the American Heart Association/American Stroke Association. Stroke. 44:870–947, 2013.
Demchuk AM, Hill MD, Barber PA, Silver B, Patel SC, Levine SR: Importance of early ischemic computed tomography changes using ASPECTS in NINDS rtPA Stroke Study. Stroke. 36:2110–5, 2005.
McLaughlin PD, Moloney F, O’Neill SB, James K, Crush L, Flanagan O, et al.: CT of the head for acute stroke: Diagnostic performance of a tablet computer prior to intravenous thrombolysis. J Med Imaging Radiat Oncol. 61:334–8, 2017.
Pexman JHW, Barber PA, Hill MD, Sevick RJ, Demchuk AM, Hudon ME, et al.: Use of the Alberta Stroke Program Early CT Score (ASPECTS) for Assessing CT Scans in Patients with Acute Stroke. AJNR Am J Neuroradiol. 22:1534–42, 2001.
Tan BY, Wan-Yee K, Paliwal P, Gopinathan A, Nadarajah M, Ting E, et al.: Good Intracranial Collaterals Trump Poor ASPECTS (Alberta Stroke Program Early CT Score) for Intravenous Thrombolysis in Anterior Circulation Acute Ischemic Stroke. Stroke. 47:2292–8, 2016.
Barber PA, Demchuk AM, Zhang J, Buchan AM: Validity and reliability of a quantitative computed tomography score in predicting outcome of hyperacute stroke before thrombolytic therapy. The Lancet. 355:1670–4, 2000.
Mainali S, Darsie ME, Smetana KS: Machine Learning in Action: Stroke Diagnosis and Outcome Prediction. Front Neurol. 12:734345, 2021.
Vayena E, Blasimme A, Cohen IG: Machine learning in medicine: Addressing ethical challenges. PLoS Med. 15:e1002689, 2018.
Padimi V, Telu VS, Ningombam DD: Performance analysis and comparison of various machine learning algorithms for early stroke prediction. ETRI Journal. 45:1007–21, 2023.
Kamal H, Lopez V, Sheth SA: Machine Learning in Acute Ischemic Stroke Neuroimaging. Front Neurol. 9, 2018.
Murray NM, Unberath M, Hager GD, Hui FK: Artificial intelligence to diagnose ischemic stroke and identify large vessel occlusions: a systematic review. J Neurointerv Surg. 12:156–64, 2019.
Takahashi N, Tsai D-Y, Lee Y, Kinoshita T, Ishii K: Z-score Mapping Method for Extracting Hypoattenuation Areas of Hyperacute Stroke in Unenhanced CT. Acad Radiol. 17:84–92, 2010.
Takahashi N, Lee Y, Tsai D-Y, Kinoshita T, Ouchi N, Ishii K: Computer-aided detection scheme for identification of hypoattenuation of acute stroke in unenhanced CT. Radiol Phys Technol. 5:98–104, 2011.
Wan S, Lu W, Fu Y, Wang M, Liu K, Chen S, et al.: Automated ASPECTS calculation may equal the performance of experienced clinicians: a machine learning study based on a large cohort. Eur Radiol. 34:1624–34, 2024.
Stoel BC, Marquering HA, Staring M, Beenen LF, Slump CH, Roos YB, et al.: Automated brain computed tomographic densitometry of early ischemic changes in acute stroke. J Med Imaging (Bellingham). 2:14004, 2015.
Yu Z, Chen Z, Yu Y, Zhu H, Tong D, Chen Y: An automated ASPECTS method with atlas-based segmentation. Comput Methods Programs Biomed. 210:106376, 2021.
Chiang P-L, Lin S-Y, Chen M-H, Chen Y-S, Wang C-K, Wu M-C, et al.: Deep Learning-Based Automatic Detection of ASPECTS in Acute Ischemic Stroke: Improving Stroke Assessment on CT Scans. J Clin Med. 11:5159, 2022.
Kuang H, Menon BK, Sohn SIL, Qiu W: EIS-Net: Segmenting early infarct and scoring ASPECTS simultaneously on non-contrast CT of patients with acute ischemic stroke. Med Image Anal. 70:101984, 2021.
Naganuma M, Tachibana A, Fuchigami T, Akahori S, Okumura S, Yi K, et al.: Alberta Stroke Program Early CT Score Calculation Using the Deep Learning-Based Brain Hemisphere Comparison Algorithm. J Stroke Cerebrovasc Dis. 30:105791, 2021.
Cao Z, Xu J, Song B, Chen L, Sun T, He Y, et al.: Deep learning derived automated ASPECTS on non-contrast CT scans of acute ischemic stroke patients. Hum Brain Mapp. 43:3023–36, 2022.
Jung S, Whangbo T: Evaluating a deep-learning system for automatically calculating the stroke ASPECT score. In 2018 International Conference on Information and Communication Technology Convergence (ICTC), Jeju, Korea (South), October 17–19, 2018. IEEE, 2018.
Shieh Y, Chang C-H, Shieh M, Lee T-H, Chang YJ, Wong H-F, et al.: Computer-aided diagnosis of hyperacute stroke with thrombolysis decision support using a contralateral comparative method of CT image analysis. J Digit Imaging. 27:392–406, 2014.
Kniep HC, Sporns PB, Broocks G, Kemmling A, Nawabi J, Rusche T, et al.: Posterior circulation stroke: machine learning-based detection of early ischemic changes in acute non-contrast CT scans. J Neurol. 267:2632–41, 2020.
Kuang H, Najm M, Chakraborty D, Maraj N, Sohn SI, Goyal M, et al.: Automated ASPECTS on Noncontrast CT Scans in Patients with Acute Ischemic Stroke Using Machine Learning. AJNR Am J Neuroradiol. 40:33–8, 2019.
Bayona H, Ropero B, Salazar AJ, Pérez JC, Granja MF, Martínez CF, et al.: Comprehensive telestroke network to optimize health care delivery for cerebrovascular diseases: algorithm development. J Med Internet Res. 22(7):e18058, 2020.
Albers GW, Wald MJ, Mlynash M, Endres J, Bammer R, Straka M, et al.: Automated Calculation of Alberta Stroke Program Early CT Score: Validation in Patients With Large Hemispheric Infarct. Stroke. 50:3277–9, 2019.
Salazar AJ, Granja M, Useche N, Bermúdez S, Morillo AJ, Torres O, et al.: Evaluation of the accuracy equivalence of head CT interpretations in acute stroke patients using a smartphone, a laptop or a medical workstation. J Am Coll Radiol. 16(11):1561–71, 2019.
Salazar AJ, Useche N, Bermúdez S, Morillo AJ, Tórres O, Granja MF, et al.: Accuracy and reliability of the recommendation for IV thrombolysis in acute ischemic stroke based on interpretation of head CT on a smartphone or a laptop. AJR Am J Roentgenol. 214(4):877–84, 2020.
Salazar AJ, Useche N, Granja M, Bermúdez S, Morillo AJ, Torres O, et al.: Reliability and accuracy of individual Alberta Stroke Program Early CT Score regions using a medical and a smartphone reading system in a telestroke network. J Telemed Telecare 27(7):436–43, 2019.
Bağcı U, Udupa JK, Bai L: The role of intensity standardization in medical image registration. Pattern Recognit Lett. 31:315–23, 2010.
Kalavathi P, Prasath VBS: Methods on Skull Stripping of MRI Head Scan Images-a Review. J Digit Imaging. 29:365–79, 2016.
Parmar C, Barry JD, Hosny A, Quackenbush J, Aerts HJWL: Data Analysis Strategies in Medical Imaging. Clin Cancer Res. 24:3492–9, 2018.
Muschelli J: A publicly available, high resolution, unbiased CT brain template. In Proceedings of the 18th International Conference, IPMU 2020. Lisbon, Portugal, June 15–19, 2020. Springer, 2020.
Andrade N, Faria FA, Cappabianco FAM: A practical review on medical image registration: from rigid to deep learning based approaches. In the Proceedings of the 31st SIBGRAPI Conference on Graphics, Patterns and Images (SIBGRAPI). Paraná, Brazil, 29 October - 01 November, 2018. IEEE, 2018.
Szegedy C, Ioffe S, Vanhoucke V, Alemi A: Inception-v4, inception-resnet and the impact of residual connections on learning. In Proceedings of the AAAI Conference on Artificial Intelligence. San Francisco, California USA, February 4–9, 2017. AAAI Press, 2017.
Vajda S, Godfrey KR, Rabitz H: Similarity transformation approach to identifiability analysis of nonlinear compartmental models. Math Biosci. 93:217–48, 1989.
Lehmann TM, Gonner C, Spitzer K: Survey: interpolation methods in medical image processing. IEEE Trans Med Imaging. 18:1049–75, 1999.
Sotiras A, Davatzikos C, Paragios N: Deformable medical image registration: a survey. IEEE Trans Med Imaging. 32:1153–90, 2013.
Maes F, Vandermeulen D, Suetens P: Medical image registration using mutual information. Proceedings of the IEEE. 91:1699–722, 2003.
Klein S, Staring M, Pluim JPW: Evaluation of Optimization Methods for Nonrigid Medical Image Registration Using Mutual Information and B-Splines. IEEE Trans Image Process. 16:2879–90, 2007.
Wolff L, Berkhemer OA, van Es ACGM, van Zwam WH, Dippel DWJ, Majoie CBLM, et al.: Validation of automated Alberta Stroke Program Early CT Score (ASPECTS) software for detection of early ischemic changes on non-contrast brain CT scans. Neuroradiology. 63:491–8, 2021.
Chen W, Petrick NA, Sahiner B: Hypothesis Testing in Noninferiority and Equivalence MRMC ROC Studies. Acad Radiol. 19:1158–65, 2012.
Jin H, Lu Y: A non-inferiority test of areas under two parametric ROC curves. Contemp Clin Trials. 30:375–9, 2009.
Liu J-P, Ma M-C, Wu C, Tai J-Y: Tests of equivalence and non-inferiority for diagnostic accuracy based on the paired areas under ROC curves. Stat Med. 25:1219–38, 2006.
Obuchowski NA: Testing for equivalence of diagnostic tests. AJR Am J Roentgenol. 168:13–7, 1997.
Hillis SL, Obuchowski NA, Schartz KM, Berbaum KS: A comparison of the Dorfman–Berbaum–Metz and Obuchowski–Rockette methods for receiver operating characteristic (ROC) data. Stat Med. 24:1579–607, 2005.
Nelson JC, Pepe MS: Statistical description of interrater variability in ordinal ratings. Stat Methods Med Res. 9:475–96, 2000.
Altman DG: Practical Statistics for Medical Research. 1st edition. London: Chapman and Hall/CRC; 1990.
Chen Z, Shi Z, Lu F, Li L, Li M, Wang S, et al.: Validation of two automated ASPECTS software on non-contrast computed tomography scans of patients with acute ischemic stroke. Front Neurol. 14, 2023.
Ferreti LA, Leitao CA, Teixeira BCA, Lopes Neto FDN, ZÉtola VF, Lange MC: The use of e-ASPECTS in acute stroke care: validation of method performance compared to the performance of specialists. Arq Neuropsiquiatr. 78:757–61, 2020.
Muir KW, Santosh C: Imaging of acute stroke and transient ischaemic attack. J Neurol Neurosurg Psychiatry. 76 Suppl 3 Suppl 3:iii19–28, 2005.
Salazar AJ, Hernández Hoyos M, Bayona H, Suarez JM, Torres JS, Rivera JM, et al.: Appremia. Available at https://www.appremia.com/. Accessed 16 June 2024.
Acknowledgements
We thank the neuroradiologists who interpreted the images, as well as our institutions for support this study.
Funding
Open Access funding provided by Colombia Consortium. This study was funded by the National Ministry of Science, Technology, and Innovation of Colombia—Minciencias (Grant 926–93362).
Author information
Authors and Affiliations
Contributions
Conception, design, and funding proposal for this study were performed by Marcela Hernández Hoyos and Antonio Salazar. Algorithm’s design was performed by Esteban Ortiz, Juan Rivera, Marcela Hernández Hoyos, Manuel Granja, and Antonio Salazar. Data collection and statistical analysis were performed by Esteban Ortiz and Antonio Salazar. All authors contributed to results analysis and interpretation. All authors contributed to write and review the article. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Competing Interests
This study was funded by the National Ministry of Science, Technology, and Innovation of Colombia—Minciencias (Grant 926–93362). The authors claim that they have no competing interests with respect to Minciencias or any other third party.
Ethics Approval
The Ethics Committee of University of Los Andes approved this retrospective study and waived the requirement for informed consent. This committee confirmed that this research is without risk (1556–2022, 06–03-2022).
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Ortiz, E., Rivera, J., Granja, M. et al. Automated ASPECTS Segmentation and Scoring Tool: a Method Tailored for a Colombian Telestroke Network. j Imaging. Inform. med. (2024). https://doi.org/10.1007/s10278-024-01258-9
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s10278-024-01258-9