Introduction

The extension of the artery wall, known as aneurysms, occurs due to local weakness or structural damage, resulting in permanent localized expansion. The most common type among them is cerebral aneurysms, which mainly leads to rupture and bleeding. Among all types of aneurysms, those located in the anterior part of the circle of Willis are prone to rupture [1, 2]. The most important factors for aneurysm rupture are location, size, and shape, with an annual risk of 0.95% [3]. Cerebral aneurysm rupture is the most significant cause of non-traumatic subarachnoid hemorrhage, with a mortality rate of 23 to 51% [4,5,6,7], and the majority of patients have a poor prognosis. Therefore, rapid and accurate diagnosis, as well as active follow-up treatment of aneurysms, is urgently needed to reduce mortality and improve the prognosis of patients [8, 9].

Many imaging techniques can be utilized to identify aneurysms, including ultrasound, computed tomography angiography (CTA), and digital subtraction angiography (DSA). However, ultrasound is not suitable for detecting head and neck aneurysms due to the skull blocking its signals. Although DSA can accurately diagnose aneurysms, it is not widely applied in clinical practice due to its complexity and invasiveness, despite being considered the gold standard for intracranial aneurysm diagnosis [10]. CTA, a rapid noninvasive examination, is commonly used for identifying aneurysms in clinical practice, with its accuracy reaching as high as 98% [11,12,13]. Under certain circumstances, CTA can be used to confirm the presence of an aneurysm instead of DSA [14].

Currently, certain AI-assisted diagnostic software packages have been permitted for use in clinical daily routine due to their excellent performance in medical image tasks [15]. AI has made significant progress in certain aspects of routine medical care by automatically integrating and processing the data provided by the clinics [16,17,18]. Many believe that AI-assisted systems can improve diagnostic efficiency and reduce workload as an auxiliary tool in clinical work [19, 20]. The AI-assisted diagnosis system for head and neck blood vessels, (CerebralDoc®), developed by Shukun (Beijing, China), has recently been put into clinical use. It is an AI-aided diagnostic software based on enhanced CTA scan, which provides end-to-end solutions including automatic bone subtraction, vessel segmentation, volume rendering (VR) reconstruction, curved planar reformation (CPR), multiplanar reformation (MPR), maximum intensity projection (MIP) and other post-processed images. Aneurysms were detected, and the corresponding morphological parameters were extracted and displayed in the user interface. Some articles have achieved corresponding achievements in the research on clinical applications of similiar software [21,22,23,24,25]. For instance, certain software is capable of automatic segmentation and image diagnosis. However, the focus of this study is to explore whether the software has reference significance for diagnosing head and neck aneurysms and the clinical feasibility of artificial intelligence image reconstruction.

Therefore, this study aimed to examine the performance of AI in detecting aneurysms and determining morphological parameters, as well as its consistency with experienced radiologists, thereby exploring the potential clinical application value of the system.

Materials and methods

Study design and patients

The hospital ethics committee approved the retrospective study and waived the requirement for written informed consent. All patients undergoing head and neck angiography had signed informed consent to undergo the CTA examination.

A total of 354 cases of head and neck CT angiography in our hospital from August 2018 to October 2021 were collected for the study. There were 173 males and 181 females with an average age of 61.62 ± 12.45 years (range from 16 to 89 years). Grouping and inclusion/exclusion criteria are presented in Table 1; Fig. 1.

Table 1 Grouping situations and inclusion-exclusion criteria
Fig. 1
figure 1

Flowchart of case collection and grouping

CTA protocols and image postprocessing

A total of 354 head and neck angiography cases were scanned using a 256-row wide-body CT scanner (Revolution CT, GE Healthcare). The scanning parameters included: tube voltage of 100 KV, tube current of 200–720 mA (automatically), slice thickness of 5 mm, and a pitch of 0.992. The non-ionic contrast agent iodixanol was injected through the right antecubital vein at a concentration of 320mgI/mL, with a dosage of 0.8-1 ml/kg, and the injection flow rate was set at 5 ml/s. All images were then uploaded to a GE ADW 4.7 post-processing workstation for image reconstruction and subsequent measurements.

DSA examination protocol and image postprocessing

All patient DSA examinations were performed on the Siemens Artis Zee digital subtraction angiography machine, and subsequent processing was carried out using a GE ADW 4.6 workstation. The Seldinger technique [26] was employed for puncturing, and the catheter was inserted through femoral artery intubation. Standard frontal, lateral, and oblique 2D-DSA images were acquired by injecting iodixanol solution from both the internal carotid and vertebral arteries. Then, the workstation was used to reconstruct the entire brain vessels using 3D-VR, MIP, and other technologies, followed by the examination of the brain vessels through rotational DSA. If an aneurysm was found, its optimal location on the workstation was displayed, and relevant information such as its location, neck width, and maximum diameter was recorded.

Intracranial aneurysm interpretation

The interpretation of intracranial aneurysms was carried out in two groups: the DSA group and the non-DSA group. To evaluate AI performance, DSA findings served as the gold standard for the DSA group, while the diagnostic findings of radiologists were used as the reference standard for the non-DSA group.

AI interpretation pipeline: CerebralDoc® is a comprehensive system designed for vessels in the head and neck region. It reconstructs and displays images such as VR, MIP, Cerebral CPR, and vessel straightening through a user interface, providing a clearer illustration of the vessels. Aneurysms are automatically segmented, marked, and measured across various morphological dimensions, based on anatomical structure partitions to ensure accurate identification of aneurysms.

CerebralDoc® incorporates multiple networks designed to remove bones, segment vessels, detect aneurysms, and segment aneurysms, all based on several cascaded ResU-Net models. ResU-Net is a modified U-net framework that includes an additional residual block, which can optimize the network and improve accuracy due to its increased depth. The bone and main vessels are segmented consecutively using models described in [22]. Afterwards, aneurysms are detected and segmented using two cascaded ResUNet networks, where the first ResNet network is designed for detection and the second one for segmentation. Original CTA images are patched into 128*128*128-sized cubes, which is the same for vessel-segmented images. These images are input as an additional channel into both ResUNet1 and ResUNet2 to reserve semantic information. The detailed diagram is presented in Fig. 2.

Fig. 2
figure 2

Schematic Diagram of aneurysm segmentation in CerebralDoc®

Radiologists’ interpretation: All scans were independently reviewed by two radiologists with 10 years of diagnostic experience, and the image reconstruction was performed using operations such as VR, MIP, CPR, MPR, on a GE post-processing workstation [21]. To ensure consistency in diagnostic criteria, a third senior radiologist with 20 years of diagnostic experience was consulted in case of any disagreements.

The location of aneurysms was named after the parent main blood vessels, and this study included eight branches of vessels: internal carotid artery (ICA), anterior cerebral artery (ACA), middle cerebral artery (MCA), posterior cerebral artery (PCA), basilar artery (BA), vertebral artery (VA), posterior communicating artery (PComA), and anterior communicating artery (AComA).

Given the subjectiveness and inconsistency in the definition of vessel boundaries, both strict and loose criteria were employed to evaluated the consistency of their locations. Strict criteria demand that AI outputs the same location as radiologists, whereas loose criteria deem it correct as long as AI refers to branches in adjacent or intersecting positions, consistent with what radiologists would indicate. Percentages were calculated under different criteria for comparison. Finally, the statistical evaluation of the agreement in terms of the location, neck width, and maximum diameter of aneurysms was conducted between the two groups.

Subjective evaluation of image quality and image reconstruction time

The quality of the reconstructed image was evaluated using a five-point scale, which is detailed in Table 2. Document the time taken by two radiologists and the AI software to reconstruct each head and neck CTA image.

Table 2 Scale for subjective Assessment of Image Quality

Statistical analysis

Continuous variables were reported as the mean ± standard deviation (SD), and categorical variables were presented as numbers and percentages. The intraclass correlation coefficient (ICC) and Bland-Altman plots were computed to evaluate the consistency of the neck width and maximum diameter measurements of aneurysms by the AI system compared to those by radiologists. The Kappa test was calculated to evaluate the agreement on the location of aneurysms identified by AI and radiologists. The ICC and Kappa values were categorized as indicating poor (< 0.40), moderate (0.40 to 0.75), or good (> 0.75) agreement. We used the λ2 test to analyze the differences in sensitivity, specificity, and accuracy, and the rank sum test to analyze the difference in image subjective ratings. All statistical analyses were performed using SPSS version 25.0 (IBM Corp., Armonk, NY, USA), with p < 0.05 considered statistically significant. Bland-Altman plots were generated using GraphPad Prism 9 software and are presented as mean ± (1.96*SD).

Results

Patient population

Out of the 280 cases diagnosed with aneurysms (either single or multiple), there were 115 cases of ICA, 28 cases of ACA, 71 cases of MCA, 9 cases of PCA, 3 cases of BA, 5 cases of VA, 30 cases of PComA, and 38 cases of AComa. The number of cases in two different categories were: DSA group (102 cases) and non-DSA group (178 cases).

AI performance in aneurysm detection

AI detected 235 out of 280 cases of aneurysms, with 31 false positives and 45 false negatives. (The data is from both groups, with specific group results detailed in Table 3.) The cases of aneurysms missed by AI are illustrated in Fig. 3.

Table 3 Detection rate of Aneurysms
Fig. 3
figure 3

Typical cases recognized by radiologists but failed to be reconstructed or diagnosed by AI have been identified. (1-Radiologists; 2-AI.) (a) In case 1, the AI failed to detect the dissecting aneurysm protrusion in the C3 segment of the L-C3. (b) In case 2, radiologists detected a small aneurysm in the P2 segment of the R-PCA. (c) In case 3, the radiologists detected a small aneurysm at the junction of the L-M1 and L-M2

In the DSA group, the consistency of location reported by DSA and radiologists was moderate (K = 0.722), but the consistency between DSA and AI was poor (K = 0.365). The concordance between DSA and radiologists, as well as between DSA and AI, for aneurysmal location was 61.0–92.7% and 33.9–83.9% when using strict and loose criteria, respectively. The reliability of neck width and maximum diameter was good when measured by DSA compared to radiologists (ICC = 0.816 and 0.872, respectively) and by DSA compared to AI (ICC = 0.858, 0.835, respectively). (Refer to Tables 4 and 5)

Table 4 Consistency of location
Table 5 Consistency of Neck Width and Maximum Diameter

In the DSA group, the mean differences between DSA and AI measurements were − 0.11 mm (rangeing from − 2.02 mm to 1.80 mm) for neck width and 0.58 mm (rangeing from − 2.95 mm to 4.10 mm) for the maximum diameter of the aneurysm. DSA and AI showed good consistency in neck width and maximum diameter, as demonstrated in Fig. 4.

Fig. 4
figure 4

Bland-Altman plots of neck width and maximum diameter are presented for the DSA group (a,b) and the non-DSA group (c). Red lines represent the mean differences, and blue lines indicate ± 1.96 standard deviations below and above mean differences. Points lying outside the range are marked in red

In the non-DSA group, the kappa test revealed that the location of aneurysms was moderately consistent between AI and radiologists (K = 0.537). Furthermore, when the criteria were relaxed from strict to loose, the accuracy of AI increased from 47.5 to 82.2%. The reliability of measuring neck width and maximum diameter was consistent, with ICC values of 0.802 and 0.872, respectively (Refer to Tables 4 and 5).

The mean differences between radiologists and AI were − 0.42 mm (-2.06 mm to 1.21 mm) for neck width and 0.31 mm (-1.82 mm to 2.44 mm) for the maximum diameter in the non-DSA group.

Reconstruction time and subjective image quality evaluation

The average time taken by two radiologists to reconstruct head and neck CTA images was 141.1 ± 52.6 s and 113.2 ± 42.5 s, respectively. The average reconstruction time of the artificial intelligence system is 6.9 ± 3.6 s, which is significantly faster than that of two radiologists (P < 0.001). In addition, the software is capable of simultaneously generating the corresponding diagnostic reports.

The quality of the radiologists’ interpretations and the AI-generated images was assessed by two radiologists with 10 years of reading experience, based on the criteria outlined in Table 2. The results showed that radiologists 1 and 2 achieved good consistency in scoring (K = 0.845) for radiologists’ images, and also exhibited good intra-group consistency (K = 0.750) for the AI images. The Wilcoxon rank sum test was conducted, and the result showed a statistically significant difference between the two groups of data, with P < 0.001. The average subjective scores of radiologists and AI images were 3.59 ± 0.55 and 4.68 ± 0.44, respectively. The quality of the AI images was better than that of radiologists’ images. (Figures 5 and 6)

Fig. 5
figure 5

Radiologists and AI-reconstructed cases: Figures a1 and b1 present the coronal view of patients with different aneurysms. Figure a2 (b2) belongs to radiologists, and figure a3 (b3) belongs to AI. These two cases represent aneurysms located at different sites (case a - Acoma, case b - the junction of R-M1 and M2)

Fig. 6
figure 6

Figures a and b present the MIP images by radiologists and AI for two cases, respectively. (a1 and b1 belong to radiologists; a2 and b2 belong to AI)

Discussion

Due to the recent popularization of CT in clinical diagnosis, the workload of radiologists has soared dramatically, leading to fatigue and the ‘search satisfaction’ phenomenon, which has resulted in a considerable reduction in the accuracy of CTA diagnosis [27, 28]. This phenomenon typically refers to the situation where certain lesions might be overlooked because the visual attention of the radiologist is captured by another lesion within the same image, causing the radiologist’s search of the image to cease before the detection of the remaining lesions. To solve the problem, we had integrated. AI as an auxiliary diagnostic tool to automatically detect, localize, segment, and measure aneurysms, as well as to pre-fill detailed information into the reports, thereby reducing the workload and improving consistency. Recent studies have demonstrated that AI contributes to every stage of aneurysm management [14, 29], including detection, rupture risk assessment, complication prediction, diagnosis, treatment, and recurrence prediction. Therefore, AI has the potential to significantly improve the accuracy of diagnosing head and neck aneurysms, making it beneficial to incorporate into clinical daily routines. Additionally, AI is more consistent than radiologists, with no inter-observer variability, and it is more efficient and saves more labor [30,31,32]. In clinical practice, it is sometimes necessary to utilize. AI results for the rapid evaluation of aneurysms, which is of great significance for patients at acute risk or for radiologists with less experience.

In our study, the CerebralDoc system demonstrated reliable performance in automatic detection of aneurysms and the measurement of morphological parameters. The sensitivity, specificity, and accuracy of the AI system in detecting aneurysms were 88.24%, 50.00%, and 81.97% respectively, which were similar to those of radiologists at 95.10%, 30.00%, and 84.43% respectively. For example, in the study by Claux et al., their algorithm results demonstrated satisfactory diagnostic performance, with a sensitivity of 78% and a positive predictive value of 62%. The sensitivity matched that of radiology residents [33]. In the study by Park et al., a deep learning-based model (HeadXNet) was able to assist radiologists, increasing their detection rate from 83–89% [34]. Based on these results, we hypothesize that artificial intelligence can assist in pre-diagnosing CTA images, which may help radiologists screen cases and allocate more time for critical patients.

The consistency of aneurysm localization between radiologists and AI was moderate in both the DSA group and the non-DSA group. The kappa values were 0.722 and 0.537, respectively. The consistency between DSA and AI in locating aneurysms was poor (K < 0.4). Previous studies have demonstrated that it is typically challenging to accurately identify the parent artery of an aneurysm [35]. Regardless of the imaging technique employed, the visualization of the blood vessels with aneurysm can sometimes obstruct the diagnostic process. Neither CTA nor DSA can identify the origin of blood vessels with 100% reliability, but the differences between them are acceptable. CTA may provide better visualization of surgical or vascular anatomy [36]. Although the examination methods used by the two parties differ, the final decision is made by radiologists. Therefore, we suspect that the primary reason for the poor agreement between AI and radiologists regarding the location was the complex anatomical structure. There are many intersecting positions (such as the R-C7 segment and R-PComa segment, L-A1 segment, and Acoma segment, etc.) or adjacent positions (such as the L-M1 segment and L-M2 segment, R-P2 segment, and R-P3 segment) between the head and neck arteries, especially the multiple arteries composing the circle of Willis. Radiologists often fail to accurately judge the location of aneurysms when they occur at these sites, either due to subjectivity or when the neck width is too small. To eliminate interference, we employed strict and loose criteria to compare the consistency of location results. In the DSA group, the agreement in location determination between DSA and AI improved by 50.0%. In the non-DSA group, the agreement between radiologists and AI improved by 34.7%. This result indicated that after accounting for the differences caused by anatomical factors, the diagnosis consistency between AI and radiologists for the location of aneurysms was quite high.

In this study, the neck width and maximum diameter of the aneurysm were used as indicators for morphological measurement. In both groups, the agreement between AI and DSA diagnosis, as well as between AI and radiologist diagnosis, was satisfactory (ICC > 0.800). Various shapes of aneurysms or overlapping parent arteries can make it difficult to locate the neck width, leading to disagreement among radiologists in measurement, an issue that AI is able to circumvent. Practically, it is impossible to guarantee that every case attains DSA as the gold standard, thus making AI more valuable in emergencies. The Bland-Altman plots from the groups showed almost every bias was within mean ± 1.96*SD.

The subjective evaluation of post-processing image quality demonstrated that AI possesses superior image quality compared to the standard tools on a workstation. The advantages of AI-based automatic image processing include clearer detail restoration, longer and more remote vascular reconstruction, and a better effect of automatic bone removal in MIP images [22, 24, 37]. Different from a workstation that employs automatic bone subtraction, AI-reconstructed images displayed a cleaner and smoother arterial wall, particularly in the C4-6 segment of the ICA as it runs through the skull structure [22]. In addition, the MIP images from the AI clearly depicted the calcifications, stenoses, and plaques in the vessel wall. An incomplete resection of bone structures external to the vessel wall may lead to the misjudgment of some small aneurysms. There are nuanced differences that are challenging for trained eyes and traditional AI to discern but can be more accurately captured by current AI image processing techniques. Although diagnosing tiny aneurysms is challenging, the accuracy of AI in making this diagnosis needs improvement. AI can assist in eliminating artifacts caused by veins or metallic objects, resulting in the images that are smoother, more delicate, and cleaner. In addition to serving as an auxiliary diagnosis tool for aneurysms, it also holds significant value for the detection of other diseases originating from head and neck vessels. It is foreseeable that AI will be increasingly utilized in clinical practice in the future.

This article has several limitations. Firstly, there is a lack of a DSA gold standard reference for the non-DSA group’s diagnosis, and the retrospective study methods involve inevitable selection bias and variations in imaging protocols. In future studies, prospective studies should be conducted to unify the study protocol and obtain reproducible results with stable performance. Secondly, the collection of aneurysm sample data is based on the premise that the maximum diameter is ≥ 2 mm, and there are certain errors due to both radiologist and DSA. Finally, this paper only discusses the evaluation of the efficiency of AI in aneurysm detection in CTA images and does not cover other common diseases, which are topics for future discussion.

Conclusion

The AI-assistant software (CerebralDoc®) exhibits high sensitivity and accuracy in the detection of aneurysms. The outcomes of its automatic aneurysm localization, neck width measurement, and maximum diameter calculations correspond well with those of radiologists. Additionally, AI offers superior image quality in the post-processing of head and neck CTA images compared to radiologists, and it operates more efficiently.