Background

Giant cell arteritis (GCA) is a granulomatous large-vessel vasculitis that affects the aorta and its main branches, with a predilection to involve cranial arteries [1]. GCA is the most common systemic vasculitis in individuals aged over 50 years and its incidence increases with age [2]. GCA is considered a medical emergency and prompt administration of high-dose glucocorticoids is the mainstay therapy to prevent ischemic cranial complications such as visual loss and stroke [3].

The precise diagnosis of GCA is essential in clinical practice since patients need long-term glucocorticoid therapy that bears a significant burden of adverse events. In patients presenting GCA manifestations, the diagnosis can be confirmed by temporal artery biopsy (TAB) or by imaging studies such as color Doppler ultrasound (CDU) of temporal and axillary arteries, by high-resolution magnetic resonance angiography (MRA) imaging of cranial arteries or by large-vessel imaging with positron emission tomography (PET) or computed tomography angiography (CTA) [4]. Although TAB has been considered the gold standard method for GCA diagnosis [5] and the American College of Rheumatology (ACR) guidelines for GCA recommend TAB as the first diagnostic test [6]; CDU has replaced TAB in some scenarios as this imaging modality can detect the halo sign, i.e., a dark area surrounding the vessel lumen, which is regarded as the most important sign of vasculitis in temporal arteries [7, 8]. Recently, the European Alliance for Associations of Rheumatology (EULAR) and the British Society for Rheumatology (BSR) guidelines have recommended CDU of temporal and axillary arteries as the first diagnostic test to confirm GCA [9, 10]. In one of the guidelines, the assessment of the axillary artery is recommended to be performed only in case of a normal CDU of temporal arteries [9].

Historically, studies and systematic reviews evaluating the performance of the halo sign in temporal arteries for GCA diagnosis have focused on patients presenting the cranial phenotype of the disease [11,12,13,14,15,16]. On the other hand, inflammatory findings in the aorta and proximal branches such as axillary arteries have been increasingly recognized by imaging methods at disease presentation in GCA patients [17]. Large-vessel involvement in GCA has been shown to be associated with a higher relapse rate, increased mortality, higher levels of acute phase reactants, and an increased cumulative glucocorticoid dose compared to cranial GCA patients [18,19,20]. Therefore, it is essential to include large vessels such as the aorta and axillary arteries in the initial assessment of patients with suspected GCA [4].

The CDU technique has improved over the last few years and the use of high-resolution devices with high-frequency B-mode probes have allowed a better assessment of cranial arteries to detect inflammatory signs [21]. Hence, increasing the chance to detect vasculitis in cranial arteries in patients with suspected GCA. This systematic review with meta-analysis aims to evaluate the accuracy of the halo sign in temporal arteries for GCA diagnosis. Moreover, the performance of blood flow abnormalities (i.e., stenosis and occlusions) of temporal arteries, the compression sign in temporal arteries, and the assessment of the axillary arteries for GCA diagnosis were also analyzed.

Methods

Study’s protocol and registry

This systematic review with meta-analysis was registered at PROSPERO (International Prospective Register of Systematic Reviews) under the title “Accuracy of Doppler ultrasonography in the diagnosis of Giant Cells Arteritis: a systematic review” in 2016 (registry CRD42016046860), available at https://www.crd.york.ac.uk/prospero/#recordDetail.

This systematic literature review was conducted according to the Cochrane Handbook and reported following the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines [22].

Eligibility criteria

Inclusion criteria for this systematic review with meta-analysis were prospective or retrospective observational studies evaluating individuals over 50 years of age with suspected GCA, who underwent CDU of temporal arteries; CDU should be performed at least within two weeks after starting glucocorticoid therapy; studies must describe the hypoecoic halo sign, and the following tools were accepted as a reference standard for the clinical diagnosis of GCA: the fulfillment of the 1990 ACR criteria for GCA or clinical manifestations of GCA with diagnosis confirmed by TAB or by imaging studies such as MRA of cranial arteries and by large-vessel CTA, PET or MRA. No language restriction or publication date filters were applied. Exclusion criteria were case–control studies, letters to the editor and studies using a B-mode frequency probe < 10 MHz for the assessment of temporal arteries.

Selection of studies

We performed a comprehensive literature review of studies published up to December 2022 using the following platforms LILACS (Literatura Latino Americana em Ciências da Saúde e do Caribe); PubMed; Cochrane Central Register of Controlled Trials—CENTRAL (by Wiley Cochrane library—Issue 9); Cochrane Register of Diagnostic Test Accuracy Studies (CRDTAS), Embase (by Elsevier), at ClinicalTrials.gov (www.clinicaltrials.gov), Health Technology Assessment Database (HTDA) in the Cochrane Library and World Health Organization (WHO) International Clinical Trials Registry Platform (ICTRP) (www.who.int/ictrp). Additional literature searches were performed in the gray literature (http://www.opengrey.eu/).

Data collection

Two authors independently extracted data from the included studies. Data regarding study identification and eligibility criteria were included. In case of disagreement, we reached a consensus through discussion.

Risk of bias assessment (quality assessment)

Two authors independently assessed the quality of the studies using the QUADAS-2 (Quality Assessment of Diagnostic Accuracy Studies) [23] assessment tool. Disagreements were resolved in a consensus discussion.

Statistics

For each data analysis, we created a forest plot and summary receiver operating characteristic (sROC) curves. The meta-analysis was performed using the bivariate model to calculate the pooled sensitivity and specificity, in accordance with the recommendations of the Cochrane Handbook for Systematic Reviews of Diagnostic Test Accuracy. We also calculated the positive likelihood ratio (LR+), negative likelihood ratio (LR−), and diagnostic odds ratio (diagnostic OR) with their respective 95% confidence intervals (95CI). The heterogeneity of included studies was investigated by the subgroup analysis regarding the year of publication and disease prevalence. The sensitivity analysis evaluated study design and risk of bias in studies using a probe frequency ≥ 15 MHz to select studies with a low risk of bias. Then, the meta-analysis of the halo sign for GCA diagnosis was performed only in those high-quality studies using a probe ≥ 15 MHz. The I2 = statistics were applied to quantify inconsistency between studies, and I2 values over 50% indicated substantial heterogeneity. The statistical analysis was performed using the R 4.2.1 software for Windows. We assessed the certainty of evidence using the GRADE tool (Grading of Recommendations Assessment, Development, and Evaluation Working Group) for diagnostic studies [24,25,26].

Results

We found 4501 titles in the PubMed, Embase, CENTRAL, LILACS, and Cochrane databases. After excluding duplicates, we included 3365 publications, and then 78 references from selected studies were read in full. Fifty-six studies were excluded due to the following reasons: the reference standard did not meet the inclusion criteria of this systematic review (n = 30); case–control studies or case reports (n = 7); probe frequency below 10 MHz (n = 5); studies including patients with a previous GCA diagnosis (n = 6); studies monitoring the halo sign or with a study’s design different from those stated by the inclusion criteria of this systematic review (n = 8). Figure 1 describes the flow diagram for the selection of studies.

Fig. 1
figure 1

Flow chart of studies’ selection in the systematic review. Page MJ, McKenzie JE, Bossuyt PM, Boutron I, Hoffmann TC, Mulrow CD, et al. The PRISMA 2020 statement: an updated guideline for reporting systematic reviews. BMJ 2021;372:n71. 10.1136/bmj.n71. For more information, visit: http://www.prisma-statement.org/

Characteristics of included studies

Twenty-two studies published up to December 2022 with a total 2893 participants met the inclusion criteria. All subjects had suspected GCA, underwent assessment for the presence of the hypoechoic halo sign on CDU of temporal arteries, and should be classified as GCA by the 1990 ACR criteria or GCA was confirmed by TAB or imaging studies (i.e., large-vessel PET or CTA) as the standard reference. Studies were analyzed using the QUADAS-2 tool, and the scoring included yes, no, or unclear for the criteria. The characteristics of the included studies are depicted in Table 1.

Table 1 Characteristics of GCA diagnostic studies by CDU of temporal arteries using clinical diagnosis as a gold standard

Methodological quality of included studies

Figure 2 shows the result of the assessment of the methodological quality of included studies. Forty-one percent (9/22) were considered high risk or unclear about the risk of bias in two or more domains of the QUADAS-2 tool. The most relevant risk of bias was the lack of blinding of the reference standard in five prospective studies. Fifty-nine percent (13/22) of studies were considered as low risk of bias. Only the studies performed by Diamantopoulus et al. [27] and Nesher et al. [28] showed a low risk of bias regarding all four domains. Concerns about the applicability of evidence were low in virtually all domains for nearly all included studies. In general, the studies were reasonably well reported. We tried to contact the corresponding authors of some studies included in this systematic review to clarify minor issues, but no responses were obtained from them.

Fig. 2
figure 2

Assessment of methodological quality of the studies included in this systematic review. Green signals mean low risk of bias; yellow signs mean uncertain risk of bias and the red signals mean high risk of bias. The domains include relevant QUADAS questions

Analysis of the halo sign in temporal arteries for GCA diagnosis

We calculated the diagnostic accuracy of 22 studies [27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48] including 2893 participants. The pooled sensitivity and specificity of all studies were 0.76 (95 CI 0.69–0.81) and 0.93 (95 CI 0.89–0.95), respectively and the quality of the evidence was moderate. Heterogeneity was statistically significant for both sensitivity and specificity (I2 = 88.7%, p < 0.05 and I2 = 82.9%, respectively) (Fig. 4A). The LR+ was 10.15 (95 CI 6.42–16.31) and LR− was 0.26 (95 CI 0.20–0.35). In the sROC curve, the area under the curve (AUC) was 0.91 and the diagnostic OR was 38.45 (95 CI 19.28–76.74). The sROC curve of primary studies evaluating the halo sign by CDU of temporal arteries for GCA diagnosis is depicted in Fig. 3. We analyzed the impact of the improvements in CDU devices by evaluating only studies using ultrasound probes with a frequency ≥ 15 MHz. The meta-analysis of these studies yielded a pooled sensitivity of 0.84 (95 CI 0.75–0.89) and pooled specificity of 0.93 (95 CI 0.85–0.97).

Fig. 3
figure 3

sROC curve of the halo sign in temporal arteries for GCA diagnosis

The median prevalence of GCA was 43% among suspected cases included in studies analyzed by this systematic review. These findings imply that in a hypothetical cohort of 1000 individuals with suspected GCA, we can estimate that 857 individuals will have a proper CDU result and will receive appropriate treatment for GCA, whereas 103 individuals will have a false negative CDU result. The latter group will be misdiagnosed by the CDU; these GCA patients would not receive treatment for GCA and may develop disease complications such as irreversible blindness. In addition, 40 patients may have a false positive CDU result may be exposed to invasive diagnostic tests such as TAB or will be treated unnecessarily with long-term glucocorticoid therapy with its potential adverse events.

The halo sign and flow abnormalities in temporal arteries for GCA diagnosis

In four studies [29, 31, 37, 41] including 662 participants, the halo sign in temporal arteries with flow abnormalities (i.e., stenosis and/or occlusion) resulted in a pooled sensitivity of 0.71 (95 CI 0.56–0.82; I2 = 84.9%) and a pooled specificity of 0.89 (95 CI 0.82–0.94; I2 = 75.5%). However, these studies showed statistically significant heterogeneity, and very low quality of evidence. The LR+ was 6.40 (95 CI 3.12–11.0) and LR− was 0.34 (95 CI 0.20–0.52). The AUC calculated by the sROC curve was 0.92 and the diagnostic OR was 21.20 (95 CI 6.16–51.1).

The compression sign in temporal arteries in GCA diagnosis

In five studies [35, 38, 43, 45, 48] including 1037 participants, the halo sign in temporal arteries was evaluated with the compression sign. This analysis yielded a pooled sensitivity value of 0.84 (95 CI 0.72–0.92; I2 = 87.7%) and a pooled specificity of 0.95 (95 CI 0.88–0.98; I2 = 86.3%) and the quality of evidence was moderate. The AUC in the sROC curve was 0.97 and the diagnostic OR was 286.6 (95 CI 42.6–2014.2). A forest plot with the sensitivity and specificity of the compression sign in temporal arteries for GCA diagnosis is depicted in Fig. 4B.

Fig. 4
figure 4

Forest plots describing the performance of arterial abnormalities in GCA diagnosis. A The halo sign in temporal arteries; B the halo sign and the compression sign in temporal arteries; C the halo sign in the temporal and axillary arteries

The halo sign in temporal and axillary arteries

We analyzed four studies [27, 41, 47, 48] evaluating the halo sign in temporal and axillary arteries in 603 participants. These three studies had a moderate quality of evidence. The study presenting the highest sensitivity (i.e., 0.98) for GCA diagnosis was performed by Diamantopoulus et al. in 2014 and the highest specificity for GCA diagnosis (i.e., 0.99) was described by the study performed by Skoog et al. [47]. When both temporal and axillary arteries were scanned, the pooled sensitivity of the halo sign for GCA diagnosis was 0.86 (95 CI 0.78–0.91; I2 = 69.6%) and the pooled specificity was 0.95 (95 CI 0.89–0.98; I2 65.7%), respectively. The AUC was 0.94 and the diagnostic OR was 56.8 (95% CI 23.3–137.0). A forest plot depicting studies that assessed the sensitivity and specificity of temporal and axillary artery halo sign for GCA diagnosis is presented in Fig. 4C.

Meta-regression analysis to assess heterogeneity in studies

A multivariate meta-regression model was built to analyze potential sources of heterogeneity in studies assessing the halo sign in temporal arteries for GCA diagnosis. The model included the study design (i.e., retrospective vs. prospective study), the quality of the evidence (i.e., low risk of bias vs. moderate to high risk of bias), and the use of probes ≥ 15 MHz vs. lower than 15 MHz. Meta-regression was also performed to verify the influence of the year of publication and the prevalence of GCA in the studies. In the sensitivity analysis, studies with low risk of bias had a significantly higher sensitivity and specificity of the halo sign for GCA diagnosis compared to studies with a moderate to high risk of bias. No other potential effect bias were found in the meta-regression analysis as a potential source of heterogeneity in studies (Table 2).

Table 2 Bivariate meta-regression analysis to analyze the influence of source of heterogeneity on the diagnosis of GCA

Discussion

This systematic literature review analyzed the performance of CDU findings in temporal arteries for GCA diagnosis, including the halo sign, the halo sign associated with flow abnormalities such as stenosis or occlusions, and the compression sign. Moreover, the detection of the halo sign by CDU when both temporal and axillary arteries was also analyzed. Although most analyses demonstrated substantial heterogeneity among studies, the detection of the halo sign in temporal arteries alone or in combination with axillary arteries, and the compression sign had a good diagnostic performance for GCA and a moderate quality of the evidence. The improvement in CDU devices seemed to play a role in improving the detection of the halo sign since the sub-analysis of studies using probes ≥ 15 MHz to detect the halo sign enhanced the diagnosis performance for GCA compared to all studies using probes with a frequency > 10 MHz. Despite the good diagnosis performance for GCA observed in studies analyzing the combination of the halo sign and flow abnormalities in temporal arteries, the quality of the evidence was rather low, and studies had a very high heterogeneity regarding sensitivity and specificity.

To date, other available systematic reviews in the literature have analyzed the performance of the halo sign by CDU for GCA diagnosis, including the studies published by Karassa et al. [11], Arida et al. [12], Ball et al. [13], Duftner et al. [14], Rinagel et al. [15] and Sebastian et al. [16]. In comparison to the above-mentioned systematic reviews, our study covers a more updated research period and has a broader review of the research question, including an investigation of heterogeneity across studies and meta-regression. We also demonstrated the sensitivity and specificity values ​​of the halo sign with the compression sign and the studies assessing the halo sign in the temporal artery with extension to the axillary arteries. These novel variables were ​​not described in the systematic reviews performed by Duftner et al., Rinagel et al., and Sebastian et al. [14,15,16]. Notably, the meta-analysis results of this systematic review showed slightly better sensitivity and specificity results compared to the most recent systematic reviews [15, 16].

An optimal image resolution of temporal and axillary arteries by the CDU device is essential to obtain a reliable test result when investigating GCA. The development of the technology has resulted in improvements in the resolution of surface plane imaging due to better CDU devices and a higher frequency of ultrasound probes. A clue to the impact of these improvements had been demonstrated by Sebastian et al. in 2021 when they compared studies performed before and after 2010. The sensitivity of the halo sign increased from 63 to 71%, while specificity remained high [16]. In this systematic review, we confirm the impact of the technology in improving the sensitivity of the halo sign for GCA diagnosis. Using the sensitivity analysis, the pooled sensitivity and specificity of the halo sign for GCA diagnosis increased in studies using probes with a frequency ≥ 15 MHz compared to the main analysis that included all studies with probes ≥ 10 MHz. The evidence of recent improvements in imaging techniques supports the latest EULAR recommendation to use B-mode probes with a frequency ≥ 15 MHz to scan temporal arteries as technical and operational parameters for CDU in the investigation of GCA [9].

In clinical practice, the confirmation of GCA diagnosis is of paramount importance, and the potential risk of misdiagnosis might result in a significant burden for the individual patient as a patient presenting a false negative CDU result would not be treated or a patient presenting false positive CDU results in temporal arteries would be unnecessarily treated with long-term high-dose glucocorticoids [2, 10]. In this study, we demonstrate that even with the relatively high sensitivity and specificity of the halo sign in temporal arteries for GCA diagnosis, a significant number of patients are at risk of misdiagnosis by this diagnostic test. Conversely, we demonstrate that this potential risk of misdiagnosis of GCA seems to be halted by using the compression sign, and assessing both temporal and axillary arteries, as the analyses of these variables enhance sensitivity to values above 80% while specificity remains high.

Another substantial contribution of this systematic review is the classification of evidence using the GRADE methodology. This structure allowed us to assess the quality of data regarding the body of evidence. The certainty of the evidence was moderate for the summary sensitivity estimates in the primary studies, the compression study, and the evaluation of the temporal and axillary artery, except for abnormal flows where the certainty of the evidence was low. The quality of the evidence was compromised by the presence of unexplained heterogeneity in the sensitivity and specificity results of the studies included in the primary analysis of this systematic review (i.e., halo sign in temporal arteries). The pooled sensitivity and specificity analysis of the halo sign with flow abnormalities, and when axillary arteries were analyzed with temporal arteries in eligible studies had also high heterogeneity. Significant heterogeneity was identified in several test performance characteristics, such as different patient inclusion criteria and different frequencies of B-mode ultrasound probes used to detect the halo sign. Nonetheless, we were unable to identify the reason for the heterogeneity, despite the investigation of several potential factors such as study design, prevalence, year of publication, quality of studies, and high-resolution probes.

This systematic review and meta-analysis have strengths and limitations. A strength is that we employ a comprehensive literature search and do not use search filters and do not apply date or language restrictions, which resulted in retrieved full-text articles from 22 studies. We are confident that the search strategy resulted in the detection of most eligible studies, with a low probability of undetected relevant studies. The main limitations of this study are the high heterogeneity included in the analyses and the impact of publication bias on these results. Unfortunately, the influence of publication bias over studies’ results could not be inferred since the determinants for publication bias in studies of the accuracy of diagnostic tests are not well known [49]. And not all information was available in published reports, especially in older studies. Although we tried to contact the corresponding authors of some publications to retrieve additional information, they were not available to provide us with the necessary information.

Conclusion

In conclusion, this systematic review and meta-analysis showed that the halo sign detected by the CDU of the temporal arteries has a good diagnosis performance for GCA. The accuracy for the diagnosis of GCA is improved by the non-compressible halo sign in temporal arteries (i.e., the compression sign) and by extending CDU examination to axillary arteries. The inclusion of the halo sign with blood flow abnormalities in temporal arteries has no impact on sensitivity and specificity for GCA diagnosis over the halo sign alone. In addition, substantial improvement in diagnosis performance for GCA diagnosis is achieved when using B-mode probes ≥ 15 MHz.