Introduction

Over the past three decades, rapid technological advances have led to increased access and application of magnetic brain imaging (MRI) and computed tomography (CT) in clinical practice and research. As a consequence of improved image resolution, and a rapid rise in demand, detection of incidental findings have increased in both clinical and research context [9, 56]. Intracranial incidental findings are unintended asymptomatic abnormalities diagnosed such as brain neoplasms, aneurysms and vascular malformations [72]. Their clinical significance ranges from normal anatomical variants to pathologies that may require urgent medical or surgical interventions [49].

The prevalence of incidental findings was reported to be 18% in the first large-scale study in 1999, performed on 1000 asymptomatic volunteers (age, 3–83 years) [41]. Previous meta-analyses have reported the prevalence of incidental findings on high-resolution MRI to be 2.7% in adults and 16.4% in children [16]. Although there are some guidelines in place for managing these incidentalomas, clinicians have expressed ambivalence about the ideal management [20, 30]. When participants in the study are healthy volunteers, incidental findings can pose various practical and ethical concerns [31, 34]. The detection of these findings is potentially detrimental, as treatments are often not benign, with potentially harmful consequences [55].

A systematic review was designed to investigate estimates of the prevalence of incidental findings on brain MRI, with or without intravenous contrast, performed for clinical, commercial or research purposes in the general population. We also explored the demographic characteristics, imaging parameters and their influence on the findings. Relatively few studies have explored incidental findings on MRI in a diverse demographic including children and adults. In this review, we explore the nature, incidence and implications of intracranial incidental findings across various imaging modalities to inform patient counselling and further investigation.

Methods

A systematic review and meta-analysis was performed according to the PRISMA guidelines [59] to determine the rate of incidental findings on brain MRI scans.

Inclusion criteria

Studies reporting the prevalence of incidental findings on MR imaging were eligible for inclusion. Studies with significant confounding populations were excluded, such as in patients with neurosurgical referral, evidence of focal neurologic deficit or neuropsychiatric disorder. Given the well-documented associations between many incidental findings and age/gender, we excluded studies not reporting the proportions of males and females scanned and their mean/median age. Studies with scanning indications unlikely to be confounding were included, such as patients referred for assessment of headache or head trauma. In these studies, only definite incidental findings were included in our analysis. Studies which involved only healthy volunteers were examined separately as a sensitivity analysis to test the effect of this criterion, by examining the effect of including studies comprising patients with a clinical indication for brain MRI. Healthy volunteers were defined as patients with no overt neurological complaint being investigated as part of research or a screening process.

Search strategy and selection process

PubMed, Ovid MEDLINE, EMBASE and SCOPUS were searched from inception until May 24, 2021, using the strategies in Supplementary Methods I. Citations were deduplicated using fuzzy logic matching in revtools [78] for R v4.0.2. Abstracts were then independently screened by three authors (DES, EDWT, MAB) using Rayyan QCRI [58], with conflicts solved by discussion with a fourth author (MA or JH). Data was extracted by three authors (DES, EDWT, MAB).

Data abstraction

We sought data on all neoplastic, vascular or other findings identified in the included studies. Lesions identified as meningiomas by the study authors or lesions with a description consistent with a meningioma such as “calcified dural-based lesion” were considered meningiomas. Pituitary lesions include any lesion considered radiographically consistent with an adenoma, which includes micro- and macroadenomas as identified by the study authors. Undifferentiated sellar or suprasellar lesions were recorded separately. Meningiomas, gliomas and pituitary adenomas were enumerated separately given their commonality. The total number of neoplasms includes the above and any other neoplastic findings identified by the study authors, reported in the “any neoplastic” category. Neoplasms not fitting the above categories were enumerated in the “other neoplastic” category.

Cerebral aneurysms were those identified as such by the study authors and included “probable” aneurysms and thrombosed aneurysms where these were reported. We additionally enumerated cavernomas separately. The vascular category includes all vascular malformations, including aneurysms and cavernomas, reported as “any vascular”. Vascular findings not fitting the above were reported as “other vascular”. These included arteriovenous malformations (AVMs) and dural arteriovenous fistulas (dAVFs). Developmental venous anomalies/venous angiomas were excluded. Stenosis of a major vessel was included. In addition to vascular and neoplastic findings, we quantified the number of pineal cysts, arachnoid cysts, and Chiari malformations identified in the included studies.

Morris et al. [55] additionally found that white matter hyperintensities were the most common finding. We did not assess these because we found that the threshold for their reporting appeared to vary and was often poorly described. White matter hyperintensities exist on a spectrum from a clinically insignificant finding to pathologic white matter disease [24], but reporting thresholds are poorly standardised. This issue is exemplified in the findings of Wang et al. [76], who reported the prevalence of hyperintensities stratified into grades in 579 patients. At the lowest threshold, white matter intensities were observed in 566/579 patients (97.8%) [76]. Thus, we did not pool these findings.

Risk of bias assessment

We assessed risk of bias within the included studies by adapting the tool proposed by Hoy et al. [32] for prevalence studies. We assessed the risk of bias in 4 domains relating to external validity and 4 domains relating to internal validity, as shown in Table 1. We assessed between-study bias using funnel plots. Conventional funnel plots using the standard error as a measure of precision may produce false-positive identification of publication bias, and thus, we generated funnel plots of log odds against sample size as suggested by Hunter et al. [33].

Table 1 Risk of bias classification by which studies were assessed, adopted directly from Hoy et al. [32] overall judgements are shown in Table 2
Table 2 Characteristics of the included studies. Domain-level risk of bias findings is shown in Fig. 2

Statistical analysis

All statistical analysis was performed using R v4.0.2 [61]. As the primary outcome, we report the age-stratified prevalence (findings per 1000 scans) of the various categories of findings described above. We report crude prevalence as the secondary outcome but consider these estimates of limited value because of the extreme variation in incidences across ages. Crude estimates were derived from random effects meta-analyses of proportions with inverse variance weighting. We did not report the total prevalence of all incidental findings as an aggregate, because studies frequently described the total without fully describing the findings that contribute to it, with varying thresholds for inclusion and consequently incomparable data. We fitted mixed effects restricted cubic spline non-linear mixed effects meta-regression models, with the restricted maximum likelihood (REML) estimator using metafor [73]. Age-stratified estimates were then derived as fitted estimates for each age point. Fitted estimates were derived for ages 1, 2, 5 and 10 years and each decade thereafter. The continuous relationship between age and effect size was also depicted by graphing age versus the fitted spline estimates.

We also fitted linear models and reported the regression coefficient (β) and its 95% confidence interval (95%CI). The summary measure was the back-transformed Freeman-Tukey double arcsine-transformed proportion [50], which was chosen to stabilise variance given that findings are rare. Heterogeneity was quantified by τ2 and its impact by I2 and was derived from the random effects unadjusted crude estimates.

Additional analyses

To assess the impact of including studies in patients with potentially neurological complaints or indications for scanning, a sensitivity analysis was performed wherein estimates were derived as described above from studies recruiting exclusively healthy volunteers/controls. As further sensitivity analyses, we also repeated the analysis using conventional weighted least squares linear regression models, both univariable models and multivariable models including both age and gender. For multivariable models, we computed fitted estimates under the assumption of an equal number of male and female participants. We also examined the effect of publication year and the use of intravenous contrast using meta-regression models, which were additionally adjusted for age. Proportions discussed in relation to these models relate to the median age in the analysis.

Assessment of certainty

We assessed our certainty in the included findings using the GRADE framework, which downgrades certainty qualitatively based on a number of factors including the statistical effect size, its precision, the presence of heterogeneity and within-study bias.

Results

A total of 6356 citations were identified, of which 106 full texts were assessed and 35 reports of 34 studies were included (Fig. 1), comprising 40,777 participants. Some studies which may appear to meet the inclusion criteria were excluded because they were published in a non-English language [5, 75] or did not report sufficient demographic detail [27, 39, 45, 46, 67, 69, 83]. Characteristics of the included studies are provided in Table 2.

Fig. 1
figure 1

PRISMA flowchart detailing article screening and selection

Characteristics of the included studies

Risk of bias in each domain is summarised in Fig. 2. Studies were typically at high risk of bias in domains relating to external validity, as samples were often convenience samples or based upon self-referral. Bias relating to internal validity was generally low, because scans were by nature directly sampled from patients and MRI is sensitive and specific for the detection of intracranial abnormalities. Findings for each study are shown in Supplementary Figure I.

Fig. 2
figure 2

Risk of bias in each domain in the included studies

Substantial heterogeneity is apparent in our analysis, which is easily appreciated by examining the vertical dispersion of effect sizes in Figs. 3, 4, 5 and 6 and forest plots in Supplementary Figure I. This may be a result of rare findings examined in relatively small sample sizes but may also be a result of differences in methodology and enrolled populations. Magnet strength and sequence differed between studies. The most frequently used strengths were 1.0 T [57, 70, 77], 1.5 T [9, 10, 14, 26, 31, 44, 51, 64, 65, 68, 71, 72, 81], and 3.0 T [3, 12, 25, 28, 29, 40, 42, 48, 49, 76], while some studies used more than one strength (1.5 T and 3 T) [15, 41, 43, 52, 62]. Imaging scans were read by senior radiologists and/or neuroradiologists and/or neurologists in a large number of the included studies [3, 12, 14, 15, 28, 40,41,42,43, 48, 49, 52, 60, 68, 71, 72, 76, 77, 81]. In one study, 2 neuropsychiatrists co-assessed the MRI scans with a neuroradiologist. Only one study [9] had no radiologist, neuroradiologist, or neurologist in the team reading the scans; their team included researchers with a doctor of medicine training in neuropsychology. T1-weighted image (T1WI) [25, 28], T2-weighted image (T2WI) and fluid-attenuated inversion recovery (FLAIR)[57] were the most commonly used MRI sequences, either alone or combined [3, 9, 10, 12, 14, 15, 26, 31, 40,41,42, 44, 48, 49, 52, 62, 65, 68, 71, 72, 76, 77, 81]. Other sequences that were used in a smaller number of published papers were gradient recalled echo T2WI (GRE T2WI) [12, 14, 48, 72, 77], proton density weighted imaging (PDWI) [48], perfusion weighted imaging (PWI) [48], diffusion tensor imaging (DTI) [48], time-of-flight (TOF) [48] angiography and T1/T2 spine echo (SE) [43]. Of the 35 included studies, 11 used a contrast agent [3, 14, 15, 26, 57, 62, 65, 68, 71, 77, 81], which improves the sensitivity of imaging [13].

Fig. 3
figure 3

Crude estimates of the number of findings per 1000 scans in each category. ( +), number of positive scans

Fig. 4
figure 4

Relationship between proportion of each vascular finding and age, derived from restricted cubic spline meta-regression models. Red dots show the findings of individual studies, with the size of the point relative to study sample size. Black lines are fitted estimates, while the shaded area is the 95% confidence interval of the fitted estimate

Fig. 5
figure 5

Relationship between proportion of each neoplastic finding and age, derived from restricted cubic spline meta-regression models. Purple dots show the findings of individual studies, with the size of the point relative to study sample size. Black lines are fitted estimates, while the shaded area is the 95% confidence interval of the fitted estimate

Fig. 6
figure 6

Relationship between proportion of each other finding and age, derived from restricted cubic spline meta-regression models. Green dots show the findings of individual studies, with the size of the point relative to study sample size. Black lines are fitted estimates, while the shaded area is the 95% confidence interval of the fitted estimate

There was also substantial variation in patient age, with studies focusing on children [15, 25, 40, 43, 49, 52, 60, 81], adults [3, 14, 28, 29, 31, 41, 62, 68, 77] or the elderly population [9, 10, 22, 26, 42, 44, 48, 57, 64, 65, 71, 72, 76]. This poses substantial analytic challenges. While meta-regression can approximate the relationship between age and prevalence, it is important to note that this approximation may be less valid in the context of rare findings. Ideally, future studies should consider pooling individual patient data to better characterise the relationship between findings and age. Most studies focused on healthy elderly [9, 10, 26, 42, 44, 48, 65, 72, 76], healthy adults [14, 29, 31, 41, 57, 62, 64, 68, 77] and healthy children [25, 40, 49, 60]. Three papers were on children who presented with headaches [43, 52, 81], while one was on adults with headaches [3]. Children with TBI [15], lead-exposed patients [2], patients with diabetes [71] and patients with early Alzheimer’s disease [12] were also the focus for some of the research done. While their inclusion may be considered confounding, excluding these studies in a sensitivity analysis did not substantially influence our results (Supplementary Figure II; Supplementary Table III). An element of selection bias is also possible in that populations were often self-selected volunteers or commercial screening populations, which may be reflective of socio-economic status or other social determinants of health.

Findings

Crude estimates for each category are shown in Fig. 3. Age-stratified estimates of the prevalence of findings per 1000 scans are shown in Table 3. Findings comprising the “other” category are shown in Table 4.

Table 3 Age-stratified findings per 1000 scans. Numbers in parentheses represent the 95% confidence interval. Findings derived from univariable restricted cubic spline meta-regression models
Table 4 Findings comprising the “other” category in each analysis

Vascular findings

Cavernomas were the most common vascular finding, observed in 111/34,777 participants (3.9/1000 scans, 95%CI 2.3–5.8), with a range of 0.00001/1000 scans (95%C 0–6) in 1-year-olds to 7/1000 scans (95%CI 0.3–20) in 80-year-olds (Fig. 4). There appeared to be a linear relationship between the proportion of vascular findings observed and increasing patient age (β = 0.002, p < 0.0001; Supplementary Table I). Substantial heterogeneity was observed in effect sizes for vascular findings (τ2 = 0.004, I2 = 94%) (Table 3), likely due to the substantial range of demographics of the included studies.

Neoplastic findings

Meningiomas were the most common neoplastic finding, observed in 246/38,076 participants (5.3/1000 scans, 95%CI 2.3–9.5), ranging from 0/1000 scans (95%CI 0–2) in 1-year-olds to 17/1000 scans (95%CI 4–37) in 80-year-olds (Fig. 5). There was a linear association with age (β = 0.002, p < 0.0001; Supplementary Table I). The rate of findings for aggregated neoplasms was 456/39,040 (11.9/1000 scans, 95%CI 7.5–17.2), ranging from 0.2/1000 scans (95%CI 0–10) in 1-year-olds to 34/1000 scans (95%CI 12–66) in 80-year-olds. Moderate heterogeneity was observed (τ2 = 0.003, I2 = 94%) (Table 3). The proportions of chiari malformations, pineal cysts and arachnoid cysts in relation to age are shown in Fig. 6.

Additional analyses

Results of multivariable regressions are shown in Supplementary Table II. A sensitivity analysis limited to healthy volunteers did not appear to alter the relationships between age and findings (Supplementary Figure II) and did not substantially alter our age-stratified estimates (Supplementary Table III). Using linear models appeared to alter the apparent relationship between age and effect size in some analyses, particularly for those appearing to have bimodal relationships in non-linear models (Supplementary Figure III). However, this did not substantially alter our pooled estimates (Supplementary Table IV). Adjusting linear models for gender did not appear to have a substantial effect on the observed relationships (Supplementary Figure IV) or pooled estimates (Supplementary Table V). Funnel plots did not show evidence of publication bias (Supplementary Figure V).

We found that the prevalence of neoplastic findings appeared to increase substantially in newer versus older studies, after adjustment for age (β = 0.004, p < 0.001) but not vascular findings (β = 0.00004, p = 0.98) (Fig. 7). Regression coefficients for all analyses are shown in Supplementary Table VI. The use of intravenous contrast did not appear to influence the age-adjusted proportion of neoplastic (13.2/1000 scans vs. 12.7/1000 scans, p = 0.92) or vascular findings (8.3/1000 scans vs. 8.8/1000 scans, p = 0.91) (Supplementary Table VII).

Fig. 7
figure 7

Relationship between proportion of findings and publication year, derived from multivariable meta-regression models additionally adjusted for age. Proportions on the y-axis relate to median age in the given analysis. Points show the findings of individual studies, with the size of the point relative to study sample size. Black lines are the fitted estimates, while the shaded area is the 95% confidence area of the fitted estimate

Summary of findings

GRADE judgements of certainty are provided in Table 3 for each analysis. We had predominantly low certainty, with estimates downgraded primarily for imprecision, indirectness and within-study bias.

Discussion

This meta-analysis identified a substantial prevalence of various incidental findings on brain MRI, which were most commonly meningiomas. Our analysis included predominantly healthy populations and thus may reasonably approximate the prevalence of these findings in the healthy population. In keeping with a previous analysis by Morris et al. [55], we identified a point prevalence of neoplastic findings of approximately 10 per 1000 scans, with a significant increase with older age. In their study [55], aneurysms were the most common vascular finding. We found a similar prevalence of aneurysms but a higher prevalence of cavernomas, which were the most common vascular finding in our study. Our analysis includes an additional 18 studies (21,218 patients) with a high prevalence of cavernomas in some large studies [9, 12, 42, 68, 76], which accounts for this discrepancy. The reason for this difference is unclear but is likely to be age-related given that we observed substantial correlation with age. This underscores the importance of interpreting crude prevalence with caution in the presence of a strong covariate. Like Morris et al. [55], we also identified a substantial correlation with age for most findings. However, we observed a bimodal relationship with age for arachnoid cysts and a decreasing incidence of Chiari malformations with age. Gliomas were insufficiently common to reliably determine any relationship with age, but an increasing incidence with age is well described in the population [80].

The clinical approach to incidental intracranial findings is uncertain. In particular, the management of common findings such as meningiomas, pituitary adenomas, cavernomas and aneurysms has been the subject of much debate. In the case of meningiomas, treatment for symptomatic presentations is clear in that maximal safe resection is typically the first-line option where feasible [23]. Conversely, the majority of asymptomatic, incidental meningiomas will not require surgical intervention, and the natural history of these lesions is uncertain [36, 37]. Studies have attempted to develop prognostic schema [36], but there are currently no well-validated tools. The management of incidental intracranial aneurysms is also subject to debate [4, 7, 17, 54, 79]. The incidence of aneurysms [74] appears to be significantly larger than that of aneurysmal subarachnoid haemorrhage (aSAH) [18]. Approximately two thirds of aneurysms in the population are < 5 mm in size [74] which, in most cases, appear to have a low risk of rupture [4, 7, 17]. Thus, even in the endovascular era, the risks of treatment may often outweigh the risk of rupture [79].

This then poses an important ethical question as to whether patients should be informed of incidental findings of questionable significance, especially when those findings are unlikely to require treatment [31]. Knowledge of these findings may cause substantial anxiety [21, 38], and thus this question is of particular importance in the context of imaging performed for research purposes in otherwise healthy individuals [31]. In addition, an increase in imaging utilisation [1] and fidelity may lead to a larger volume of incidental findings. Another consideration is a unprecedented level of access to radiology reports by patients [47, 53]. Arguably, explicit consent should be obtained with careful consideration of the implications of incidental findings, and clear thresholds at which findings are considered reportable should be determined [31]. This is particularly important when imaging studies are read by non-radiologists for research purposes [31]. In these situations, review by specialists should ideally be obtained before findings are disclosed to patients and/or further action is taken [31]. Inter-rater reliability should also be accounted for, given that disagreement may occur as to what warrants further evaluation. This is exemplified by incidental pituitary lesions—one study identified pituitary abnormalities in over 40% of patients, but with significant inter-rater variability [27]. The majority of small (≤ 1 cm) non-functioning adenomas will never enlarge [19], and thus, there is the question as to the approach to lesions with disagreement between evaluators. As such, clear guidelines and algorithms should arguably be enacted to facilitate consistent decision-making. These may also be of benefit when considering medico-legal implications[6], as decisions to forego treatment may be scrutinised in the event of preventable complications of a known, but untreated, pathology.

Incidental findings also carry economic implications [21]. It is important to note that there is no evidence of a concrete, patient-centred benefit to their identification. In a study of 5800 patients, Bos et al. [9] identified 143 meningiomas. Of these, 91 (63.6%) were referred for further assessment and only 15 (10.4%) required intervention [9]. In total, 188 of 549 findings (32%) required specialist referral and only 44 (8%) required further follow-up or intervention; one for every 131 scans performed [9]. Thus, it appears that approximately one third of incidental findings create additional workload in the form of specialist referral, but only a minority require further intervention or follow-up. This carries cost and anxiety for the patient, but may infrequently result in intervention, and thus the risk–benefit balance is uncertain. As an example, the utility of empiric screening for intracranial aneurysms has been highly debated given their commonality, the devastating outcomes of aSAH and the existence of an effective treatment [63]. However, even in a Japanese population known to be at particularly high-risk of aSAH [18], Yoshimoto et al. found that empiric screening is not cost-effective [82].

For patients, there are also implications for underwriting of personal insurance and bank loans. The presence of an intracranial finding may increase premium rates or even exclude the applicant entirely. For example, the presence of multiple cerebral aneurysms, suggesting a genetic component, or untreated aneurysms with high risk features may preclude underwriting [66]. This may seriously affect patients, especially younger individuals yet to obtain life insurance or mortgages. These consequences are often overlooked by patients and clinicians, but it is recommended they are explicitly discussed with patients undergoing imaging for screening or research purposes [8].

Limitations

There are several limitations to this study. Firstly, we observed substantial variation in methodological components of included studies, such as magnet strengths and sequences. This was reflected in substantial heterogeneity in many analyses, which limits the certainty of our findings. Perhaps more importantly, the reporting threshold for incidental findings was not standardised in our analysis and was often undefined or unclear in the included studies. As imaging fidelity improves, the size threshold for the visual detection of lesions is becoming smaller, and thus, it is possible and even likely that this threshold varied across the studies, which may bias estimates. Given the very strong relationship between mean age in the study and the prevalence of findings identified, raw proportions for each finding may be less informative. The enrolled populations in included studies also varied substantially, which may influence baseline risk for the various findings assessed and bias our estimates. We observed substantial heterogeneity in many cases, which limits our confidence. This analysis assesses predominantly outwardly healthy individuals and thus generalisability to hospital populations, in which most incidental findings are identified in practice, may be limited.

Conclusion

We identified a substantial prevalence of incidental findings on MRI brain in predominantly healthy volunteers. Meningiomas appear to be the most common of these, though their prevalence is highly age-dependent. The significance and optimal management of incidental findings is uncertain, and future studies should consider reporting their natural history and clinical course. Future reviews should consider obtaining individual patient data to better describe the relationship between age and prevalence.