Introduction

Microcalcifications in the breast can be challenging to diagnose [1, 2]. Until now magnification mammography has been required in diagnostic mammography units [3]. Nowadays, modern equipment with digital zooming is also used [4] in the diagnosing process. This paper seeks to elucidate whether choosing one technique over the other makes any differences when detecting and diagnosing microcalcifications.

Magnification mammography, hereafter referred to as ‘magnification’, is commonly used as complementary imaging on suspicion of microcalcifications. Complementary imaging decreases sensitivity and increases specificity [5] preventing unnecessary biopsies of benign lesions [5, 6].

Increased contrast-to-noise ratio (CNR) between microcalcifications and surrounding tissue, signal-to-noise ratio (SNR) and spatial resolution improves the visual conception of microcalcifications [7,8,9,10,11,12]. It is worth noting that the following factors can significantly affect the values of these quantities: absorption characteristics of the detector, detector pixel size and depth, focus size, monitor size, monitor pixel size and depth, properties of the X-ray spectrum, detector dose, properties of irradiated objects and removal of scattered radiation from the object. The use of post-processing algorithms also affects the image quality [13, 14].

The larger breast detector distance utilised in magnification leads to reduced effective pixel size, and when combined with smaller focus size, it yields better spatial resolution compared with conventional FFDM [15, 16]. However, studies show that average glandular dose (AGD) when using magnification is about twice that of breast imaging without magnification [17,18,19]. Digital zoom of conventional FFDM, hereinafter referred to as ‘zoom’, is a post-processing method that does not increase the AGD, nor does it improve spatial resolution [4].

When women are recalled due to suspicion of microcalcifications, the use of magnification leads to more image uptakes with potentially painful compression and longer examination time. This raises the question of whether zoom could replace magnification without leading to more undetected microcalcifications while reducing sensitivity and specificity. The added value would be fewer painful compressions and a reduction in AGD. It could also streamline workflow [20].

The aim of this study is to review the literature to:

  • Summarise and compare the ability to detect microcalcifications utilising magnification and zoom.

  • Summarise and compare the sensitivity and specificity of diagnosing microcalcifications utilising magnification and zoom in connection with recall due to suspicion of microcalcifications.

Method

This study follows the guidelines of the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) [21]. The protocol is registered in PROSPERO [22], registration number CRD42017057193.

Literature search strategy

A computerised search was performed to identify original studies on detecting or diagnosing microcalcifications utilising magnification, zoom or both. The studies included were located by searching MEDLINE (Ovid), EMBASE (Ovid), CINAHL (EBSCO), Engineering Village: Compendex and Web of Science (last search date 10.09.2019). The literature search included controlled vocabulary terms and free-text terms in the following combination: (mammography OR microcalcification) AND (digital magnification OR geometric magnification).

There were no restrictions on language or publication dates. Reference lists of included articles were screened for additional references. Abstracts and posters from relevant conferences and grey literature databases were also screened. (The search strategy is described in detail in Electronic Supplementary Material.)

All references were exported to Endnote [23] for duplicate removal. Rayyan [24] was used for study selection.

Study selection

Inclusion criteria: (1) Experimental studies with physical or Monte Carlo simulated phantoms of digital zoom or magnification mammograms or both focusing on microcalcifications. (2) Studies of mammograms from non-symptomatic women recalled after screening for diagnostic mammography where zoom, magnification or both were used or compared for diagnosis of microcalcifications. Exclusion criteria: (1) studies based on analogue film-screen, computed radiography (CR) mammograms, print-out/hard copies of digital mammograms, other modalities than FFDM and computer-aided detection/diagnostics; (2) studies based on imaging palpable tumours; (3) studies with patients with previous cancer disease or BRAC1/BRAC2; (4) studies with male patients, animals or specimens; (5) case reports, review articles, editorials, letters, consensus statements and studies focusing on cost.

Two reviewers (M.Ø. and B.S.) synchronised 20 randomly selected articles before they independently reviewed the titles and abstracts against the inclusion criteria and exclusion criteria. Any disagreement over the eligibility of particular studies was resolved by consensus. The full text of these studies was retrieved and independently assessed for eligibility. Any disagreement was resolved through discussion until the reviewers reached consensus.

Data extraction and quality assessment

The two reviewers extracted relevant data from the studies included. Standardised data forms were used: (a) study characteristics: authors, year of publication, study period, affiliation and study design; (b) clinical characteristics: number of readers and their level of experience, diagnostics scale and threshold, pre-test probability, case characteristics and reference standard, number of cases, patient age and numbers of true and false positives and negatives; (c) technical characteristics: type of detector technology, pixel size and depth, exposure factors, magnification and zoom factors, focal spot size, monitor size and depth, characteristics of phantoms and outcome measures for phantom studies.

The two reviewers independently assessed the methodological quality of included studies. For diagnostic studies, QUADAS 2 [25] was used. For phantom experiment studies, risk of bias and applicability were assessed using an adapted version of QUADAS 2, where the ‘patient selection’ domain was replaced with questions about controlling confounding variables and the reference standard domain was omitted. Disagreements between the reviewers were resolved by discussion and consensus.

Data synthesis and analysis

The outcomes of this systematic review were detection of microcalcifications, and sensitivity and specificity for diagnosing microcalcifications from images obtained with magnification techniques or using zoom. To assess the detection of microcalcifications, results from the phantom studies were used, while diagnostic performance was assessed from the results of the diagnostic test studies. Analysis of detection and diagnosis was performed separately.

Different measures of detectability were expected. The authors therefore decided to draw up a narrative explanation to summarise and compare findings regarding the detection of microcalcifications, rather than calculating pooled values.

To assess the sensitivity and specificity, the hierarchical model for meta-analysis of sensitivity and specificity with bivariate analysis was used [26,27,28]. Numbers of true positives, false negatives, false positives and true negatives from the diagnostic test studies were entered in the calculations. Sensitivities and specificities of the individual studies as well as the pooled sensitivity and specificity were calculated and presented in forest plots.

Heterogeneity among studies included in the meta-analysis was assessed using both the Cochrane Q test [29], where p < 0.05 indicates the presence of heterogeneity, and the inconsistency index (I2) [30]. I2 = 0–40% means heterogeneity might not be important; 30–60% moderate heterogeneity; 50–90% substantial heterogeneity and 75–100% considerable heterogeneity [31].

There were too few studies to perform a test of publication bias using a funnel plot [32].

The module ‘midas’ [33] and the built-in function ‘xtmelogit’ in Stata 15.1 [34] were used for the statistical analysis. The use of ‘xtmelogit’ is based entirely on the tutorial of Takwoingi [35]. A p value < 0.05 was considered statistically significant.

Results

Study selection

A flowchart of the study selection was generated as a PRISMA [21] diagram included here as Fig. 1. The initial search found 6630 articles. Search in grey literature yielded no additional articles. A total of 1827 articles were identified as duplicate, and the remaining title and abstracts were screened for inclusion and exclusion. This process whittled the total down to 21 articles to be read in full by two reviewers, and 4782 to be excluded.

Fig. 1
figure 1

Flowchart (PRISMA diagram) of the study selection process

Following the close reading, further 12 articles were excluded and 9 articles were finally selected for inclusion: five experimental phantom studies [36,37,38,39,40] and four retrospective diagnostic test studies [41,42,43,44].

Characteristics of included studies

Study characteristics, number of readers, experience of readers and technical characteristics of experimental phantom studies are listed in Table 1. Monitor size was 5 megapixels, focus spot size for magnification 0.1 mm and zoom 0.3 mm for the included studies. Readers were allowed to adjust window width and window levels of the images in two studies [36, 38]. Otherwise, post-processing algorithms other than zoom were not mentioned in the studies.

Table 1 Experimental phantom studies: study characteristics, clinical characteristics and technical characteristics

Detection was studied under varying current—time products (mAs), tube potentials (kVp), anode/filter combinations, detector technologies and magnification/zoom factors, as indicated in Table 1. One study [36] used the ACR phantom [45], two studies [38, 40] the CDMAM phantom [46], one study [37] an aluminium square of 0.2-mm thickness embedded in polymethyl methacrylate (PMMA) and one study [39] a simulated phantom utilising the MASTOS model [47]. In the latter, the simulation was performed for a range of glandularities. The sizes of microcalcifications varied according to what was available in the phantoms. Outcome measures for detection also varied: RANK sum score for visibility [36], normalised performance index (PI) [37], image quality figure (IQF) [38], contrast-detail detection (CDD) [38, 40], contrast-to-noise ratio (CNR) [39] and correct observation ratio (COR) [40]. Comments on the main findings relating to these quantities are listed in Table 2.

Table 2 Results from the experimental phantom studies

Diagnostic test studies [41,42,43,44] reported clinical characteristics and diagnostic accuracy data for magnification and zoom at a threshold equal to or equivalent to BIRADS ≥ 4a [48], see Table 3. The ranges of sensitivity and specificity were 85–100% and 50–57%, respectively, for magnification. For zoom, the ranges of sensitivity and specificity were 59–98% and 43–62%, respectively. The total number of true positives, false positives, false negatives and true negatives from each diagnostic test study is listed in Table 4.

Table 3 Retrospective diagnostic test studies: Study characteristics, technical characteristics and clinical characteristics
Table 4 Numbers extracted from the retrospective diagnostic test studies: number of true positives (TP), number of false positives (FP), number of false negatives (FN) and number of true negatives (TN) from zoom and magnification

Figure 2 shows the summarised result of the quality assessment of the included studies. One of the studies came out as at ‘high risk of bias’ because the readers were not blinded with regard to the use of magnification and zoom. We also determined that it was ‘unclear’ whether some of the studies met certain quality criteria. The specific reasons were as follows: two gold standards were used in three out of four diagnostic test studies, the retrospective design of certain diagnostic test studies, and that three of the phantom studies [37, 39, 40] did not state standard deviations, 95% confidence intervals or p values.

Fig. 2
figure 2

Risk of bias and applicability. Grouped bar charts showing risk of bias (left) and concerns regarding applicability (right) for the included studies, using the QUADAS2 domain for the diagnostic test studies, and the modified version of QUADAS2 for the phantom studies

Detection of microcalcifications

Findings concerning the detection of microcalcifications from the phantom studies were summarised in Table 2.

Detection of the smallest microcalcifications (diameters < 200 μm) was higher when using magnification than zoom, whereas detection was more comparable for larger microcalcifications [38, 39]. According to Vahey et al [38], there are also statistically significant differences in detection in the range of 200–630 μm. Koutalonis et al [39] found that microcalcifications of radii 50–100 μm are only visible when utilising magnification.

The detection of microcalcifications rose with increased current-time product (mAs) [40], decreased glandularity and increased magnification or zoom factor [39]. Changing the tube voltage (kVp) while current-time product is controlled by automatic exposure control (AEC) did not have a statistically significant effect on the detection of microcalcifications regardless of whether the magnification or zoom was used [36]. Magnification yields higher detection of microcalcifications for anode/filter combinations Mo/Mo and Rh/Rh, while zoom yields higher detection for the anode/filter combination Mo/Rh [36].

According to Egan et al [37], normalised PI was higher for mass detection than microcalcification detection when using standard AEC. Optimising the exposure factors improved detection both for conventional FFDM magnification mammography and photon counting FFDM without magnification, and the value of normalised PI for photon counting FFDM was comparable to the conventional FFDM with magnification [37].

Diagnosing microcalcifications

Coupled forest plots of sensitivity and specificity for magnification and zoom are shown in Fig. 3. Pooled sensitivity was 0.93 (95% CI 0.84–0.97) and 0.85 (95% CI 0.70–0.94) for magnification and zoom respectively. The pooled specificity was similar for both 0.55 (95% CI 0.51–0.58) and 0.56 (95% CI 0.50–0.62) for magnification and zoom, respectively.

Fig. 3
figure 3

Coupled forest plots of pooled sensitivity and specificity for diagnosing microcalcifications using magnification images (above) and zoom (below). The squares represent the sensitivities and specificities for individual studies, while the horizontal lines plot their 95% confidence interval. The pooled sensitivities and specificities are indicated with a red dotted vertical line and a diamond, while the horizontal size of the diamonds indicates their 95% confidence interval. Results from the heterogeneity tests are also listed in the lower right corner of the plots

A likelihood ratio test was performed, comparing a bivariate model without a covariate for diagnostic test type (magnification or zoom) with a bivariate model that included a covariate for diagnostic test type and assumed equal variance for each diagnostic test. There was no statistical evidence that sensitivity and/or specificity differed between magnification and zoom (p = 0.42). Likelihood ratio tests for sensitivity and specificity alone gave non-significant differences for both sensitivity (p = 0.20) and specificity (p = 0.57) between magnification and zoom.

The Q test did not indicate any heterogeneity among the observations of specificities from individual studies of magnification (Q = 4.89, p = 0.18), and the I2 test confirmed that heterogeneity was not significant (I2 = 38.64%). However, we did detect heterogeneity in the cases of sensitivity of magnification and both sensitivity and specificity of zoom (Q tests yielded p < 0.05) and could be considerable (I2 values = 77.96%, 95.59% and 74.79%, respectively).

Discussion

In this systematic review, we have compared the use of magnification and zoom for detecting and diagnosing microcalcifications. Our review of phantom studies found that the size of microcalcifications, exposure factors and detector technology determine whether or not digital zoom is equivalent to magnification in the detection of microcalcifications. Our meta-analysis of sensitivity and specificity from the diagnostic test studies found high sensitivities for both magnification and zoom 0.93 (95% CI 0.84–0.97) and 0.85 (95% CI 0.70–0.94), respectively, but low specificities 0.55 (95% CI 0.51–0.58) and 0.56 (95% CI 0.50–0.62), respectively. No statistically significant differences were found between the sensitivities or the specificities.

Malign microcalcifications are likely to occur in the range 50–500 μm, while benign calcifications are often larger than 1 mm [1]. The observed differences in detection between magnification and zoom apply to the size corresponding to the smallest malignant microcalcifications. Other phantom studies [8, 9] revealed that pixel sizes below 100 μm enhance the visual perception of small objects corresponding to typical microcalcifications, and detection of microcalcifications increases as pixel size decreases. A detector pixel size of 100 μm, as used in three out of five phantom studies in our review, would then require magnification for a more optimal effective pixel size. A detector pixel size of 100 μm was also used in one of the diagnostic test studies [41]. This study showed that microcalcifications were more visible and more microcalcifications were detected when using geometric magnification than digital zoom.

Optimising the exposure factor also improves detection of microcalcifications: An increase in mAs/ESAK improved detection [40], in line with another study, which demonstrated that reduced noise improved reader performance when detecting microcalcifications [10]. Increasing tube potential (kVp) decreases contrast, but a phantom study [36] could not demonstrate a statistically significant decrease in detection. However, zoom yielded better detection using anode/filter combination Mo/Rh, while magnification yielded better detection for Mo/Mo and Rh/Rh [36]. The author of this study proposed further investigations on how spectra of x-rays influence the visibility of structures [36].

Newer photon-counting detector technology in combination with optimised exposure factors may generate non-magnified images with microcalcification detection equivalent to magnified images from FFDM with flat panel detectors [37]. Standard AEC seems to be best suited to imaging masses, and exposure factors should always be optimised for the purpose of detecting microcalcifications, whether you use magnification or photon counting technology. However, this was not tested on different sizes of microcalcifications.

Our meta-analysis of sensitivity and specificity for magnification and zoom based on diagnostic test studies did not find any statistically significant differences. A partial explanation may be that most of the diagnostic studies included were performed using detectors of smaller pixel sizes (70 μm) than most of the phantom studies, which offer a better visual perception of microcalcifications also for zoomed images. The absence of any significant difference is in agreement with earlier studies based on digitised analogue images [19] or hard-copy prints of digital images [20], which suggests that zooming provides valuable information about microcalcifications [19]. If zoom could replace magnification in recalls due to suspicion of microcalcifications, it would reduce AGD, number of potentially painful compressions and examination time, thereby improving workflow [20].

The meta-analysis revealed heterogeneity between the diagnostic test studies. The fact that no statistically significant differences were found between the sensitivity for magnification and zoom could also be due to heterogeneity implying broader 95% confidence intervals for the pooled value and a non-significant p value and obscuring real differences. Some conditions cause more subtle microcalcifications than others [1, 2], and differences in patient age, case characteristics and pre-test probabilities, as listed in Table 3, may contribute to heterogeneity. Differences in imaging detectors introduce differences in image noise and resolution and could also contribute to heterogeneity. According to the Nyquist sampling theorem, objects smaller than twice the detector pixel size will either not be visualised or will be incorrectly visualised due to aliasing [10]. Varying reader experience is also a potential factor; a study has shown that experienced readers perform better in the detection of microcalcifications than inexperienced readers [14].

The retrospective design of the diagnostic test studies and different reference standards for benign lesions might be a limitation in this review. The studies used images available from clinical practice where both magnification and conventional FFDM were available; this may have led to a selection bias in the sensitivity and specificity of the individual studies. It should be mentioned that some of the experimental phantom studies did not provide standard deviations and/or confidence intervals. More diagnostic studies, studies using newer detector technology and studies considering post-processing of magnified and zoomed images would have strengthened this review. The phantom images in the included experimental studies have a uniform background in contrast to the background of real mammograms where the anatomic noise could be a limiting factor [49]. Experimental studies comparing magnification and zoom with anthropomorphic breast phantoms could be an option for further investigations. Nevertheless, this systematic review provides an overview of studies using FFDM magnification and zoom to compare detection and diagnosing of microcalcifications in diagnostic mammography.

In conclusion, zoom may be equivalent to magnification in many cases given that optimised procedures and newer detector technologies are now available. This finding has the potential to reduce AGD and improve examination workflow. Both diagnostic test studies and phantom studies using newer detectors would contribute additional knowledge on this topic.