Probability of malignancy for lesions detected on breast MRI: a predictive model incorporating BI-RADS imaging features and patient characteristics
- First Online:
- Cite this article as:
- DeMartini, W.B., Kurland, B.F., Gutierrez, R.L. et al. Eur Radiol (2011) 21: 1609. doi:10.1007/s00330-011-2094-6
- 180 Views
To predict the probability of malignancy for MRI-detected breast lesions with a multivariate model incorporating patient and lesion characteristics.
Retrospective review of 2565 breast MR examinations from 1/03–11/06. BI-RADS 3, 4 and 5 lesions initially detected on MRI for new cancer or high-risk screening were included and outcomes determined by imaging, biopsy or tumor registry linkage. Variables were indication for MRI, age, lesion size, BI-RADS lesion type and kinetics. Associations with malignancy were assessed using generalized estimating equations and lesion probabilities of malignancy were calculated.
855 lesions (155 malignant, 700 benign) were included. Strongest associations with malignancy were for kinetics (washout versus persistent; OR 4.2, 95% CI 2.5–7.1) and clinical indication (new cancer versus high-risk screening; OR 3.0, 95% CI 1.7–5.1). Also significant were age > = 50 years, size > = 10 mm and lesion-type mass. The most predictive model (AUC 0.70) incorporated indication, size and kinetics. The highest probability of malignancy (41.1%) was for lesions on MRI for new cancer, > = 10 mm with washout. The lowest (1.2%) was for lesions on high-risk screening, <10 mm with persistent kinetics.
A multivariate model shows promise as a decision support tool in predicting malignancy for MRI-detected breast lesions.
Breast MRI is an important new tool for the detection and characterization of breast carcinoma. Current applications for breast MRI include evaluation of extent of ipsilateral malignancy [1–17] and screening of the contralateral breast [4, 11, 13, 15, 16, 18–25] in patients with newly diagnosed breast cancer, screening of women at high risk for breast cancer [25–34] and evaluation of patients with metastatic axillary adenopathy and an unknown primary cancer [35–40]. In 2003 the First Edition of the American College of Radiology (ACR) Breast Imaging Reporting and Data System (BI-RADS) MRI lexicon was published, reflecting the significance of this tool in the breast imaging armamentarium .
The value of breast MRI is attributable to its high sensitivity, ranging from 94 to 99% [42–46]. This is tempered by more variable and potentially lower specificity . In practice, there are multiple complex imaging and patient parameters which the radiologist must integrate when assessing the likelihood of malignancy for breast MRI findings. While there is broad agreement that both morphological and kinetic characteristics should be considered when evaluating breast MRI lesions, there is continued debate as to the relative importance of these features in predicting malignancy. Existing studies of the predictive values of lesion MRI characteristics have generally investigated either morphological or kinetic features, often using solely univariate analyses [47–54]. Multivariate predictive models, which allow the integration of lesion and patient characteristics to predict the probability of malignancy for MRI lesions with various combinations of features, have the potential to improve the diagnostic accuracy of MRI. Such tools offer promise to provide decision support in the management of breast MRI findings. However, the few multivariate studies to-date have typically included lesions identified and deemed suspicious before MRI or used only lesions with MRI features associated with higher likelihoods of malignancy. Existing models have also yet to incorporate patient characteristics known to impact the risk of breast carcinoma in combination with MRI features defined using the BI-RADS lexicon. Thus, further analyses are warranted to develop a clinically applicable predictive model with the capacity to discriminate benign from malignant MRI findings in a variety of patients with a spectrum of lesions.
The purpose of our study was to predict the probability of malignancy for MRI-detected breast lesions through development of a multivariate model incorporating patient characteristics and BI-RADS features. In particular, we sought to identify the constellation of features in lesions currently recommended for imaging follow-up or biopsy which may confer sufficiently low predicted probabilities of malignancy to obviate the need for additional work-up. This is the first investigation to include BI-RADS 3 (probably benign) in addition to BI-RADS 4 and 5 MRI-detected lesions, and the first to incorporate both clinical indication for breast MRI and BI-RADS lesion features including kinetics in a multivariate predictive model.
Materials and methods
This study was approved by our institutional review board and was HIPAA-compliant. A retrospective review of our prospectively populated MRI database was performed to identify all consecutive breast MR examinations performed at our institution between January 1, 2003 and November 30, 2006. Lesions included in this study were initially detected on breast MR examinations performed for the clinical indication of a new cancer diagnosis or for high-risk screening, and received a final MRI BI-RADS assessment of 3 (probably benign), 4 (suspicious) or 5 (highly suggestive of malignancy). All lesions were clinically and mammographically occult at the time of MRI. BI-RADS 3, 4 and 5 MRI lesions were evaluated because improved discrimination of benign from malignant lesions in this group could prevent unnecessary follow-up or biopsy for false-positive lesions. Lesions with a BI-RADS 6 assessment (biopsy-proven, known malignant lesions) were excluded. We do not routinely record data for BI-RADS 2 assessment (benign) lesions and they were not included in this study.
All study data were obtained from the Consortium Oncology Data Integration (CODI) project (IR #5586, protocol 1833E). CODI is a solid tumor clinical research database developed and maintained by the Fred Hutchinson Cancer Research Center in collaboration with the University of Washington. Data in CODI have been obtained in accordance with all applicable human subjects’ laws and regulations. Sources of data for CODI include our prospectively recorded breast MRI data forms, our institutional pathology database and the regional Cancer Surveillance System (CSS) tumor registry. CSS is a part of the National Cancer Institute’s SEER program (Surveillance, Epidemiology, and End Results) and collects population-based data on the incidence, treatment and follow-up on all newly-diagnosed cancers (except non-melanoma skin cancers) occurring in residents of the 13 counties in western Washington State.
Breast MRI technique
All examinations were performed on a GE LX 1.5T (General Electric Medical Systems, Milwaukee, WI, USA) using a dedicated bilateral breast coil (MRI Devices, Waukesha, WI, USA). During the study period, three MRI protocols were used as clinical practise and technology evolved. All protocols were consistent with the guidelines established by the International Breast MRI Consortium (IBMC) and by the ACR Imaging Network (ACRIN) MRI trials. Imaging sequences included pre- and at least two post-contrast T1-weighted fat-suppressed 3D fast spoiled gradient recalled series. Before October 2005, imaging was performed in the sagittal plane with TR/TE 6.7/4.2 ms, flip angle 10°, FOV 18–22 cm, slice thickness 3 mm and matrix size 256 × 192. From October 2005 through June 2006, imaging was performed in the axial plane with TR/TE 6.2/3 ms, flip angle 10°, FOV 32–38 cm, slice thickness 2.2 mm and matrix size 350 × 350. From July 2006, imaging was performed in the axial plane with TR/TE 5.5/2.7 ms, flip angle 10°, FOV 32–38 cm, slice thickness 1.6 mm and matrix size 420 × 420. For all protocols, initial post-contrast acquisitions were centered at 1.5 min after contrast medium administration. Delayed acquisitions were centered between 4.5 and 7.5 min after contrast medium administration depending upon protocol. Omniscan (Mallinckrodt, Inc., Hazelwood, MO, USA) gadolinium contrast medium was used (20 cc hand-injected from January 2003 to February 2004; 0.1 mmol/kg power-injected at 2 cc/second from February 2004 to November 2006) followed by a saline flush. Each sequence required between 90 and 180 s, with post-contrast medium imaging completed by 9 min in most patients, allowing for differences in breast size.
Magnetic resonance examinations were prospectively interpreted, without knowledge of pathological outcomes, by one of five fellowship-trained breast imaging specialists (including W.B.D., R.L.G., and C.D.L.) with one to nine years of breast MRI experience. The examinations were interpreted in conjunction with clinical history and available correlative breast imaging studies including mammograms and ultrasounds. All MR examinations were processed by a computer-assisted evaluation (CAE) system (CADstream™ version 3.0, Confirma, Inc., Kirkland WA, USA) and were reviewed on high-resolution PACS monitors (General Electric Medical Systems, Milwaukee, WI, USA). A published survey of members of the USA Society of Breast Imaging indicated that such commercially available CAE systems were used by the majority (50.7%) of practices for breast MRI interpretation . Lesion morphological characteristics, kinetic features, assessments and recommendations were recorded at the time of interpretation using the BI-RADS MRI lexicon. Lesion kinetics were obtained from CAE synopses of initial and delayed enhancement, with synopses available for those lesions with enhancement meeting a specified minimum threshold of a > = 50% increase in pixel signal intensity on the initial post-contrast compared with the pre-contrast series. The CAE data include the proportions of delayed phase persistent, plateau and washout enhancement comprising each lesion. Kinetics for those findings with enhancement below the CAE minimum threshold were based on the radiologist’s qualitative assessment of the most suspicious delayed phase enhancement. All lesion data were recorded on an MRI data form at the time of interpretation, and the data were subsequently entered into the CODI database.
Definitions of variables and cancer outcomes
Patient characteristics variables were clinical indication for breast MRI and patient age. Clinical indication was dichotomized to new cancer diagnosis (evaluation for extent of disease) or high-risk screening. Patient age was dichotomized to <50 or ≥50 years. Lesion variables were maximum lesion size, BI-RADS lesion type and delayed kinetics defined by most suspicious curve. Lesion size was dichotomized to <10 or ≥10 mm. BI-RADS lesion type was categorized as focus/foci, mass or non-mass-like enhancement (NMLE). Delayed phase kinetics was categorized as persistent, plateau or washout based on the single most suspicious type of lesion kinetics (any washout > any plateau > any persistent). Delayed kinetics were further considered as all delayed kinetics (all study lesions with CAE-assessed or radiologist-assessed kinetics) versus CAE-assessed (only lesions with CAE-assessed kinetics).
Histopathology outcomes were classified as benign or malignant based on follow-up at our institution and linkage to the regional tumor registry. A final diagnosis of invasive breast carcinoma or ductal carcinoma in situ (DCIS) at our institution or in the tumor registry within 12 months of the index MR examination constituted a malignant outcome. A benign outcome was defined by a benign biopsy or ultrasound correlate at our institution within 12 months, a benign follow-up MRI at our institution within 18 months or the absence of a malignant diagnosis in the tumor registry within 12 months of the index MR examination.
The primary analysis aim was to develop a multivariate model to predict the probability of malignancy using patient clinical characteristics and lesion MRI features. The model was meant to inform clinical practice, rather than to reveal mechanisms of malignancy. Therefore, variables were generally dichotomized based on results from previous research and on patterns observed in clinical practice. Predictive models were fit using generalized estimating equations (GEE) as multiple lesions could belong to a single patient . Using GEE with independence working correlation, estimated probabilities of malignancy were the same as for logistic regression models, but standard errors and confidence intervals were computed using sandwich standard errors .
Initially, study variables’ associations with risk of malignancy were assessed using univariate GEE with calculation of odds ratios (OR) and 95% confidence intervals (CI). Multivariate models were constructed based on univariate analyses. Clinical indication for breast MRI, lesion size, BI-RADS lesion type and their interactions were the first variables considered for the multivariate model. Patient age was considered next, followed by lesion delayed kinetics. Interactions with clinical indication were investigated because the relatively large main (additive) effect for clinical indication suggested that the patient populations for two MRI indications (new cancer and screening) could be fundamentally different. Interactions with size were also explored. At each level of variable selection, models were evaluated by Wald and Score tests for individual variables. Additionally, the predictive abilities of the models were compared using the area under the receiver operating characteristic curve (AUC), which is equivalent to the concordance measure c for logistic regression (or GEE with independence working correlation) . The final multivariate model was evaluated using cluster-level deletion diagnostics . Furthermore, the AUC for the final multivariate model was computed using predicted values from 5-fold cross-validation , to ensure that model assessment was not conducted on the same data as that used for model development.
Patient and lesion characteristics
N = 528
Mean age, range
51 years, 20–85
Clinical indication for patient MR examinations (N = 551 examinations)
New cancer diagnosis
High risk screening
Number of lesions per patient
N = 855
Mean size, range (N = 810, missing for 45 lesions)
15 mm, 2–130
Non mass like enhancement (NMLE)
Univariate models: associations with malignancy for patient and lesion variables
Univariate generalized estimating equations (GEE) models
(N = 155) N (PPV**)
(N = 700) N (NPV**)
Odds ratio (95% CI***)
Area under the curve (AUC) (95% CI)
0.60 (0.55, 0.64)
0.59 (0.56, 0.62)
0.59 (0.54, 0.63)
0.56 (0.52, 0.61)
0.66 (0.61, 0.70)
0.60 (0.51, 0.70)
Lesion type and size were strongly correlated (Chi-squared statistic with 2 degrees of freedom = 286.1, p < 0.001), so size served as a proxy for lesion type in the multivariate analysis. Prediction based on both size and clinical indication was stronger than with either alone (N = 810, AUC = 0.64). The interaction of size and clinical indication was statistically significant (p = 0.01): the odds ratio describing the association between size (≥10 mm versus <10 mm) and malignancy was much greater for a screening indication (OR = 9.3, 95% CI = 2.8–31.1) than for new cancer (OR = 1.8, 95% CI 1.2–2.7). Women ≥50 years were more likely to have malignant lesions than women <50, but age was not an important predictor in multivariate models.
When added to a base multivariate model with size, indication and the size:indication interaction, all delayed kinetics (by CAE and by the radiologist) contributed to predictions of malignancy (N = 644, p < 0.001 for kinetics). The odds of malignancy were greater for plateau versus persistent enhancement (OR = 3.3, 95% CI = 1.6–6.7) and for washout versus persistent enhancement (OR = 3.9, 95% CI = 2.2–6.7). The AUC computed using 5-fold cross-validation was 0.70 (95% CI 0.65–0.74), lower than the naïve AUC of 0.73 computed when the predicted probabilities from model development were also used for model assessment.
Sensitivity analyses were conducted by limiting delayed kinetics to those assessed by CAE and through deletion diagnostics investigating the influence of individual patients’ data on model selection and fitted odds ratios. When only CAE assessed delayed kinetics were added as a predictor to the base multivariate model, the magnitude of the effect of kinetics was similar to the full delayed kinetics analysis (for example, the odds ratio was 1.4 for washout versus persistent kinetics, holding size and indication constant). However, for this subset (N = 126), the p value would not be considered statistically significant (p = 0.10 for CAE-assessed kinetics). Cluster-level deletion diagnostics did not identify any patients with undue influence on model selection, but did identify three patients (each with malignant lesions detected by high-risk screening) whose exclusion enhanced the magnitude of the size:indication interaction.
The radiologist must consider multiple patient and lesion characteristics when assessing the likelihood of malignancy for a finding detected on breast MRI. For example, a lesion characterized as a mass on MRI will have up to five additional BI-RADS MRI descriptors when shape, margin, internal enhancement, initial and delayed phase enhancement are designated . Previous studies of the predictive values for MRI variables have typically assessed either imaging features alone or patient characteristics alone [42, 47–53, 61–63], and most have been univariate analyses. Multivariate predictive models for malignancy which allow the integration of lesion morphology, lesion kinetics and patient characteristics have advantages over these analyses. By considering combinations of features, such models may identify lesions currently recommended for imaging follow-up or biopsies which have sufficiently low predicted probabilities of malignancy that further work-up is not warranted.
We developed a multivariate model incorporating patient characteristics and lesion features to predict the probability of malignancy for MRI-detected breast lesions. Our study design improves upon the methodology of the previous multivariate predictive models for MRI lesions, which have included only BI-RADS 4 or 5 MRI lesions [62, 63], or lesions identified and known to warrant biopsy before MRI . Due to an over-representation of suspicious MRI features, the results of these studies are not generalizable to the broader range of MRI lesions encountered in practice. Our study is the first to include BI-RADS 3 MRI lesions, which allows less suspicious MRI features to be included in the statistical model. Further, our cohort of BI-RADS 3, 4 and 5 lesions is the group for which improved assessment of the likelihood of malignancy may most impact on clinical care, by potentially identifying lesions that may not require further follow-up imaging or biopsy. Our study is also the first to incorporate clinical indication for breast MRI and lesion characteristics of BI-RADS MRI types and delayed kinetics in a multivariate tool.
Among our study variables, we found the most predictive multivariate model incorporated clinical indication for breast MRI, lesion size and delayed kinetics. The increased likelihood of malignancy for lesions on MRI performed for a new cancer diagnosis compared with MRI for high-risk screening has previously been demonstrated by Han et al (36% vs. 14%, respectively) , Liberman et al  (32% vs. 16%, respectively) and Gutierrez et al (42% vs. 22%, respectively) . Lesion size has also previously been shown to be a significant predictor of malignancy, with Liberman et al demonstrating a greater frequency of malignancy in larger MRI findings (3% of lesions less than 5 mm vs. 31% of lesions 20 mm or larger) , and Gutierrez et al showing increased risk of malignancy for findings 1 cm or greater compared with those less than 1 cm (34% vs. 20%, respectively) . The importance of delayed kinetics based on most suspicious curve, the classification method that is recommended by the BI-RADS MRI Atlas , was confirmed by a recent investigation which demonstrated significantly greater frequencies of malignancy in lesions with any washout (45.7%) compared with those with plateau (20.0%) or persistent (13.3%) as the most suspicious delayed enhancement type .
In this investigation, our predictive model (AUC 0.70) was more accurate than individual variables in assessing the likelihood of malignancy. Similarly, in their study of 995 MRI lesions initially found to be suspicious by conventional imaging or clinical examination, Schnall et al found that a multivariate model (AUC 0.88) for focal masses was more predictive of malignancy than were individual features . Our highest predicted probability of malignancy (41.1%) was for lesions detected on MRI performed for a new cancer diagnosis, measuring > = 10 mm with washout as the most suspicious delayed curve. The lowest predicted probability of malignancy (1.2%) was for lesions on high-risk screening, measuring <10 mm with persistent delayed kinetics. Because our results demonstrate that lesions with this combination of features have a likelihood of malignancy of less than 2%, they suggest that short-interval follow-up for such findings would be reasonable rather than biopsy. Of note, because we found no combination of features that conferred a probability of malignancy close to zero, we did not identify a group of lesions in our own BI-RADS 3, 4 and 5 cohort for which neither biopsy nor follow-up imaging was indicated.
Our study has limitations. It is a single-institution investigation, using specific MRI acquisition protocols, and the breast MRI were interpreted by fellowship-trained radiologists specialized in breast imaging. A multi-site validation study is necessary to determine if our statistical model and results regarding probabilities of malignancy are generalizable to other practices.
In summary, our multivariate model incorporating patient and imaging characteristics shows promise in predicting the likelihood of malignancy for MRI-detected breast lesions. We identified a group of lesions with a probability of malignancy of less than 2%, for which short-term follow-up over biopsy is likely reasonable. If our model is validated, it may provide important decision support for radiologists assessing findings detected on breast MRI.
This research was supported by the GE-Radiology Research Academic Fellowship (GERRAF) Program Sponsored by the Association of University Radiologists (AUR).
Support was also provided by a National Cancer Institute (NCI) Cancer Center Support Grant for biostatistics as a shared resource (P30 CA015704). This paper was presented at the Radiological Society of North America (RSNA) 2008 Annual Meeting.