Selection of diagnostic features on breast MRI to differentiate between malignant and benign lesions using computer-aided diagnosis: differences in lesions presenting as mass and non-mass-like enhancement
- First Online:
- Cite this article as:
- Newell, D., Nie, K., Chen, JH. et al. Eur Radiol (2010) 20: 771. doi:10.1007/s00330-009-1616-y
To investigate methods developed for the characterisation of the morphology and enhancement kinetic features of both mass and non-mass lesions, and to determine their diagnostic performance to differentiate between malignant and benign lesions that present as mass versus non-mass types.
Quantitative analysis of morphological features and enhancement kinetic parameters of breast lesions were used to differentiate among four groups of lesions: 88 malignant (43 mass, 45 non-mass) and 28 benign (19 mass, 9 non-mass). The enhancement kinetics was measured and analysed to obtain transfer constant (Ktrans) and rate constant (kep). For each mass eight shape/margin parameters and 10 enhancement texture features were obtained. For the lesions presenting as nonmass-like enhancement, only the texture parameters were obtained. An artificial neural network (ANN) was used to build the diagnostic model.
For lesions presenting as mass, the four selected morphological features could reach an area under the ROC curve (AUC) of 0.87 in differentiating between malignant and benign lesions. The kinetic parameter (kep) analysed from the hot spot of the tumour reached a comparable AUC of 0.88. The combined morphological and kinetic features improved the AUC to 0.93, with a sensitivity of 0.97 and a specificity of 0.80. For lesions presenting as non-mass-like enhancement, four texture features were selected by the ANN and achieved an AUC of 0.76. The kinetic parameter kep from the hot spot only achieved an AUC of 0.59, with a low added diagnostic value.
The results suggest that the quantitative diagnostic features can be used for developing automated breast CAD (computer-aided diagnosis) for mass lesions to achieve a high diagnostic performance, but more advanced algorithms are needed for diagnosis of lesions presenting as non-mass-like enhancement.
KeywordsDiagnostic performance Mass and non-mass breast lesions Pharmacokinetic enhancement parameters Quantitative morphology and texture features Computer-aided diagnosis Artificial neural network
MR technology has advanced tremendously in the 30 years since the first breast MR imaging was performed. Dynamic contrast-enhanced MRI (DCE-MRI) is now a well-established clinical imaging technique. It is recommended for the screening of women with an aggregate lifetime breast cancer risk of more than 20% . Breast MRI is also employed throughout all stages of management, from detection, diagnosis, pre-operative staging, therapy response monitoring and surveillance [2, 3, 4, 5]. This trend is demonstrated by the nearly 40% per year increase in breast MR studies performed in the USA over the past 10 years .
Currently, breast MR demonstrates a high sensitivity in the range of 93–100%. As many benign lesions also show enhancement or other atypical features on MRI, the primary weakness of DCE-MRI remains its low specificity, reported to be in the range of 37–97% [1, 2, 3, 4, 5, 6, 7, 8, 9]. The cost of MRI itself in addition to the cost of invasive follow-up procedures associated with such mediocre specificity has limited its broader implementation as a screening tool for the general population .
For evaluation, traditionally, diagnostic impressions are generated by visual examination of morphology features and contrast enhancement kinetics using descriptors established in the BI-RADS (Breast Imaging-Reporting and Data System) lexicon [10, 11, 12]. The commercially available computer-aided diagnosis (CAD) systems presently in use display the suspicious lesions based on enhancement above a threshold level, and the enhancement kinetics from the lesion is also shown [13, 14, 15, 16]. Analyses of morphological features are left to the radiologist, and then all information needs to be integrated by the radiologist to make a final diagnostic impression. The interpretation of morphological features is subject to high inter-observer variability, resulting in differential diagnostic performance being highly dependent on the level of experience of the radiologist [17, 18]. Therefore, these commercial systems are in fact “computer-aided display systems”, not a true CAD that gives an intellectual impression about the suspicion level of the lesion. However, they did provide a very efficient way to extract the most essential information. Further efforts to add capabilities for quantitative characterisation of morphological features into the CAD systems will be very helpful, particularly to mammographers who are not experienced in interpreting breast MRI [19, 20].
Typically, the first step in evaluating lesion morphology on breast MRI is to classify the lesion as a mass, a focal lesion, or a non-mass-like enhancement. The BI-RADS breast MRI lexicon  gives the following clear definitions for mass and non-mass-like enhancement: “Mass—A mass is a three-dimensional space-occupying lesion that comprises one process, usually round, oval, lobular, or irregular in shape”; “Non-mass-like enhancement—Enhancement of an area that is not a mass. This includes enhancement patterns that may extend over small or larger regions, and whose internal enhancement characteristics can be described as a pattern discrete from normal surrounding breast parenchyma.” In the case of mass-type lesions there are several parameters that can be used for constructing the differential diagnosis. For example, spiculation (morphology), rim enhancement (texture) and the wash-out kinetic pattern are typical features of malignant lesions; whereas smooth margin (morphology), low and homogeneous enhancement (texture) and a persistent kinetic pattern typically indicate a benign mass. The diagnostic features to differentiate between mass-type malignant and benign lesions are readily available. On the other hand, diagnosis of non-mass-like enhancement lesions is much more challenging. Malignant lesions such as ductal carcinoma in situ (DCIS) and invasive lobular cancer (ILC) are likely to present as non-mass-like enhancement [11, 12, 21, 22]. Benign fibrocystic changes, which also appear as non-mass-like enhancement, are a frequent finding on DCE-MRI . Unlike mass lesions, non-mass-like enhancement lesions exhibit poorly defined boundaries, leading to difficulty in the analysis of morphology [13, 24]. Furthermore, the malignant non-mass lesions often do not show the typical wash-out pattern in enhancement kinetics, so this very useful diagnosis criterion for mass lesions has a limited diagnostic value for non-mass lesions [25, 26].
Extensive research has been undertaken to build quantitative diagnostic models for breast MRI, similar to the mammography CAD system that gives a diagnostic impression. Typically the analysis system will characterise the morphological features as well as the enhancement kinetics of the lesion (either using automated or manual lesion segmentation), then build a classifier based on those features that yield the highest diagnostic performance [27, 28, 29]. However, most studies in the past have only focused on analysing mass lesions. Although the efficacy of kinetic analysis in the diagnosis of non-mass lesions has been investigated, little was done to investigate the morphological features. In a previous work, we developed a quantitative morphology and texture analysis method to select features for the diagnosis of mass lesions . In this study we apply the developed methods to characterise the morphology and enhancement kinetic features of both mass and non-mass lesions, and to investigate the diagnostic performance to differentiate between malignant and benign lesions that present as mass versus non-mass types. The diagnostic features analysed using the presented method may be potentially used to build a true breast MRI CAD that can give intellectual impression, such as BI-RADS score, for each lesion.
Materials and methods
Histological types of lesions in the four groups: malignant mass, benign mass, malignant non-mass and benign non-mass
IDC (invasive ductal carcinoma)
DCIS (ductal carcinoma in situ)
ILC (invasive lobular carcinoma)
IDC (invasive ductal carcinoma)
DCIS (ductal carcinoma in situ)
ILC (invasive lobular carcinoma)
MRI acquisition and lesion ROI drawing
All MR studies were conducted at 1.5 T (Eclipse, Philips Medical Systems, Cleveland, OH). Patients were positioned prone into the dedicated breast coil. Dynamic imaging was performed utilising a T1-weighted 3D gradient echo (RF-FAST) pulse sequence, with TR = 8.1 ms, TE = 4.0 ms, flip angle = 20°, matrix size = 256 × 128, field of view (FOV) between 32 and 38 cm for bilateral axial view imaging. The slice thickness was 4 mm, and a total of 32 slices were used to cover the entire breast. Temporal resolution was 42 s for each dynamic acquisition. After acquiring four sets of unenhanced baseline images, the contrast medium (Omniscan®, GE Healthcare, New Jersey, USA, 0.1 mmol/kg) was administered as a bolus injection at the beginning of the fifth acquisition. Twelve sets of post-contrast enhanced images were obtained.
For quantitative evaluation, the lesion ROI (region of interest) was manually outlined based on the subtraction images at 1-min post-injection (sixth frame–third frame), by a well-trained operator (DN) using an in-house program written in MATLAB. The ROI for each case was confirmed by an experienced radiologist (JHC). The resulting ROIs from all slices of one lesion were combined in order to analyse 3D information for the entire lesion.
Quantitative analysis of lesion shape features and enhancement texture
Eight features were used to describe the shape of a lesion: volume, surface area, compactness, NRL (normalised radial length) mean, sphericity, NRL entropy, NRL ratio and roughness. Compactness is defined as the ratio of the square of the surface area to the volume of the lesion—with a sphere having the lowest compactness index and an irregular undulating shape, such as a spiculated lesion, having a higher compactness index. The features based on the normalised radial length (NRL) describe contours and the finer shape of the lesion. NRL is defined as the Euclidean distance from the object’s centre (centre of mass) to each of its contour pixels and normalised relative to the maximum radial length of the lesion [27, 29]. For non-mass lesions, as there were no clearly defined boundaries, these shape parameters could not be reliably analysed.
Radiographically, texture is defined as a repeating pattern of local variations in image intensity, and is characterised by the spatial distribution of intensity levels in a particular area. Haralick et al. defined 10 grey-level co-occurrence matrix (GLCM) enhancement features (energy, maximum probability, contrast, homogeneity, entropy, correlation, sum average, sum variance, difference average and difference variance) to describe texture , and these features were used to characterise lesions with and without mass effect.
The contrast enhancement kinetics was measured from the whole tumour ROI, as well as from the hot spot within the ROI. The hot spot was automatically searched within the whole tumour ROI as the area of 3 × 3 pixels showing the strongest enhancement on the subtraction image at 1-min after contrast injection. A mean signal intensity was calculated by averaging over the nine pixels of the hot spot ROI, or all pixels within the whole tumour ROI. The percentage enhancement was calculated as the increased signal intensity at each post-contrast frame normalised by the pre-contrast signal intensity (the averaged signal from all four time frames before the injection of contrast agent). The percentage enhancement time course was fitted to a two-compartmental pharmacokinetic Tofts model to characterise the uptake of contrast material in the lesion. The transfer constant (Ktrans) represents contrast uptake in percentage/minute, and the rate constant (kep) captures the washout rate in units of 1/min.
An artificial neural network (ANN) was used to select the optimal feature set to differentiate between malignant and benign tumours. The structure contains one input layer with the number of nodes corresponding to the number of input variables, one hidden layer and one output node from 0 to 1 indicating level of malignancy, where 0 means absolutely benign and 1 means absolutely malignant. Different neural network architectures with hidden nodes from 2 to the number of input nodes were tested. A stochastic gradient descent with the mean squared error function was used as the learning algorithm. The optimal architecture was chosen as the one for which the validation error was lowest.
After the topology was chosen, the diagnostic features in each category (shape, texture, kinetic parameters) were selected to identify those yielding the highest discrimination thus achieving the optimal diagnostic performance. Every analysed parameter had different values and ranges; and to avoid bias, the values of each parameter from all lesions were normalised to have zero mean and unit variance before training. Forward search strategy was applied to find the optimal feature subset, which was obtained when the trained classifier produced the least error rate. To control for over-fitting, the potential feature set was limited to no more than four in the shape and texture category. The selected features from all categories were then considered in a combined model.
Four-fold cross validation was used to evaluate the generated classifier. All cases were first randomly assigned into four sub-cohorts, with each sub-cohort containing approximately the same proportion of benign and malignant cases. Three sub-cohorts were combined as the training set and the remaining sub-cohort was used as the validating set. This process was repeated by randomizing the cases assigned to the training and validation sets to find the optimized diagnostic classifier. Then, the determined diagnostic classifier could be used to predict a lesion being malignant or benign based on the threshold level. The sensitivity and specificity in the entire dataset were calculated from a full range of thresholds (0.0–1.0 with 0.05 interval). The ROC curve was then constructed from all data points at different thresholds by plotting sensitivity versus 1-specificity. The ROC curves for differentiating between (i) malignant and benign mass lesions, (ii) malignant and benign non-mass-like enhancement lesions, and finally (iii) all malignant and all benign lesions, were generated. The area under the ROC curve (AUC) was calculated for comparison. All analyses were performed using the LNKnet package (http://www.ll.mit.edu/IST/lnknet/).
Differentiation between malignant and benign lesions presenting as mass
Diagnostic performance for differentiating between malignant and benign lesions
Mass type malignant vs. benign
Morphology (shape + texture)
Kinetics (hot spot)
Kinetics (whole lesion)
Morphology + kinetics (hot spot)
Non-mass type malignant vs. benign
Kinetics (hot spot)
Kinetics (whole lesion)
Morphology + kinetics (hot spot)
All malignant vs. all benign
Kinetics (hot spot)
Kinetics (whole lesion)
Morphology + kinetics (hot spot)
Differentiation between malignant and benign lesions presenting as non-mass-like enhancement
Differentiation between all malignant and all benign lesions
For differentiating between all malignant and benign lesions, the following three texture features were selected: homogeneity, grey-level max probability and grey-level sum average, which achieved an AUC of 0.81. After a kinetic parameter (hot spot kep) was added, the AUC was improved to 0.86. The results are also summarised in Table 2, and the ROC curves are shown in Fig. 4c.
Previous investigations have reported the selection of quantitative morphological and kinetic characteristics for building the computer-aided diagnosis models for lesions shown on breast MRI. Most reported works were for masses and rarely for lesions presenting as non-mass-like enhancement, primarily because of the challenges in defining the lesion extent for computer-based analysis. Nevertheless, given the very limited utility of kinetic enhancement data for the non-mass lesions, identifying diagnostic morphological features is even more important [14, 22, 25, 26, 31]. In this study, we compared four breast lesion groups: malignant mass, benign mass, malignant non-mass and benign non-mass, and investigated how they could be differentiated. Quantitative analysis was used to characterise the shape (only for masses), and the texture and kinetic features (for all lesions). An artificial neural network was used to search features to form diagnostic classifiers that can best differentiate between malignant and benign lesions.
The development of automated CAD for breast MRI is in the early stages in comparison with the well-established CAD systems for mammography. Most research has focused on the classification of mass-type lesions. Chen et al. published four papers using a dataset of 77 malignant and 44 benign lesions. One study used region growing for lesion segmentation for the analysis of enhancement kinetic features . The remaining three studies used fuzzy c-means (FCM)-based lesion segmentation: one reported methodology alone , one analysed kinetic features , and the other analysed texture . The differentiation between the malignant and benign groups was analysed using Student’s t test. Meinel et al.  analysed the kinetic data and a limited set of morphological features, and demonstrated that providing these features to radiologists may enhance their diagnostic performance, regardless of their experience level. Gibbs et al.  analysed enhancement and texture features based on manually drawn ROIs in 45 malignant and 34 benign lesions, and found that ‘grey-level entropy’ and ‘homogeneity’ were the most important features for lesion differentiation. In one study published by our group , we used neural networks to select shape and texture features to differentiate between 43 malignant and 28 benign mass lesions, and further attempted to establish the link between selected quantitative features and the descriptors defined in the BI-RADS lexicon. Szabo et al.  reported the selection of diagnostic features by neural network using a database of 75 malignant and 30 benign lesions. The morphology features were analysed visually by radiologists based on manually drawn ROIs. Leinsinger et al.  used neural network clustering to characterise 92 diagnostically challenging breast lesions in DCE-MRI which were categorized as BI-RADS III lesions in mammography, and found improvement in the discrimination between malignant and benign indeterminate lesions in comparison with a standard evaluation method. Overall, these results demonstrated that it is feasible to build a quantitative diagnostic model, particularly for mass lesions. In the present study, we used eight shape, 10 GLCM texture and two kinetic parameters to characterise each mass. The diagnostic performance based on two shape features (compactness and NRL entropy) and two texture features (homogeneity and grey-level sum average) could reach AUC = 0.87. When using the hot spot kinetic parameter kep, it could reach a comparable AUC = 0.88, and when using these five parameters together the AUC was further improved to 0.93. This finding demonstrates that the combination of the kinetic enhancement data and morphology information in a systematic model is the most effective and comprehensive approach to the diagnosis of masses.
Masses typically represent invasive ductal cancers and solid benign tumours (such as fibroadenoma and adenosis). Lesions presenting as non-mass-like enhancement have long been recognised as an important manifestation of certain breast cancers, in particular for DCIS and ILC [14, 22]. Diagnosis of these lesions is challenging because the enhancement of normal tissues and some benign processes, such as fibrocystic change, might show similar appearances [12, 21, 38, 39]. Radiological diagnosis of these non-mass-like enhancement lesions relies on the common descriptors defined in the BI-RADS lexicon . The distribution patterns are diverse and can be described as focal, linear, ductal, segmental, regional, multiple regions and diffuse. These lesions usually have fat or normal glandular tissues interspersed between the enhancing malignant tissues, making the definition of boundaries difficult . A literature review of breast MRI diagnosis for non-mass lesions based on the BI-RADS lexicon shows a wide variation. For example, ductal enhancement is considered suspicious for cancer with a positive predictive value (PPV) ranging from 26% to 58.5% [40, 41]. Segmental enhancement has a PPV ranging from 67% to 100% for carcinoma [41, 42, 43]. While these distribution patterns may be easily assessed by visual examination, they are difficult to assess by using quantitative evaluation methods. Furthermore, when the boundary could not be defined well, although some mathematical formulae could be used to calculate the shape parameters (as used for masses), they might not be reliable. Therefore, we chose not to analyse the shape features.
Apart from the distribution pattern, the internal enhancement patterns within the enhanced area defined in the BI-RADS lexicon may also be used for diagnosis, including homogeneous, heterogeneous, stippled/punctuate, clumped and reticular/dendritic. Stippled/punctate enhancements are more likely to represent normal breast tissue or fibrocystic changes, and thus a low likelihood of malignancy, while clumped enhancement has a higher chance of being malignant [16, 42, 43]. The internal enhancement patterns may be quantitatively characterised based on the texture features, as demonstrated in our previous publication .
The sensitivity and specificity of DCE-MRI are, in general, much lower for the diagnosis of non-mass-like enhancement lesions compared with masses [26, 44]. Approximately 30% of invasive lobular cancer  and DCIS [22, 45] show low enhancements with the persistent kinetic pattern. A study published by Jansen et al.  examined 34 benign and 78 malignant lesions and investigated whether enhancement kinetics could improve diagnosis by considering lesions with and without mass effect separately. The enhancement kinetics was measured from manually drawn ROI, and analysed using a three-parameter empirical mathematical model. It was found that for non-mass-like enhancement lesions there was no statistical difference in the kinetic features between malignant and benign lesions. Another study published by Goto et al.  analysed 60 benign and 144 malignant breast lesions, and the lesions were also separated into mass and non-mass types. The morphology and enhancement kinetic data were evaluated by radiologists based on BI-RADS descriptors. It was reported that the presence of early enhancement added no diagnostic value to the standard morphological analysis. In fact, in the breast cancer case review session of the 2008 annual meeting of the RSNA (Radiological Society of North America), one speaker suggested that there is no need to analyse the enhancement kinetics of non-mass-like enhancement lesions, because it does not add diagnostic value.
In the present study a quantitative analysis method was used to extract the enhancement texture and the kinetic features of non-mass-like enhancement lesions. The AUC based on four selected texture features reached 0.76, which was worse than the AUC of 0.87 for mass lesions using two shape and two texture features. The enhancement kinetic data could only achieve an AUC of 0.59 using the hot spot analysis and 0.55 using the whole tumour ROI analysis. Regardless of the analysis method, the AUC was only slightly higher than the random guess of 0.5, suggesting a very low diagnostic value. Overall, our results were consistent with findings reported in the literature [24, 26, 38]. Nonetheless, there are other features that may be used to improve the diagnosis of non-mass-like enhancement lesions that were not considered in the present study. For example, evaluation based on symmetry between two breasts was commonly used by radiologists, which was also a feature defined in the BI-RADS MRI lexicon. Similar quantitative analysis strategies using computer algorithms to analyse symmetry can be considered. Schmitz et al.  proposed to analyse the vascularity score, and found that when the score was added to the standard morphologic and kinetic data analysis the diagnostic accuracy was increased significantly. Further improvements in imaging techniques to obtain more information might help as well. Veltman et al.  combined the high temporal resolution images during initial enhancement (fast dynamic analysis) with the high spatial resolution images (slow dynamic analysis) and showed that combined analyses resulted in a significant improvement of diagnostic performance. MR spectroscopy has also been shown to have a high diagnostic value for non-mass-like enhancement lesions. If the MRS can be added into the image acquisition protocol, the diagnostic performance may be improved . The signal to noise ratio of the choline peak (or the concentration if using a quantitative method) may be built into the diagnostic model in the CAD system.
One limitation that should be clearly noted is the relatively low spatial resolution of images analysed in the present study. The cases were collected several years ago, and the imaging protocol used at that time was not comparable to the current recommendation of 1 mm × 1 mm in-plane resolution, and slice thickness of less than 2.5 mm [49, 50]. The inadequate spatial resolution was clearly demonstrated in Fig. 2 in that no spiculation was observed in this typical malignant mass-type lesion. In our quantitative analysis, the parameter “roughness” is sensitive to the spiculated margin, but it is not selected as a diagnostic feature. As seen in Fig. 2, when the spiculation was not revealed in the ROI, these two parameters might not truly capture the margin feature, and thus could not contribute to the diagnosis. Therefore, the value of this work is more on the presented method rather than the results. If another diagnostic dataset collected using the currently recommended imaging protocol are available, a similar analysis can be applied, and the generated diagnostic features may be used in the current clinical setting. They may be used to give a likelihood of a lesion being malignant or benign, or to further relate this likelihood to a BI-RADS score. This approach will provide the basis for developing a true CAD system, similar to the CAD for mammography that gives intellectual impression.
In summary, our study demonstrated that it is possible to build a quantitative diagnostic model for diagnosing mass-type lesions with a high sensitivity (0.97) and a reasonable specificity (0.80). However, further improvement is needed for diagnosis of lesions that present as non-mass-like enhancement. In this study we have shown that the texture features may be used to characterise the internal enhancement pattern, and to build a diagnostic model. However, the performance was inferior compared with the diagnosis of masses. Other shape-based analysis not relying on the precise boundary of the lesion, for example based on symmetrical analysis with respect to the contralateral breast tissue enhancements, may be developed to evaluate the distribution pattern; also MRS or other adjunct imaging techniques may provide additional helpful information.
This work was supported in part by NIH/NCI CA90437, CA121568 and CBCRP 9WB-0020, 14GB-0148, as well as a grant from the National Science Council in Taiwan, NSC-97-2314-B-039-031.
This article is distributed under the terms of the Creative Commons Attribution Noncommercial License which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.