Introduction

Phyllodes tumors (PTs) are rare breast fibroepithelial tumors and are classified as benign, borderline and malignant according to the characteristics of the stromal components [1]. Each of these three grades is considered to have a very different biological behavior. Surgery is a fundamental treatment for PTs, and different surgical approaches are commonly selected based on the histologic grade of the tumor [2,3,4]. In the clinic, BPTs are usually treated with local excision, while extensive excision, including mastectomy, is necessary to reduce recurrence for BMPTs [5, 6]. Therefore, an accurate preoperative diagnosis and qualitative grading are conducive to the selection of the surgical procedure. Biopsy is the basis of an accurate preoperative diagnosis for breast diseases. However, due to the complex composition and obvious heterogeneity of PTs, it is difficult to obtain representative tissues by percutaneous needle biopsy alone, which could lead to low accuracy in the pathological diagnosis [7].

Magnetic resonance imaging (MRI), with its high sensitivity and relatively high specificity, has become an important imaging method in the diagnosis of breast diseases. However, previous studies have found that the conventional MRI findings of BPTs and BMPTs overlap [8, 9], and it is very difficult for physicians to subjectively identify them without obvious morphological features. Current studies on the diagnostic grading ability of functional MRI parameters, such as ADC value, for PT grading are contradictory [10, 11], and DCE-MRI findings, such as enhancement pattern or TIC, are not of great value in predicting the histologic grade of PTs [11, 12]. In addition, MR spectroscopy, which can reflect tumor metabolites, has been unable to conclusively distinguish benign from borderline or malignant PTs to date [8]. Therefore, it is of importance to improve the diagnostic performance of MRI by changing the existing image analysis methods.

Texture analysis (TA) is a method used for the quantitative analysis of image grayscale distribution features and the relationship between pixels and spatial features. Compared with conventional imaging methods, TA can provide objective and additional quantitative image information on lesions independent of the subjective judgment and experience of clinicians or radiologists, adding potential clinical value [13]. Recently, computer-aided TA has been used for the diagnosis and treatment response and prognostic evaluations of cancer patients [14]. However, few studies have used conventional MRI TA to grade PTs. The purpose of this study was to determine the diagnostic performance of conventional MRI TA in differentiating between BPTs and BMPTs.

Methods

Patients

We retrospectively reviewed the MRI data of fifty-one patients with surgically proven primary PTs admitted to our hospital between January 2013 and March 2020. 44 patients were enrolled in this study. The exclusion criteria included (1) MRI images with poor quality; (2) implants in one or both breasts; and (3) MRI images acquired after surgery, chemotherapy or radiotherapy. The patient ages ranged from 31 to 75 years (mean 48.55 ± 10.75 years). There were 25 cases of BPTs, 16 cases of borderline PTs, and 3 cases of malignant PTs based on the pathological results. The patient flowchart is illustrated in Fig. 1.

Fig. 1
figure 1

Flowchart for the case accrual process

Imaging protocol

All patients were examined with a 1.5 T MRI scanner (Siemens Magnetom Aera, Germany) in the standard prone position using an 8-channel dedicated breast coil. Axial T1WI (SE, TR = 8.6 ms, TE = 4.7 ms) and fat-suppressed T2WI (FSE, TR = 5600 ms, TE = 57 ms) were obtained. DWI was performed with a spin echo-echo planar imaging (SE-EPI) sequence with two b values (0 and 1000 s/mm2) in 3 orthogonal directions. The imaging parameters were as follows: TR = 3300 ms, TE = 94 ms, flip angle = 90°, layer thickness = 5 mm, matrix = 128 × 128, and FOV = 320 mm × 320 mm. Following DWI, DCE-MRI was performed with a 3D fat-suppressed T1 fast-field echo sequence (TR = 4.62 ms, TE = 1.75 ms, layer thickness = 1.5 mm, interlayer spacing = 0, FOV = 360 mm × 360 mm, and matrix = 384 × 320) before and five times after the injection of 0.1 mmol/kg gadopentetate dimeglumine (Omniscan, GE Healthcare, Ireland). Subsequently, 7-phase DCE-T1W images were acquired.

Imaging analysis

All FS-T2WI, DWI, ADC, DCE-T1WI2min and DCE-T1WI7min tumor images were exported in DICOM format from the PACS system and then imported to RadiAnt software (https://www.radiantviewer.com/) to render them in BMP format with identical window widths and window levels. ImageJ software (https://rsb.info.nih.gov/ij/) was used for image TA (Fig. 1), and all the data were analyzed separately by two radiologists (CL.Z. and XG.L., with 10 and 15 years of experience in breast imaging, respectively). The regions of interest (ROIs) were extracted as follows: ROIs were placed on the image containing the maximal tumor area for all MR images and included necrotic, cystic, and hemorrhagic areas. The tumor solid region was delineated on DCE-T1WI7min (Fig. 2). Finally, the gray-level histogram and gray-level cooccurrence matrix (GLCM) parameters of all the ROIs were automatically measured by the software [15]. Definitions and formulas for the histogram and GLCM parameters are shown in Table 1. GLCM is a spatial domain statistical technique that calculates second- and higher-order statistics for the number of paired (i, j) occurrences for which a gray level i is spaced away from a gray level j by a distance (d) and along a direction (θ) [16]. In this study, the relationship between the pixels of the GLCMs was set with d = 1 and θ =0° [17, 18]. The final histogram and GLCM parameter values of each lesion were the mean of the measured results from the two radiologists.

Fig. 2
figure 2

Schematic diagram of the ROI. ImageJ software was used to select the layer with the maximum tumor are from the FS-T2WI (a) and manually delineate the tumor boundary as much as possible, which was automatically copied to the DWI (b), ADC image (c), DCE-T1WI2min (d), and DCE-T1WI7min (e) of the same tumor layer. Note that the red part in the lower right corner of the DCE-T1WI7min (e) represents the solid tumor region

Table 1 Representative gray-level histogram and gray-level co-occurrence matrix texture features

Statistical analysis

Statistical analyses were performed using IBM SPSS version 21.0 (IBM Corporation, New York). Kolmogorov-Smirnov and Levene tests were used to determine the normality and homogeneity of variance, respectively, of all measurement data. The independent sample t-test and the Mann-Whitney U test were used for data with normal and nonnormal distributions, respectively. Bonferroni’s correction was used to adjust p values for multiple parameter comparisons [19]. For texture parameters with significant differences, the group was taken as the dependent variable, logistic regression analysis was performed for multiparameter joint analysis, and the predicted value of the computational model was used to draw the receiver operator characteristic (ROC) curve. The efficacy (sensitivity, specificity, 95% confidence interval and p value) of each individual texture parameter and of the combined parameters in the identification of the two groups was determined with the maximum parameter value of the Youden index [(sensitivity + specificity)-1] as the threshold. p < 0.05 indicated a statistically significant difference.

Results

Comparisons of texture parameters of different images between the BPT and BMPT groups

As illustrated in Tables 2 and 3, for FS-T2WI, the GLCM texture parameters ASM and entropy were significantly different between the two groups (both p<0.05). However, no histogram parameters showed significant intergroup differences. For ADC images, the GLCM parameters ASM, correlation, contrast, entropy and histogram parameter ADCMinimum showed significant differences (all p<0.05). For DWI and DCE-T1WI2min, none of the histogram or GLCM parameters showed significant differences (all p>0.05). For DCE-T1WI7min, none of the histogram or GLCM parameters of tumor overall region showed significant differences. The maximum gray values and kurtosis of the tumor solid region showed significant differences (all p<0.05); however, no GLCM parameters showed significant differences for this region of the tumor (all p>0.05).

Table 2 Comparisons of histogram parameters of ADC images, FS-T2WI, DWI, DCE-T1WI2min and DCE-T1WI7min between the two groups, respectively
Table 3 Comparisons of GLCM parameters of ADC images, FS-T2WI, DWI, DCE-T1WI2min and DCE-T1WI7min between the two groups, respectively

Diagnostic efficacy of MRI texture analysis in differentiating between BPTs and BMPTs

The parameters with significant differences between the two groups were further analyzed by ROC curve analysis. Those parameters with an AUC > 0.75 were ADCASM, ADCContrast, ADCCorrelation, ADCEntropy, FS-T2WIEntropy, and kurtosis of DCE-T1WI7min (DCE-T1WI7min-Kurtosis) of the tumor solid region. Among them, ADCContrast had the highest differential diagnostic efficiency, with an AUC of 0.815, a sensitivity of 84.2% and a specificity of 76.0%. Binary logistic regression analysis revealed that both ADCContrast and FS-T2WIEntropy showed significant differences between the two groups (p < 0.05) and were thus regarded as independent variables. Then, the following regression eq. (RE) was obtained: P = -13.616 + 0.067ADCContrast + 1.341FS-T2WIEntropy. The ROC curve of the combined texture parameters from the logistic regression model was plotted, and its identification efficiency was shown to be better than that of each individual texture parameter. The AUC was 0.891 (95% CI: 0.793–0.988), with a sensitivity of 84.2% and a specificity of up to 89.0% (Table 4, Fig. 3).

Table 4 Receiver operating characteristic curve analysis for the positive texture variables between the BPTs and BMPTs
Fig. 3
figure 3

ROC curves of independent variables and the combination of texture parameters from the logistic regression model for differentiating BPTs from BMPTs

Discussion

TA is a radiomics technique that can help reveal the potential heterogeneity within tumor lesions and provide quantitative and objective information on conventional MR images in the clinic [20]. First-order TA is performed through the gray-level histogram, which mainly describes the distribution of individual gray intensity values. Generally, the ROI is decomposed into single values representing gray-signal intensity: the mean value, maximum value, minimum value, skewness, and kurtosis. A higher gray value indicates a brighter ROI area. The ADC image histogram is the most popular method for analyzing MRI tumor histograms, and a series of parameters obtained from the ADC image histogram for retrospective analysis have good repeatability [21]. In our study, both the ADCMean and ADCMaximum gray values of BPTs were larger than those of BMPTs; however, there was no significant difference between these two groups, similar to the conclusion made in the study by Guo et al. [12]. In his study, there was no significant difference in the ADC values between the BPT and BMPT groups for the mean ROI-w (the whole-tumor ROI) values or for the 10th, 25th, 50th and 75th percentile values from the ROI-w histogram.

Previous studies have found that the minimum ADC value has the best accuracy in differentiating between malignant and benign breast masses [22, 23]. In this study, we found that the ADCMinimum gray value of BPTs was significantly higher than that of BMPTs, which indicates that the ADCMinimum gray value can better display areas with higher cellular density than the ADCMean gray value. The mean ADC value based on conventional hot spot ROIs or the histogram ROI only represents the average level of the data, which might be limited to PTs with considerable heterogeneity. However, it should be emphasized that the ADCMinimum gray value may be more susceptible to outliers from noise, artifacts, adjacent structures and the partial volume effect; therefore, great care should be taken when delineating ROIs [24]. Kurtosis and skewness are statistics reflecting the distribution of the image gray values. The steeper the kurtosis, the steeper the distribution is compared with the normal distribution; the greater the absolute value of skewness, the greater the skewness of the distribution is [25, 26]. In our study, neither of these two parameters was able to distinguish BPTs from BMPTs, indicating that they are of little importance in distinguishing the two groups based on the morphological changes in ADC gray histograms.

The GLCM is one of the important methods used in second-order TA. The GLCM can describe the spatial relationship between voxels by analyzing the gray distribution of pixels and the surrounding spatial domain [27]. The texture parameter ASM reflects homogeneity, the value of which is quite high when the image has perfect homogeneity or when the pixel intensity is very similar. Correlation reflects the linear dependency of the gray levels of neighboring pixels, and high values can be obtained for regions of similar gray levels [28]. In this study, the GLCM derived from ADC images showed that the ASM and correlation values of BPTs were significantly higher than those of BMPTs. This indicates that BPTs have a more uniform gray distribution and stronger texture regularity than BMPTs on ADC images. Contrast reflects the amount of gray-level variation in an image, where a high contrast value indicates the presence of noise or a wrinkled texture in an image. The increased contrast values of BMPTs suggest more noise or wrinkled textures in malignant PT lesions, which may be associated with the local heterogeneous intensity. Entropy represents the amount of information needed for image compression. A higher entropy value represents a greater loss of image information and a more complex image texture [29]. In this study, BMPTs had a higher entropy value than BPTs, suggesting that BMPTs lose more image information and thereby have increased complexity and heterogeneity.

FS-T2WI is one of the more important sequences for MRI TA, which may be related to the relatively long time of echo (TE) of the sequence, increasing the interorganizational contrast and making the image contain more texture features of diagnostic significance [30]. In this study, we did not find any FS-T2WI gray histogram parameter that could distinguish BPTs from BMPTs. However, we did find significant differences in the GLCM parameters ASM and entropy between the two groups. The entropy of BPTs was significantly lower than that of BMPTs, indicating that the FS-T2WI texture of BMPTs is more complex and heterogeneous than that of BPTs. This may be related to the fact that BMPTs are more prone to allogenic metaplasia, which further complicates their internal composition. To some extent, these heterogeneous structures can also explain why the ASM of BMPTs was significantly lower than that of BPTs. It is worth mentioning that in empirical imaging analysis, we tend to consider that the degree of diffusion limitation of malignant PTs on DWI is more obvious and that the signal is higher than that of benign PTs. A previous study showed that the accuracy of DWI in characterizing lesions by using b values = 0 s/mm2 and 1000 s/mm2 was the best when breast lesions were identified on 1.5-T MRI [31]. Therefore we attempted to verify whether high b value (b = 1000 s/mm2) DWI TA could be of importance in differentiating between BPTs and BMPTs. However, the results were disappointing, showing that none of the histogram and GLCM parameters could differentiate between BPTs and BMPTs. We conjecture that this might be related to the nature of high b value DWI, in which the signal-to-noise ratio (SNR) can be reduced, along with image information.

Of the 7 phases of DCE-MRI scanning performed, we selected DCE-T1WI2min and DCE-T1WI7min for study, mainly because the contrast agent had just entered the tumor at 2 min of DCE-T1WI, and the texture comparisons were substantial; furthermore, at 7 min of DCE-T1WI, all components of the tumor could demonstrate significant contrast with the surrounding glandular tissues [32]. The results showed that the histograms of the parameters mean, minimum and maximum gray value of DCE-T1WI2min and DCE-T1WI7min (both solid and overall region) were higher in the BPT group than in the BMPT group, and only the maximum gray value of DCE-T1WI7min for tumor solid region showed significant differences between the two groups after Bonferroni’s correction. This indicates a higher enhancement degree for the tumor solid region in BPTs than in BMPTs. Additionally, the kurtosis of the tumor solid region in the BPT group was significantly lower than that in the BMPT group, which suggests a more uniform signal from the tumor solid region in BPTs on DCE-T1WI7min. Previous studies [33] have shown that the GLCM based on DCE-MRI can better reflect tumor heterogeneity, and texture differences may reflect the potential pathological subtypes of breast cancer. However, we found no significant difference in GLCM parameters between the BPT and BMPT groups, either in the tumor overall region on DCE-T1WI2min and DCE-T1WI7min or in the tumor solid region on DCE-T1WI7min.

In this study, although multiple texture parameters were statistically significant in differentiating between BPTs and BMPTs, by drawing the ROC curves, we found that the GLCM parameters derived from ADC images had better diagnostic performance, in which the AUC of contrast reached more than 0.8, with the highest sensitivity (84.2%) and specificity (76.0%). Furthermore, binary logistic regression analysis showed that the texture parameters ADCContrast and FS-T2WIEntropy were independent variables in differentiating BPTs from BMPTs. ROC curve analysis showed that the combination of these two texture parameters had excellent diagnostic efficiency, with an AUC of 0.891, an optimal sensitivity of 0.842 and a specificity of 0.890, all of which were better than the diagnostic efficiency of individual sequences and single parameters.

There are several limitations to our study that deserve discussion. First, it is inevitable that the size of the sample would be small, especially for malignant PTs. Second, this study was a single-center retrospective study with no external data validation, and in particular, the differences in MRI scanning protocols between different studies may lead to deviations in the image data. Third, only DWI based on b = 1000/mm2 were used for texture analysis, and other DWI with different b values should be examined in the future. Fourth, our study was performed using 1.5 T MRI systems for acquiring DWI, and the possibility of differing results using a higher magnetic field (3 T) cannot be excluded. Fifth, we analyzed only the two-dimensional features of the maximum surface of the tumor but did not obtain the three-dimensional features of the whole tumor. Sixth, only first-order histograms and second-order GLCM parameters were used for the differentiational diagnosis of PTs; whether higher-order texture parameters, such as the run-length matrix (RLM) and absolute gradient matrix (ARM), are helpful in the identification of BPTs and BMPTs is worth further discussion.

Conclusion

This study conducted a texture analysis based on FS-T2WI, DWI (b = 1000/mm2), ADC images, DCE-T1WI2min and DCE-T1WI7min to explore their diagnostic value in the preoperative classification of PTs. The results showed that the texture parameters that could aid in the differentiation between BPTs and BMPTs were mainly derived from the GLCM analysis of ADC images, among which ADCContrast had the highest differential efficacy. In addition, we found that combined multiparameter TA from multiple images could greatly improve the efficiency of the identification of BPTs and BMPTs. Thus, MRI texture analysis may be used as an image-processing tool that is worthy of further evaluation in the differential diagnosis of BPTs and BMPTs.