Background

Breast cancer has become the most prevalent cancer worldwide with an estimated 2.3 million new cases in 2020 [1], the most common route of breast cancer metastasis is through the axillary lymphatic network. Therefore the status of the axillary lymph nodes (ALN) plays an important role in tumour staging, postoperative therapy and tumour prognosis. An accurate assessment of the ALN burden before surgery can avoid unnecessary axillary surgery, consequently preventing surgical complications such as lymphedema, sensory abnormalities, and limitation of upper limb movement [2, 3]. The possibility of exempting axillary surgery in early breast cancer have been widely explored in several clinical trials [4,5,6]. However, with the limited randomized, multicenter clinical trials and strict inclusion criteria, proper selection of axillary surgeries for patients who fail to meet the criteria has become a priority of many clinicians. Previous studies [7, 8] have attempted to develop models to assess the ALN burden individually to facilitate clinical decision making.

Ultrasonography (US), mammography and magnetic resonance imaging (MRI) are three commonly used imaging modalities for detecting breast cancer, with all techniques relying on the anatomic differences between cancer and normal breast parenchyma [9]. Mammography is the golden standard in cancer screening. However, it remains limited as it only provides information about the anterior axilla when assess the ALN status and its sensitivity is significantly reduced in women with dense breast tissue [10]. MRI shows its superiority in terms of sensitivity and specificity for breast cancer, but some patients are unable to undergo MRI evaluation due to implantable devices, body habitus, renal insufficiency, and claustrophobia. US has the advantage of being convenient, radiation-free, inexpensive and non-invasive. However, in view of the variations between different operators, US has a high false positive rate which limits the clinical application. Therefore, supplementary methods of breast screening must be found. Breast-specific gamma imaging (BSGI) is a recently emerging molecular breast imaging modality that can detect breast cancers at the sub-centimetre level and at various breast tissue densities, with a sensitivity estimate of 90–96% [11, 12]. In BSGI, a radiotracer such as Technetium-99 m Sestamibi is injected into the patient’s bloodstream and the breast is visualized using a special camera. The radiotracer uptake is commensurate with blood flow and mitochondrial activity within tumour cells [13], which enables us to diagnose breast cancer by distinguishing the biological behaviour between tumour cells and normal cells. In addition, The BSGI presents an excellent specificity for the diagnosis of ALN metastasis [14]. This prompted us to wonder the possibility of incorporating BSGI, a modality with accuracy comparable to MRI, to improve the accuracy of ALN status prediction. We reviewed the previous literature and found none exploring this possibility yet.

Machine learning (ML) is an emerging tool for cancer prediction and prognosis that is making significant contributions in different cancer fields [15, 16]. It is a learning process capable of providing excellent accuracy through a continuous mechanical learning approach using techniques like decision trees (DTs), artificial neural networks (ANN), and support vector machines (SVM), ultimately developing prediction tools which in some cases outperform traditional statistical modeling.

Thus, the aim of our study was to employ ML-based statistical methods to select variables from non-invasive preoperative modalities, such as BSGI and US, ultimately establish a prediction model to assess the ALN burden, which can guide clinicians for better choice of cancer treatment options for different patients.

Methods

Patients

The clinical data of patients who underwent surgery between January 2012 and May 2021 at Zhongshan Hospital (an affiliate of Fudan University) were collected consecutively and analyzed retrospectively. The inclusion criteria were: 1. Patients has received preoperative BSGI as well as US; 2. The patient has received both breast tumour resection and axillary surgery in our hospital; 3. Postoperative pathologically confirmed diagnosis of breast cancer without neoadjuvant therapy; 4. No history of other tumours. Normal breast and lymph node ultrasounds imagines were excluded. The ethical approval of this study was granted by ethics committee of Zhongshan Hospital. All methods were carried out in accordance with relevant guidelines and regulations. Zhongshan Hospital Ethics Committee waived the need of informed consent from patients since it was a retrospective study. The working flow of our study was showed in Fig. 1.

Fig. 1
figure 1

Working flow of this study

BSGI and ultrasonographic results

Patients underwent BSGI (Dilon 6800; Dilon Technologies) at high-resolution and a small field-of-view. Imaging was performed 10–15 min after the intravenous administration of 740 MBq Technetium-99 m Sestamibi (GMS Pharmaceutical Co. Ltd) through an antecubital vein contralateral to the suspicious breast side. Craniocaudal (CC) and mediolateral oblique (MLO) images of both breasts were obtained. A low-energy general-purpose collimator was used, with a photopeak focused at 140 keV with a symmetric 10% window. The acquisition time was approximately 6 min per image and a value of 100 000 counts per image was defined as the minimum range.

Two experienced nuclear medicine physicians analyzed the images and were blinded to the patients’ clinical information and pathological results. According to the 2010 guidelines of the Breast Imaging Reporting and Data System (BIRADS) of the Society and Nuclear Medicine and Molecular Imaging [17], lesions with homogeneous and small patchy uptake were considered to be negative, lesions with patchy uptake, mild focal uptake and definite focal uptake were considered to be positive. The tumour-to-normal lesion ratio (TNR) was calculated by dividing the maximal pixel counts of the tumor lesion by that of normal background breast tissue on both CC and MLO view. A positive axillary mass was considered to be patchy, mild focal and definite focal uptake of Technetium-99 m Sestamibi in axilla. An example of BSGI imagines analysis was showed in Fig. 2.

Fig. 2
figure 2

An example of breast-specific gamma imaging analysis. (A) showed a left-sided breast cancer without axillary lymph node metastasis, the yellow rectangle showed the uptake of Technetium-99 m Sestamibi in breast. (B) showed a left-sided breast cancer with axillary lymph node metastasis, the red circle showed a positive axillary mass

The ultrasonographic images were obtained using an HDI 5000 scanner (Philips Medical Systems) and analyzed by 2 experienced operators. According to Adler’s method [18], the degree of blood flow signal within the breast carcinomas and axillary lymph nodes was subjectively classified into 1 of 4 levels: absent (grade 0), minimal (grade 1), moderate (grade 2), or abundant (grade 3). The echogenicity was classified as cystic (no echo), hypoechoic, isoechoic, hyperechoic and mixed echoic. When a mass showed echogenicity minimally less than that of subcutaneous fat, it was defined as hypoechoic. An example of ultrasonographic imagines analysis was showed in Fig. 3.

Fig. 3
figure 3

An example of ultrasonographic lymph node imagines analysis. (A) showed an ultrasound image of the right axilla exhibiting no signs of lymph node metastasis. (B) showed a color Doppler flow imaging (CDFI) image of the right axilla, indicating the absence of lymph node metastasis. (C) showed an ultrasound image of the left axilla showcasing lymph node metastasis, featuring a 24.0*16.0 mm mass identified as a lymph node. (D) showed a CDFI ultrasound image of lymph node metastasis in the left axilla, revealing discernible colored blood flow within the affected lymph node

Data collection

The collected clinical and medical information of patients included the patients’ gender, age, breast tumour location, BSGI features (tumour TNR and axillary mass status), postoperative pathological features (estrogen receptor (ER) status, proliferation index (Ki-67), progesterone receptor (PR) status, Her-2 overexpression, lymphovascular invasion (LVI), Scarff-Bloom-Richardson (SBR) grade, T stage, N stage, infiltration depth, histologic type, molecular subtype, and multifocality) and ultrasonographic parameters of tumour and axillary lymph nodes (sizes, echogenicity, margin, lymph node hilum status, color Doppler flow imaging (CDFI) grade, and resistance index (RI)).

Statistical analysis

Clinical and pathological variables associated with the risk of lymph node metastasis were assessed on the basis of their clinical importance and predictors identified in previously published articles [19, 20]. Categorical variables, such as patients’ gender, tumor location, axillary mass status (BSGI), tumor echogenicity, tumor margin, tumor CDFI, lymphatic echogenicity, absence of lymph node hilum, lymphatic CDFI, infiltration depth, histologic type, SBR grade, ER status, PR status, Ki-67, Her-2 overexpression, molecular subtype, LVI, multifocality, T stage, and N stage, were reported as integers and proportions. On the other hand, continuous variables, including patients’ age, tumor TNR (CC), tumor TNR (MLO), transverse diameter of tumor, longitudinal diameter of tumor, tumor RI, transverse diameter of lymph nodes, and longitudinal diameter of lymph nodes, were reported as means with standard deviations. The association between clinicopathological characteristics and ALN status was analyzed using X2 test or t-test as appropriate. Collinearity for all explanatory variables were assessed using correlation matrix and plausible interaction terms were also tested. To relax the assumption of a linear relationship between continuous predictors and the risk of ALNs metastasis, continuous variables (such as the patients’ age, tumor TNR (CC), tumor TNR (MLO), transverse diameter of tumor, longitudinal diameter of tumor, tumor RI, transverse diameter of lymph nodes, and longitudinal diameter of lymph nodespatient) were converted into categorical variables after valuation using restricted cubic splines (RCS) [21]. Patients were randomly sampling into the training and test sets by ratio 7 : 3. Six machine learning (ML) methods were trained in, including generalized linear model (GLM), random forest (RF), support vector machines (SVM), neural network (NNET), gradient boosting machine (GBM), extreme boosting machine (XGB) [22,23,24]. To select the strongest predictive variables, recursive feature elimination (REF) was used for each algorithm. To avoid overfitting in training, the best hyper-parameter for ML models was 10-fold cross-validation. Comparison of ML methods performance, the best classification model was select. Based on the best-performing model, we created an online calculator that can make predictor in patients easily accessible to clinicians. Finally, the receiver operating characteristic (ROC) analysis was used to assess the role of BSGI feature in this prediction model [25]. All statistical analyses were determined using the R software (version 3.6.3, http://www.r-project.org). The R packages “caret”, “rms”, “glmnet”, “randomForest”, ‘nnet’, “e1071”, “kernlab”, “pROC”, “gbm”, and “xgboost” were used. The “shiny” package was used for web application. We used t-test for continuous variables and chi-square test for classified variables. A two sided P < 0.05 was considered statistically significant.

Results

Demographic and clinicopathological characteristics

A total of 1672 ultrasound images and 1336 BSGI images from 334 patients were screened between January 2012 to May 2021. Patients were grouped into two groups according to the presence/absence of axillary lymph metastasis. The clinicopathological characteristics between the metastasis and non-metastasis patients were shown in Table 1. The median patient age was 58 years, and 52.1% patients’ tumour located on upper-outer quadrant. Patients showed no difference in gender, age, location, tumour echogenicity, tumour margin, tumour CDFI grade, lymph node hilum status, histological type, multifocality, ER, PR, or Her-2 status. However, a higher value of tumour TNR either on CC or MLO view, a positive axillary mass on BSGI, a lower lymphatic echogenicity, a longer transverse or longitudinal diameter of tumour or lymph node, a higher CDFI grade of lymph node, a higher SBR grade, a higher proliferation index (Ki-67), a deeper infiltration, Her-2 positive subtype and the presence of tumour LVI each showed a higher likelihood for ALN metastasis (P < 0.05).

Table 1 Differences of clinicopathological characteristics between the patients with and without axillary lymph metastasis

Prediction model and factors selection

Patients were randomly divided into the training and test groups (group ratio 7 : 3). The training set comprised 1672 ultrasound images and 940 BSGI images sourced from 235 patients. Conversely, the test set consisted of 568 ultrasound images and 396 BSGI images collected from 99 patients. All explanatory variables were turned into categorical form and the cutoffs of continuous variables after RCS processing. All pathological characteristics were excluded since we intended to build a non-invasive model. No statistical difference of variables between training set and test set was found in Table S1. We use six ML methods and combine them with REF to select the optimal combination of variables within each algorithm (Figure S1). The optimal sets of variables for each algorithm in the training set were selected, these variables sets were then passed into each ML method to tune and validated the model in the test set. In Fig. 4A; Table 2, the model of SVM showed best predictive ability in the test set (AUC = 0.794, sensitivity = 0.641, specificity = 0.8, PPV = 0.676, NPV = 0.774 and accuracy = 0.737).

Table 2 Predictive performance comparison of different machine learning models in the test set

Relative importance of variables in machine learning algorithms

The relative weights of each optimal variables set in each model were shown in Fig. 4B-G. The best parameters of each model for their optimal variables were shown in Table S2.

Fig. 4
figure 4

Model selection. (A) The line graph shows the predictive values of each model in test set. (B-G) The relative weights of selected variables in each model. GLM: generalized linear model; RF: random forest; SVM: support vector machine; NNET: neural network; GBM: gradient boosting machine; XGB: extreme boosting machine; RI: resistance index; CDFI: the color Doppler flow imaging grade; TNR: tumour-to-normal lesion ratio; MLO: mediolateral oblique; BSGI: breast specific gamma image; US: Ultrasonography

Although obvious differences were shown in the importance of variables among those ML algorithms, factors including lymphatic CDFI grade, lymphatic echogenicity and BSGI axillary mass status rank top three without fail. However, if only these three variables were selected it would affect the classification results of the ML algorithms. The importance of high-ranking variables in the SVM model is arranged as follows in a descending order: lymphatic CDFI grade, lymphatic echogenicity, longitudinal diameter of lymph nodes, axillary mass status, transverse diameter of lymph nodes, longitudinal diameter of tumour, transverse diameter of tumour.

Web-based calculator

An online calculator based on the best-performing model was established for clinicians to predict patients’ risk of ALN metastasis by simply inputing preoperative clinicopathological variables (https://wuqian.shinyapps.io/shinybsgi/) (Fig. 5).

Fig. 5
figure 5

An online calculator for predcting ALN metastasis. SVM: support vector machine; ALN: axillary lymph node; CDFI: the CDFI grade of lymph node; BSGI: breast-specific gamma imaging

Clinical application evaluation of BSGI

In Fig. 6, a ROC analysis showed that using the final SVM model with BSGI features provided additional benefit from only ultrasonographic features. Thus, We believed the inclusion of BSGI was important to improve preoperative prediction of ALN metastasis in breast cancer patients, especially when lymph node metastases cannot be identified by ultrasonography. This can facilitate early clinical intervention and thus support personalized postoperative cancer rehabilitation.

Fig. 6
figure 6

Evaluation of the model effectiveness with and without the BSGI features by ROC curves. US: Ultrasonography; BSGI: breast-specific gamma imaging; AUC: area under the curve

Discussion

The shift toward less invasive local treatment for axillary sugery is inevitable with the increased use of systemic and radiation therapy. The NSABP b-32 [4] trial found that sentinel lymph nodes biopsy (SLNB) alone without further ALN dissection (ALND) is an appropriate therapy for the targeted patients and has become a routine surgical procedure. Although the incidence of complications after SLNB is relatively low compared to ALND [26], given that dissection of the mammary lymphatic network is unavoidable during SLNB, patients undergoing SLNB may similarity experience subsequent complications such as limited range of motion, lymphoedema, pain, and sensory defects [2, 3]. It prompted us to ponder whether we could exempt axillary surgery for patients with a particularly small probability of metastasis, whether we could develop a completely non-invasive model to predict the probability of ALN metastasis to improve the patients’ quality of life.

In this study, we attempted for the first time to combine BSGI features and ultrasonographic parameters to predict ALN metastasis, six different ML methods were analysed and compared to derive optimal solutions for the combination of variables, and the SVM model was found to have the best performance. Six ultrasonographic parameters (transverse diameter of tumour, longitudinal diameter of tumour, lymphatic echogenicity, transverse diameter of lymph nodes, longitudinal diameter of lymph nodes, lymphatic CDFI grade) and one BSGI features (axillary mass status) were eventually incorporated into the final model. The final AUC value obtained for the test set was 0.794, which showed a comparatively high prediction ability. While SVM exhibited the highest AUC values, we observed closely clustered AUC values among the six ML methods, ranging from 0.76 to 0.79. In a study by Zhu et al. [27], six ML methods were applied to predict postoperative central lymph node metastasis in T1-T2 thyroid cancers, yielding AUCs between 0.69 and 0.75. In another study by Wu et al. [28], seven ML methods were used to predict central lymph node metastasis in thyroid cancer, with AUCs ranging from 0.68 to 0.73. Although a slightly larger discrepancy in AUC values exists between these studies compared to ours, it is apparent that the variation in AUC values across all ML methods is not substantial. We attribute this lack of significant difference to the fact that both the training and validation cohorts were sourced from the same institution without external validation, a limitation we’ll address in our forthcoming sections.

An online calculator was created to facilitate individualized surgical treatment by calculating the risk for each patient of having a positive lymph node (https://wuqian.shinyapps.io/shinybsgi/). For instance, a women with the preoperative US showed a hyperechoic lymph node measuring 9*4 mm, with the CDFI grading absent, the tumour size of 14*10 mm, and the BSGI showed a negative mass in the axilla might be considered to have a approximately 16% risk of ALN metastasis, which implied that axillary surgery could be omitted clinically. Futhermore, the result in ROC analysis showed that the model could benefit from including BSGI features. To the best of our knowledge, no studies have incorporated BSGI features into the prediction model until now, however, the study result is limited and requires much more validation before it can be applied to clinical reasoning.

The advantage of BSGI lies in its combination of the physiological differentiation of the nuclear imaging and the anatomical differentiation of mammography. The semi-quantitative index TNR is traditionally applied in the diagnosis of breast cancer, with a value above 1.65 being considered to be highly suspicious [29], Studies have investigated the association between TNR and clinicopathological characteristics of breast cancer and found a positive correlation with the presence of ALN metastasis [30, 31], which is consistent with our study. In the comparison of the six ML approaches, we found that the tumour TNR was included in the NNET model, but unfortunately the SVM model was the final adoption. Nevertheless, there is no implication that the TNR value has no guidance in clinical practice. For the detection of metastatic ALN, studies showed the sensitivity of BSGI ranges between 67% and 100%, with an average of 81%, and the specificity between 64% and 100% [32]. A positive mass in the axilla indicates the possibility of ALN metastasis more graphically. Although obvious differences were shown in the importance of variables among six ML algorithms, axillary mass status held a relatively higher weight in all models.

US is a common preoperative imaging modality for breast cancer, previous studies have suggested that some US features may correlate with the status of ALN, such as cortical thickness of ALN, transverse diameter of ALN, and lymph node hilum status [8, 33]. In our study, both the transverse/longitudinal diameter of the tumour and the lymph node, the echogenicity of the lymph node, and the CDFI grade of the lymph node were all found to be statistically correlated with ALN status. The CDFI grade has been employed in the US to evaluate tumoral angiogenesis, which is a growing trend in breast cancer [34, 35]. Studies also showed that tumour vascularity was correlated with lymph node involvement [36], Chao [37] reported that ALN metastasis had a greater tendency to be present in carcinomas with neovascularisation based on an analyses of 368 patients. However, few researches have investigated the relationship between lymphatic CDFI grade and ALN burden, while our study obtained a positive correlation. Traditionally, hyperechogenicity of a mass are associated with benignity [38] and the same conclusion was reached in this study.

Compared with previous studies attempting to predict the risk of ALN metastases in breast cancer, our work has several strengths. First, few studies have included BSGI-related variables in the model, while a considerable number of studies have compared the specificity and sensitivity of BSGI with MRI and reached favourable conclusions, which confirms the rationality of BSGI as a preoperative modality. The ROC analysis in our study showed that using model with BSGI features provided additional benefit from only ultrasonographic features. Furthermore, all variables included were non-invasive, which might reduce the complications associated with invasive procedures such as SLNB. Finally, we applied ML approach and established an online calculator, which certainly provides the clinical decision making with greater convenience.

However, some limitations of the study were noted. Firstly, despite comparable accuracy to MRI, BSGI remains to be an uncommon modality owing to its high cost and radiation exposure, limiting the promotion of the model. What’s more, The relatively low sensitivity of the model reflected the inclusion of fewer patients with ALN metastases in the training data, a limitation that results in showing higher specificity in predicting the absence of ALN metastases. Finally, the model lacks valid external validation given that it was built and validated in the same institution and the nature of a retrospective study all might resulted in selection bias and a slight discrepancy in AUC.

Conclusions

Overall, the trend of de-escalation of axillary surgery is inevitable. In this study, we sought to develop a non-invasive preoperative prediction model to facilitate the subsequent clinical management of patients using ML approach, and for the first time incorporated BSGI-related variables into the model in the hope of provoking subsequent researchers to explore the wider possibilities of BSGI in the management of breast cancer. This study revealed more areas that need future research to validate our findings.