Introduction

Breast cancer is the most common malignancy and the leading cause of cancer-related death in women worldwide [1, 2]. Despite early detection and improved treatment, breast cancer still accounts for 15% of cancer-related deaths [3]. Treatment effects differ between patients and breast tumours are known to become therapy-resistant, necessitating new treatment modalities [4, 5].

Increasingly, it is becoming clear that tumour-associated macrophages (TAMs) in the tumour microenvironment are involved in breast cancer development, progression and therapy response [6,7,8]. Therefore, they may become a target for therapy [9,10,11]. Macrophages can be characterized as classically activated anti-tumour M1-like macrophages and alternatively activated pro-tumour M2-like macrophages [12, 13]. TAMs in the breast carcinoma microenvironment predominantly display the M2-like phenotype. Preclinically, they promote tumour growth, invasion, metastasis, angiogenesis and therapy resistance [14,15,16,17,18,19,20,21].

In an in silico analysis of publicly available gene expression profiles of 7270 primary tumours of patients with non-metastatic breast cancer (prior to any treatment), we previously found that a higher fraction of M0 macrophages was associated with shorter disease-free survival (DFS) and overall survival (OS) in oestrogen receptor (ER)-positive disease, while a higher fraction of M1 macrophages was associated with a higher pathological complete response rate and prolonged OS [22]. Assessing the ratio between TAM subset numbers in breast cancer subtypes is therefore likely of importance.

Unfortunately, TAM subsets’ most optimal immunohistochemical detection method is unknown, and the clinical implications of the immunohistochemically defined subsets are unclear. Moreover, most studies to date do not distinguish between M1-like and M2-like TAMs, and studies assessing their ratio or comparing the multiple M2-like macrophage markers in breast cancer subtypes are lacking [23, 24].

Manual counting of TAM subsets is labour intensive and prone to inter- and intra-observer variability. Digital image analysis (DIA) is an efficient method for quantifying macrophages and other immune cell types in breast cancer [25]. DIA may therefore aid standardized, objective quantitative TAM assessment.

In light of the above, we aimed to optimize the assessment of TAM subsets in breast cancer subtypes. Therefore, we related TAM subset numbers and ratio to clinicopathological characteristics and clinical outcome. We used DIA to quantify CD68 (pan-macrophage marker), CD163 (M2-like TAM marker), CSF-1R (colony-stimulating factor 1 receptor; M2-like TAM marker) and CD206 (M2-like TAM marker) positive cells in a large, well-characterized series of Luminal-A, Luminal-B, human epidermal growth factor receptor 2 (HER2)-positive and triple-negative breast cancers (TNBC).

Materials and methods

Patients

Consecutive resection specimens of HER2-positive, triple-negative and the first 200 ER-positive/HER2-negative primary, non-metastasized, breast carcinomas diagnosed in the University Medical Center Groningen (The Netherlands) between 2006 and 2017 were retrospectively collected. Samples of 57 patients were excluded, resulting in a study population of 347 patients with primary invasive breast carcinoma (Fig. 1). All tumours were reviewed for diagnosis and tumour grade on diagnostic haematoxylin and eosin (H&E)-stained slides by two of the authors (MZ and BvdV). Retrospective collection of clinicopathological characteristics and overall survival data from patient charts and the Personal Records Database was approved by the Local Ethics Review Board Pathology non-WMO studies (UMCG research register number 201900243, approved on 18-8-2020) and according to UMCG security guidelines, in line with Dutch law. Information was retrieved on age, treatment regimen, tumour size, lymph node status, lymphovascular invasion, DFS and OS. DFS was defined as the interval between date of diagnosis and date of local recurrence, regional recurrence, distant metastasis, second primary breast cancer or death by any cause. OS was defined as the interval between date of diagnosis and date of death by any cause.

Fig. 1
figure 1

CONSORT flow diagram describing sample selection. After initial patient selection, 44 cases were excluded. After histological and immunohistochemical evaluation of the TMA sections, 13 cases were excluded

The specimens used in this study were obtained from redundant diagnostic material stored at the Department of Pathology. No objection to research on redundant tissue was recorded from these patients in the institutional record of objection.

Tissue microarray

Tissue microarrays (TMAs) were assembled using the formalin-fixed, paraffin-embedded tumour blocks. Three representative 0.6 mm tumour cores of each donor block were transferred into recipient paraffin blocks using a Manual Tissue Arrayer (Beecher Instruments, WI, USA) to account for tumour heterogeneity. In total, seven TMAs were constructed, each containing tumour samples from 42–71 patients and healthy control tissue samples. Serial sections of 3 μm were cut with a standard microtome.

Immunohistochemistry

Immunohistochemistry of ER (SP-1, Ventana), progesterone receptor (PR) (1E2, Ventana), HER2 (SP-3, Thermo Fisher Scientific), Ki67 (30–9, Ventana), CD68 (KP-1, Roche), CD163 (MRQ26, Ventana), CSF-1R (5c11, Sigma-Aldrich) and CD206 (SP211, Sigma-Aldrich) was performed. For ER, PR, HER2, Ki67, CD68 and CD163, antibodies were pre-diluted by the manufacturer and sections were stained on a Ventana Benchmark Ultra immunostainer (Ventana) according to the manufacturer’s protocols.

For CSF-1R and CD206, the following immunostaining protocol was performed: sections were deparaffinized, rehydrated in a series of decreasing concentrations of alcohol and washed with demineralized water. Antigen retrieval was performed by cooking the sections in the microwave for 15 min in 10 mM citrate buffer (pH 6.0). Endogenous peroxidase reaction was blocked by incubating the sections in 0.3% H2O2 in 50 ml phosphate buffered saline (PBS) [0.15 M NaCl, 8.0 mM Na2HPO4 2 H2O, 1.5 mM KH2PO4], (pH 7.4)] for 30 min.

The primary antibodies were diluted (1:200) in PBS containing 1% bovine serum albumin (BSA) and incubated for 1 h. The secondary antibodies for CSF-1R (polyclonal goat anti-rabbit [GARPO], DAKO, 1:100 diluted in PBS containing 1% BSA and 1% AB-serum) and CD206 (polyclonal rabbit anti-mouse [RAMPO], DAKO, 1:100 diluted in PBS containing 1% BSA and 1% AB-serum) were incubated for 30 min, after which the tertiary antibodies for CSF-1R (polyclonal rabbit anti-goat [RAGPO], DAKO, 1:100 diluted in PBS containing 1% BSA and 1% AB-serum) and CD206 (GARPO) were incubated for 30 min. Visualization was performed using the diaminobenzidine peroxidase reaction. Sections were counterstained with haematoxylin and dehydrated in a series of increasing concentrations of alcohol.

Evaluation of immunohistochemistry

Scoring of ER and PR was based on the percentage of tumour cells with positive nuclear staining. A score of > 1% was considered positive [26]. HER2 was graded according to the ASCO/CAP HER2 testing guideline [27].

Based on expression of ER, PR, HER2 and Ki67, tumours were divided into four intrinsic molecular subtypes: Luminal-A, Luminal-B, HER2-positive and TNBC (basal-like), according to surrogate definitions of the ESMO guideline [28].

Digital image analysis

Digital images of the stained TMA slides were obtained by a Philips UltraFast Scanner (Philips, The Netherlands). The DIA platform used was Visiopharm Integrator System (VIS) version 7.0.1.318 (Visiopharm, Denmark). Ki67 was scored using a CE-IVD-approved DIA algorithm. An application-based algorithm was developed to detect the percentage and number of macrophages based on positive cytoplasmic staining for CD68, CD163, CSF-1R and CD206. The algorithm detects cells and classifies positive cells based on the size of the nuclei and the amount of surrounding staining (Fig. 2).

Fig. 2
figure 2

Digital image analysis of macrophage infiltration in representative TMA cores. Immunohistochemical staining for CD68 (a), digital image analysis of CD68-positive TAMs (b)

First, the TMAs were de-arrayed using the Tissue array module. A grid was superimposed on the digitalized TMA slides and manually and automatically adjusted to fit all cores. Next, the individual applications for the specific stains were run to detect and count the total number of macrophages and the number of M2-like macrophages per tumour core. An average score per tumour was determined and used to calculate the number of M1-like macrophages by subtracting the number of M2-like macrophages from the total number of macrophages. Additionally, the M2:M1 ratio was calculated.

Definitions of terms

M2-like TAMs, detected by one of the M2-like macrophage markers, will be referred to as CD163M2-like, CSF−1RM2-like or CD206M2-like TAMs. Similarly, M1-like macrophages, calculated by subtracting the number of M2-like TAMs detected by CD163, CSF-1R or CD206 from the total number of CD68 + macrophages, will be referred to as CD163M1-like, CSF−1RM1-like or CD206M1-like TAMs. M2-like:M1-like TAM ratios will be described as CD163ratio, CSF−1Rratio or CD206ratio.

Statistical analysis

Statistical analyses were performed with IBM SPSS Statistics 25. Differences in TAM subsets between breast cancer subtypes were assessed with the Kruskal–Wallis test. Correlations between the different TAM markers were evaluated using Spearman’s correlation. Correlations between TAM subset counts and clinicopathological parameters were studied with Spearman’s correlation test for linear variables, a Mann–Whitney U test for binary variables and the Kruskal–Wallis test for categorical variables. Survival analyses were performed with univariate Cox regression analyses. Two-sided P values ≤ 0.05 were considered statistically significant. We used a Bonferroni correction to adjust for multiple testing when we studied correlations between TAM subsets and clinicopathological features, and when assessing the prognostic value of TAM subsets. In these cases, P values ≤ 0.005 were considered statistically significant.

Results

Patient characteristics

Clinicopathological characteristics are shown in Table 1. The median follow-up for the full cohort was 153 months (interquartile range [IQR] 100–174 months), 169 months (IQR 136–180 months) for the Luminal-A group, 140 months (IQR 110–171) for the Luminal-B group, 129 months (IQR 94–157 months) for the HER2-positive group and 109 months (IQR 48–152 months) for the TNBC group.

Table 1 Patient and tumour characteristics

TAM subset numbers vary across breast cancer subtypes

Distributions of TAM subset numbers per breast cancer subtype are shown in Figs. 3, 4 and 5. CD68 + TAM counts were higher in the HER2-positive group compared to the Luminal-A, Luminal-B and TNBC groups.

Fig. 3
figure 3

Distribution of CD68 + TAM numbers per breast cancer subtype. P values > 0.05 not shown

Fig. 4
figure 4

Distribution of M2-like TAM numbers per breast cancer subtype. P values > 0.05 not shown

Fig. 5
figure 5

Distribution of M1-like TAM numbers per breast cancer subtype. P values > 0.05 not shown

CD163M2-like TAMs were more abundantly present in the HER2-positive tumours than in the Luminal-A or TNBC samples. CSF−1RM2-like TAM numbers were highest in the TNBC and HER2-positive samples. In the Luminal-A group, CD206M2-like TAM numbers were higher than in the Luminal-B and TNBC groups but did not differ from the HER2-positive group.

CD163M1-like TAM numbers were highest in the Luminal-B group. CSF−1RM1-like and CD206M1-like TAM numbers were higher in the HER2-positive group than in the Luminal-A and TNBC groups.

CD68, CD163 and CSF-1R TAM numbers are strongly correlated with each other

CD68 + TAM numbers strongly correlated positively with CD163M2-like TAM (r = 0.67, P < 0.001) and CSF−1RM2-like TAM numbers (r = 0.47, P < 0.001), but only weakly with CD206M2-like TAM numbers (r = 0.06, P = 0.260). CD163M2-like TAMs and CSF−1RM2-like TAMs (r = 0.50, P < 0.001) also correlated strongly. The numbers of CD206M2-like TAMs and CD163M2-like TAMs (r = 0.22, P < 0.001) or CSF−1RM2-like TAMs (r = 0.22, P < 0.001) correlated weakly.

Total number of CD68 + macrophages and M2-like TAM numbers positively correlate with unfavourable tumour characteristics in Luminal-B breast cancer

In the Luminal-B group, the number of CD68 + macrophages positively correlated with tumour grade (P < 0.001) (Table 2). Also, the number of CD163M2-like and CSF−1RM2-like macrophages (P < 0.001) and the ratioCSF-1R (P = 0.001) were related to tumour grade.

Table 2 Correlations between TAM subset numbers and clinicopathological characteristics in Luminal-B breast cancer

TAM subset numbers do not correlate with tumour characteristics in Luminal-A, HER2-positive and triple-negative breast cancer

In the Luminal-A, HER2-positive and TNBC groups, CD68 + pan-macrophage numbers did not correlate with any clinicopathological parameters (Tables 3, 4 and 5). Similarly, no correlations with clinicopathological parameters were found for M2-like or M1-like TAM numbers or with ratios of M2-like to M1-like TAMs for any of the markers.

Table 3 Correlations between TAM subset numbers and clinicopathological characteristics in Luminal-A breast cancer
Table 4 Correlations between TAM subset numbers and clinicopathological characteristics in HER2-positive breast cancer
Table 5 Correlations between TAM subset numbers and clinicopathological characteristics in TNBC

TAM subset numbers are not associated with disease outcome

Univariate survival analyses of the total study cohort did not show associations between TAM subset numbers and DFS or OS (Supplementary Table 1). Similarly, no associations between TAM subset counts and DFS or OS were found in patients with Luminal-A, Luminal-B, HER2-positive or triple-negative breast cancer (Supplementary Tables 2–5).

Discussion

In this study, we found positive associations between high CD68 + TAM/CD163M2-like TAM numbers and higher tumour grade in the Luminal-B group. Furthermore, CSF−1RM2-like TAMs and ratioCSF-1R were related to high tumour grade in the Luminal-B group.

To our knowledge, we are the first who compared CD68, CD163, CSF-1R and CD206 for TAM detection and assessed their relation with clinicopathological characteristics in a large well-characterized series of intrinsic breast cancer subtypes. Furthermore, the long-term follow-up of the patients (median 153 months) allowed for adequate assessment of the relation between TAMs and patient outcome.

A meta-analysis of 16 studies assessing the relation of CD68 + , CD163 + and/or CD206 + TAMs with survival in primary, adjuvant- and neoadjuvant-treated patients with breast cancer, found that CD68 was a better outcome predictor than CD163 and CD206 [29]. However, none of the included studies compared all three markers in one breast cancer set. At the same time, it is increasingly recognized that the tumour microenvironment contains many inflammatory cell types that may contribute to tumour behaviour prognosis in a contradicting manner [6]. In addition, while early clinical trials with CSF-1R targeting compounds are being conducted, studies on CD68 + or CD163 + macrophages as a therapeutic target are lacking [30,31,32,33,34]. Therefore, a direct comparison of the clinical relevance of these three M2-like TAM markers in breast cancers has not yet been performed, but may be important, as this may provide a rationale for selective macrophage subset targeting in patients with breast cancer.

We hypothesized that CD163, CSF-1R and CD206 are specific M2-like macrophage markers that would provide accurate and comparable M2-like TAM numbers. However, the TAM numbers detected by the M2-like markers in our study varied considerably between and within breast cancer subtypes and did not identify a similar M2-like subset. We did find a moderate correlation between CD68, CD163 and CSF-1R, but only a weak correlation with CD206. Furthermore, in some cases, the number of CD163M2-like TAMs exceeded the CD68 + total macrophage number. These findings indicate a non-specificity of these markers for macrophages. Moreover, macrophages are highly plastic and exhibit functional and phenotypical diversity, depending on environmental stimuli [35, 36]. It seems likely that macrophages play multifunctional roles in development and progression of breast cancer. This functional heterogeneity is reflected by a heterogeneous expression of TAM markers [37,38,39,40]. Therefore, selectively identifying specific TAM subsets remains difficult and may complicate TAM-targeted therapy [35]. Immunohistochemical staining for multiple TAM markers, for example double or triple staining, may overcome this difficulty.

In contrast to previous studies, neither total macrophages nor TAM subsets correlated with DFS or OS for the whole group or per breast cancer subtype. Others reported that high expression of CD68, CD163 and CD206 was predictive of poor OS, breast cancer-specific survival, or recurrence-free survival [23, 41,42,43,44,45]. These studies were smaller than our study, comprising between 107 and 278 patients, but showed relatively high recurrence and death rates [23, 41,42,43]. Selection bias might therefore have played a role in the results of these studies. One large study of 562 patients with breast cancer in which TAMs were expressed as the number of positive cells per TMA core with DIA is most comparable to our study. This study did not find an association between CD68 + and CD163 + TAMs and survival [24].

This study has strengths and limitations. Strengths of our study are the large breast cancer series with a long-term follow-up and availability of well-characterized clinical data. Furthermore, instead of manual counting we used DIA to objectively quantify TAM numbers in a standardized manner. Limitations are the small breast cancer subtype groups and the fact that other immune microenvironment markers, e.g. tumour-infiltrating lymphocytes, were not considered. Concerning the M1-like TAM subset, we preferably would have detected the M1-like TAM subset using a specific marker. Unfortunately, such a marker is currently not available [46].

In conclusion, we found that total TAMs and M2-like TAM subsets and their ratios were related to our series’ clinicopathological characteristics in Luminal-B breast cancer. Therefore, based on these data we suggest that it is important to consider TAM subsets and their ratio, per specific breast cancer subtype. Prospective series are needed to identify the clinically most relevant marker for M2-like macrophages and other (immune) environment markers. Ultimately, this may support rational macrophage targeting in breast cancer.