Introduction

After the initial treatment of primary breast cancer, 20–30 % of patients develop distant metastases [17, 47]. The survival outcomes and sites at which distant metastases develop differ greatly among patients [24, 25, 29, 55]. Several studies have already reported gene expression profiles correlated with risk of distant metastasis, which are in the process of being validated with prospective studies [16, 45, 61]. Moreover, breast cancer’s propensity to spread to certain organs, so-called “non-random organ-specific metastasis”, has also been investigated [3, 10, 14, 30, 41]. There have been several important studies using animal models to unravel the mechanism of site-specific distant metastases in breast cancer [6, 7, 33, 36, 43, 44, 65, 66]. These studies focusing on organotropism of metastatic breast cancer have used human breast cancer cell lines which were injected in immune-compromised mice. By combining genomic profiling of organ-tropic metastatic variants selected in vivo from the animal models of metastatic disease with clinical genomic studies, Massague and his colleagues were able to identify gene expression signatures that were associated with metastasis to bone, lung and brain [7, 33, 43]. They have further explored the association between specific patterns of gene expression and metastatic pattern. The discovered candidate genes were then further investigated and their metastatic role was confirmed by means of overexpressing or inactivating their expression. Hereafter they have validated these gene expression signatures in several cohorts of primary breast tumours with known metastatic disease.

We have recently described the metastatic behaviour (organ-specific metastasis)-related immunophenotypic findings of the primary tumours in a retrospective study including 263 primary breast tumours with known metastatic disease [52]. We have shown that the time to distant metastasis was less than 5 year in 90 % of the hormone receptor negative breast cancer patients as compared to 66 % of hormone receptor-positive patients. The role of estrogen receptor (ER) positivity was found to be closely associated to the development of bone metastasis including bone-only and bone-first metastasis in the course of the disease, whereas ER negativity was found to be related to visceral (liver, lung or brain) metastasis. Along with the hormone status, tumour size and tumour grade, we found that patients who developed visceral metastasis had worse survival outcome, in terms of metastasis-specific survival and overall survival and additionally they frequently developed multiple metastasis during the course of the disease. We have concluded that tumour types were associated with survival and pattern of distant metastasis during the course of the disease. Gene expression profiling patterns predicting site-specific metastasis may aid in better understanding the mechanisms for the development of distant metastases.

In this study, we analysed the gene expression profile of 157 primary tumours that are all metastasized. In order to identify and validate tumour factors of metastatic breast cancer that are predictive of metastatic behaviour, gene expression profiling of primary tumours is correlated to metastasis pattern, and subsequently, gene expression signatures are investigated for prediction of the site-specific distant metastasis.

Materials and methods

The methodology for selection of patient and tumour samples, gene expression profiling experiments, microarray data analysis/bioinformatics and identification and validation of site-specific metastasis signature is described in details in a supplementary file (Supplementary file 1).

Results

For 157 primary invasive breast carcinomas from patients who all developed metastatic disease, mRNA expression signatures were assessed using microarray analysis. The patient characteristics and metastasis patterns are described in Table 1. Tumours were subdivided into 5 molecular subtypes using the PAM50 classifier [48]. Out of 157 cases, 67 (42.7 %) were identified as Luminal A, 46 (29.3 %) as Luminal B, 18 (11.5 %) as HER2-like and 25 (15.9 %) as basal type. One (0.6 %) of these tumours was identified as normal-like. For statistical purposes, the normal-like breast tumour was excluded from the multivariate analysis. Median follow-up time for patients who were alive was 11.5 years (range 6.2–17.3 years). 79.4 % of the patients with Luminal A, 72.5 % of Luminal B, 78.6 % HER2-like and 87.5 % of basal-type tumours received adjuvant therapy. None of the patients received trastuzumab as adjuvant therapy; a subgroup of patients (n = 10) received trastuzumab for treatment of metastatic disease.

Table 1 Clinical and pathological characteristics of metastatic breast cancer patients

Bone was the most frequent site of distant metastasis (71.5 %) followed by liver (51.7 %) and lung (34.4 %). 74.2 % of the patients developed visceral organ metastasis (lung, liver or brain).

Survival analysis revealed that luminal-type tumours had better outcomes in terms of metastasis-specific and overall survival compared to basal-type tumours and HER2-like tumours (p < 0.000). Median time to develop metastasis was 37, 27, 19 and 15 months for Luminal A, Luminal B, HER2-like tumours and basal-type tumours, respectively. 88.3 % of basal-type and HER2-like tumours developed metastases within 5 years versus 72.7 % of luminal A and 76.7 % of Luminal B tumours.

Among luminal subtype 80.5 % of the tumours developed bone metastasis as opposed to, respectively, 41.7 and 55.6 % of basal-type and HER2-like tumours (p 0.001). This group of tumours also composed the 81.8 % of the tumours which metastasized to bone as initial site of metastasis (p 0.001). The rates of development of visceral metastasis were 70.4 % in luminal-type tumours, 87.5 % in basal-type tumours and 77.8 % in HER2-like tumours. Of basal-type tumours, 66.7 % developed visceral metastasis as first metastasis site and 29.2 % of these tumours had only visceral site metastasis during the course of disease (p 0.061 and p 0.034).

The tumour samples from all patients were assigned to the poor prognostic group according to the 70-gene signature [61]. Based on recently published epithelial mesenchymal transition (EMT) gene classifiers [26], 100 of the tumours allocated as EMT-activated and the rest, n = 51, as EMT-non-activated.

Validation of a previously identified gene signature for bone-specific metastasis

First, we have studied the predictive value of the previously published bone metastasis signature of Kang et al. [33]. This signature was assessed as positive in 110 of the tumours in the current study set. All (100 %) Luminal A tumours and 90.7 % of the Luminal B tumours were found be positive for the signature, whereas 33 % of the HER2-like tumours were positive. None of the basal-type tumours were found to be positive for this site-specific metastasis signature. Within this site-specific signature positive subgroup of tumours, 80 % had clinically identified bone metastasis (n = 88, p 4.26e−04). Kang et al’ s 102-gene expression signature for bone metastasis was able to identify 81.5 % of the tumours with bone metastasis, 84.1 % of the tumours which had bone as initial site of metastasis and 100 % (n = 18) of the tumours which had bone-only metastasis in the training set (p values < 0.001, <0.001 and 0.002, respectively. Sensitivity: 81.5 % and specificity: 48.8 %). When tested in ER-positive (n = 108) and ER-negative (n = 43) groups separately, 61.1 % (n = 66) of the ER-positive tumours and 60.4 % (n = 26) of the ER-negative tumours were tested to be positive with this 102-gene expression signature. Out of positively tested ER-positive tumours (n = 66), 83.3 % had clinically evident bone metastasis (p 0.456). Of the 26 bone signature positive tested ER-negative tumours, 50 % had bone metastatic disease (p 1.000).

Supervised classification of bone (specific) metastasis-related genes

To identify site-specific metastasis genes, differentially expressed genes between tumours with bone metastasis and the ones without bone metastasis were explored. A t test was conducted with a p value of <0.01. After application of filtering criteria, differentially expressed genes were identified between two subgroups of tumours with and without bone metastasis. The group of differentially expressed genes were subsequently validated in the training dataset as well as in the independent dataset with the help of K-means and t testing.

We identified 15 differentially expressed genes between tumours with bone metastasis and the ones without bone metastasis (Table 2). The heat map with gene expression pattern of these 15 genes is displayed in Fig. 1. None of the genes in this set overlapped with the bone signature of Kang et al. Three genes, namely NAT1, PH-4 and BBS1, were up-regulated and the other genes were found to be down-regulated. Mapping into the Gene Ontology and Kyoto Encyclopaedia of Genes and Genomes databases showed an overrepresentation of membrane-bound molecules with molecular function of protein binding (APOPEC3B, ATL2, BBS1, MMS22L, KCNS1, MFAP3L, NIP7, NUP155, PALM2, PH-4 and STEAP3).

Table 2 The list of differentially expressed genes in bone metastatic disease
Fig. 1
figure 1

The gene expression pattern of 15 genes of bone metastasis gene signature. Heat map shows the gene expression profiling pattern of 15-genes among 151 patients. Primary tumours with clinically evident bone metastasis are illustrated in blue and the ones without bone metastasis are in yellow. For each primary tumour, the expression level of the specific gene is exhibited as red, if up-regulated and green, if down-regulated

In order to validate this gene expression signature, conjointly with our training set, an independent large combined microarray dataset of four studies was analysed. This combined dataset was previously published by Harrell et al. [27]. With the help of K-means clustering method, we have grouped our training dataset and independent dataset into two groups based on their expression levels for our newly developed bone metastasis gene expression signature and subsequently these two groups were compared using a t test.

The 15-gene bone metastasis gene signature was found to be present in 103 tumours in the training dataset. With the help of this signature, 82.4 % of the tumours with known metastatic disease, 85.2 % of the tumours which had bone metastasis as first metastasis site and 100 % of the ones with bone metastasis only were identified (p 9.99e−09, sensitivity: 82.4 % and specificity: 67.4 %). When analysed in the independent dataset, the 15-gene expression signature was found to be present in 160 tumours (total n = 376) and 81.2 % of these positive tested tumours had also clinically evident bone metastatic disease (p 4.28e−10, sensitivity 54.6 % and specificity: 78.2 %). The independent database of Harrell et al. was also utilized to test the bone-specific metastasis of Kang et al. The 102-gene expression signature was assessed as present in 201 tumours (total = 376) and 72.6 % of these tumours reported to have bone metastasis (p 6.92e−05, sensitivity: 61.3 % and specificity: 60.1 %).

In addition, the independent dataset was analysed separately in ER-positive and ER-negative tumours. Among ER-positive tumours (n = 245), the 15-gene expression signature was found to be present in 136 tumours and 83.1 % of these tumours had known bone metastasis; 38.5 % of the negatively tested tumours had no bone metastasis (p 2.38e−04, sensitivity: 79.3 and specificity: 57.1 %). Out of 139 ER-positive tumours which were tested to be positive for the 102-gene expression signature, 75.5 % had bone metastatic disease and 29.2 % of the negatively tested tumours had no bone metastasis (p 0.466, sensitivity: 63.2 % and specificity: 47.6 %). Within the ER-negative subgroup (n = 128), 74 tumours were tested positive for the 15-gene expression signature and 56.8 % these tumours had bone metastasis; 70.4 % of negatively tested tumours had no evidence of bone metastasis (p 3.83e−03, sensitivity: 72.4 % and specificity: 56.8 %). Out of 56 ER-negative tumours which were tested positive for 102-gene expression signature, 55.4 % had clinically bone metastasis; 62.5 % of negatively tested tumours had no bone metastasis (p 0.05, sensitivity: 53.5 % and specificity 64.3 %). Table 3 summarizes the validation of gene signatures in training and independent datasets.

Table 3 Performance of the gene expression signatures

In addition, in a subsequent study a subset of 50-genes (out of initially identified 102 genes) was selected by Massague’s group [44]; this subset of 50 genes was also analysed in our training and in the independent datasets for its predictive value for bone-specific metastasis The 50-gene signature was able to identify the patients with bone metastasis in the training set (p 1.14e−03) and the independent dataset (p 0.014). When tested in the ER-positive and the ER-negative tumours separately, this 50-gene signature was not predictive for bone metastatic disease.

When tested among all patients with metastatic and not-metastatic disease in the independent dataset (n = 855), the 15-gene signature was able to identify the patients with bone metastasis (p 5.48e04, sensitivity: 54.6 % and specificity: 58.7 %). This gene expression signature remained statistically significant for identification of bone metastasis when separately analysed in ER-positive and ER-negative tumours (p 3.45e−04, sensitivity: 63.9 % and specificity: 52.2 %; p 3.82e-03, sensitivity: 75.9 % and specificity: 45.5 %, respectively).

The up-regulated genes and their correlation with molecular subtypes and known prognostic gene signatures were further explored. NAT1 was identified to be expressed at the highest levels in Luminal A followed by Luminal B, HER2-like group and being least expressed in the basal-type group. NAT1 expression was also correlated with the EMT-activated group, being overexpressed in this group of tumours compared to the EMT-non-activated group (p 5.7e−05) (Fig. 2). Similarly the other up-regulated genes, BBS1 and PH-4, were also found to be significantly correlated with the EMT-activated group of tumours (p: 5.8e−04 and p 0.01, respectively).

Fig. 2
figure 2

The expression levels (log2) of NAT1 among molecular subtypes (a) and in EMT-activated and EMT-non-activated group (b). The box plots show that NAT1 expression was higher in Luminal-type tumours compared to the other molecular subtypes (p 7.2e−20). NAT1 expression was also found to be higher in the EMT (epithelial to mesenchymal transition)-activated group (p 5.7e−05)

The 15-gene bone metastasis signature was positive in 96.9 % of the Luminal A tumours, in 76.7 % of luminal B tumours and in 38.9 % of HER2-like tumours. Similar to Kang’s bone metastasis signature, none of the basal-like tumours were found to be positive for this signature.

Univariate analyses showed that our bone metastasis signature was significantly correlated to the development of bone metastasis especially in the group of patients who developed only bone metastasis in the course of their disease (p < 0.001). As expected, ER status and molecular subtypes were the parameters that were closely related to bone metastasis status (p < 0.001). Subsequently, multivariate analyses were applied in order to further explore the link between our gene signature and these parameters. Table 4 displays the multivariate analyses results for ER status, molecular subtypes and two separate gene datasets (training and independent) for bone-specific metastasis. As shown, the 15-gene signature was the only parameter that was significantly correlated to bone metastasis status in the training dataset (p < 0.001, 95 % CI 3.86–48.02). In the independent dataset, together with the molecular subtype, the 15-gene signature was significantly correlated to bone metastasis status (p 0.001, 95 % CI 1.54–5.00).

Table 4 Multivariate analyses results of predictive factors among the gene datasets

Discussion

The metastatic potential of the primary tumour revolves around multistep biological processes within host tissue and microenvironment of the distant organ site [20]. In addition to the early origin of genetic instability [4, 19, 20] and hence the metastatic potential of the tumour cells, several intrinsic and extrinsic factors are recognized as potential promoters of metastatic relapse [11, 46, 53]. Upon sustaining the elementary steps of dissemination, the circulating tumour cells can colonize a new organ, forming a detectable metastasis [10, 20].

Experimental models of metastasis yielded distinct sets of genes that mediated site-specific metastasis in breast cancer [7, 33, 36, 43]. Kang et al. identified a bone metastasis signature composed of 102 genes mostly encoding cell surface and secretory proteins, with functions including bone marrow homing and extravasation, pericellular proteolysis and invasion, angiogenesis, osteoclastogenesis, growth factor regulation and extracellular matrix alteration [33]. The authors concluded that this gene set was superimposed on a poor-prognosis gene signature to provide additional functions in order to achieve an overt bone-specific metastasis.

Despite these interesting findings from mouse model system and validation of the results from the mouse models in human breast cancer, no clinical application or follow-up research has emerged since these first findings. Here we present results of the largest study to date on the association between gene expression profiling of primary breast cancer and the development of bone metastases, and the first study in which supervised classification has been used to identify a bone metastasis associated gene expression signature. This gene expression signature was composed of 15 genes, with 3 (NAT1, PH-4 and BBS1) of them being up-regulated in the primary tumour samples. The overexpressed genes in this bone-specific metastasis signature were associated with metabolic (NAT1) and oxidation–reduction (PH-4) processes, and protein transport (BBS 1), in agreement with previous works hypothesizing their potential role in altering the host tissue environment in order to achieve a bone metastasis [11, 28, 49, 53].

N-acetyltransferase 1 (NAT1) was first reported to be associated with enhanced growth and survival of breast epithelial cells by Adam et al. [1], and later reported to be a potential biomarker for breast cancer [15, 18, 37, 59, 60]. In several studies, inhibiting NAT1 resulted in cell morphology change, a loss of surface filopodia and subsequent reduction of invasive potential both in vitro and in vivo [60]. Likewise, knockdown of this gene led to inhibition of invasion and metastasis, by means of modification/rearrangement of filopodia (intracellular) actin [58, 59]. In agreement with other gene expression profiling studies in human cancer samples, here we showed that NAT1 clusters close to the estrogen receptor with higher expression levels in luminal-type tumours [1, 5, 56]. Tiang et al. also showed that the loss of NAT1 resulted in alteration of cell-to-cell contact and up-regulation of E-cadherin. Based on aforementioned cell-line studies, a possible association between this gene and EMT/MET has been speculated [58]. Interestingly, in our dataset overexpression of this gene was significantly correlated to the so-called EMT-activated group (p = 5.7e−05). To our knowledge, this is the first study pointing to the association between NAT1 and EMT in human female breast cancer samples. Along with the considerations of the potentiality of this gene as a drug target [57, 58], we believe that further studies in human breast cancer samples are indicated to explore this link.

The extracellular matrix (ECM) plays important role in diverse pathological and physiological processes including cancer invasion and metastasis [22, 32]. Collagens compose the major component of ECM. Increased expression of collagens, thereupon increase in deposition and stiffening in ECM, is associated with tumour progression [38, 50]. Collagen prolyl 4-hydroxylase (PH-4), a member of post-transcription modification enzyme family, is required in collagen biosynthesis and angiogenesis. Hypoxia-induced collagen prolyl 4-hydroxylase expression is reported to be associated with increased progression and mortality in breast cancer [12, 21, 50]. Indeed, animal studies showed that knockdown of PH-4 resulted in inhibition of tumour growth and lung metastasis [23, 62]. With gene expression profiling of breast cancer samples, we have found that PH-4 was positively correlated with site-specific metastasis to bone. This finding confirms the observations by others [22, 32, 38, 50] and advocates for the importance of extracellular matrix alterations in disease progression.

Twelve out of 15 genes were found to be down-regulated in the primary tumours of breast cancer patients who developed bone metastasis. One of these genes, apolipoprotein B mRNA editing enzyme, catalytic polypeptide-like-3B (APOBEC3B), is reported to be up-regulated in a large proportion of breast tumours and high levels of APOBEC3B were found to be associated with worse disease-free and overall survival [8, 9, 51, 54]. Recently, several independent genome-wide association studies have shown a deletion resulting in complete elimination of the APOBEC3B gene-encoding region [34, 40, 63]. This deletion has been indicated to be associated with decreased expression of APOBEC3B in breast cancer cells [34]. In this study, we have also shown that APOBEC3B was significantly down-regulated in the group of tumours with bone metastatic disease (p 3.55e−03). We believe that further copy number variations studies are required to explore such an association between APOBEC3B deletion and site-specific metastasis. Six-transmembrane epithelial antigen of prostate 3 (STEAP3), which is thought be involved in apoptosis and cell-cycle progression [2, 39, 64], is also found to be down-regulated in the bone metastatic group of primary breast tumours in our study. STEAP3 expression is shown to be diminished in hepatocellular carcinoma nodules compared to cirrhotic peritumoral tissue and healthy liver [13]. Another family member of these proteins, STEAP1, has already shown to be overexpressed in breast cancer cells [31, 35, 42]. However, we could not retrieve any similar data pointing STEAP3 expression levels in breast cancer tissues.

In order to determine the validity of the experimentally derived 102 gene bone metastasis signature, Kang et al. have utilized a cohort of 63 primary breast carcinomas to test this signature. The authors have selected a subset of 50 genes to carry on their validation studies and they have shown that this gene set was not able to identify the group of tumours with bone metastasis. When the authors restricted their analyses to 25 breast tumours with known metastatic disease, they were able to distinguish the tumours preferentially metastasized to bone rather than other distant organs [44]. In this current study along with new identified 15-gene expression signature, we have shown that the 102-gene expression signature and the subset of 50 genes as reported by Kang et al. were informative in identifying likelihood of developing bone metastasis in the training and the independent datasets. However, when datasets subdivided into two groups according to their ER status, the 102-gene expression signature as well as the 50-gene signature were not effective in predicting bone metastasis, whereas herein identified 15-gene expression signature remained associated with the likelihood of bone metastasis development in ER-positive and ER-negative tumour groups.

Notably, the bone-specific metastasis signature presented in this study did not include any of the genes from already published Kang’s bone signature [33]. The absence of overlap between these gene sets could be justified with the fact that in the former study tumour cells from the metastasis site were utilized to generate gene signatures in contrast to primary tumours in the current study. Considering that tumour progression and development of metastasis requires compiled steps of modification, we may assume that these two different gene signature sets play a complementary role in separate levels of this multi-complex process.

Notwithstanding several well-received studies focusing on the biology of metastatic breast cancer, little progress has been made over the past years to identify a robust gene expression signature for site-specific metastasis. Moreover, the experimentally derived gene expression signatures when tested in human breast carcinomas were not as strongly associated with site-specific metastasis as in the experimental conditions. A reproducible gene expression signature associated with the development of bone metastases in breast cancer will have clinical utility in two ways: first, the knowledge of the specific gene expressed at higher or lower levels in the metastatic disease will lead to the investigation of targeted therapy options directed to the altered mechanism related to this gene, and second, reliable identification of the patients at high risk of developing bone metastases may lead to therapeutic interventions specifically aimed at preventing the development of bone metastases, for example treatment with bisphosphonates.

In summary, we present the largest study to date revealing the association between the gene expression profiling patterns and bone-specific metastasis in breast carcinomas. The identification of novel 15-gene expression signature will forward this area of research, including subsequent exploration of the underlying mechanisms of metastatic behaviour and ultimately help improve outcome for breast cancer patients.