Introduction

Breast cancer is a heterogeneous disease with significant mortality associated withmetastatic progression. Classification subdivides human breast cancer into sixcategories including Luminal A, Luminal B, HER2+, Basal, Claudin-low and normal-like[1]. Recent work suggests additionalsubclasses exist within each intrinsic subtype including three basal subtypes withstriking differences in overall survival [2].Further, The Cancer Genome Atlas (TCGA) and the Encyclopedia of DNA Elements (ENCODE)projects show remarkable variability in genetic alterations beyond gene expression bothacross and within subtypes of human breast cancer. Together these genomic analysesdemonstrate the complex nature of human breast cancer.

To more readily study mechanisms leading to breast cancer, research has turned to themouse as a model. Mouse models of breast cancer have employed various methods ofinitiation, including mouse mammary tumor virus (MMTV) infection, chemical mutagenesisand genetically engineered mice (GEM). This pioneering work identified and tested therole of many oncogenes in breast cancer. With the insertion of MMTV into the genome,numerous key oncogenes were uncovered [3, 4]. The later development of MMTV driven transgenics allowed fordevelopment of spontaneous models. With the identification of human epithelial growthfactor receptor 2 (HER2) amplification in human breast cancer [5, 6], the observation that MMTV drivenexpression of the activated rat form of HER2 (NeuNT) resulted in breast cancerreinforced the importance of HER2 as a driving oncogene [7]. More recently, models have been refined to include tissuespecific activation resulting in gene amplification, analogous to human HER2+ breastcancer [8], as well as temporal control wheretransgene expression can be activated or inactivated [9].

Individual mouse models have been used to model aspects of human breast cancer and theselection of the appropriate model to compare to human breast cancer has been directedby phenotype or known genetic events. For instance, the MMTV-PyMT model is widely usedto examine metastasis [10] while P53 knockoutmammary epithelium transplanted into wild type hosts results in tumors with variousgenetic mutations [11]. Another aspect is thehistological subtype associated with various tumors in GEM models and the metastaticability can be altered with background [12].Indeed, similarities between mouse models such as Neu and Wnt as well as their humancounterparts have been previously noted [13, 14]. Importantly, in both human breast cancer and inmany GEM models, there is significant histological heterogeneity [1517].These attributes illustrate the importance and utility of mouse models to examine breastcancer.

With the number and variety of GEM models, it is important to consider how accuratelythese various systems model human breast cancer. Initial studies using intrinsicclustering revealed similarities between mouse models and human breast cancer, albeit ina limited number of samples [18]. Yet, a moredetailed characterization of a larger number of p53 null tumors revealed a variety ofsubtypes with strong similarities to human breast cancer [11], revealing the importance of examining a large number ofsamples to capture tumor heterogeneity and variability. Further, expanding the number ofMyc induced tumors revealed that a subpopulation of Myc induced tumors had similaritiesto claudin-low human breast cancer [19]. Takentogether, recent comparative studies [11, 17, 1922] highlighted a clear need for acomprehensive examination of the genomic features of mouse models of breast cancer andtheir relation to human breast cancer. To this end, we assembled an expansive dataset ofmouse models of breast cancer. This dataset reveals the genomic heterogeneity of mousemodels and offers a predictive resource for essential cell signaling pathways.Importantly, all comparisons between all models are made available with our report.These data demonstrate the similarities and differences of the various subtypes of mousemodels to the key subtypes of human breast cancer and underscore the necessity for aninformed choice of the appropriate mouse model for studying specific types of humanbreast cancer.

Methods

Combination of datasets

Datasets (GSE10450, GSE11259, GSE13221, GSE13231, GSE13259, GSE13553, GSE13916,GSE14226, GSE14457, GSE14753, GSE15119, GSE15263, GSE15632, GSE15904, GSE16110,GSE17916, GSE18996, GSE20465, GSE20614, GSE21444, GSE22150, GSE22406, GSE23938,GSE24594, GSE25488, GSE27101, GSE30805, GSE30866, GSE3165, GSE31942, GSE32152,GSE34146, GSE34479, GSE6453, GSE6581, GSE6772, GSE7595, GSE8516, GSE8828, GSE8863,GSE9343, GSE9355 GSE37954, GSE2034, GSE2603, GSE4922, GSE6532, AND GSE14020) weredownloaded from Gene Expression Omnibus. E-TABM-683 and E-TABM-684 were downloadedfrom Array Express. For Affymetrix data, Bayesian Factor Regression Methods (BFRM)[23] were used to combine datasets andremove batch effects [24]. Agilent data wasmerged with Affymetrix data using Chip Comparer [25] and Filemerger [26].To remove platform effects between Affymetrix and Agilent data and batch effectsbetween individual Agilent studies we used COMBAT [27, 28]. Batch effects and batchcorrection were visualized by principle component analysis in Matlab (for code seeAdditional file 1).

Data analysis

Unsupervised hierarchical clustering was done using Cluster 3.0 and exported usingJava Tree View. The color scheme for the heatmap and sample legends were made usingMatlab. Human breast cancer sample intrinsic subtypes were classified according toprotocol [1]. Prior to clustering mouse modelswith human breast cancer, we clustered the human breast tumor samples on their own,to identify genes that would organize the breast tumors according to their intrinsicsubtype in the combined dataset. We used these genes to filter the mouse and humancombined gene expression dataset for unsupervised hierarchical clustering.

Significance analysis of microarrays [29] wasused for fold change analysis. Settings for each comparison can be found in the exceldownload for each model (Additional files 2, 3). Gene ontology and TRANSFAC predictions were made usingGATHER [30]. Gene set enrichment analysis wasconducted using Genepattern [31]. Thegene-set describing mammary cell-types was derived from [32].

Pathway activation was predicted according to previous studies [2, 33]. For mouse samples,specific conditions for each pathway signature can be found in Additional file4. For human breast tumor samples, pathway activationwas predicted using Score Signatures [34] andconditions can be found [2]. Mixture modelingwas implemented according to [2].

Results

Database assembly

We assembled a database containing 1,172 samples from mouse mammary tumor models,cell types and normal mammary gland. The major mouse models and descriptions arelisted in Table 1. Within a number of these models,variants exist with different alleles, promoters, and genetic backgrounds. Inassembling the database, we measured the non-biological variance between geneexpression studies and batch correction with principle components analysis (PCA)(Additional file 5A-D). PCA demonstrated that normalizationsuccessfully removed artificial variance between datasets (Additional file 5B,D). As a control, we confirmed batch correction utilizingNeu-initiated tumors spanning the Affymetrix and Agilent platforms from severalstudies. Prior to normalization (Additional file 5E) PCAdemonstrated that Neu tumors varied by platform. After correction, Neu tumorsclustered together in PCA, demonstrating that artifactual variance has been removed(Additional file 5F). With platform and batch effectseliminated, we began to explore relationships in the mouse model database.

Table 1 List of mouse models in the dataset

Gene expression heterogeneity in mouse models

Using unsupervised hierarchical clustering, we examined mouse mammary tumorsinitiated by various oncogenes. Unsupervised hierarchical clustering generated fourmajor clusters (Figure 1A). We observed remarkablevariability in gene expression profiles, including within model heterogeneity. Forexample, Myc initiated tumors span each of the major clusters in the dendrogram. Incontrast, some models show uniformity in gene expression from tumor to tumor,including Ras initiated tumors that ordered into a single cluster. Interestingly,there was significant interrelatedness between tumor models initiated with differentoncogenes. Annotations for individual tumors revealed that similarities in tumorhistology correlated with relationships in gene expression profiles. For example,MMTV-Myc, MMTV-Met and a subset of 7,12-dimethylbenz[a]anthracene (DMBA) inducedtumors of the adenosquamous histology shared gene expression profiles. These datareveal mouse models with various levels of heterogeneity and illustrate some of thetumor phenotypes that drive relationships between different mouse models.

Figure 1
figure 1

Analysis of relationships between mouse mammary tumor models. (A) Theunsupervised hierarchical clustering analysis of gene expression data for mousemammary tumors, cell types and normal mammary gland is shown. The dendrogramacross the top illustrates relationships between samples and is color-coded toitemize the four main clusters. Below the dendrogram, black bars label samplesfrom each corresponding model on the same line. Gene expression values areillustrated with the heatmap, according to the scale shown. The verticaldendrogram beside the heatmap illustrates genes with similar patterns ofexpression across the samples in the dataset. (B) The pie chartillustrates the gene ontologies of the genes that are significantly(q = 0, fdr = 0) over-expressed as identified by SAM inthe blue cluster of tumors compared to tumors in other clusters. (C) Thegene set enrichment plot comparing tumors from cluster 4 (black) to tumors inthe other clusters shows significant enrichment for high expression of a geneset that defines mesenchymal breast cancer (P = .004).SAM, significance analysis of microarrays.

To define the characteristics of each cluster, we used Significance Analysis ofMicroarrays (SAM) to identify differentially regulated genes that define tumorswithin each cluster (Additional file 6). We interrogatedgene lists for gene ontologies (Additional file 6). Forinstance, Figure 1B shows the gene ontologies for theupregulated genes in the blue cluster in Figure 1A.Ontological categories included genes involved in biological processes andmetabolism. To refine these results, tumors from each cluster were examined with GeneSet Enrichment Analysis (GSEA) (Additional file 7).Focusing on tumors in the black cluster, GSEA showed enrichment for gene setsseparating mesenchymal cells from luminal cells (Figure 1C, Additional file 8A), including low expressionof Zeb1 target genes (Additional file 8B). Gene lists thatdefine mammary stem cells demonstrated that this cluster also had a gene expressionprofile enriched for mammary stem cell-like features (Additional file 8C,D). In agreement, the majority of epithelial to mesenchymaltransition (EMT) like tumors were observed in the black cluster (Figure 1A, Additional file 9). GSEA alsodemonstrated that tumors from the other clusters had gene expression profilesconsistent with luminal cells (Additional file 10). Forexample, tumors within the blue cluster correlated with gene signatures for luminalprogenitor cells and the orange cluster had similarities in gene expression to matureluminal cells. Together, these results define the characteristics of the tumorscontained in the major clusters.

Fold change analysis

Given that unique initiating events in the tumor models should cause characteristicresponses associated with the tumor initiating event, we used SAM to identify genessignificantly altered within each model compared to all other models (Additional file2). Fold change differences were also calculated betweenthe tumors within a model and normal mammary glands in the corresponding geneticbackground (Additional file 3). As an example, wedetermined fold change gene expression differences for Neu initiated tumors(Figure 2A). Collectively, SAM analysis provided acollection of genes that are differentially expressed in each model.

Figure 2
figure 2

Fold change analysis of Neu induced tumors compared to other tumor models.(A) The expression pattern for the top 50 significantly(q = 0, fdr = 0) upregulated and down regulated genesfor Neu-induced tumors as identified by SAM are illustrated with the heatmap.Above the heatmap, black bars denote the model each sample corresponds to.Expression levels are depicted according to the colorbar beside the heatmap.(B) The bar graph shows the bayes factor measuring the enrichment ofpredicted binding sites for the Krox family of transcription factors withinupregulated genes from each model. The dotted line indicates a bayes factor of2.0. (C) Gene ontologies for upregulated genes in Neu induced tumors aredepicted in the pie chart according to the color-coded categories. (D)Gene ontologies for upregulated genes in TAG induced tumors are depicted in thepie chart according to the listed color-coded categories. SAM, significanceanalysis of microarrays; TAG, large T antigen.

To identify possible transcription factors that could be active in mediating thesegene expression changes, we annotated fold change results for each model usingTRANSFAC (Additional file 2, 3).For example, for genes regulated by Neu (Figure 2A), wepredicted that a significant number of genes had predicted binding sites for the Kroxfamily of transcription factors (Figure 2B). The completeresults for the transcription factor binding predictions are included in theadditional data for each of the models.

We also annotated fold change differences between each model using gene ontologies(Additional files 2, 3). As anexample of the utility of the method, we examined the similarities and differences ingene ontologies in the Neu and TAG models (Figure 2C).Both Neu and TAG tumors featured biological processes, metabolism and nucleicacid-related metabolism as major ontological categories. Key differences included Neutumors with genes related to transport, ion transport and biosynthesis, categoriesnot found with TAG gene expression changes. TAG tumors had major ontologiesrepresenting genes involved in cell cycle, cell organization, cytoskeletonorganization and biogenesis, and cell organization and biogenesis. To expand upongene ontology results we compared each model to all other models and separately tonormal mammary gland using GSEA (Additional file 11). Thisanalysis predicted unique features for all models including specific information onmetabolism, microenvironment, metastasis and possible pathway activation(Figure 3). For example, TAG tumors had down regulationof genes significantly enriched for the citric acid cycle TCA) (Figure 3A). Wnt tumors were predicted to have upregulation of tumorangiogenesis (Figure 3B). Not surprisingly, polyoma middleT (PyMT) tumors show enrichment for gene sets that predict metastasis(Figure 3C). Finally, GSEA results predicted that p53mutant tumors may have increased TNF signaling activity (Figure 3D). Together, these results provide a catalogue of possible importantfeatures corresponding to the transcriptional outcomes of an initiating oncogeneevent.

Figure 3
figure 3

Gene set enrichment analysis of mouse mammary tumor models. (A) Gene setfor genes involved in the TCA cycle are significantly enriched(P < .0001) for low expression in TAG tumors.(B) A gene set for genes upregulated during tumor angiogenesis aresignificantly enriched (P = .019) for high expression inWnt induced tumors. (C) A gene set for genes upregulated in breastcancer metastasis is significantly enriched (P = .02) forhigh expression in PyMT induced tumors. (D) A gene set for genes thatupregulated as a result of TNF signaling is significantly enriched(P < .0001) for high expression in p53 mutant tumors.PyMT, polyoma middle T; TAG, large T antigen; TCA, the citric acid cycle.

Pathway analysis

To expand the predictive analysis, we utilized a gene signature approach to predictpathway activation across mouse mammary tumors. The pathway prediction relationshipsbetween the various models were organized with unsupervised hierarchical clustering(Figure 4). Using this approach, we noted a largedegree of heterogeneity within models. Myc tumors showed extensive variation inpathway activation profiles, spanning the spectrum of clusters. To understand betterthe heterogeneity and pathway activity within each model, we viewed the pathwaypredictions on a model-by-model basis (Additional file 12). For example, in PyMT induced tumors, there is a significant differencein predicted pathway activity between tumors from a FVB and AKXD genetic background(Additional file 13). Myc induced tumors with an EMT orsquamous histology had distinct predicted pathway activities relative to tumors witha papillary or microacinar histology (Additional file 14).In Neu-induced tumors, we observed a major difference in predicted pathway activitybetween Neu tumors using the MMTV promoter and a Tet-on system to drive oncogeneexpression (Additional file 15). Taken together, thesedata demonstrate that tumor type, genetic background, and promoter result in keydifferences in pathway activity.

Figure 4
figure 4

Unsupervised hierarchical clustering of pathway activation predictions inmouse mammary tumors. The dendrogram across the top illustrates therelationship between samples based on predicted pathway activation profiles.Below the dendrogram, the black bars mark tumor samples corresponding to themodel listed on the same line. The heatmap illustrates the probability ofpathway activation according to the color bar provided below the heatmap. Thevertical dendrogram beside the heatmap illustrates pathways with similarpredicted activity across the samples in the dataset.

To validate and illustrate the utility of pathway activation predictions fordeveloping hypotheses about pathways that function in tumor progression, weidentified models with clear pathway activity predictions. Previous genetic studiesthat correlate with these predictions are noted (Table 2).Demonstrating the validity of the gene signatures, we observe a large degree ofagreement between pathways with predicted activity and results from previousinvestigations.

Table 2 Validation of pathway predictions

Comparisons to human breast cancer

With identification of pathways that function in tumor progression in mouse models,it is important to understand whether the given model is reflective of human breastcancer. To this end, we combined datasets for human breast cancer and the mousemammary tumors in our database, removing both batch and platform effects (Additionalfile 16). To investigate the relationships between themouse mammary tumors and human breast tumors, we used unsupervised hierarchicalclustering. We identified a large number of mouse mammary tumor models that hadsimilarities in gene expression profiles to human breast cancer (Figure 5). Importantly, Myc and Met induced tumors both recapitulate theheterogeneity observed in human breast cancer. Using histological annotations,specific relationships between Myc tumor types and human breast cancer subtypes wereobserved (Additional file 17). For example, Myc tumorswith an EMT histology clustered together with human claudin low breast cancer.Extending this to the cluster of tumors predicted to have mesenchymal gene expressionfeatures (Figure 1C), we observed that a large majority ofthese tumors also clustered with claudin low breast cancer. Importantly, furtherinvestigation of these tumors matched marker expression for claudin low tumors(Additional file 18A-K). Together these data demonstratedthat there are mouse models that share human breast cancer heterogeneity withindividual tumor types that are closely related to subsets of human breast cancer atthe level of gene expression.

Figure 5
figure 5

Unsupervised hierarchical clustering of mouse mammary tumor and human breastcancer gene expression data. Across the top, the dendrogram illustratesthe relationship between human and mouse tumor samples on the basis of geneexpression profiles. The red bars mark the intrinsic subtype of each humantumor sample according the annotation on the same line. The blue barscorrespond to the mouse mammary tumor type. Below this, a heatmap shows thegene expression patterns for each sample, with expression values illustratedaccording to the color bar on the right. The dendrogram beside the heatmapshows the correlation between genes based on expression patterns across thesamples in the dataset.

In addition to comparing mouse mammary tumors and human breast cancer with geneexpression, we tested relationships using pathway activation predictions. Using amixture modeling approach, we clustered human breast cancer into ten different groupsbased on pathway activation profiles (Figure 6). The piechart above each heatmap shows the spectrum of the intrinsically annotated samples ineach group. No single group was made up of one intrinsic subtype, illustrating theheterogeneity of pathway activation within and between intrinsic subtypes of breastcancer. After groups of human tumors were identified, the probability that anindividual mouse mammary tumor belonged to a group of human breast cancer wascalculated using the pathway activation profile of the mouse mammary tumor sample.Observing these probabilities with a heatmap, we noted that no single group of humanbreast cancer was modeled by a single mouse mammary tumor type at the pathway level.Instead, for each group of human breast cancer, multiple mouse models showed similarpredicted pathway activation profiles. Further, these results demonstrated that mousemodel relationships to human breast cancer extended beyond the initiating oncogene.For example, mouse tumors initiated by Myc overexpression contained several differenttumor types, each modeling a different group of human breast cancer including thosegroups that have lower predicted Myc activity. Moreover, Neu initiated tumors usingan inducible promoter frequently model a single group of human breast cancer(Additional file 19), while other Neu models have diversepathway activation profiles leading to relationships with several different groups ofhuman breast cancer. These results considered together highlight the similarity anddifferences between mouse models and human breast cancers.

Figure 6
figure 6

Mixture modeling analysis of human breast cancer pathway heterogeneity andrelationships to mouse models of breast cancer. Pie charts above eachheatmap illustrate the distribution of the intrinsic subtype of samples in eachgroup, according to the color-coded legend. The heatmap for groups 1 to 10shows predicted pathway activity with probabilities corresponding to the colorbar at the bottom of the figure. Below this, black bars mark the samplescorresponding to annotations on the same line. Following the samples down tothe heatmap below the black bars, the probability that a mouse model hassimilar pathway activation profiles is shown for each group. Probabilities forthis heatmap are shown according to the color bar at the bottom of thefigure.

Discussion

Here we have described the genomic analysis of a dataset composed of publicly availablegene expression data for mouse models of breast cancer. These data have been analyzedthrough a variety of mechanisms to ask how mouse models are distinct, what propertiesthey share and how they reflect human breast cancer. These data indicate that great careshould be taken to appropriately choose the mouse model to use and that a genomic andhistological characterization of tumors should be completed followingexperimentation.

In the examination of mouse models in the database, unsupervised hierarchical clusteringrevealed significant heterogeneity both between models and within models and waspronounced in tumor models with a large number of samples. Between model differenceswere fully expected given the unique initiating events causing tumor formation. However,prior studies with relatively few samples for each model did not demonstrate extensivewithin model heterogeneity [18]. In comparison,we have demonstrated extensive heterogeneity within many models. In part this is due todifferences between intrinsic clustering methods [80] and unsupervised hierarchical clustering. However, given that wehave noted corresponding differences in fold change, GSEA predictions and pathwaysignature probabilities, it is likely that this is a reflection of the number of samplesused in the analysis. As such, this provides an important caution to characterize asufficiently large population of tumors to capture heterogeneity in the analysis.

Given that there is typically a predominant histological pattern associated with a givenGEM tumor type [81], it is not surprising thatthere is a predominant genomic pattern. Indeed, we noted for many models that histologyis predictive of the genomic subtype. Interestingly, this histological and genomicinteraction is capable of spanning tumor initiating events from different mouse models.Indeed, EMT and spindle-type tumors from diverse models clustered together and weredistinct from the non-EMT samples originating in the same model system. Thus, it is alsocritical for investigators to analyze all tumors from a given model for bothhistological and genomic patterns.

Mouse models were also investigated individually in comparison to the entire datasetusing a variety of methods. This revealed characteristic gene expression patterns at thefold change level, specific GSEA enrichment effects and key pathway signaturedifferences. In many cases, these results correlated with prior studies. For instance,annotation of fold change results predicted that Neu induced tumors upregulated Krox 20which is consistent with previous chromatin immunoprecipitation (ChIP) results[82]. When pathway signatures wereexamined, there were a large number of predictions that could be made for pathways usedin specific GEM tumor models. Importantly, while these pathway signatures havepreviously been validated [2], the model by modelpathway predictions shown in Table 2 are highly consistentwith previously published tests. For instance, the pathway signatures predicted a highprobability of Src activation in PyMT tumors in the FVB background and recent work hasdemonstrated the necessity for c-Src in PyMT induced tumors [76]. Collectively, for the pathways listed in Table 2, we note agreement between the pathway signature predictions andthe reported genetic crosses. Moreover, the pathway signature predictions are alsoreflective of additional mutations that accumulate in the samples. This was noted in theMyc and TAG induced tumors where the Ras signature was predicted to be elevated,consistent with the large number of Ras activating mutations in these strains[15, 77]. Given thatnumerous published genetic tests are in agreement with the pathway predictions, theremaining cell signaling pathway predictions offer a large number of testablehypotheses. In the future, pathway predictions in the various models should prove to bean important resource for initiating studies into investigating the importance ofvarious signaling pathways in tumor biology.

One of the key aspects of this study was the comparison between mouse models and humanbreast cancer. These data demonstrated similarities and differences between the twogroups and should serve as an important consideration when attempting to extend thecomparison of mouse models to human cancer. Taking into account the clustering data, wereadily noted that the heterogeneity between human breast cancer samples was presentwithin individual mouse models. Despite capturing the genomic diversity of the samples,we noted several samples with no genomic similarity to human breast cancer, includingtumors from strains with other samples that had clear similarity to human breast cancer.This clearly suggests that if conclusions are to be drawn from mouse models of breastcancer, that the mouse samples should be compared and clustered with a variety of humantumors.

In addition to clustering of genomic data, we compared mouse models to human breastcancer through signaling pathway activation predictions. These results showed that forany given group of human breast cancer samples, there was a mouse model with similarpathway activation profiles. Using these results, it is possible to select the mousemodel that most closely represents a group of human breast cancer for the signalingpathways of interest. However, it is critical to consider both clustering and pathwayactivation and to combine these methods to choose the most appropriate model to mimichuman breast cancer. For example, to model HER2+ breast cancer and to study the role ofHER2 in tumor development, research initially used the MMTV-Neu mice [7]. However, the gene expression data reveals that thisstrain does not associate with the HER2+ human samples through genomic clustering.However, mixture modeling indicated that a proportion of HER + human cancersdid group with the MMTV-Neu samples at the level of pathway activation. This indicatesthat in some aspects the mouse model is appropriately related to human HER2+ breastcancer. Further, recent reports demonstrate that a strain of mice with conditionalactivation of Neu under the control of the endogenous promoter which undergoamplification [8] far more closely recapitulatehuman HER2+ breast cancer [21]. Taken together,these data illustrate the importance of fully characterizing and using all genomicinformation to select the appropriate model for examination.

Recent reports have described the development of serially transplantable human breastcancer samples that are grown in a murine host with clear genomic similarity to theprimary human breast cancer samples [83] andobviously this is an optimal model for specific studies. However, there is clear utilityfor GEM models, especially with regard to the ability to ask defined genetic questionswith regard to key signaling pathways in tumor biology. As such, the priorcharacterization of mouse and human breast cancer similarities was a criticaldevelopment [18]. The expanded number of samplesand methods of analysis in this report have clearly illustrated additional components ofmouse breast cancer biology that require careful consideration. Indeed, the extent ofgenomic heterogeneity was only appreciated previously for select models [11, 1517], but our work indicates that this is a generalcharacteristic across the majority of breast cancer model systems. As such, this workunderscores the requirement to fully characterize mouse tumor biology at histologicaland genomic levels before a valid comparison to human breast cancer may be drawn. Thus,we have provided the complete files for all of the comparisons made in this manuscript,from fold change between models to GSEA and pathway predictions, with the intent of thisbeing used as a resource to choose and compare mouse models in breast cancerresearch.

Conclusions

Collectively, our work demonstrates genomic heterogeneity in mouse mammary tumor models.As an additional outcome of this research, we have provided a large scale predictiveresource for each of the mouse models in the database. With heterogeneity driving avariety of relationships between individual mouse mammary tumors and human breastcancer, this work highlights the necessity of fully characterizing mouse tumor biologyat molecular, histological and genomic levels before a valid comparison to human breastcancer may be drawn.

Additional files

1, 2, 3,6, 7, 11, 12 are available for download at:https://www.msu.edu/~andrech1/BCR_Supplemental/BCR_Supplemental.html. Theresults of each analysis are provided as links to zipped folders as described below andare numbered according to their reference in the manuscript. Clicking on a link willbegin the download of the zipped material.