Epigenetic transgenerational inheritance involves the germ line transmission of epigenetic marksbetween generations that alter genome activity and phenotype [13]. Environmental factors (for example, toxicants or nutrition) at a critical time duringfetal gonadal sex-determination have been shown to alter DNA methylation programming of the germline to promote the presence of imprinted-like sites that can be transmitted through the sperm tosubsequent generations [1, 4]. Animals derived from a germ line with an altered epigenome have been shown to developadult-onset disease or abnormalities such as spermatogenic cell defects, mammary tumors, prostatedisease, kidney disease, immune abnormalities and ovarian defects [57]. The epigenetic transgenerational inheritance of such abnormal phenotypes has been shownto develop in F1 to F4 generations after environmental exposure of only an individual F0 generationgestating female [1]. Recently, we have found a variety of environmental toxicants (plastics, pesticides,dioxin (TCDD), hydrocarbons, and vinclozolin) can promote the epigenetic transgenerationalinheritance of adult-onset disease phenotypes [8]. Similar observations of epigenetic transgenerational inheritance of altered phenotypeshave been shown in worms [9], flies [10], plants [11], rodents [1, 5] and humans [12]. Environmentally induced epigenetic transgenerational inheritance provides an additionalmechanism to consider in disease etiology and areas of biology such as evolution [2, 13]. The current study was designed to provide insights into how a male germ line with analtered epigenome can transmit a variety of altered disease states and phenotypes.

During migration down the genital ridge to colonize the fetal gonad, the primordial germ cellsundergo an erasure of DNA methylation to allow a pluripotent state for the stem cell; then, at theonset of gonadal sex determination, DNA re-methylation is initiated in a sex-specific manner togenerate the male or female germ line [2, 14, 15]. The germ line re-methylation is completed later in gonadal development. Thisdevelopmental period in the mammal is the most sensitive to environmental insults for altering theepigenome (DNA methylation) of the male germ line [1, 2, 16]. After fertilization the paternal and maternal alleles are demethylated to, in part,develop the pluripotent state of the embryonic stem cells; re-methylation of these is then initiatedat the blastula stage of embryonic development [2, 14]. A set of imprinted genes escapes this de-methylation to allow a specific DNA methylationpattern to be maintained and transferred between generations [17, 18]. The ability of environmentally induced epigenetic transgenerational inheritance totransmit specific epigenetic changes between generations suggests the germ line epimutations actsimilarly to imprinted-like sites that, although they undergo developmental programming, develop apermanently programmed DNA methylation pattern [2, 4]. Observations suggest environmentally induced epigenetic transgenerational inheritanceinvolves the development of programmed epimutations in the germ line (sperm) that then escape thede-methylation after fertilization to transmit an altered epigenome between generations.

After fertilization the gametes transmit their genetics and epigenetics into the developingembryo and subsequently to all somatic cell types derived from the embryo. The altered spermepigenome can then promote a cascade of altered epigenetic and genetic transcriptome changes intothe developing cell types and tissues [19]. Therefore, the speculation is that all cells and tissues will have an alteredtranscriptome. These altered transcriptomes would appear throughout development to generate an adulttissue or cell type with an altered differentiated state associated with this transgenerationaltranscriptome [16, 19]. Previously, epigenetic transgenerational inheritance of an altered testis transcriptome [20] and ovarian granulosa cell transcriptome [7] has been observed. Although some tissues may be resistant to dramatic alterations inphysiology due to these transcriptome changes, other tissues that are sensitive will have anincreased susceptibility to develop disease [2, 7, 16, 20]. The current study was designed to investigate the epigenetic transgenerationalinheritance of transcriptomes in a variety of different tissues and investigate potential genebionetworks involved.

Gene expression of a specific cell type or tissue goes through a continuous cascade of changesfrom a stem cell through development to a stable adult differentiated state [7]. Similarly, the epigenome goes through a cascade of developmental changes to reach astable epigenome in the adult associated with specific cell types [19]. The genetic and epigenetic components interact throughout development to promote thedevelopmental and subsequent adult state of differentiation [16]. The classic paradigm for the regulation of gene expression involves the ability to alterpromoter activity to regulate the expression of the adjacent gene. The epigenome plays an importantrole in this mechanism through histone modifications that fine tune the expression of the adjacentgene [21]. In contrast to histones, DNA methylation can be distal and not correlated with promoterregions, yet appears to regulate genome activity [22, 23]. Although major alterations in DNA methylation of promoters clearly can alter geneexpression, distal regulatory sites also have an important role in gene regulation [22, 24]. One of the best examples of such a mechanism involves imprinted genes such as H19and IGF2 [17]. The DNA methylation region of the imprinted gene in the promoter of the adjacent generegulates allele-specific gene regulation for a wide number of genes. An additional role for theseepigenetic DNA methylation sites can also be to influence distal gene expression through animprinting control region (ICR) [23].

The ICR for IGF2 and H19 [17, 25] has been shown to act through long non-coding RNA (lncRNA) and distally for over amegabase in either direction to regulate the expression of multiple genes [26, 27]. Therefore, an epigenetic DNA methylation region can regulate the expression of a numberof distal genes [17, 28]. Similar observations have also been made in plant systems [29, 30]. The speculation is made that a large family of epigenetic sites will have the ability toregulate the expression of multiple genes distally. These regions we term 'epigenetic controlregions' (ECRs). The ICR previously identified will likely be a subset of a larger family of suchregions not required to have an imprinted gene characteristic, but use a variety of mechanisms fromnon-coding RNA to chromatin structural changes. The current study was designed to identify thepotential presence of such ECRs in the epigenetic transgenerational inheritance model investigated.The existence of such ECRs can help explain how subtle changes in the epigenome may have dramaticeffects on the transcriptome of a cell type or tissue.

Environmentally induced epigenetic transgenerational inheritance of adult-onset disease andphenotypic variation [2] involves the germ line transmission of an imprinted-like epigenome (for example, DNAmethylation) [4] that subsequently affects the transcriptomes of all cell types and tissues throughout thelife of the individual derived from that germ line. The current study identifies transgenerationaltranscriptomes in all the tissues investigated in both female and male progeny. A systems biologyapproach was used to investigate the molecular and cellular pathways and processes common to theepigenetic transgenerational inheritance of the tissue transcriptomes identified. Gene bionetworkanalysis was used to identify underlying gene networks that may provide insight into the epigeneticcontrol of the differential gene expression. Combined observations identified potential ECRs thathelp explain, in part, how a tissue-specific transgenerational transcriptome was generated and how asubtle alteration in the germ line epigenome may promote adult onset disease phenotypes.


Transgenerational transcriptomes

The experimental design involved developing F3 generation Harlan Sprague Dawley rat control andvinclozolin lineage male and female adult animals as previously described [1, 5]. The F0 generation gestating females were transiently exposed to vinclozolin or vehicle(DMSO) control during embryonic day 8 to 14 (E8 to E14) and then F1 generation offspring bred toproduce the F2 generation followed by production of the F3 generation as described in the Materialsand methods. No sibling or cousin breedings were used to avoid any inbreeding artifacts. Animalswere aged to 4 months and then sacrificed to collect from males the testis, seminal vesicle,prostate, liver, kidney and heart; and from females the ovary, uterus, liver, kidney and heart. Atotal of six different control and six different vinclozolin F3 generation lineage animals, each onefrom different litters, were used and microarrays ran on each tissue using three pools of twoanimals each. A total of 66 microarrays were run on F3 generation control and vinclozolin lineagemale and female rat tissues. The microarray data were obtained and compared for quality control asshown in Additional file 1. All microarrays within a tissue set comparedwell with no outliers, so all were used in subsequent data analysis. A comparison of control lineageand vinclozolin lineage tissues was made to identify the differentially expressed genes consistentbetween all animals and microarrays with a minimum of a 1.2-fold change in expression and meandifference of raw signal >10 as previously described [31]. As outlined in the Materials and methods, since a 20% alteration in gene expression canhave cellular and biological impacts, particularly for transcription factors, the gene expressionused a 1.2-fold cutoff that had a statistical difference rather than minimize the list with a morestringent cutoff value. The mean difference cutoff was used to eliminate background level signalexpression changes. Differential gene expression with a statistical significance of P <0.05 was used to identify the differentially expressed gene sets for each tissue; these are termedthe 'signature list'. These less stringent criteria led to a relatively larger number of genes forthe subsequent network analysis that can further filter out noisy signal using advanced softthresholding techniques. The signature lists for all tissues are presented in Additional file5 and genes categorized functionally. A summary of the signature list genesets is presented in Figure 1.

Figure 1
figure 1

Number of differentially expressed genes and pathways that overlap between signaturelists. The total number of genes or pathways for a signature list is shown in bold and onlypathways with three or more affected genes are counted. F, female; M, male; SV, seminal vesicle.

The general overlap of genes between the tissues and between males and females is shown in Figure1. These differentially expressed genes in the various tissues representtransgenerational transcriptomes in the F3 generation. No predominant overlap with large numbers ofdifferentially exposed genes were found between the different tissues and between male and femalelists (Figure 1). A specific comparison of genes between the tissues for maleand female is presented in Figure 2. Venn diagrams show the majority ofdifferentially expressed genes are tissue-specific with negligible overlap among all tissues.Therefore, each tissue had a predominantly unique transgenerational transcriptome and negligibleoverlap was observed between male and female tissues.

Figure 2
figure 2

Venn diagrams of male and female tissue signature lists of F3 generation vinclozolin lineagedifferentially expressed genes. (a) Female (F) heart, kidney, liver, uterus, and ovary.(b) Male (M) heart, kidney, liver, testis, and prostate. (c) Male kidney, testis,seminal vesicle (SV), and prostate. (d) Female heart and kidney and male heart and kidney.Numbers in brackets are the total number of genes in the signature list.

The specific differentially expressed genes were placed in Gene Ontology (GO) functionalcategories from Affimetrix annotations and similar trends were found among the different tissuesignature lists and between the male and female lists. Therefore, no specific functional categorieswere predominant in any of the individual lists and no major differences exist. The categories areshown in Figure 3 for all tissues. Further analysis of specific cellularpathways and processes determined the number of genes associated with the various tissue signaturelists. A list of those pathways containing the highest number of genes altered within the pathway orprocess for the top 30 is provided in Table 1. A more extensive list ofdifferentially expressed genes correlating to specific pathways and processes is provided inAdditional file 6. Observations demonstrate no predominant pathways orcellular processes were associated with the various signature lists. In contrast, a relatively largenumber of pathways and processes were influenced by all the tissue signature lists (Figure 1).

Figure 3
figure 3

Number of genes differentially expressed in F3 generation vinclozolin lineage tissues andtheir distribution among main functional categories. (a) Male (M) heart, kidney, liver,testis, seminal vesicle (SV), and prostate. (b) Female (F) heart, kidney, liver, uterus, andovary. ECM, extracellular matrix.

Table 1 Pathway enrichment for 11 male and female rat tissue signature lists

Gene bionetwork analysis

Gene networks were investigated using a previously described bionetwork analysis method [31] that utilizes all the array data to examine coordinated gene expression and connectivitybetween specific genes [32, 33]. Initially, cluster analysis of the differential gene expression lists was used toidentify gene modules, which were then used to identify gene networks and functional categories. Theconnectivity index ( for individual genes is shown in Additional file 5 and the number of connections for each gene with a cluster coefficient for male andfemale list comparisons is shown in Additional file 2. A cluster analysiswas performed on the combined male tissue signature lists, the combined female tissue signaturelists and a combination of all female and male signature lists (Figure 4).Gene modules were identified that involved coordinated gene expression and connectivity between thegenes assessed. The modules are shown in colors on the axes, with white indicating no connectivityand red highest connectivity (Figure 4). The heat diagram identified modulesas boxed gene sets and assigned them a specific color. The combined male and female cluster analysisdemonstrates strong modularity (Figure 4c), but the sexually dimorphictransgenerational transcriptomes identified in Figure 2 suggest thatsex-specific cluster analysis and modules will be more informative, and these were used in allsubsequent analyses. A list of sex-specific modules and represented gene sets are shown in Table2. Identification of co-expressed gene modules is actually a process toenhance the signal by filtering out noisy candidates using advanced soft thresholding and networktechniques. To access the robustness of the approach with respect to different cutoffs for detectingdifferentially expressed genes, we also constructed additional male and female co-expressionnetworks based on a more stringent mean difference cutoff of a 1.5-fold change in gene expression.The 1.5-fold networks have a smaller number of modules than their counterparts, but all the modulesfrom the 1.5-fold networks all significantly overlapped (Fisher's exact test P-values < 1.6e-7)with the modules identified in the previous networks based on a mean difference cutoff of 1.2-foldchange in gene expression.

Figure 4
figure 4

Gene bionetwork cluster analysis of 11 male and female tissues with corresponding genemodules. Topological overlap matrixes of the gene co-expression network consisting of genesdifferentially expressed in 11 tissues of the F3 vinclozolin lineage compared to F3 control lineageanimals. Genes in the rows and columns are sorted by an agglomerative hierarchical clusteringalgorithm. The different shades of color signify the strength of the connections between the nodes(from white signifying not significantly correlated to red signifying highly significantlycorrelated). The topological overlap matrix strongly indicates highly interconnected subsets ofgenes (modules). Modules identified are colored along both column and row and are boxed. (a)Matrixes of the combined network for six male tissues. (b) Matrixes of the combinednetwork for five female tissues. (c) Matrixes of the combined network for 11 male and femaletissues.

Table 2 Overlap of male and female signature list genes with network modules

The correlation of the gene modules with cellular pathways and processes is shown in Additionalfile 7. A relatively even distribution is observed for the variouspathways with no significant over-representation. As observed with the tissue signature lists,similar pathways with the largest numbers of genes affected are represented (Additional file 7). Therefore, no predominant cellular pathway or process was observed within thegene modules identified.

Gene network analysis was performed to potentially identify the distinct or common connectionsbetween the various tissue signature lists and gene modules identified. A direct connectionindicates a functional and/or binding interaction between genes while indirect connections indicatethe association of a gene with a cellular process or function. This analysis used theliterature-based Pathway Studio software described in the Materials and methods. Analysis of thefemale gene modules identified only one module (turquoise) that had a direct connection network(Additional file 3A). The gene network analysis of the male modules foundthat the yellow, brown and turquoise modules have direct connections (Additional file 3). None of the other female or male modules had direct connection gene networks.Therefore, no specific gene networks were common between the gene modules. The possibility that thetissue signature lists of differentially expressed genes may contain gene networks was alsoinvestigated. The majority of tissue signature lists confirmed the direct connection gene networks(Additional file 4). Analysis of the individual tissue gene networks didnot show any major overlap or common regulatory gene sets within the different gene networks.Therefore, each tissue acquires a different and unique gene network that is also distinct betweenthe sexes (sexually dimorphic; Additional file 4).

The cluster analysis (Figure 4) identified gene modules with genes withcoordinated gene regulation and a connectivity index ( was identified (Additional files 2 and 5). The top 10% of genes from each module with thehighest connectivity index were combined for male (258 total genes) and female (75 total genes) genemodules, and gene networks identified for the male and female gene sets (Figure 5). The combined female gene module top 10% connectivity gene network identified only fivedirectly connected genes as critical components of the network. This indicates the general lack ofan underlying gene network in the female tissue modules. The combined male gene module networkidentified over 30 directly connected genes as critical components (Figure 5b). Although the tissue-specific gene networks are different and unique (Additional file4), a combined gene network of the most highly connected and criticalgenes in the gene modules was identified for the male. Although a common gene network among thevarious tissues does not appear to be involved in the epigenetic transgenerational inheritancemechanism, a network involving the most connected genes between the tissues was identified for themale (Figure 5). Observations suggest additional molecular mechanisms may beinvolved.

Figure 5
figure 5

Direct connection gene sub-networks for the top 10% interconnected genes from each module ofthe separate networks for female and male obtained by global literature analysis. (a)Female; (b) male. Directly connected genes only are shown according to their location inthe cell (on the membrane, in the Golgi apparatus, nucleus, or cytoplasm or outside the cell). Nodeshapes: oval and circle, protein; diamond, ligand; circle/oval on tripod platform, transcriptionfactor; ice cream cone, receptor; crescent, kinase or protein kinase; irregular polygon,phosphatase. Color code: red, up-regulated genes; blue, down-regulated genes. Arrows with a plussign indicate positive regulation/activation; arrows with a minus sign indicate negativeregulation/inhibition; grey arrows represent regulation; lilac arrows represent expression; purplearrows represent binding; green arrows represent promoter binding; yellow arrows represent proteinmodification.

Epigenetic control regions

The total number of all differentially expressed genes in the tissue signature lists was 1,298for female and 3,046 for male (Figure 1). The possibility that the chromosomallocation of these genes may identify potential regulatory sites was investigated. All the genes forthe female and male were mapped to their chromosomal locations and then a sliding window of 2 Mb wasused to determine the regions with a statistically significant (Z test, P < 0.05)over-representation of regulated genes (Figure 6a,b). The analysis identifiedgene clusters in regions 2 to 5 Mb in size on nearly all chromosomes that have a statisticallysignificant over-representation of regulated genes (Table 3). Several ECRs areup to 10 Mb, which we suspect involves adjacent ECRs. As these regions were associated with theepigenetic transgenerational inheritance of these tissue-specific transcriptomes, we termed them'epigenetic control regions'. The specific ECRs are presented in Figure 7 forthe female and male combined signature lists. A comparison of the female and male tissue ECRsdemonstrated many were in common. The common and sex-specific ECRs are shown in Figure 7. The number of differentially regulated genes associated with these ECRs rangedfrom 5 to 70 (Table 3). Selected ECRs from the male and female were mapped todemonstrate the differentially expressed genes in the ECRs (Figure 8). An ECRcommon between male and female in chromosome 10 is shown in Figure 8a. TheECRs may provide a coordinated mechanism to regulate a set of functionally related genes that areexpressed in different tissues (Additional file 8). Therefore, a limitednumber of regulatory sites such as the identified ECRs could regulate tissue-specific and sexuallydimorphic gene expression from similar regions. However, the current study was designed simply toidentify the ECRs, and their functional role remains to be established. The genes within the maleand female ECRs were used to generate gene networks. The female ECR-associated genes generated anetwork with connection to cellular differentiation, cellular acidification and endocytosis (Figure9a). The male ECR-associated genes generated a network linked with a largernumber of cellular processes (Figure 9b). Therefore, no predominant genenetwork or cellular process was associated with the identified ECRs.

Figure 6
figure 6

Chromosomal locations of differentially expressed genes. (a) Chromosomal plot ofdifferential gene expression (arrow head) and ECRs (box) for five female tissue types (heart,kidney, liver, ovaries and uterus). (b) Chromosomal plot of ECRs for six male tissue types(heart, kidney, liver, prostate, seminal vesicle and testis). (c) Chromosomal plot showingclustering of male tissues and female tissues. Insets show tissue identification color code. F,female; M, male; SV, seminal vesicle.

Table 3 Gene clusters and epigenetic control regions
Figure 7
figure 7

Chromosomal plot showing gene clustering in epigenetic control regions of male tissues andfemale tissues overlapped with rat long non-coding RNA (arrowheads). Inset provides colorcode.

Figure 8
figure 8

Representative epigenetic control regions (ECRs) identifying regulated genes in singleECRs. (a) ECR selected from male and female tissues combined (overlapped). (b) ECRselected from female only tissues. (c) ECR selected from male only tissues. The location ofall genes (total genes) on chromosomes 1, 2 and 10 are shown in megabases and regulated genes arenamed. The arrowhead identifies the location of a known rat long non-coding RNA.

Figure 9
figure 9

Shortest cell processes connection gene sub-networks for genes of selected female ECRchr2-188.8 and male ECR chr1-204.75. (a) ECR chr2-188.8. (b) Male ECR chr1-204.75.Node shapes: oval and circle, protein; diamond, ligand; circle/oval on tripod platform,transcription factor; ice cream cone, receptor; crescent, kinase or protein kinase; irregularpolygon, phosphatase. Color code: red, up-regulated genes; blue, down-regulated genes. Arrows with aplus sign indicate positive regulation/activation; arrows with a minus sign indicate negativeregulation/inhibition; grey arrows represent regulation; lilac arrows represent expression; purplearrows represent binding; green arrows represent promoter binding; yellow arrows represent proteinmodification. AA, amino acid; FAS; FGF, fibroblast growth factor; INS, insulin; LRP2, low densitylipoprotein receptor-related protein 2; ROS, reactive oxygen species.

Previously, the ICRs identified have been shown to be associated with lncRNAs. Similar distalregulation involving lncRNAs has also been shown in plants [29, 30]. The rat genome lncRNAs have not been fully characterized [34], but 20 rat lncRNAs have been reported. The possibility that these known rat lncRNAs maycorrelate with the identified ECRs was investigated (Figures 7 and 8). Interestingly, over half the known rat lncRNAs did correlate with the male andfemale ECRs. A full list of all these lncRNAs is provided in Additional file 9. Although more extensive characterization of the rat lncRNAs is required, those fewknown rat lncRNAs did correlate strongly with the identified ECRs. The functional role of theselncRNAs within the ECRs remains to be elucidated.

Vinclozolin-induced sperm epimutations associated with epigenetic transgenerational inheritanceof adult-onset disease phenotypes have been reported [4]. Comparison of the chromosomal locations of 21 F3 generation sperm epimutations with theidentified ECRs showed that they are correlated. Although specific sperm epigenetic alterations andclustered gene expression may be functionally related, further research regarding the specificepigenetic modifications within the ECRs remains to be investigated.


Environmentally induced epigenetic transgenerational inheritance of adult-onset disease requiresan epigenetically modified germline to transmit an altered baseline epigenome between generations [1, 2]. The current study utilized the commonly used agricultural fungicide vinclozolin [35], which has been shown to induce epigenetic transgenerational inheritance of disease [1, 5] and permanently alter the sperm epigenome (DNA methylation) [4]. Vinclozolin has been shown to promote in F3 generation lineage animals a number ofadult-onset diseases, including of testis, prostate, kidneys, the immune system, and behavior andcancer [5, 36]. This high degree of a variety of adult-onset disease states suggests that baselinealteration of the sperm epigenome influences the subsequent development and function of most tissuesand cell types [16]. Other factors shown to promote epigenetic transgenerational inheritance of diseaseinclude bisphenol A [8, 37], dioxin [8, 38], pesticides [1, 8], hydrocarbons (jet fuel) [8] and nutrition [39, 40]. Therefore, a number of environmental factors have been shown to promote epigenetictransgenerational inheritance of phenotypic variation and this occurs in most species [2]. The current study was designed to investigate how an altered germline epigenome promotestransgenerational adult-onset disease in a variety of different tissues.

Upon fertilization, the germline (egg or sperm) forms the zygote and the developing embryoundergoes a de-methylation of DNA to create the totipotent embryonic stem cell. As the earlyblastula embryo develops, DNA re-methylation is initiated, promoting tissue- and cell-specificdifferentiation [14, 15]. A set of imprinted gene DNA methylation regions are protected from this de-methylationevent to allow the specific DNA methylation pattern/programming to be transmitted betweengenerations [17, 41]. The identified vinclozolin-induced transgenerational alterations in the sperm epigenome(epimutations) [4] appear to be imprinted and transmit the altered DNA methylation regions betweengenerations [2]. The mechanisms that allow a differential DNA methylation region to be protected from DNAde-methylation in the early embryo are not known, but are speculated to involve specific proteinassociations and/or other epigenetic factors. In addition, during early fetal gonadal development,the primordial germ cell DNA is de-methylated, which also involves imprinted genes. The imprintedsites are then re-methylated to maintain their original DNA methylation pattern/programming throughunknown mechanisms. Therefore, how both imprinted sites and the transgenerational epimutationsescape and/or reprogram to their original state remains to be elucidated and is a critical mechanismto be investigated in future studies. The epigenetic transgenerational inheritance of the alteredsperm epigenome results in a modified baseline epigenome in the early embryo that will subsequentlyaffect the epigenetic programming of all somatic cells and tissues [16, 19]. The epigenome directly influences genome activity such that an altered baselineepigenome will promote altered transcriptomes in all somatic cells and tissues [16]. The current study was designed to test this hypothesis and examine the transcriptomes ofa variety of tissues.

The previously observed epigenetic transgenerational inheritance of adult-onset disease involveddisease in a variety of different tissues (prostate, kidney, testis, ovary), but no apparent diseasein other tissues (liver, heart) [5]. Previous clinical observations have demonstrated that some tissues are more highlysusceptible to develop disease than others. An alteration in the baseline epigenome andtranscriptome of a tissue in certain tissues may increase susceptibility or promote disease, whileothers can tolerate the alterations and maintain normal function. The environmentally inducedepigenetic transgenerational inheritance of adult-onset disease may be due to a baseline alterationin epigenomes and transcriptomes in somatic cells of tissues susceptible to these changes anddisease.

The experimental design involved the isolation of six different tissues from males and fivetissues from females. These tissues were obtained from young adult rats prior to any disease onset.The F3 generation control and vinclozolin lineage animals from different litters were used andtissues obtained from six different animals for each sex, tissue and lineage. A microarray analysiswas used to assess transgenerational alterations in the tissue-specific transcriptomes betweencontrol versus vinclozolin lineage animals. The differentially expressed genes for a specific tissueare referred to as a signature list. Analysis of the various tissue signature lists demonstratednegligible overlap among tissues or between sexes. Therefore, the transgenerational transcriptomeswere observed in all tissues, but each tissue had a sexually dimorphic tissue-specifictransgenerational transcriptome. The hypothesis that an altered transgenerational germline epigenomewould promote transgenerational alterations in all somatic transcriptomes is supported by theobservations of the current study. The initial bioinformatics analysis involved examination of thevarious tissue signature lists to correlate the involvement of cellular signaling pathways orprocesses among the various signature lists. The majority of pathways included genes from eachsignature list, but none were predominant among the signature lists. Gene functional categories thatwere generally predominant in the cell, such as signaling or metabolism, were also the mostpredominant among the signature lists. Therefore, a common pathway or process was not present amongthe observed transgenerational transcriptomes.

A more extensive analysis of the differentially expressed genes in all the tissues involved apreviously described gene bionetwork analysis [31, 42]. The coordinated gene expression and connectivity between the regulated genes wasconsidered in a cluster analysis (Figure 4). Gene modules of interconnectedgenes with coordinated gene expression were identified in both a combined male and female signaturelist analysis, and separate male and female analyses. Although defined modularity was identified inthe combined analysis, the sexually dimorphic transgenerational transcriptomes and distinct tissuephysiology suggested the separate male and female analyses would be more informative. Thesex-specific modules were used to determine if any over-represented gene sets were present inspecific tissues. Generally, each tissue had a specific module of differentially regulated genes(Table 2). For example, prostate was predominant in the male turquoise moduleand female heart in the female turquoise module. In contrast, in the analysis of cellular signalingpathways or processes, the gene modules did not have over-represented pathways (Additional file7). The tissue-specific modules did not generally reflect a specificpathway or process. Therefore, the gene bionetwork analysis identified gene modules associated withspecific tissues, but the modules did not generally contain predominant cellular pathways orprocesses.

The transgenerational transcriptome data analysis was extended with a literature-based genenetwork analysis. Direct connection networks (DCNs), involving genes with direct functional and/orbinding links, were identified for a number of the male and female gene modules, but the majoritydid not have specific gene networks. Each DCN corresponds to a previously identified co-expressedgene module. Specifically, the nodes of a DCN were the members of the corresponding co-expressedgene module but the links in the DCN were based on the literature and known databases. The moduleswith an identified gene network suggest that those specific tissues and abnormal physiology arepotentially regulated by the network (Table 2; Additional file 3). The female turquoise module associated with the heart, male yellow moduleassociated with testis, male brown module associated with kidney, liver and seminal vesicle, andmale turquoise module associated with prostate. Each of these gene networks is unique and provides apotential regulated gene set associated with abnormal tissue pathology. Future studies will need toconsider these gene networks with regard to the pathophysiology of the specific tissues. Analternative gene network analysis involved the different tissue signature lists and tissue-specificdirect connection gene network analysis (Additional file 4).Tissue-specific gene networks were identified for female heart, kidney, ovary and uterus, and formale heart, kidney and liver. Similar to the observed lack of overlap between the tissue-specificsignature lists (Figure 2), negligible overlap was found between thetissue-specific gene networks (Additional file 4). These tissue-specificdirect connection gene networks also provide regulated sub-networks of genes associated with thepreviously identified abnormal transgenerational tissue pathologies [5]. Interestingly, the gene network associated with the female turquoise module was similarto the female heart tissue-specific gene network. This regulated female heart network provides aninterconnected gene set that could be investigated in future studies on heart pathophysiology. Thefinal direct connection gene network analysis involved the combined male tissue and combined femaletissue regulated gene sets. The combined female tissue network involved a small network of sixgenes, suggesting a gene network was not common among the different female tissues. The combinedmale tissue network involved a larger gene set of over 30 genes (Figure 5),which had elements similar to the male kidney network (Additional file 4).The similarities suggest this gene network may be associated with the observed kidneypathophysiology and needs to be investigated in future studies [5]. Although this combined male tissue direct connection gene network suggests a potentialcommon regulatory gene set among the tissues, the tissue-specific transgenerational transcriptomeshave negligible overlap (Figure 2) and distinct tissue-specific gene networks(Additional file 4). Observations suggest the transgenerational somatictranscriptomes are primarily tissue-specific without common gene networks or specific pathwaysassociated with the adult-onset disease that developed in the specific tissues.

To understand how a limited number of sperm epimutations can lead to such a diverse geneexpression profile between tissues, an epigenetic mechanism needs to be considered. As discussed,somatic cells and tissues will have a shift in the baseline epigenome derived from sperm thatpromotes distinct cellular and tissue differentiation [16, 19]. Therefore, it is not surprising each cell type has a distinct epigenome andtranscriptome to promote cell-specific differentiated functions. The classic dogma that a gene'spromoter is the central regulatory site involved in regulating its expression is not sufficient toexplain the over 4,000 genes differentially regulated between the different tissues examined (Figure1). A potential alternative epigenetic mechanism involves an ECR that canregulate gene expression within a greater than 2 Mb region together with, for example, lncRNAs andchromatin structure. An example of such a mechanism has been previously described as an ICR, wherean imprinted DNA methylation site (for example, H19 and IGF2) influences a lncRNAto regulate gene expression for over a megabase in either direction [17, 22, 23, 27]. The imprinted H19 and IGF2 loci together with a lncRNA have been shownto distally regulate the expression of multiple different genes [17, 25, 26, 28]. These ICRs are likely a small subset of a larger set of ECRs, most not involvingimprinted gene sites. Another example has been shown in plants where lncRNAs regulate distal geneexpression associated with specific plant physiological phenotypes [29, 30]. The current study used the various tissue transgenerational transcriptomes to identifythe potential presence of ECRs.

The ECRs were defined as having a statistically significant (Z test) over-representation of geneexpression within an approximately 2 Mb region. The male and female sets of differentially expressedgenes were used separately to identify regions with statistically significant (Z test)over-representation (P < 0.05). The differentially expressed genes were mapped to thechromosomes and then a 2 Mb sliding window was used to identify potential ECRs (Figures 6 and 7). For the male, over 40 ECRs were identified, and forthe female, approximately 30 ECRs were identified. Approximately half the ECRs were found to be incommon between male and female (Figure 7). The ECRs identified ranged from 2to 5 Mb in size and the numbers of genes regulated ranged from 5 to 50 (Table 3). Interestingly, different genes in different tissues were found to be expressed withinthese ECRs (Additional file 8). The majority of the expression sites ofcurrently known rat lncRNAs correlated with the identified ECRs (Figure 7;Additional file 9). Therefore, it is proposed that a single ECR couldregulate tissue-specific gene expression that has been programmed during differentiation to expressa specific set of genes within the ECR. This could explain how a limited number of epimutationscould have a much broader effect on genome activity and clarify how tissue-specifictransgenerational transcriptomes develop. The current study outlines the association of geneexpression with the potential ECRs, but does not provide a functional link between epigeneticdifferential DNA methylation regions or lncRNAs and gene expression regulation within them.Therefore, future studies are now critical to assess the functional role of these ECRs andunderlying epigenetic mechanisms.


A systems biology approach was taken to elucidate the molecular mechanism(s) involved inenvironmentally induced epigenetic transgenerational inheritance of adult-onset disease. The currentstudy identifies tissue-specific transgenerational transcriptomes with tissue-specific genenetworks. A combination of epigenetic and genetic mechanisms is required to reach thesedifferentiated tissue states that can not be explained through genetic or epigenetic mechanismsalone. The identification of potential epigenetic control regions that regulate regions of thegenome in a coordinated manner may help explain in part the mechanism behind the process ofemergence [43]. In a revolutionary systems biology consideration the emergence of a phenotype or processinvolves the coordinated and tissue-specific development of unique networks (modules) of geneexpression [44]. Since the initial identification of epigenetics [45], its role in system development at the molecular level has been appreciated. The currentstudy suggests a more genome-wide consideration involving ECRs and tissue-specific transcriptomesmay contribute, in part, to our understanding of how environmental factors can influence biology andpromote disease states.

Combined observations demonstrate that environmentally induced epigenetic transgenerationalinheritance of adult-onset disease [2] involves germline (sperm) transmission of an altered epigenome [4] and these epimutations shift the base line epigenomes in all somatic tissues and cellsderived from this germline [16]. This generates tissue-specific transgenerational transcriptomes that do not involvecommon gene networks or pathways, which associate with the adult-onset disease in the tissues. Alltissues develop a transgenerational transcriptome, which helps explain the phenotypic variationobserved. Some tissues are sensitive to shifts in their transcriptomes and develop disease, whileothers are resistant to disease development. The observation that all tissues develop a specifictransgenerational transcriptome can help explain the mechanism behind complex disease syndromes.Those tissues sensitive to developing disease will be linked into a complex disease association dueto these transgenerational transcriptome modifications. This epigenetic mechanism involves ECRs thatcan have dramatic effects on genome activity and promote tissue-specific phenomena. Although thefunctional roles of these ECRs remain to be investigated, their potential impact on expanding ourconcepts of gene regulation, the elucidation of emergent properties of unique gene networks, andproviding links to various tissue functions and diseases are anticipated to be significant. Theobservations provided help elucidate the molecular mechanisms involved in environmentally inducedepigenetic transgenerational inheritance of adult-onset disease and the phenotypic variationidentified.

Materials and methods

Animal procedures

All experimental protocols involving rats were pre-approved by the Washington State UniversityAnimal Care and Use Committee. Hsd:Sprague Dawley®™SD®™female and male rats of an outbred strain (Harlan, Indianapolis, IN, USA) were maintained inventilated (up to 50 air exchanges per hour) isolator cages containing Aspen Sani chips (pinewoodshavings from Harlan) as bedding, on a 14 h light: 10 h dark regimen, at a temperature of 70°Fand humidity of 25% to 35%. Rats were fed ad libitum with standard rat diet (8640 Teklad22/5 Rodent Diet; Harlan) and ad libitum tap water for drinking.

At proestrus as determined by daily vaginal smears, the female rats (90 days of age) werepair-mated with male rats (120 days). On the next day, the pairs were separated and vaginal smearswere examined microscopically. In the event sperm were detected (day 0) the rats were tentativelyconsidered pregnant. Vaginal smears were continued for monitoring diestrus status until day 7.Pregnant rats were then given daily intraperitoneal injections of vinclozolin (100 mg/kg/day) withan equal volume of sesame oil (Sigma, St. Louis, MO, USA) on days E8 through E14 of gestation [6]. Treatment groups were Control (DMSO vehicle) and Vinclozolin. The pregnant female ratstreated with DMSO or vinclozolin were designated as the F0 generation.

The offspring of the F0 generation were the F1 generation. The F1 generation offspring were bredto other F1 animals of the same treatment group to generate an F2 generation and then F2 generationanimals bred similarly to generate the F3 generation animals. No sibling or cousin breedings wereperformed so as to avoid inbreeding. Note that only the original F0 generation pregnant females wereinjected with the DMSO or vinclozolin.

Six female and six male rats of the F3 generation Control and Vinclozolin lineages at 120 days ofage were euthanized by CO2 inhalation and cervical dislocation. Tissues, includingtestis, prostate, seminal vesicle, kidney, liver, heart, ovary and uterus, were dissected from ratsand were processed and stored in TRIZOL (Invitrogen, Grand Island, NY, USA) at -80°C until RNAextraction. High quality RNA samples were assessed with gel electrophoresis and required a minimumOD260/280 ratio of 1.8. Three samples each of control and treated ovaries were applied tomicroarrays. For each of three Vinclozolin or Control microarray samples, RNA from two rats werepooled. The same pair of rats was used for each tissue type.

Microarray analysis

The microarray hybridization and scanning was performed by the Genomics Core Laboratory, Centerfor Reproductive Biology, Washington State University, Pullman, WA using standard Affymetrixreagents and protocol. Briefly, mRNA was transcribed into cDNA with random primers, cRNA wastranscribed, and single-stranded sense DNA was synthesized, which was fragmented and labeled withbiotin. Biotin-labeled single-stranded DNA was then hybridized to the Rat Gene 1.0 ST microarrayscontaining more than 30,000 transcripts (Affymetrix, Santa Clara, CA, USA). Hybridized chips werescanned on an Affymetrix Scanner 3000. CEL files containing raw data were then pre-processed andanalyzed with Partek Genomic Suite 6.5 software (Partek Incorporated, St Louis, MO, USA) using anRMA (Robust Multiarray Average), GC-content adjusted algorithm. Raw data pre-processing wasperformed in 11 groups, one for each male or female tissue. Comparison of array sample histogramgraphs for each group showed that data for all chips were similar and appropriate for furtheranalysis (Additional file 1).

The microarray quantitative data involve signals from an average 28 different oligonucleotides(probes) arrayed for each transcript and many genes are represented on the chip by severaltranscripts. The hybridization to each probe must be consistent to allow a statistically significantquantitative measure of the resulting gene expression signal. In contrast, a quantitative PCRprocedure uses only two oligonucleotides and primer bias is a major factor in this type of analysis.Therefore, we did not attempt to use PCR-based approaches as we feel the microarray analysis is moreaccurate and reproducible without primer bias.

All microarray CEL files from this study have been deposited with the NCBI gene expression andhybridization array data repository Gene Expression Omnibus (GEO series accession number [GSE35839])and can also be accessed through the Skinner Laboratory website [46]. For gene annotation, Affymetrix annotation file RaGene1_0stv1.na32.rn4.transcript.csvwas used.

Network analysis

The network analysis was restricted to genes differentially expressed between the control and thetreatment groups based on previously established criteria of fold change of group means ≥1.2,a mean difference >10, and P-value ≤ 0.05. A change in gene expression of 20% formany genes, particularly transcriptome factors, has been shown to have important cellular andbiological effects. Therefore, the 1.2-fold cutoff was selected to maintain all expressioninformation and not a more stringent one to simply reduce the gene list size. To eliminate baselinesignal gene expression changes, a mean difference >10 was used. All genes required a statisticaldifference P < 0.05 to be selected. The union of the differentially expressed genesfrom the tissues resulted in 5,266 genes for males and 1,909 for females being identified and usedfor constructing a weighted gene co-expression network [47, 48]. Unlike traditional un-weighted gene co-expression networks in which two genes (nodes)are either connected or disconnected, the weighted gene co-expression network analysis assigns aconnection weight to each gene pair using soft-thresholding and thus is robust to parameterselection. The weighted network analysis begins with a matrix of the Pearson correlations betweenall gene pairs, then converts the correlation matrix into an adjacency matrix using a powerfunction: f(x) = xβ. The parameter β of thepower function is determined in such a way that the resulting adjacency matrix (that is, theweighted co-expression network) is approximately scale-free. To measure how well a network satisfiesa scale-free topology, we use the fitting index proposed by Zhang and Horvath [47] (that is, the model fitting index R2 of the linear model thatregresses log(p(k)) on log(k) where k is connectivityand p(k) is the frequency distribution of connectivity). The fitting index of aperfect scale-free network is 1.

To explore the modular structures of the co-expression network, the adjacency matrix is furthertransformed into a topological overlap matrix [49]. As the topological overlap between two genes reflects not only their direct interactionbut also their indirect interactions through all the other genes in the network. Previous studies [47, 49] have shown that topological overlap leads to more cohesive and biologically meaningfulmodules. To identify modules of highly co-regulated genes, we used average linkage hierarchicalclustering to group genes based on the topological overlap of their connectivity, followed by adynamic cut-tree algorithm to dynamically cut clustering dendrogram branches into gene modules [50]. Such networks were generated from combined 6 male or 5 female differentially expressedgene sets (2 networks) or from combined male and female 11-tissue signature lists. From 9 to 20modules were identified in either of 3 networks and the module size range was from 7 to 1,040genes.

To distinguish between modules, each module was assigned a unique color identifier, with theremaining, poorly connected genes colored grey. The hierarchical clustering over the topologicaloverlap matrix (TOM) and the identified modules is shown (Figure 4). In thistype of map, the rows and the columns represent genes in a symmetric fashion, and the colorintensity represents the interaction strength between genes. This connectivity map highlights thatgenes in the transcriptional network fall into distinct network modules, where genes within a givenmodule are more interconnected with each other (blocks along the diagonal of the matrix) than withgenes in other modules. There are a couple of network connectivity measures, but one particularlyimportant one is the within module connectivity ( The of a gene was determined by takingthe sum of its connection strengths (co-expression similarity) with all other genes in the module towhich the gene belonged.

Gene co-expression cluster analysis clarification

Gene networks provide a convenient framework for exploring the context within which single genesoperate. Networks are simply graphical models composed of nodes and edges. For gene co-expressionclustering, an edge between two genes may indicate that the corresponding expression traits arecorrelated in a given population of interest. Depending on whether the interaction strength of twogenes is considered, there are two different approaches for analyzing gene co-expression networks:1) an unweighted network analysis that involves setting hard thresholds on the significance of theinteractions; and 2) a weighted approach that avoids hard thresholds. Weighted gene co-expressionnetworks preserve the continuous nature of gene-gene interactions at the transcriptional level andare robust to parameter selection. An important end product from the gene co-expression networkanalysis is a set of gene modules in which member genes are more highly correlated with each otherthan with genes outside a module. Most gene co-expression modules are enriched for GO functionalannotations and are informative for identifying the functional components of the network that areassociated with disease [51].

This gene co-expression clustering/network analysis (GCENA) has been increasingly used toidentify gene sub-networks for prioritizing gene targets associated with a variety of common humandiseases such as cancer and obesity [5256]. One important end product of GCENA is the construction of gene modules composed ofhighly interconnected genes. A number of studies have demonstrated that co-expression networkmodules are generally enriched for known biological pathways, for genes that are linked to commongenetic loci and for genes associated with disease [42, 47, 5155, 57, 58]. In this way, one can identify key groups of genes that are perturbed by genetic locithat lead to disease, and that define at the molecular level disease states. Furthermore, thesestudies have also shown the importance of the hub genes in the modules associated with variousphenotypes. For example, GCENA identified ASPM, a hub gene in the cell cycle module, as amolecular target of glioblastoma [55] and MGC4504, a hub gene in the unfolded protein response module, as a targetpotentially involved in susceptibility to atherosclerosis [53].

Pathway and functional category analysis

Resulting lists of differentially expressed genes for each male or female tissue were analyzedfor gene functional categories with GO categories from the Affymetrix annotation site. Each modulegenerated in male or female network analysis were analyzed for KEGG (Kyoto Encyclopedia for Genesand Genome, Kyoto University, Japan) pathway enrichment using the KEGG website 'Search Pathway'tool. Global literature analysis of various gene lists was performed using Pathway Studio 8.0software (Ariadne Genomics, Inc., Rockville, MD, USA) and used to generate the direct and indirectgene connection networks.

Chromosomal location of ECRs

An R-code was developed to find chromosomal locations of ECRs. A 2 Mb sliding window with 50,000base intervals was used to find the associated genes in each window. A Z-test statistical analysiswith P < 0.05 was used on these windows to find the ones with over-representation ofdifferentially expressed genes. The consecutive windows with over-represented genes were mergedtogether to form clusters of genes termed ECRs. Typical ECR regions range from 2 to 5 Mb, with thelargest being 10 Mb.