Background

Pregnancy and parturition require an intricate interplay between maternal and fetal factors, orchestrated by the placenta, which lies at the interface between mother and fetus. The placenta performs multiple functions critical for fetal survival, growth, and development, including transport of gases, nutrients, and waste products, hormone production, protection of the fetus from maternal immune attack, and anchorage of the fetus to the uterus [1]. The role of the placenta as a key organ of pregnancy is well demonstrated by the fact that placental pathology is associated with adverse maternal and fetal outcomes such as preterm birth (PTB), intrauterine growth restriction (IUGR), and preeclampsia (PE) [13].

The value of placental examination is well recognized in the setting of PTB, for instance, which complicates over 12% of all pregnancies in the U.S. [35]. Histological examination of the placenta, which is frequently carried out to explore possible causes of preterm delivery, has been a useful tool for identifying lesions commonly associated with PTB, such as chorioamnionitis [3]. In cases where no remarkable histologic abnormalities are found, investigation into molecular alterations causing placental dysfunction could provide insight into the pathogenesis of prematurity.

The normal function of the placenta depends on its structural integrity, and the proper growth and development of its structural components require the finely tuned regulation of relevant genes. Thus, alterations in gene expression and RNA processing may represent one of the major molecular mechanisms underlying pathological pregnancies. Previously, numerous studies have investigated changes in global human placental gene expression associated with gestational age [6], physiologic labor [7, 8] or pathological conditions [9]. The two most comprehensive gene expression profiling studies related to the placenta used microarray analysis to characterize four different components of the human placenta in 76 individuals [10] and the mouse placenta over the whole course of pregnancy [11]. Although those microarray studies have provided useful insights into the placental transcriptome, they were limited in depth in that they only examined gene-level expression changes, and did not have the resolution to investigate the complexity of the placental transcriptome that arises from changes in RNA processing.

Alternative splicing (AS) is a common mechanism of gene regulation in higher eukaryotes, occurring in over 90% of multi-exon genes in the human genome [12, 13]. AS is regulated by complex interactions between cis-acting splicing elements and trans-acting factors [14]. Many splicing regulators have tissue-specific expression patterns, resulting in widespread differences in AS patterns across different tissues. In addition to playing a critical role in regulating normal gene functions, AS is also frequently involved in diseases [15, 16]. Previous studies have revealed associations between AS of individual genes and human pregnancy complications [1719]. For example, the soluble isoform of the fms-like tyrosine kinase-1 (sFlt1) arising from AS and polyadenylation is significantly up-regulated in placentas of women with PE [19], and encodes a potent inhibitor of the vascular endothelial growth factor (VEGF) [18]. Despite such interesting anecdotal examples, the global patterns of AS of human genes have not been examined systematically in the placenta.

In this study, we used high-throughput RNA-Seq to conduct a genome-wide analysis of the normal placental transcriptome. RNA-Seq is a powerful technology for transcriptome analysis that allows global characterization of gene expression and AS at the nucleotide resolution [20]. Given the heterogeneity in tissue composition of the placenta and the importance of both fetal and maternal factors in normal and pathological pregnancy, we separately examined three placental tissue components: the amnion and chorion of fetal origin, and the maternally derived decidua [1]. The amnion and chorion were obtained from the extraplacental membranes (reflected membranes), which provide a purer source of the fetal membranes compared with those overlying the chorionic plate. The decidua was dissected from the surface of the basal plate of the placenta, which has close relevance to normal placental physiology. We observed a wide spectrum of gene-level and exon-level transcriptome differences both between placenta and other human tissues and between distinct compartments of the placenta. Our work provides the first high-resolution profiles of gene expression and AS characteristic of different parts of the normal human placenta.

Results

Overview of the RNA-Seq data

We sequenced pooled mRNA of amnion, chorion, and decidua separately taken from five normal term placentas (3 from male infants and 2 from female infants). For each of the placental tissues, we generated 2 lanes of paired-end Illumina RNA-Seq data with 54 bp and 72 bp in read length and 23-33 million reads of each lane, for a total of 50-60 million paired-end reads per tissue. We only used 50 bp of each end for mapping and analysis based on the sequencing error profile. In addition, we also obtained the Illumina Human Body Map 2.0 (HBM2.0) data with 73-83 million 50 bp paired-end reads from 16 normal non-placental human tissues (adipose, adrenal, brain, breast, colon, heart, kidney, liver, lung, lymph node, ovary, prostate, skeletal muscle, testes, thyroid and white blood cells). We mapped the sequence reads of each tissue to the reference human genome sequence (hg19) as well as all possible exon-exon junctions (Ensembl genes, r57). We obtained a high mapping rate with 70-90% and 7-10% of reads mapped to the reference genome and exon-exon junctions, respectively (Table S1 in Additional file 1). 70-80% of the mapped paired-end reads were uniquely mapped pairs and were used for subsequent analysis.

Global analysis of gene expression in placenta and other human tissues

Using the uniquely mapped read pairs, we estimated the expression levels of 22,523 protein-coding genes (Ensembl genes, r57) in each tissue using the "Fragments Per Kilobase of gene per Million mapped fragments" (FPKM) metric [21] in a way similar to RPKM [22] (see details in Methods). With a coverage depth ranging from 50 to 80 million paired-end reads per tissue, we detected the expression (i.e. FPKM > 0) of the majority of the protein-coding genes (66-84% for each of the 19 tissues). Approximately half of the genes were expressed with FPKM > 1 (Table S2 and Figure S1 in Additional file 1). We investigated the similarity in the global gene expression profiles among the three placental compartments and 16 HBM2.0 tissues using average linkage hierarchical clustering of the top 1,000 most divergent genes (Figure 1). The three placental tissues clustered more closely with one another than with the other 16 tissues, suggesting the existence of a placenta-specific gene expression signature. In addition, we also observed genes with distinct expression patterns among amnion, chorion, and decidua, indicating that each compartment of the placenta has its unique expression signature, possibly reflecting differences in their functions and/or biological activities.

Figure 1
figure 1

Hierarchical clustering analysis of differentially expressed genes among placental and other human tissues. We calculated expression levels of 51,682 Ensembl genes in each tissue and selected those expressed with FPKM > 5 in 8 or more tissues, which were then ranked based on their coefficient of variation (CV). The heat map was generated by average linkage hierarchical clustering of the top 1,000 differentially expressed genes, using 1-Pearson correlation coefficient as the distance metric. Scaled expression values are color-coded according to the legend in the top left corner.

To obtain a more detailed picture of genes potentially important for normal placental function, we compared the RNA-Seq gene expression profiles between the three placental compartments and the 16 HBM2.0 tissues to identify two types of genes with preferential expression in the placenta: (1) placenta-enriched genes, defined as genes with an FPKM value of at least 1 and greater than 4-fold difference in FPKM between any of the three placental tissues and the average of the 16 non-placental tissues as similarly defined in another study [11]; and (2) placenta-specific genes, defined as genes whose RNA-Seq reads were only detected in the placenta but not in any of the 16 non-placental tissues.

We identified 938, 865, and 944 genes with at least 4-fold enriched expression in amnion, chorion, and decidua, respectively, as compared to non-placental tissues, including 216 genes shared among the three compartments of the placenta. We also used a similar strategy to generate a list of 758 placenta-enriched genes using the GeneAtlas microarray data set covering whole placental and other human tissues [23] (see Methods for further details). Among the 758 array-based placenta-enriched genes, 297 were found to be enriched in one of the 3 placental tissues according to our RNA-Seq data, representing a significant overlap between the array and RNA-Seq results (p = 2.2e-119, Fisher's exact test). The difference between the array and RNA-Seq based gene lists could be due to the difference in platforms as well as in tissue samples used for expression profiling. We also used a similar approach to identify tissue-enriched genes in each of the 16 HBM2.0 tissues (15 other HBM2.0 tissues were used as the background). Of all 19 tissues, the three placental tissues were among the tissues with the highest number of tissue-enriched genes, with only testes, brain and white blood cells topping the placental tissues (Figure 2a).

Figure 2
figure 2

Analysis of placenta-enriched and -specific genes. (a) Number of tissue-enriched (blue bar) and tissue-specific (red bar) genes. Tissue-enriched genes were defined as genes with more than 4-fold change in expression and minimum FPKM of 1. (b) Proportions of overlapping genes between the placenta-enriched gene list and the MGI or PTB gene list (see text and Methods for details). The lighter shade indicates the proportion of non-placentaenriched genes while the darker shade indicates the proportion of placenta-enriched genes. P-values were determined by Fisher's exact test. (c) Expression profile of the 70 placenta-enriched MGI list genes. Gene expression values were normalized for each gene and color-coded using the same scheme depicted in Figure 1. (d) Expression patterns of placenta-specific genes in amnion, chorion, and decidua. Color scheme is based on log10(FPKM value).

The RNA-Seq data also allowed us to identify genes whose expression was restricted to the placenta (i.e. not a single read detected in any of the 16 non-placental tissues). We identified a total of 170 placenta-specific genes in the three placental compartments combined. We also used the same criteria to identify tissue-specific genes within the 16 HBM2.0 tissues. Consistent with the pattern observed for the tissue-enriched genes, the three placental tissues were among the tissues with the highest number of tissue-specific genes, only after testes and brain. Taken together, these data indicate abundant tissue-specific activation of gene transcription in the placenta.

Genes enriched in or specific to the placenta play important roles in placental function and pregnancy-related diseases

In order to understand the functional significance of the genes with enriched expression (EE) in the placenta, we asked whether these genes have been implicated in placental biology and/or pregnancy disorders. We compiled two lists of human genes using the Mouse Genome Informatics (MGI) database [24, 25] and the PTBGene database [26, 27]. The MGI list consisted of human genes whose mouse orthologs are associated with abnormal placental phenotypes when disrupted. The PTB list consisted of genes collected from the literature on genetic association studies on preterm birth (PTB). We found that the placenta-enriched genes overlapped with 70 genes (19%, p = 1.9e-9) in the MGI list and 20 genes (24%, p = 1.7e-5) in the PTB list, significantly overrepresented compared to random expectation (Figure 2b).

Many of the genes associated with placental abnormalities in mice (see the heat map of their expression patterns in Figure 2c) were previously known to be involved in physiological and pathological processes related to pregnancy, with examples including prolactin receptor (PRLR) and insulin-like growth factor 2 (IGF2). The PTB list was particularly enriched with interleukin-1 (IL1)-related genes, including IL1R1, IL1RN, IL1B, and IL1A. We also found genes overlapping with both the MGI and PTB lists, such as coagulation factor II (thrombin) receptor (F2R) and vascular endothelial growth factor A (VEGFA).

To gain more insight into key processes that may possibly explain functional differences among the three placental tissues, we carried out functional annotation analysis of placenta-enriched genes identified in each of the three placental tissues compared with the other 16 human tissues using DAVID [28, 29]. The analysis revealed significant enrichment (p < 0.05 after Bonferroni correction) of Gene Ontology (GO) terms and KEGG pathways involved in a wide range of biological processes, including focal adhesion, vasculature development, wound healing, and extracellular matrix (ECM)-receptor interaction (Table 1). Of particular note is that there was no significantly enriched GO term shared among all three placental tissues, indicating that each compartment of the placenta has its unique profile of active genes involved in different biological processes.

Table 1 Gene Ontology (GO) analysis of placenta-enriched genes

Although there was no GO annotation shared by all three compartments, we identified several biologically relevant enriched categories that overlap between the two membranous compartments amnion and chorion. For example, epithelium development, one of those categories, explains a common compositional feature that exists between the two tissues with both at least partially consisting of a layer of cells that are epithelial in origin (the amniotic epithelium and extravillous cytotrophoblast) [30]. The enrichment of cell/biological adhesion-related genes supports the role of the two membranes as a barrier protecting the fetus from external mechanical force, which requires substantial involvement of cell adhesion molecules. Of note is that we also observed an overrepresentation of mesoderm development in both tissues when we performed our analysis using a different annotation system PANTHER [31, 32], which reflects a common structural feature shared by the two membranes.

Among the non-overlapping GO terms, it was noted that there was significant overrepresentation of vascular-related GO terms such as blood vessel development, vasculature development, blood vessel morphogenesis, and angiogenesis in the chorion, while these terms were absent from the amnion, an avascular tissue. One of the genes belonging to these categories is VEGFA, which is an extensively studied gene that acts as a signal triggering the induction of angiogenesis [33] and has been implicated in pregnancy complications [3436].

We found that three GO terms are significantly enriched for the decidua with female pregnancy being the most enriched category, consistent with the role of decidua as a principal source of hormones and cytokines pivotal in the maintenance of pregnancy. It was noted that many of the genes associated with female pregnancy have also been implicated in pregnancy-related disorders. These genes include transforming growth factor beta 1 (TGFB1) and placental growth factor (PGF) in PE [3739] and corticotropin releasing hormone (CRH) in preterm labor or delivery [40, 41].

For placenta-specific genes, we further removed genes with extremely low FPKM values (< 0.3) in the placental tissues, which could represent genes with universal low expression in all tissues but sampled by RNA-Seq in the placenta by chance. This led to a final set of 24 placenta-specific well-annotated protein-coding genes with FPKM > 0.3 in at least one placental tissue. The placenta-specific genes are highly enriched for genes encoding pregnancy-related hormones, including pregnancy-specific glycoproteins (PSG s), chorionic somatomammotropin hormones (CSH s), and chorionic gonadotropin, beta polypeptides (CGB s) (Figure 2d).

Expression profiles of splicing factors (SFs) in placental and other human tissues

The deep RNA-Seq data also allowed us to go beyond whole transcript level changes, to identify transcript isoform changes due to pre-mRNA alternative splicing (AS). Splicing factors (SFs) are RNA binding proteins that play key roles in AS regulation [14]. Tissue- and cell-type specific expression of SFs is a major mechanism that drives AS differences among human tissues [42]. For example, brain-specific SFs NOVA1, NOVA2, and FOX1 control a large number of brain-specific AS events [43]. The epithelial-specific splicing factor ESRP1 is transcriptionally silenced during the epithelial-to-mesenchymal transition, which flips the switch off for a genome-wide epithelial splicing regulatory network [44].

To identify SFs with a placenta-specific increase or decrease in expression levels, we compiled a list of sixty well-studied SFs [14, 45], and analyzed their RNA-Seq FPKM gene expression levels in the placenta and 16 other human tissues. Hierarchical clustering of the 60 SFs revealed a sub-cluster among the three placental compartments, (Figure 3a), consistent with the clustering pattern based on all genes (Figure 1). This cluster analysis recapitulated the known tissue-specific expression patterns of SFs, such as the brain-specific expression of NOVA1, NOVA2, FOX1 (also known as A2BP1), and BRUNOL4. Interestingly, we identified several SFs with compartment-specific changes in expression levels in the placenta, most notably ESRP1 (in amnion) and MBNL3 (in decidua) (Figure 3b), which we confirmed by qRT-PCR (Figure S2 in Additional file 1). ESRP1 and MBNL3 are known to regulate splicing of a large number of genes in epithelial cells [46] and during myogenic differentiation [47], suggesting a unique set of AS events in individual placental compartments downstream of these master splicing regulators. We also identified several ubiquitously expressed SFs with a significant difference in expression levels among the three placental compartments. For example, FOX2 (also known as RBM9), an important splicing regulator in the heart, muscle, and neurons [14], was expressed two-fold higher in amnion compared to chorion and decidua. Together, the expression profiles of SFs suggest tissue-specific regulation of AS between the placenta and other tissues and between different compartments of the placenta.

Figure 3
figure 3

Expression profile of splicing factors in placental and other human tissues. (a) Heat map showing the expression levels of 60 selected splicing factors across all 19 tissues. Scaled expression values are color-coded according to the legend in the top left corner. Clustering of genes and tissues are both generated by average linkage hierarchical clustering using 1-Pearson correlation coefficient as the distance metric. (b) Expression levels of 3 splicing factors differentially expressed between placental and other human tissues. Each bar labeled HBM2.0 (in blue) represents mean expression value of all 16 HBM2.0 tissues.

RNA-Seq and RT-PCR analysis of exon skipping events in placental and other human tissues

To directly identify AS differences between the placenta and other human tissues, we calculated the exon inclusion level (Ψ) of alternatively spliced cassette exons in each tissue using RNA-Seq reads that are uniquely mapped to the upstream, downstream, and skipping exon-exon junctions of alternatively spliced exons as previously described [13]. We used a Bayesian approach MATS (M ultivariate A nalysis of D ifferential S plicing) [48] to perform pairwise comparisons of tissue pairs to test if the difference in Ψ of any alternatively spliced exon between two tissues exceeds 10% (see Methods for details). Between the three compartments of the placenta, approximately 0.1% of exons were found to be differentially spliced (FDR < 0.1). In contrast, there was a much greater degree of splicing difference between placental and other human tissues, with 1.6% of exons, on average, being differentially spliced between one of the placental tissues and one of the 16 HBM2.0 tissues (Figure 4a). It should be noted that given the moderate sequencing depth of 50-83 million reads per tissue, this analysis is expected to have an appreciable level of false negatives. The true extent of splicing differences among these tissues could be considerably larger.

Figure 4
figure 4

Summary of differential splicing events identified by RNA-Seq. (a) Number (percentage in parentheses) of exons with differential inclusion levels ((Δ|Ψ| > 0.1, FDR < 0.1) between given tissue pairs. (b) Venn diagrams showing the distribution and overlap of exons (left) and genes (right) in the three placental tissues that are differentially spliced between the placental and non-placental tissues.

In order to boost the power of RNA-Seq splicing analysis and obtain a robust set of splicing differences between the placental and non-placental tissues, we pooled the RNA-Seq data of all HBM2.0 tissues. We then compared the pooled data to that of each placental tissue. We identified 393, 637, and 402 differentially spliced exons (in 275, 464, and 289 genes) when comparing the pooled non-placental tissues to amnion, chorion, and decidua, respectively (Figure 4b). 129 exons (in 76 genes) were shared among the three placental tissues. On the other hand, the majority (74%) of differentially spliced exons identified were restricted to only one of the three placental tissues as compared to the non-placental tissues (Figure 4b). Importantly, among the 744 genes containing differentially spliced exons between placental and non-placental tissues, we observed a significant enrichment for genes in the MGI list (2.8% over 1.4% for the genome background, p = 0.001 based on Fisher's exact test), indicating the importance of tissue-specific AS in placental function and development. For example, one of these exons (ENSE00000882762) was in integrin, alpha 6 (ITGA6), which forms heterodimers with other integrin components and plays a crucial role in cell adhesion and migration [49, 50]. We observed a high inclusion level of this exon in amnion and chorion compared to most of the other tissues, with close to 100% exon inclusion in amnion as validated by fluorescently labeled RT-PCR (Figure 5a). Exon (ENSE00001385284) in another integrin gene ITGB4 was frequently skipped in the placental tissues (Figure 5b). TCIRG1 (T-cell, immune regulator 1, ATPase, H + transporting, lysosomal V0 subunit A3) is another differentially spliced gene with multiple known isoforms produced by AS [51, 52]. As shown in Figure 5c, the inclusion level of one of its exons (ENSE00000736978) was significantly lower in amnion.

Figure 5
figure 5

Examples of exons with splicing differences between placental and HBM2.0 tissues. (a) Exon ENSE00000882762 in ITGA6. (b) Exon ENSE00001385284 in ITGB4. (c) Exon ENSE00000736978 in TCIRG1. Shown on the left-hand side are wiggle plots of RNA-Seq read coverage and RT-PCR gel images for validation of differential splicing events generated for placental and HBM2.0 tissues. UJC, DJC, and SJC indicate upstream, downstream, and skipping junction counts, respectively. Star mark in (c) indicates an additional alternatively spliced product detected by using the given primer pairs. Represented on the right-hand side are histograms showing exon inclusion levels obtained from RNA-Seq (blue bar) and RT-PCR (red bar) experiments. The values represented by red bars correspond to the numbers shown on the top of the gel pictures.

To further confirm the RNA-Seq results of exon splicing, we randomly selected 34 exons in total (including the 3 aformentioned exons) for fluorescently labeled RT-PCR. Using an independent set of term placental samples (N = 4) that were not used in the RNA-Seq experiments, we validated the predicted differential splicing events of 27 exons, yielding a validation rate of 79%. The RNA-Seq difference in exon inclusion levels between the placental tissues and the pooled non-placental tissues strongly matched the RT-PCR results (Pearson's correlation coefficient = 0.78) (Figure 6a).

Figure 6
figure 6

Validation of differentially spliced exons between placental and other tissues.(a) Correlation of exon inclusion level differences between placental and HBM2.0 tissues estimated by RNA-Seq (x-axis) and by RT-PCR (y-axis). The dots are color-coded based on the placental compartment to which the values for other tissues were compared. The grey line indicates y = x. Two dashed lines indicate the 0.1 inclusion level difference, which was used to select target exons for validation. (b) Significant enrichment of ESRP1 targets among exons that are differentially spliced between amnion and other tissues. The darker and lighter shades indicate the proportions of exons with and without splicing differences (according to RNA-Seq) between amnion and other tissues, respectively. P-value was determined by Fisher's exact test. (c) An example of ESRP1 target exons differentially spliced in amnion. Shown are a wiggle plot of RNA-Seq read coverage for MINK1 (top) and a gel image of RTPCR products (bottom). Exon inclusion level for each tissue is shown on the top of the gel picture. Star mark in gel picture (c) denotes PCR products of unexpected sizes possibly resulting from the usage of cryptic splice sites.

The splicing factor ESRP1 regulates tissue-specific splicing in amnion

The placenta-specific increase in the expres levels of certain master splicing regulators such as ESRP1 and MBNL3 raises the possibility that downstream exon targets of these regulators may have altered splicing activities in the placenta over non-placental tissues. To test this, we studied the splicing factor ESRP1, which had 5.4 fold higher expression in amnion over the average of the 16 HBM2.0 tissues (Figure 4b). Of note, among the exons validated by RT-PCR as differentially spliced between amnion and non-placental tissues, several were known ESRP1 targets (such as those in ITGA6, LAS1L, MAP3K7, LRRFIP2 and KIF13A; see Figure 6a). To assess the overall enrichment of ESRP1 target exons among differentially spliced exons in amnion, we collected 167 RT-PCR validated ESRP1 target exons from our previous genome-wide analysis of ESRP1-regulated splicing events in epithelial and mesenchymal cells [46]. Of the 167 known ESRP1 target exons, 131 were expressed and detectable in our data. Among them, a significantly enriched set of 20 exons exhibited differential splicing in amnion compared to other human tissues according to RNA-Seq data (Fisher's exact test, p = 4.3 e-33) (Figure 6b).

Given our moderate sequencing depth in the placental tissues, it is possible that additional ESRP1 target exons with differential splicing in amnion were missed by RNA-Seq. We therefore selected additional 21 ESRP1 target exons besides the aforementioned 5 validated exons for RT-PCR analysis, resulting in 26 exons tested in total. Seven of those exons did not have any RNA-Seq reads presumably due to their relatively low expression levels and the limited coverage depth of our sequencing data. We confirmed that 12 of the 26 ESRP1 target exons showed more than 10% changes in splicing in amnion, with known ESRP1-enhanced exons having increased splicing activities, and known ESRP1-silenced exons having decreased splicing activities. One of the validated ESRP1 target exons was in misshapen-like kinase 1 (MINK1), which has an important role in cell adhesion and motility [53]. The exon (ENSE00001213221) in MINK1, a known ESRP1 target had an inclusion level of > 90% in amnion, approximately 20-30% higher than those observed for other human tissues (Figure 6c). The increased splicing activity of this MINK1 exon was consistent with the previous observation that ESRP1 positively regulates the splicing of this exon [46].

Analysis of pathways influenced by tissue-enriched expression and differential splicing in placenta

The differential gene- and exon-level expression patterns observed between the placental and non-placental tissues may underlie gene pathways that have key roles in the normal biology of the placenta. To identify pathways and molecular networks influenced by placenta-specific gene expression and splicing, we constructed functional interaction (FI) networks [54] covering genes with enriched expression (EE) and genes with differential splicing (DS) in amnion, chorion and decidua compared to other human tissues. These genes were used as query sets and projected onto a functional interaction network of human genes constructed from diverse genomic data sources [54]. We used the edge betweenness algorithm [55] to find functional modules in the network, each of which contained enriched functional annotation terms (pathways) that describe the biological roles of genes that are grouped together.

The results of our analysis performed on each of the three placental tissues showed significant enrichment of many functional pathways (Table S3 in Additional file 2), including those involved in the regulation of SMAD2/3 signaling, TGF-beta receptor signaling, and HIF-1 alpha TF network, which were significantly overrepresented in module 0 of all the amnion, chorion, and decidua FI networks (shown in Figure 7 is module 0 of the chorion FI network).

Figure 7
figure 7

Functional interaction network analysis of genes with enriched expression (EE) and differential splicing (DS) in the placenta. Shown is module 0 of the interaction network constructed for chorion. Circular node: a query gene. Diamond-shaped node: a linker gene. Node color was determined based on whether the query gene shows EE (green), DS (pink), or both (red). Two large circular clusters represent highlighted significantly enriched pathways in chorion: glypican, SMAD2/3, and TGF-beta receptor signaling pathways (left-hand side; red lines) and HIF-1 alpha transcription factor signaling pathway (right-hand side; blue lines). FOS and SERPINE1 are shared by both groups of enriched pathways, but only shown in the right cluster. Several linker hub genes with dense connections with the highlighted pathways are also shown in bigger nodes.

The analysis performed on genes abundantly expressed and/or differentially spliced in all three placental tissues revealed strong overrepresentation of pathways related to integrin signaling and focal adhesion (Figure S3 in Additional file 1). These pathways were enriched with genes encoding collagens (COL17A1, COL7A1, COL5A1), laminins (LAMA3, LAMA5), filamins (FLNC, FLNA), integrin (ITGB4), and actinin (ACTN1), all of which are structural components of extracellular matrix (ECM). These results suggest the critical role of ECM in processes involved in normal placental biology. It is interesting to note that the network module contained an appreciable number of both differentially expressed and differentially spliced genes, suggesting that AS and gene transcription act in a coordinated manner to control the overall pathway activity in the placenta.

Novel transcriptional active regions (TARs)

One major advantage of RNA-Seq compared to microarray technology is its capability to detect un-annotated novel transcripts. To identify novel transcriptional active regions (TARs) in placental tissues, we used the software Scripture [56] for ab initio reconstruction of transcripts for each tissue after sequence mapping with Tophat [21] (see details in Methods). We identified approximately 100,000 transcripts in each of the placental tissues with more than 70% of them being multi-exon transcripts (Table 2). To reduce false signals, only multiexon transcripts were used in the following analysis. After overlapping transcripts were merged into one single TAR, a total of 13,469, 16,987, and 15,158 TARs were found in amnion, chorion, and decidua, respectively. We filtered out the ones overlapping with the annotated transcripts from the NCBI RefSeq, UCSC, Ensembl, and Vega database and identified 604, 1,007, and 896 novel TARs in amnion, chorion, and decidua, respectively. The expression levels of the identified novel TARs are listed in Table S4 in Additional file 3. Importantly, a large proportion of these novel TARs (285, 456, and 468 in the corresponding placental tissues) are placenta-specific or more than 4 fold enriched compared to non-placental tissues. Shown in Figure 8 is one example of novel TARs on chromosome 16 (chr16:50424807-50430893) expressed in amnion with a high FPKM value of 7.1. Of note, this transcript is not documented in any human gene databases, although the existence of human expressed sequence tags (ESTs) at this locus further supports the validity of this TAR (Figure 8).

Table 2 Novel transcriptional active regions (TARs) and exons discovered in placental tissues
Figure 8
figure 8

An example of novel transcriptional active regions (TARs) identified in the present study. Shown is a novel TAR on chromosome 16 found in amnion. A wiggle plot of RNA-Seq read coverage, structures of 3 alternatively spliced transcripts and ESTs were shown from top to bottom. Note that there is no gene annotated in this region in the indicated annotation databases.

We also used RNA-Seq data to identify novel exons in annotated genes. There are a total of between 93 and 103 thousand exons identified in the TARs overlapping with annotated genes. Although more than 80% of these exons were well annotated with the same 5' and 3' ends, we detected between 494 and 585 totally new exons with no sequence overlap with any annotated exons in the placental tissues. These novel TARs and exons provide a valuable resource for novel transcripts with potential functional significance in the placenta.

Discussion

With the emergence of new high-throughput technologies such as RNA sequencing, we have recently witnessed a remarkable increase in our knowledge of mammalian transcriptome content and diversity. There has been a particular surge in our understanding of the transcriptome diversity between different tissues and cell types. For example, Wang et al. performed an RNA-Seq analysis of 15 human tissues and cell lines and identified over 22,000 tissue-specific AS events [13]. Other studies have established the association between tissue-specific expression of SFs and genome-wide changes in tissue-specific splicing patterns [42, 45], which underscores a critical role of AS regulation in tissue differentiation and specialization.

The majority of previous gene expression studies of human placental tissue have only provided gene-level insights [610], driving the need for higher-resolution analysis to enable a better understanding of the complexity of the placental transcriptome at the level of exon splicing. AS, which has a well-established role in cell differentiation [57, 58], may be critical for the proper functioning of the placenta, an organ composed of a variety of differentiated cell types, each with its own specific functions during pregnancy. Thus, uncovering the complexity of AS in the placental transcriptome will provide a valuable basis for understanding genes with functional and clinical relevance in placental biology and pathophysiology.

In the present study, we used RNA-Seq to characterize the transcriptome of selected compartments of the human placenta from normal term pregnancies. RNA-Seq allows an unbiased and sensitive interrogation of the full repertoire of placental mRNA transcripts. We took a two-step approach to analyze the RNA-Seq data at both the gene-level and the exon-level. First, we investigated differential gene expression between the placental and other human tissues to identify genes that are specifically or abundantly expressed in the placenta. Second, we carried out exon profiling as well as SF expression profiling to find AS events and their potential regulators that are differentially present in the placental versus non-placental tissues.

We have compared placenta-enriched genes to genes with putative functional significance in the placenta using the mouse phenotype data and human PTB association study data. We observed that genes implicated in placental abnormalities and PTB are enriched among the genes with placenta-enriched expression profiles. We note that the mouse phenotype data from MGI were generated independent of any previously known gene expression pattern in the placenta. Among such genes are PRLR and F2R, genes encoding receptors for prolactin and thrombin, respectively, whose levels are precisely regulated during pregnancy [59, 60]. The enrichment of IL1-related genes was also noted, suggesting the importance of IL1 signaling in normal placental function and pregnancy. IGF2, one of the genes associated with abnormal placental phenotypes in mice, is known for its active role in placental and fetal growth [61, 62]. Together, these provide a link between highly expressed placenta-enriched genes and their functional importance in the placenta. Similarly, our work provides evidence suggesting the importance of genes uniquely expressed in the placenta in diverse pregnancy-related processes, with examples including CSH1 in the regulation of fetal growth [63], CGB in the maintenance of early pregnancy [64, 65], and human leukocyte antigen-G (HLA-G) in feto-maternal immune tolerance [66, 67]. In addition, we observed a significant enrichment of differentially spliced genes in the placenta among genes with placental phenotypes in the mouse, suggesting the importance of tissue-specific AS in placental development and function.

Because the HBM2.0 data all came from adult tissues, it is possible that some placenta-enriched genes identified in our study reflect age-specific expression signatures. Because of the unavailability of RNA-Seq data from other fetal tissues, we assessed this possibility using the GeneAtlas array data [23]. There were 4 fetal tissues (brain, liver, lung, and thyroid) included in the GeneAtlas data. Of the 297 genes with at least 4-fold enrichment in the placenta over adult tissues in both the GeneAtlas array data and our RNA-Seq data, the vast majority (281 genes) were more than 4-fold enriched in the placenta compared with the 4 fetal tissues according to the GeneAtlas array data. This suggests that the placenta-enriched genes identified in our study reflect genuine placenta-associated gene expression signatures. In addition, the strong association of placental expression enrichment with placental disease-related gene sets further supports that most of the placenta-enriched genes found here reflect tissue effect rather than age effect.

Given the heterogeneous tissue composition of the placenta, we have characterized the transcriptome profiles of the placenta not only at the whole-organ level, but also at the sub-organ level. It should be noted that the placental samples used in our study (amnion, chorion, and decidua) may not be completely pure, containing minor contamination with other placental components. Nonetheless, our study demonstrated that they are highly enriched for the corresponding tissue types, displaying compartment-specific expression profiles and splicing patterns. The amnion is the innermost layer of the fetal membranes lining the amniotic cavity and is composed of an epithelial cell layer on top of a basement membrane and an avascular matrix [68, 69]. Consistent with these histological properties of the amnion, we have detected enrichment of genes involved in cell/focal adhesion and observed that the epithelial splicing regulator ESRP1 was highly expressed. Our splicing analysis of the amnion using RNA-Seq and RT-PCR revealed 20 and 12 known ESRP1 target exons, respectively, with differential splicing activities in the amnion. It should be noted that ESRP1 is a master cell-type-specific splicing regulator critical for maintaining the epithelial cell identity and has been implicated in a variety of developmental and disease processes [46]. The ESRP1 target exons are strongly enriched in genes involved in the regulation of cell adhesion such as the exon in MINK1[53], that was found to be differentially spliced in the amnion compared to other human tissues by RNA-Seq and validated by RT-PCR. These data support a role of the ESRP1 splicing regulatory network in the amnion. The chorion, the outer layer of the fetal membranes in contact with the decidua, consists of the reticular layer, the basement membrane, and the trophoblast layer [30]. Similar to the amnion, genes with a role in cell/biological adhesion are also enriched in the chorion, which may be important for the adherence of the trophoblast layer to the decidua [70]. The enrichment of genes involved in vascular-related processes in the chorion may be explained by velamentous vessels traversing the extraplacental membranes or maternal vessels in interdigitating decidua processed along with the chorion. Unlike the two fetal membranes, the decidua is of maternal origin [1, 10]. It is noteworthy that genes related to female pregnancy were significantly enriched in this compartment of the placenta, further supporting the crucial role of this tissue in pregnancy. Of note, we observed significant differential expression of a splicing factor MBNL3 in the decidua. In future studies, it would be useful to examine how MBNL3 globally impacts gene splicing and function in the decidua.

We also examined potential interactions among genes highly expressed and differentially spliced in the placenta compared to other human tissues by constructing FI networks composed of sub-network modules enriched for specific gene categories and functional pathways. Analysis performed separately on each of the three placental tissues revealed enrichment of set of pathways commonly enriched in all three compartments, for example, regulation of cytoplasmic and nuclear SMAD2/3 signaling and TGF-beta receptor signaling. These pathways are known to be involved in a wide range of cellular processes [71], which reflects the versatile function of the placenta that can be achieved through diverse cellular activities occurring in different parts of the placenta. Among its other main functions, the placenta plays an important role as an immune barrier, protecting the fetus from the mother's immune system [1]. This function is reflected by the enriched expression of transcription factors (TFs) involved in immune regulation such as GATA3 and IRF7 as well as the differential splicing of REL, a member of the Rel/NFKB family and NFATC2, a member of the nuclear factors of activated T cells transcription complex. HIF-1 alpha TF network is another pathway that was enriched in module 0 of all the three FI networks. The placenta, during its development, is exposed to different oxygen environments and tight regulation of oxygen homeostasis is necessary for proper placental development and function, which requires active involvement of the HIF-1 alpha TF network [72]. These findings suggest: (1) the common importance of these pathways in the functioning of the different parts of the placenta examined in the present study; and (2) the importance of the regulation of gene expression and AS as critical mechanisms underlying anatomical, developmental, and functional specialization of the placenta. When the analysis was performed on all of the tissues combined, we observed the overrepresentation of ECM-related gene sets such as integrin signaling pathway, ECM-receptor interaction, focal adhesion, and integrin cell surface interactions. These results provide evidence for the role of ECM in placental development and placental cell proliferation as demonstrated in earlier studies [73, 74].

Conclusions

Our study provides the first comprehensive view of the placental transcriptome at exon-level resolution, and reveals that tissue-specific gene regulation in the placenta involves complex changes in both gene transcription and exon splicing. Our data should serve as a valuable resource for future in-depth investigations into what genes contribute to specification of the placenta. All of the RNA-Seq data can be accessed as the raw RNA-Seq reads and as a processed UCSC Genome Browser custom track http://intron.healthcare.uiowa.edu/placenta/. Furthermore, the findings of this work may provide useful clues on how those genes/pathways, when altered at either the gene level or exon level, could lead to pregnancy-related diseases. Future research using tissues from abnormal conditions will help expand our knowledge of the transcriptome alterations and pathological processes involved in maternal and fetal complications.

Methods

Tissue collection

Fresh human placentas were obtained within one hour of normal vaginal delivery at term with signed informed consent under protocols approved by the University of Iowa Institutional Review Board (200506792, 200411759). The placentas were received largely intact when visually inspected. Each placenta was dissected into the fetal (amnion, chorion) and maternal (decidua) portions. The amnion and chorion were taken from the reflected membranes and separated by blunt dissection. Decidual tissue samples were macroscopically isolated from the maternal-facing surface of the placenta. The dissected tissues were cut into small pieces and placed in RNAlater® solution (Applied Biosystems, Foster City, CA). To ensure that our results better reflect the true nature of the normal term placental transcriptome, we used placentas from term (≥ 37 weeks of gestation) deliveries with spontaneous onset of labor.

RNA extraction

Total RNA was extracted from each tissue using the TRIzol® reagent (Invitrogen, Carlsbad, CA) according to manufacturer's instructions and stored at -80°C until used. For RNA-Seq, we prepared pooled amnion, chorion, and decidua samples, using an identical set of RNA from five different individuals. The pooled samples were of high quality with an RNA integrity number (RIN) > 8. For validation of differential splicing events and splicing factor expression, we generated RNA pools, each for amnion, chorion, and decidua, consisting of 4 biological replicates that are independent from those used in the RNA-Seq experiments. For validation experiments, we purchased total RNA representing all HBM2.0 tissues except white blood cells from Applied Biosystems (Foster City, CA) or Clontech (Mountain View, CA).

Library construction and sequencing

Library preparation and paired-end sequencing were performed by Ambry Genetics (Aliso Viejo, CA). Double-stranded cDNA fragments were synthesized from mRNA, ligated with adapters, and size-selected for library construction according to the manufacturer's protocol (Illumina, San Diego, CA). Each of the three libraries generated was loaded onto one lane of the flow cell at 8 pM concentration. Two paired-end runs (72 bp and 54 bp runs) of sequencing were carried out on the Illumina Genome Analyzer IIx. Initial data processing was performed using RTA 1.6.47.1 (SCS version 2.6.26). Sequence quality filtering script was executed in the Illumina CASAVA version 1.6.0 software (Illumina, Hayward, CA).

Sequence alignment

For each end (forward or reverse) of the paired-end reads from placenta, we trimmed the sequence to 50 bp based on the sequencing error profile. The HBM2.0 data consist of the following tissues: adipose, adrenal, brain, breast, colon, heart, kidney, liver, lung, lymph node, ovary, prostate, skeletal muscle, testes, thyroid and white blood cells. Each tissue came from a single adult donor with ages ranging from 19 to 86. The HBM2.0 data are accessible from EBI ArrayExpress track: http://www.ebi.ac.uk/arrayexpress/browse.html?keywords=E-MTAB-513. For HBM2.0, we used all the 50 bp from the paired end data. Each read was mapped to the reference human genome (hg19) as well as all possible exon-exon junctions (Ensembl genes, r57) as previously described [75]. Each exon-exon junction is 84 bp in length, containing the last 42 bp of the upstream exon and the first 42 bp of the downstream exon. We used Bowtie [76] to map those reads, allowing up to three mismatches and also required that each read has at most three possible mapped locations in either the human genome or all possible exon-exon junctions. For each pair of forward and reverse reads, we enumerated all possible combinations of mapped forward and reverse reads. We required that the two ends from the same read pair should be on the same chromosome but in the opposite orientation. Since 98.4% of human introns have length less than 50 kb (data not shown), we also required that the two ends should be within 50 kb of each other in the mapped genomic locations. Based on these criteria, we collected a set of uniquely mapped pairs to do the subsequent analysis.

Gene expression quantification using RNA-Seq data

We estimated the gene expression level using RNA-Seq by the Fragments Per Kilobase of gene per Million mapped fragments (FPKM). Ensembl release r57 was used for gene annotation. To avoid the ambiguity of assigning reads to different isoforms of the same gene and obtain a robust estimate of the overall gene expression levels, we used an exon union method by counting all reads mapped to any exon in any of the gene's isoforms. This approach is similar to the original RPKM definition [22] instead of the transcript isoform level estimate as in Cufflinks [21].

Placenta-enriched genes based on GeneAtlas array data

We selected 16 tissues from the Human GeneAtlas array dataset [23], consisting of whole placental tissue and 15 non-placental tissues. These 15 tissues are identical to those examined in the Human Body Map 2.0 project except that breast tissue is not included in the GeneAtlas data set. We compared the expression values from the whole placental tissue to the average values from 15 other human tissues and generated a list of 758 genes with at least 4-fold enrichment in the whole placenta.

Enrichment of GO functional categories and pathways

To identify the overrepresented functional categories among the genes with enriched expression or differential alternative splicing in the placenta compared to the 16 HBM2.0 tissues, we used the online functional annotation tool DAVID [28, 29, 77]. All the expressed protein-coding genes in the combined placenta and HBM2.0 data were used as the background. We used the GO_BP_FAT categories for GO biological process categories and KEGG and PANTHER annotation for pathway analysis. A modified Fisher's exact test (EASE score) from DAVID was used for testing the significance of functional category enrichment. The significant categories with a p-value < 0.05 after Bonferroni correction were reported.

Placental abnormality- and preterm birth-related genes

To obtain genes associated with abnormal placental phenotypes, we searched the MGI database [24, 25] for 4 MGI phenotypes (abnormal amnion morphology, MP:0005029; abnormal chorion morphology, MP:0002836; abnormal placenta morphology, MP:0001711; abnormal maternal decidual layer morphology, MP:0004256). These mouse genes were mapped to human based on the Human and Mouse Orthology in the MGI database to obtain the orthologous human genes. Preterm birth-related genes were taken from the preterm birth genetics knowledge base PTBGene [26, 27], a regularly updated and manually curated collection of genes implicated in published association studies of PTB.

Alternative splicing analysis

We focused our analysis on the exon-centric analysis. We only used those reads that uniquely mapped to the splicing junctions to estimate the exon inclusion level (Ψ) of alternatively spliced exons. We used the same formula as in [13]: Ψ = I+S. Suppose UJC, DJC, and SJC represent read counts of upstream junction, downstream junction and skipping junction respectively, then junction read counts from the exon-inclusion transcript (I) equal ((UJC + DJC)/2) and read counts from the exon-skipping transcript equal S. To find the placenta specific exon inclusion/skipping, we also pooled all the reads from the 16 HBM2.0 tissues to get a mean inclusion level of non-placental tissues. Utilizing the read counts information on the 3 types of junctions of each exon, we used a multivariate Bayesian algorithm MATS (Multivariate Analysis of Transcript Splicing) [48]. Briefly, MATS uses a multivariate uniform prior to model the between-sample correlation in exon splicing patterns, and a Markov chain Monte Carlo (MCMC) method coupled with a simulation-based adaptive sampling procedure to calculate the P value and false discovery rate (FDR) of differential AS. Importantly, the MATS approach provides the flexibility to identify differential AS events that match a given user-defined pattern. Suppose Ψ1 and Ψ2 are the exon inclusion levels of 2 tissues and we want to test if |Ψ1-Ψ2| > 10%, we can obtain the Bayesian posterior probability P = P(|Ψ1-Ψ2| > 10%) and subsequent P value and FDR. The MATS software can be downloaded from http://intron.healthcare.uiowa.edu/mats/.

Fluorescently labeled RT-PCR and qRT-PCR

We validated 2 sets of exons using RT-PCR. One set includes 34 exons that showed significant differential splicing (> 10% inclusion level difference with FDR < 0.1 between one of the three placental tissues and the pooled HBM2.0 tissues). Another set includes 21 known ESRP1 target exons that are predicted to have differential splicing due to differential expression of ESRP1 in amnion. Single-strand cDNA was synthesized from total RNA using the High Capacity cDNA Reverse Transcription Kit (Applied Biosystems) according to the manufacturer's protocol. Fluorescently labeled RT-PCR was performed as described [78]. Briefly, for each tested exon, we designed a pair of primers targeting flanking constitutive exons. Fluorescent labeling of PCR products was carried out according to a method modified from that of Schuelke [79]. PCR products were separated on a polyacrylamide gel and the fluorescence signal was captured and quantified using a Typhoon 9200 scanner (Molecular Dynamics, Sunnyvale, CA) and the Quantity One 4.6.2 software (Bio-Rad, Hercules, CA). To validate the expression levels of ESRP1 and MBNL3, qRT-PCR was performed using the Power SYBR Green PCR Master Mix (Applied Biosystems) and the 7900HT Fast Real-Time PCR System (Applied Biosystems). In each experiment, HPRT1 was used as an endogeneous reference. Three technical replicates were included for each sample. Data were generated using the SDS 2.3 software (Applied Biosystems) and analyzed using the comparative CT method [80]. All primer sequences used for this study and exon inclusion levels from both RNA-Seq and RT-PCR are provided in Table S5 in Additional file 4 and gel pictures are shown in Figure S4 and S5 in Additional file 5.

Functional interaction networks of genes with placenta-enriched expression or differential splicing

We combined the genes with placenta-enriched expression or differential splicing into 4 query gene sets: combination of placenta-enriched genes with FPKM > 1 and > 4 fold enrichment (EE) and genes significantly differentially spliced with FDR < 0.1 and |Ψ1-Ψ2| > 10% (DS) compared to the HBM2.0 tissues in each of the three placental tissues individually and the intersection set of all three tissues. We projected each of the query gene set onto the functional interaction network of human genes from the Reactome database [54] using the Reactome FI network plug-in in Cytoscape [81]. Edge-betweenness algorithm was used to cluster the network into modules [82]. Pathway enrichment analysis was done on the whole network and within each of the sub-network modules. The networks from representative modules are visualized in Cystoscape [81]. Enriched pathways with FDR < 0.05 in modules with size of at least 40 are listed in Table S3 in Additional file 2.

Discovery of novel transcriptional active regions (TARs)

Scripture software [56] was used for ab initio reconstruction of the transcripts for each tissue after mapping with Tophat [21]. Same as in the expression analyses, reads of the three placental tissues were trimmed at 3' end to 50 nt before mapping. As reported, starts and ends of reconstructed transcripts were usually not as accurate as splice sites, thus single-exon transcripts were removed in the analyses. The reconstructed transcripts were clustered into TARs when there were any overlaps between transcripts. Overlapping between two transcripts was defined when they are in the same strand and have at least one common internal exon boundary, which means that they have at least one common exon start site or exon end site. Novel TARs were determined by comparison with a combination of annotated transcripts from the NCBI RefSeq, UCSC, Ensembl, and Vega database. A TAR was considered as novel if there is no overlap of TAR with any annotated transcript using the above definition. We also examined the exon distributions within the TARs overlapping with annotated transcripts. Because the start and end of transcript annotations usually vary greatly, to compare the reconstructed exons within TARs overlapping annotated transcripts with the exon annotations, we only focused on the internal exons in our analysis. To compare in all tissues the expression levels of novel TARs identified in placental tissues, we first used the exons identified in the novel TARs, and then calculated FPKM values in the same way as in the analysis of known gene expression for all three placental tissues and 16 HBM2.0 tissues.

Data accessibility

All data described here can be accessed from: http://intron.healthcare.uiowa.edu/placenta/.