Background & Summary

The placenta is essential for the maintenance of pregnancy and the regulation of fetal growth and development1. Regulation of gene expression through epigenetic modifications2 and transcription factor availability3 is critical for healthy placental functioning4,5,6.

Small non-coding RNAs (sncRNAs) have the ability to regulate gene expression through a variety of epigenetic and post-transcriptional mechanisms7. Certain sncRNA subtypes, including microRNAs (miRNAs), have been associated with gene deregulation in pregnancy-associated diseases, including preeclampsia8. Specifically, miRNAs originating from the chromosome 14 and 19 miRNA clusters (C14MC and C19MC, respectively) have established contributions to placental gene regulation9,10,11.

However, studies of placental sncRNAs have focused almost exclusively on canonical miRNAs12,13, and not on other RNA species, such as PIWI-interacting RNAs (piRNAs), small nucleolar RNAs (snoRNAs), and miRNA variants (isomiRs) that can also influence the genetic and epigenetic regulation of transcription14,15,16. Furthermore, past efforts to identify human miRNAs have typically prioritized those that are highly expressed across multiple tissues17. MiRNA discovery efforts that focused on individual tissues have successfully identified novel, tissue-specific miRNAs that had previously been overlooked18,19, but the placenta has yet to be studied in this fashion. As a resource for future placental biology investigations, we have profiled and quantified the expression of annotated sncRNAs and determined the expression pattern of novel (previously-unannotated) miRNAs within the human placenta.

We isolated and sequenced the small RNA fractions of 32 placental (chorionic villi) samples of varying fetal sexes and gestational ages, 30 of which met our threshold for total high-quality reads (Table 1). Trimmed sequencing reads were input into the miRMaster platform, which performs quality filtering, aligns reads to annotated sncRNAs, quantifies sncRNA expression, and predicts novel miRNA sequences20. We considered a sncRNA to be ‘placentally expressed’ if it was present at ≥ 1 read per million (RPM) in at least 10% (3/30) of the placental samples. We considered a sncRNA to be expressed during a given trimester if it was present at ≥ 1 RPM in at least 10% of samples from that trimester. Raw data are made available for investigating sncRNA expression related to specific biological features21.

Table 1 Clinical data for analyzed placental samples.

A total of 1544 distinct sncRNAs were placentally expressed, 81% of which met our expression threshold across all trimesters (Fig. 1a)21. Due to this similarity, all subsequent characterization considers only these 1544 placentally expressed sncRNAs, which include miRNAs, piRNAs, snoRNAs, small nuclear RNAs (snRNAs), and transfer RNAs (tRNAs).

Fig. 1
figure 1

Summary of the quantity and expression levels of placentally expressed sncRNAs. (a) Count of sncRNAs, divided by subtype, that meet the expression cutoff (≥ 1 RPM in ≥ 10% of samples), across the entire sample cohort (n = 30), as well as across samples from a particular trimester. nov-miRNA: novel miRNA. (b) Mean total expression of all placentally expressed sncRNAs, divided by subtype, by trimester. (c) Mean total expression of all placentally expressed sncRNAs in each trimester, divided by subtype, and normalized relative to the mean total expression of sncRNAs of that subtype across all samples.

A total of 654 miRNAs were placentally expressed, along with 277 piRNAs, 231 snoRNAs, 161 snRNAs, and 221 tRNAs (Fig. 1a). For all trimesters, miRNA reads made up a large majority (88–93%) of the total sncRNA reads (Fig. 1b). Of the 654 identified miRNAs, 48 were novel miRNA sequences (Online-only Table 1, GSE164178)21. In both individual features, such as length and GC content, and composite metrics, such as novoMiRank score22 and miRMaster-computed probability of their precursor being a true precursor, these novel miRNAs closely resemble the annotated placentally expressed miRNAs (Table 2).

Table 2 Comparison between annotated and novel placentally expressed miRNAs.

In addition, there were 18,003 distinct isomiRs, which are natural variations of canonical miRNAs23, that were placentally expressed21. Additions or deletions of nucleotides at the 3′ end of isomiRs were far more prevalent (72% of all placentally expressed isomiRs) than at the 5′ end (18%) (Table 3). The relative proportions of isomiR types were similar across trimesters, but a greater total number of isomiRs were expressed during Trimesters 1 and 2 (17,870 and 16,541, respectively), than in term samples (12,444) (Table 3).

Table 3 Count of placentally expressed isomiRs by modifications and trimester.

For each sncRNA subtype, we calculated the fold change in RPM values for each trimester relative to the average RPM for all samples. Expression of miRNAs peaks in second trimester samples, while expression of piRNAs, snRNAs, snoRNAs, and tRNAs is lowest in second trimester samples (Fig. 1c). PiRNA expression is highest in first trimester samples, while snRNA, snoRNA, and tRNA expression is highest in term samples (Fig. 1c).

SncRNAs of all subtypes, including novel miRNAs, were found to be expressed from almost all chromosomes (Fig. 2a). Regions of high miRNA expression were found on chromosomes 14 and 19, at the well-characterized C14MC and C19MC clusters (Fig. 2b).

Fig. 2
figure 2

Illustration of the genomic locations of placentally expressed sncRNAs. (a) Circos plot 29 depicting the genomic location and mean log(1+x)-scaled expression level of all placentally expressed sncRNAs, including novel miRNAs. SncRNAs expressed from multiple genomic loci are shown at all such loci. Radial black lines within the chromosome 14 and 19 sectors indicate the positions of the C14MC and C19MC sncRNA clusters, respectively. (b) Heatmap displaying the log(1+x)-scaled RPM expression values of all placentally expressed known sncRNAs, divided by sample. SncRNAs expressed from multiple genomic loci are shown at all such loci. Samples are numbered in identical order to Table 1. M: mitochondrial chromosome.

Methods

Sample acquisition

This study used 32 de-identified chorionic villi samples collected at the BC Women’s Hospital and Health Centre from first trimester, second trimester, and term pregnancies, including 14 that had been previously collected24. Thirty of the samples had sufficient (> 1 million) high-quality sequencing reads to be included in further analysis. First trimester samples were obtained from elective terminations, while second trimester terminations were due to various fetal demise conditions, including but not limited to anencephaly, spina bifida, and preterm membrane rupture. Mode of delivery for 3/9 term samples was caesarean section. Cases with known chromosome abnormalities were excluded. Six of the samples were from fetuses that were phenotypically classified as having neural tube defects (Table 1).

For all cases ascertained before the termination of pregnancy, written consent was obtained. For all cases obtained retrospectively from pathological autopsy specimens, biospecimens were de-identified and all links to clinical data were removed. No identifiable information for any cases is presented in this publication. Ethics approval was obtained from the joint University of British Columbia/Children’s Hospital and Women’s Health Centre of British Columbia Research Ethics Board (H10-01028, H16-02280, and H04-70488).

After placental membrane removal, 30 mg of chorionic villi was sampled from the fetal-facing side of the placenta. Processing time after delivery ranged from 1–192 hours, and samples were RNAlater preserved. During the time of extraction, excess RNAlater was removed by blotting with Kimwipes (Kimberly-Clark, USA), after which samples were homogenized in the Next Advance Bullet Blender Tissue Homogenizer (Next Advance, USA), using the 3.2 mm Stainless Steel Beads (Next Advance, USA), with 1 ml of TRIzol reagent (ThermoFisher Scientific, USA). Samples were then incubated for 5 minutes at room temperature (RT), and then centrifuged at 7000 rpm for three minutes. 200 ml of chloroform (ThermoFisher Scientific, USA) was added to the supernatant, which was thoroughly mixed by inversion, and incubated for five minutes at RT. Samples were centrifuged at 4 °C at 9000 rpm for 20 minutes, and 500 μl of isopropanol (ThermoFisher Scientific, USA) was added to the aqueous phase, which was mixed by inversion and incubated for 10 minutes at RT. Samples were further centrifuged at 4 °C at 9000 rpm for 15 minutes. Supernatant was discarded and RNA pellet was washed in 75% ethanol (Commercial Alcohols, diluted with Ultrapure Distilled Water-RNAse/DNAse free, Gibco-LifeTechnologies) by gently inverting. Samples were centrifuged at 4 °C at 9000 rpm for 10 minutes. Supernatant was discarded, and RNA pellet was air-dried for five minutes at RT. RNA was eluted in 50 μl of nuclease-free water (Ultrapure Distilled Water-RNAse/DNAse free, Gibco-LifeTechnologies). Genomic DNA removal was carried out using the RNase-Free DNase Set (Qiagen, Germany). RNA concentration was measured on a Nanodrop 2000 (ThermoFisher Scientific, USA). RNA quality was assayed on an Agilent Bioanalyzer 2100 (Agilent, USA). Prior to sequencing, small RNA fractions were depleted of ribosomal RNA by hybridization, using the NEBNext rRNA Depletion Kit (New England BioLabs, USA).

Sequencing and quality control

Samples were sequenced at Canada’s Michael Smith Genome Sciences Centre in Vancouver, using their standard ribodepleted strand-specific RNA (ssRNA) sequencing protocol25. This protocol includes plate-based ssRNA library construction, followed by sequencing on an Illumina HiSeq 2500 using the 3′ TruSeq small RNA adapter. No negative sequencing controls or positive spike-in controls were used.

Sequencing reads were subjected to a series of quality control steps, in order to trim adapters and discard reads that were < 16 nucleotides. Trimmed reads (FASTQ) were processed through the miRMaster platform (accessed on Mar. 2018), under default settings20. Reads with a Phred quality score < 20 were discarded, and samples with < 1 million remaining reads were excluded from further analysis (2/32 samples).

Detection of annotated sncRNAs

MiRMaster maps reads to the human genome (hg38) using Bowtie 2 and assigns them to different classes of sncRNAs (miRNA, piRNA, snRNA, snoRNA, or tRNA). Reads that nearly mapped to annotated miRNAs (miRBase v21), with 5′ or 3′ additions or deletions of nucleotides and up to two mismatches, were classified as isomiRs. Reads were then quantified by miRMaster and scaled on a per sample basis by units of reads per million. For sncRNA sequences that mapped to multiple locations in the genome, only the reads derived from the locus with the highest mean expression were retained. Sequences expressed at ≥ 1 RPM in ≥ 10% (3/30) of the samples were considered to be ‘placentally expressed’.

Discovery of novel (previously-unannotated) miRNAs

All reads not aligning with annotated sncRNAs were assessed by miRMaster, using a machine learning algorithm trained to classify sequences as true or false miRNA precursors. MiRMaster employs the AdaBoost algorithm, trained on a set of 216 miRNA features, including nucleotide ratios, free energy metrics, and folding metrics20. Prospective miRNA precursors were also scored by novoMiRank. These scores represent the extent to which a prospective precursor differs from early miRBase-catalogued precursors in 24 features, including nucleotide composition, loop length, and the genomic proximity of other miRNA precursors22. Prospective miRNA precursors were filtered according to their miRMaster-assigned probability of being a true precursor (≥ 65%) and their novoMiRank score (≤ 1.5), and the corresponding prospective miRNAs were filtered by their expression level (≥ 1 RPM in ≥ 10% of samples). For novel miRNA sequences that could be derived from multiple prospective miRNA precursors, only the sequence derived from the prospective precursor with the highest miRMaster-assigned probability of being a true precursor was retained.

Data Records

FASTQ files containing raw sequencing reads can be accessed through the NCBI Sequence Read Archive26. CSV files detailing the reads per million expression values of sncRNAs in each placental sample can be accessed through the Gene Expression Omnibus21. Data are provided for all sncRNAs and isomiRs with at least one read in one sample, not only those that were placentally expressed. Similarly, data are provided for all candidate novel miRNAs, including those that did not pass filters for expression, novoMiRank score, or miRMaster-assigned probability of being a true precursor. Each sample has a separate file for expression of annotated sncRNAs, novel miRNAs, and isomiRs.

Technical Validation

RNA quality score (RQS) was found to correlate with both sample processing time (Spearman’s ρ = −0.55, p = 1.9 × 10−3) and gestational age (Spearman’s ρ = −0.64, p = 1.9 × 10−4) (Table 1).

In order to ensure the accuracy of sequencing reads, reads with Phred scores < 20 were discarded. Prior to this filtering, FastQC v0.11.9 was used to assess the overall sequencing quality for each sample27. The mean (SD) pre-filtering Phred score for individual samples was 31.83 ± 2.99 (Table 4). Samples with the library ID ‘MX1355’ (n = 5) possessed a lower mean (SD) Phred score of 25.37 ± 0.67, which most likely represents a batch effect (Table 4). The median sample yielded an average Phred score of > 20 at positions 1–27 of the trimmed reads, indicating that mature miRNAs (length 18–25 nt) were being accurately quantified (Fig. 3a). Reads with average Phred scores of 35–37 are the most abundant in all samples (Fig. 3b). The mean (SD) GC content for individual samples was 48.7 ± 1.5% (Table 4).

Table 4 Sequencing quality metrics for placental samples.
Fig. 3
figure 3

Summary of sequencing quality metrics for all analyzed placental samples (n = 30). (a) Boxplot of the mean Phred scores for each sample at each position of a sequencing read. (b) Boxplot of the percentage of reads within each sample that have a given mean Phred score. (c) Plot of all placental samples with respect to the first two principal components derived from the expression levels of all placentally expressed sncRNAs. NTD: neural tube defect.

To confirm that the six second trimester samples from fetuses with neural tube defects did not have dramatically different non-coding transcriptomes from the other second trimester samples, multidimensional scaling using Principal Component Analysis was performed on their expression of placentally expressed sncRNAs. When plotted for the first two principal components, the samples with neural tube defects were not distinct from the other second trimester samples (Fig. 3c). The possibility that some preterm samples would have developed observable neural tube defects or other placental or fetal dysfunctions had the pregnancies progressed further cannot be excluded. However, the lack of outliers in the first two principal components indicates, based on the non-coding transcriptomic data that we present, that none of the samples were significantly altered at the time of sampling (Fig. 3c).