Genome-wide identification and characterization of transfer RNA-derived small RNAs in Plasmodium falciparum
Transfer RNA (tRNA)-derived fragments (tRFs) have been widely identified in nature, functioning in diverse biological and pathological situations. Yet, the presence of these small RNAs in Plasmodium spp. remains unknown. Systematic identification and characterization of tRFs is therefore highly needed to understand further their roles in Plasmodium parasites, particularly in the virulent Plasmodium falciparum parasite.
Genome-wide small RNAs with sizes ranging from 18–30 nucleotides from P. falciparum were deep-sequenced via Illumina HiSeq 2000 technology. In-depth analysis revealed the presence of a vast number of small RNAs originating from tRNA-coding genes, responsible for 22.4% of the total reads as the second predominant group. Three P. falciparum-derived tRF types (ptRFs) were identified as 5'ptRFs, mid-ptRFs and 3'ptRFs. The majority (90%) of ptRFs were derived from tRNAs that coded eight amino acids: Pro, Phe, Asn, Gly, Cys, Gln, His and Ala. Stem-loop reverse transcription polymerase chain reaction further confirmed the presence of tRFs in the blood stages of P. falciparum. Four new motifs with an enriched G/C feature were determined at cleavage sites that might guide the generation of ptRFs.
To our knowledge, this is the first report of a genome-wide investigation of ptRFs from Plasmodium species. The identification of ptRFs reveals a complex small RNA system manipulated by the malaria parasite, and might promote research on the function of tRFs in the pathogenesis of Plasmodium infections.
KeywordsPlasmodium falciparum Transfer RNA Short RNAs Small RNAs tRFs
Endogenous small interference RNA
P. falciparum laboratory strain 3D7
P-element induced wimpy testis
Red blood cells
Relative number per one million reads
Roswell Park Memorial Institute-1640
Quantitative reverse-transcribed polymerase chain reaction
Small nucleolar RNA
Small nuclear RNA
Transfer RNA-derived fragments
A growing amount of data suggests that small RNAs play critical regulatory roles in many biological processes in both prokaryotes [1, 2, 3] and eukaryotes [4, 5]. Other than the three major types of small RNAs including microRNAs (miRNAs), endogenous small interference RNAs (endo-siRNAs) and PiWi-interacting RNAs (piRNAs) [6, 7], the advance in deep sequencing technology has unveiled new classes of small RNAs with novel features and functions. One such class is those derived from transfer RNAs (tRNAs), commonly referred to as tRFs (tRNA-derived RNA fragments), which were once mis-annotated as miRNAs [8, 9].
tRFs are typically short in length, ranging from 12–30 nucleotides (nt) . Despite the fact that the nomenclature used to refer to the subclass of tRFs varies in the literature, they are generally classified in terms of their origin: 5'tRFs are generated from 5' end cleavage in the D-arm of mature tRNAs, 3'tRFs or 3'CCA tRFs are produced from 3' end processing of mature tRNAs in the TΨC-arm and contain a CCA post-transcriptional modification, 3'UtRFs or tRF-1 are derived from the 3' end of a pre-tRNA, and endogenous tRFs (itRFs or mid-tRFs) originate from a combination of cleavage from the anticodon loop domain and either D-arm or TΨC-arm of mature tRNAs. Single cleavage occurring in the anticodon loop could lead to generation of a conventional class of tRNA-derived RNAs, namely tRNA-halves (~ 30–50 nt). Apart from the yield of tRNA-halves from mature tRNA under stress conditions, the biogenesis of tRFs as well as their functions remain to be fully understood.
With increased research interest in tRFs, mounting evidence has linked the production of tRFs to cancer , neurodegenerative disorders, viral infection and other pathological conditions. They have been involved in cellular proliferation, RNA interference pathway and regulate gene translation at the post-transcriptional level, similar to miRNAs (reviewed in [10, 12]). Moreover, the interactions in many cases between tRFs with the functional proteins in miRNA pathway such as argonaute (AGO) protein families [13, 14], PIWI  and DICER proteins  further imply the possibility of tRFs behaving like miRNAs and siRNAs.
Malaria remains the most grievous protozoan load due to its severe prevalence and lethality in the world, particularly in Southeast Asia and Africa. More than 400,000 deaths were recorded in 2017 by the WHO Malaria Report . Of the five human malaria parasite species, Plasmodium falciparum is the most virulent, causing the majority of malaria-related mortality. In P. falciparum, as well as in other Plasmodium parasites, active RNA interference apparatus  and canonical miRNA molecules were found to be absent [19, 20], as confirmed by computational and experimental evidence. The lack of DICER and AGO homologs in Plasmodium parasites suggests that these organisms might use an alternative pathway to accomplish the intense post-transcriptional regulation needed during the rapid morphological change in the intraerythrocytic cycle. It is possible that small RNAs such as tRFs might be involved. Until now, the existence and characterization of tRFs in Plasmodium parasites has not been evaluated. tRFs have been identified in several protozoan organisms including Tetrahymena , Giardia lamblia , Trypanosoma cruzi  and the exosomes from Leishmania donovani . This provides credence that these deeply conserved molecules are expressed in protozoan species, even though their classifications and functions remain unclear. With this inspiration, we sought to investigate tRFs in the malaria parasite, in which the machinery associated with small RNA biogenesis is thought to be either entirely lost or extensively simplified [18, 19, 20].
Here, we focused on the deadly P. falciparum and took the advantage of deep sequencing technology to comprehensively analyze genome-wide high-throughput sequencing data from P. falciparum-derived small RNAs. To our knowledge, we report for the first time the expression of novel, endogenous Plasmodium tRFs (ptRFs) and their global distribution and characterization in blood stages of P. falciparum.
Maintenance of P. falciparum
The standard P. falciparum laboratory strain 3D7 (Pf3D7) was obtained from ATCC (Manassas, VA, USA) and maintained within fresh human red blood cells (RBCs) at 5% hematocrit in RPMI-1640 medium supplemented with 0.5% (w/v) albumax II (Invitrogen, Carlsbad, CA, USA), as previously described . Parasitemia was monitored by Giemsa staining of blood smears.
Small RNA purification and high-throughput sequencing
The mixed-stage infected RBCs were lysed with 0.05% saponin and washed three times with RPMI-1640 to collect parasites. Total RNA from the parasite pellet was extracted using Trizol reagent (Invitrogen) following the manufacturer’s instructions and was quantified by absorbance at 260 nm. RNA was then separated by urea-denatured 15% polyacrylamide gel electrophoresis, and bands of small RNAs of 18–30 nt in length were extracted and purified. Small RNA library construction and sequencing was performed following commercial protocols. Briefly, the small RNA molecules were ligated to 5' and 3' adaptors sequentially and then converted to cDNA with reverse transcription followed by polymerase chain reaction (PCR) amplifications. Approximately 20 μg of reverse transcription (RT)-PCR products per sample were sequenced using an Illumina Hiseq 2000 platform (Illumina, San Diego, CA, USA) by Beijing Genomics Institute (BGI, Shenzhen, China).
Small RNA data analysis
A total of 14,828,877 reads were initially obtained from the sequencing of small RNA libraries. Raw sequences were pre-cleaned to remove low-quality reads and 10,230,166 high-quality reads were then trimmed to discard the 3' and 5' adaptors, sequences containing the polyA tail, and those smaller than 18 nt or larger than 30 nt. This dataset was mapped to the genome of the P. falciparum 3D7 strain (PlasmoDB, release 36, http://plasmodb.org/plasmo/) using BLASTN to allow the removal of any contamination from human RNA segments to yield a clean data pool. These data were analyzed in both total and unique aspects. Total reads, also referred to as redundant reads in the literature, indicated multiple reads aligned around the same location, representing high abundance of the RNA. To demonstrate the diversity of the small RNA, the total reads were collapsed to remove the redundant ones, resulting in single consensus aligned unique reads. The clean-read counts were normalized as a relative number per one million reads (RPM) in both total and unique categories for further analysis. Information of the 45 P. falciparum tRNA genes was downloaded from the PlasmoDB database. By performing BLAST alignments, data from the mapped Plasmodium tRNA-derived segments were extracted and further analyzed. The expression intensity of the ptRFs against the whole genome was sequentially calculated with the “Build” function of Bowtie2 (v.188.8.131.52) and the pipeline of TopHat (v.2.0.9) and Cufflinks (v.2.2.1). The data extraction of each type ptRFs was performed by compiled Perl (v.5.22) codes. The statistical analysis for the Pearsonʼs correlation coefficients was performed using the “cor” function in R package stats. The plots were drawn mainly with the ggplot2 package in R language (v.3.5.0).
Gapped Local Alignment of Motifs (GLAM2 v1056)  and Motif Comparison Tool (Tomtom v.4.12.0)  were used to predict and compare the motifs around the cleavage sites of the tRFs. For the GLAM2 algorithm, the adopted parameters for motif predication were set as default values, except that only the given strand was aligned. The outputs were directly submitted into the Tomtom algorithm for comparing the predicted motifs against the known RNA motif database. All known motifs were selected to be calculated using an e-value of < 10 followed by the adoption of the function for Pearsonʼs correlation coefficient.
Total RNA templates were reverse-transcribed into cDNAs using the EasyScript One-Step gDNA Removal and cDNA Synthesis SuperMix Kit (TransGen Biotech, Beijing, China) according to the manufacturer’s instructions. Specific stem-loop reverse transcription primers (Additional file 1: Table S1) were designed based on the sequence of each ptRFs as described previously [28, 29]. The PCR cycling conditions were as follows: initial denaturation at 94 °C for 5 min, followed by 30 cycles of denaturation at 94 °C for 5 s, annealing at 50 °C for 15 s and extension at 72 °C for 15 s, with a final extension at 72 °C for 5 min. The PCR products were separated on 12% polyacrylamide gel. The expected sizes of the PCR products were calculated by the length of each ptRF plus 40 base pairs of nucleotides technically introduced in the design of specific stem-loop primers. Negative controls, absent with either DNase I-treated RNA or reverse transcriptase in RT reactions, were performed to verify the accuracy and specificity of the stem-loop RT-PCR. The sequences of RT-PCR primers are listed in Additional file 1: Table S2.
General features of tRNA-derived small RNAs in P. falciparum
Classification of small RNAs mapping to the P. falciparum genome
Unannoted small RNAs
Identification of three types of tRFs from P. falciparum
Given that the biogenesis of tRFs from their parental tRNA molecules was reported to be evolutionarily conserved at the 5' end, 3' end and tRNA precursor, we mapped the ptRF reads to all 45 Plasmodium tRNA-coding genes and, as expected, we observed that a large portion of the reads fell into two groups that mapped to the 5' and 3' ends of their parent tRNAs. Unexpectedly, it was noticed that part of the ptRFs were mapped to the middle regions of the parent tRNAs, which were divergent in size and ends in contrast to the known mid-tRFs (or itRFs). Hence, we named these three ptRF types as “5'ptRFs”, “3'ptRFs” and “mid-ptRFs”. 5'ptRFs predominated (86.23%) in ptRFs, while mid-ptRFs and 3'ptRFs represented only 6.20% and 5.99% of the tRNA-derived fragments, respectively (Fig. 2b). By plotting the RPM values of three types of ptRFs against their sizes, we observed two peaks at ~23/24 and ~29 nt for 5'ptRFs, one peak at ~20 nt for mid-ptRFs, and two peaks at ~20/21 nt and ~25 nt for 3'ptRFs (Fig. 2c), which displayed great difference in the expression levels.
To further understand their expression divergence, we investigated the abundance of ptRFs derived from all 45 P. falciparum tRNAs. The RPM values of ptRFs in both total and unique categories were plotted against their sizes to create 270 plots (Additional files 3, 4 and 5: Figures S2, S3, S4), which exhibited great variance of the ptRFs from different tRNAs. Nine ptRFs with high RPM values over 105 were observed, namely 5'ptRFs-AspGTC, -CysGCA, -GlyGCC, -HisGTG, -LeuCAA, -ProAGG, -ProCGG, -ProTGG and 3'ptRFs-AspGTC; whereas four ptRFs, 5'ptRFs-SecTCA, mid-ptRFs-ArgTCG, -LeuAAG and -LeuTAG were found to have low RPM values less than 5 (Additional files 3, 4 and 5: Figures S2, S3, S4). For a given tRNA gene, the production of three types of ptRFs varied dramatically. For example, tRNA-CysGCA produced more 5'ptRFs than mid-ptRFs or 3'ptRFs with an abundance 3 × 103-fold higher, while tRNA-AlaAGC generated a lower level of 5'ptRFs than mid-ptRFs or 3'ptRFs (Fig. 2d). Consequently, no significant correlation was found within the abundance of these three types in the library. The Pearsonʼs correlation coefficient was -0.0176 (P = 0.6275) between 5'ptRFs and mid-ptRFs, 0.3192 between mid-ptRFs and 3'ptRFs (P = 0.3151), and 0.0218 between 3'ptRFs and 5'ptRFs (P = 0.8068) (Additional file 2: Figure S1b). We then selected the top 20 abundant ptRFs of each type to avoid any possibility of them being as RNA degradation products for further analysis (Additional file 1: Table S3). Interestingly, these three types of ptRFs were generally derived from various tRNAs, sharing only four common parental tRNAs (Additional file 2: Figure S1c). The generation of ptRFs did not seem to be associated with the expression levels of their parental tRNAs in P. falciparum.
Experimental validation of ptRF candidates
New motifs found to process ptRFs at their cleavage sites
In spite of the progress made in post-genomic research on malaria parasites, our knowledge on the function of small non-coding RNAs in these organisms remains insufficient. The limitation in cloning techniques to identify small RNAs in P. falciparum has restricted genome-wide small RNA characterization. Yet, high-throughput sequencing provides advanced depth and massive volumes, thereby enables the exploration of distinct categories of RNAs compared with traditional methods [19, 20]. It is widely accepted that miRNAs are not the only regulators of gene expression; many other types of small RNAs have been identified to contribute to this process. tRFs is one such type, associated with many functions similar to miRNAs at the post-transcriptional level. Thus, it is possible that Plasmodium parasites complete gene expression regulation through molecules that can substitute for miRNAs, e.g. tRFs. Studies have shown that tRF expression is evolutionarily conserved from prokaryotes to mammalian cells. In the present study, we took advantage of high-throughput sequencing to obtain a coverage over 100-fold for 18–30 nt length small RNAs, which provided sufficient depth for systematic analysis. Consistent with tRFs derived from plant or mammalian cell lines [13, 33], we found three types of tRFs in P. falciparum: 5'ptRFs, 3'ptRFs and mid-ptRFs. Of these, 5'ptRFs is the most abundant class, while mid-ptRFs are generated from the 5' end of the anticodon loop of their parental tRNAs, which has not been previously reported [12, 34, 35].
The mechanism of tRF biogenesis has not been understood clearly because cleavage sites of known tRNA-specific nucleases remain ambiguous . Information on the consensus feature(s) of cleavage sites on tRNAs needs to be fully addressed . Therefore, we analyzed the motifs for the P. falciparum tRNAs to produce tRFs, and discovered four new motifs at the ends of the ptRFs that may aid the explanation of the versatile production of ptRFs. Intriguingly, motif 3 at the 3' end of the mid-ptRFs and motif 4 at the 5' end of the 3'ptRFs were found from different tRNA molecules in our analysis, while they shared analogous sequences. We noticed that these motifs are also situated at the same sites in those tRNAs not producing ptRFs. It is possible that the generation of ptRFs might involve multiple factors in addition to motif recognition. Moreover, the D-loop adjacent to motif 1 contains the “GG” conserved dinucleotide, which is required by RNase P and 39-tRNase processing and is highly conserved in Drosophila and humans . Collectively, these findings provide a clue that there might be unidentified Plasmodium-derived endonucleases playing roles in the biogenesis of ptRFs.
The functional proteins closely associated with tRFs are reported to be DICER, AGO1, 2, 4 and 5 in plant cells, and AGO1, 3 and 4 in animal cells [13, 14, 33]. Yet, genome sequencing of P. falciparum has not revealed the existence of DICER and AGO homologs. Thus, the mechanism of function and related proteins for ptRFs remains to be elucidated. Evidence from one previous study indicates that human miRNAs complexed with human AGO2 are translocated into the parasite cytoplasm to function as negative regulators of the infection . Host-derived miRNA molecules have been shown to interfere with the synthesis of Plasmodium proteins . Whether ptRFs function by binding to host AGO2 or other small RNA biogenesis-related apparatus should be further explored to strengthen our understanding of the interaction between malaria parasites and their human host.
In P. berghei and its closely related species T. gondii, half-tRNAs with a size of nearly 35 nt have been detected experimentally . Although stress-induced tRNA-halves are not included in this study, they need to be explored in P. falciparum. Therefore, more work is required to investigate ptRFs and other small RNAs as well as their function both in vitro and in vivo.
To our knowledge, this is the first comprehensive analysis of tRFs in P. falciparum. Three types of tRFs were identified, and four new motifs guiding their cleavage were revealed. The generation and function of these ptRFs need further exploration. Our work provides evidence of tRFs in P. falciparum, which could promote research on the functions of tRFs in the pathogenesis of Plasmodium infections and potentially lead to new therapeutic approaches to malarial disease.
We are grateful to Dr Huiliang Li from University College London for helpful discussions.
This work was supported by Beijing Natural Science Foundation (7173266), National Science and Technology Major Project (2018ZX10101-001) and CAMS Innovation Fund for Medical Sciences (CIFMS) (2016-I2M-3-016).
Availability of data and materials
The datasets generated and analyzed during the present study are available in the GEO repository under the accession number GSE119771.
ZSW, ZLW, HW, XH, CYW and WWD conceived and designed the experiments. ZSW and LHZ performed the experiments. ZSW and ZLW analyzed the data. ZSW, ZLW and HW wrote the manuscript. All authors read and approved the final manuscript.
Ethics approval and consent to participate
This work was approved by the Ethical Inspection Committee of the Institute of Basic Medical Sciences, Chinese Academy of Medical Sciences (no. ACUC-2017-056).
Consent for publication
The authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
- 4.Okamura K. Diversity of animal small RNA pathways and their biological utility. WIREs RNA. 2012;3:351–68.Google Scholar
- 7.Portnoy V, Huang V, Place RF, Li LC. Small RNA and transcriptional upregulation. WIREs RNA. 2011;2:748–60.Google Scholar
- 13.Kumar P, Anaya J, Mudunuri SB, Dutta A. Meta-analysis of tRNA derived RNA fragments reveals that they are evolutionarily conserved and associate with AGO proteins to recognize specific RNA targets. BMC Biology. 2014;12:78.Google Scholar
- 17.WHO. World malaria report 2017. Geneva: World Health Organization; 2017.Google Scholar
- 22.Liao JY, Guo YH, Zheng LL, Li Y, Xu WL, Zhang YC, et al. Both endo-siRNAs and tRNA-derived small RNAs are involved in the differentiation of primitive eukaryote Giardia lamblia. Proc Natl Acad Sci USA. 2014;111:14159–64.Google Scholar
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.