The Trypanosoma rangeliEST project

Trypanosoma rangeli and T. cruzi are hemoflagellate protozoan parasites transmitted by triatomine bugs in a wide, overlapped distribution area in Central and South America [1, 2]. T. cruzi is the etiological agent of Chagas disease that affects 16 million people [3], whereas T. rangeli infection does not seems to be pathogenic for vertebrate hosts [1]. Beyond their morphological similarity, the sharing of several antigenic determinants allows the occurrence of inconclusive or wrong serological diagnosis of human infection, constituting a serious problem for the epidemiology of both etiologies [4, 5]. Nowadays, only few methods or specialized techniques can undoubtedly differentiate T. cruzi from T. rangeli, being not available for routine diagnosis [2]. Moreover, the controversial taxonomic position of T. rangeli has been discussed for several years. Presenting a remarkable pleomorphism and biological characteristics of distinct Trypanosoma Subgenus, the T. rangeli biology in its vertebrate hosts is still awaiting to be revealed [2]. Thus, it is necessary to develop new strategies in order to allow specific differentiation in a fast, secure, easy and economically viable way, to provide valuable information on genetic markers and to support functional studies in order to clarify the parasite biology and taxonomic position. Comparing to the actual number of T. cruzi entries in the GenBank (42,410 nucleotides and 1,602 proteins), a small number of T. rangeli nucleotide sequences or genes (68) and proteins (42) have been characterized and the overall picture of the genome is still absent. Due the absence of such information, little is known about important biological features such as the host-parasite relationships at the molecular level and the taxonomic position of the parasite [5, 6]. Considering the ongoing initiatives over the T. cruzi genome [7] and the usefulness of cDNA partial sequencing to generate expressed sequence tags (ESTs) or Orestes (ORF ESTs) as a rapid and efficient approach to establish a detailed profile of the genes and the gene expression as well [814], we have started the "Trypanosoma rangeli EST Project" under the auspices of CNPq, a Brazilian Government Agency. Considering the importance of this parasite and its under representation on the current genomic/proteomic scenario, the project aims to generate a overview of the T. rangeli transcriptome through sequencing ESTs and ORESTES from both epimastigotes and trypomastigotes forms of the parasite. Initially, tree distinct normalized and non-directional T. rangeli cDNA libraries were constructed using epimastigote forms of the Choachi strain [15]. Total RNA was obtained using standard protocols [16] followed by phenol-chloroform extraction and poliA+ RNA was prepared using the Oligotex RNA purification kit (Qiagen, Valencia). The cDNA libraries were constructed by cloning size-purified fragments in pGEM T-Easy plasmid (Promega, Madison) using the Clontech PCR-SelectTM cDNA Subtraction kit (BD Biosciences, Franklin Lakes). A total of ~3,360 clones were obtained and the validation of the libraries was performed by sequencing a single end of the inserts in a MegaBace 1000® equipment (Molecular Dynamics/Amersham Biosciences, Piscataway) using the DYEnamicET dye terminator kit (Amersham Biosciences). The mean size of the inserts for all cDNA libraries was of ~750 bp as revealed by EcoRI restriction analysis. The obtained sequences were edited and analyzed using the new ESTonSQL system under development in collaboration of D.M. Lorenzini with the UFSC Bioinformatics laboratory http://www.bioinformatica.ufsc.br. ESTs with 150 bp and phred quality > 10 http://www.phrap.org were considered valid by ESTonSQL and were analyzed by comparison with the EMBL-EBI, GenBank, Swissprot and Interpro databases. Sequences < 150 bp or with Phred value < 10 were labeled as invalid ESTs and stored. Up to now, a total of 656 valid ESTs were generated among which, 386-showed similarity with trypanosomatid sequences (Table 1). Several matches were observed with distinct organisms, but considering only the trypanosomatids database, 168 sequences were similar to T. cruzi sequences and only 20 showed similarity with T. rangeli sequences (Table 2). Despite the preliminary analysis, several ESTs revealed similarity to sequences not previously reported for T. rangeli such as a human topoisomerase III (GenBank accession AAH02432) and a putative Leishmania major quinone oxidoreductase (NP_859482). Over 37% of the valid ESTs returned no hits and may represent T. rangeli specific genes, unknown genes or even 5' or 3' end untranslated regions (UTRs), proving the usefulness of this type of strategy to gather new information and the needs of genomic studies of this parasite. Along with the continuing EST generation by sequencing both ends of all obtained clones, ORESTES (ORF ESTs) profiles [17] will be obtained from the same libraries used for EST generation. While ESTs are generated by sequencing the 5' and/or 3' ends of cDNA clones, the ORESTES technique allows the generation of sequences belonging to the middle portion of the genes [17]. Through the combination of both EST and ORESTES approaches to study the transcriptome of a target organism, a significant number of coding regions can be rapidly obtained. Having described a method to induce T. rangeli metacyclogenesis in vitro [18], our group is currently working on a cDNA library of this parasite form in order to perform an intra and inter-specific comparative study of both epimastigotes and trypomastigote ESTs and ORESTES.

Table 1 Current status of the Trypanosoma rangeli transcriptome project.
Table 2 Actual number of expressed sequence tags (ESTs) generated by the Trypanosoma rangeli transcriptome project distributed according to the similarity with sequences of distinct species.

Conclusions

The number of studies involving T. rangeli is increasing due several aspects such as i) the importance on Chagas disease diagnosis and epidemiology, ii) the unknown life cycle on vertebrate hosts and iii) the contradictory taxonomic position of the parasite. Several molecular approaches have been used during the last few years to address these points. However, in order to perform comparative transcriptomics to address taxonomic studies, to bring up new markers for specific diagnosis, to discover, understand and/or compare some biological processes such as the involvement of the 3' end UTR on the post-transcriptional regulation of gene expression or even to assess intra-specific variability, a genomic panorama of the parasite must be revealed. Upon the release of the T. cruzi, T. brucei and L. major genome databases, not so far from now, the existence of well annotated ESTs and ORESTES from distinct T. rangeli forms will allow comparative transcriptomic studies with the state of the art technologies such as microarrays. Upon the end of the project, free accession to the T. rangeli transcriptome data, as well as tools for comparative genomics/transcriptomics under development, will be available through the UFSC Bioinformatics Laboratory website http://www.bioinformatica.ufsc.br.

Author's contributions

The present work is part of C.Q. Snoeijer Masters thesis. All other authors have equally contributed to this work.