Advertisement

Molecular Biology

, Volume 52, Issue 4, pp 497–509 | Cite as

Algorithm for Physiological Interpretation of Transcriptome Profiling Data for Non-Model Organisms

  • R. F. Gubaev
  • V. Y. Gorshkov
  • L. M. Gapa
  • N. E. Gogoleva
  • E. P. Vetchinkina
  • Y. V. Gogolev
Genomics. Transcriptomics
  • 31 Downloads

Abstract

Modern techniques of next-generation sequencing (NGS) allow obtaining expression profile of all genes and provide an essential basis for characterizing metabolism in the organism of interest on a broad scale. An important condition for obtaining a demonstrative physiological picture using high throughput sequencing data is the availability of the genome sequence and its sufficient annotation for the target organism. However, a list of species with properly annotated genomes is limited. Transcriptome profiling is often performed in the so-called non-model organisms, which are those with unknown or poorly assembled and/or annotated genome sequences. The transcriptomes of non-model organisms are possible to investigate using algorithms of de novo assembly of the transcripts from sequences obtained as the result of RNA sequencing. A physiological interpretation of the data is difficult in this case because of the absence of annotation of the assembled transcripts and their classification by metabolic pathway and functional category. An algorithm for transcriptome profiling in non-model organisms was developed, and a transcriptome analysis was performed for the basidiomycete Lentinus edodes. The algorithm includes open access software and custom scripts and encompasses a complete analysis pipeline from the selection of cDNA reads to the functional classification of differentially expressed genes and the visualization of the results. Based on this algorithm, a comparative transcriptome analysis of the nonpigmented mycelium and brown mycelial mat was performed in L. edodes. The comparison revealed physiological differences between the two morphogenetic stages, including an induction of cell wall biogenesis, intercellular communication, ion transport, and melanization in the brown mycelial mat.

Keywords

RNA sequencing de novo transcriptome assembly transcript annotation functional classification of expressed genes visualization of metabolic pathways morphogenesis of Lentinus edodes 

Abbreviations

NGS

next-generation sequencing

ORF

open reading frame

DEG

differentially expressed gene

GO

Gene Ontology

OG

orthologous gene

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Goodwin S., McPherson J.D., McCombie W.R. 2016. Coming of age: Ten years of next-generation sequencing technologies. Nat. Rev. Genet. 17 (6), 333–351.CrossRefPubMedGoogle Scholar
  2. 2.
    Li B., Fillmore N., Bai Y., Collins M., Thomson J.A., Stewart R., Dewey C.N. 2014. Evaluation of de novo transcriptome assemblies from RNA-Seq data. Genome Biol. 15 (12), 553.CrossRefPubMedPubMedCentralGoogle Scholar
  3. 3.
    Shang J., Zhu F., Vongsangnak W., Tang Y., Zhang W., Shen B. 2014. Evaluation and comparison of multiple aligners for next-generation sequencing data analysis. Biomed. Res. Int. 2014, 309650.CrossRefPubMedPubMedCentralGoogle Scholar
  4. 4.
    Huang X., Chen X.G., Armbruster P.A. 2016. Comparative performance of transcriptome assembly methods for non-model organisms. BMC Genomics. 17 (1), 523.CrossRefPubMedPubMedCentralGoogle Scholar
  5. 5.
    Ball D.W. 2006. Concentration scales for sugar solutions. J. Chem. Educ. 83 (10), 1489.CrossRefGoogle Scholar
  6. 6.
    Vetchinkina E.P., Gorshkov V.Yu., Ageeva M.V., Gogolev Yu.V., Nikitina V.E. 2015. Activity and expression of the laccase, tyrosinase, glucanase and chitinase genes in the morphogenesis of Lentinus edodes, Microbiology. 84 (1), 78–89.CrossRefPubMedGoogle Scholar
  7. 7.
    Bolger A.M., Lohse M., Usadel B. 2014. Trimmomatic: A flexible trimmer for Illumina sequence data. Bioinformatics. 30 (15), 2114–2120.CrossRefPubMedPubMedCentralGoogle Scholar
  8. 8.
    Ewing B., Hillier L., Wendl M.C., Green P. 1998. Basecalling of automated sequencer traces using phred: 1. Accuracy assessment. Genome Res. 8 (3), 175–185.CrossRefPubMedGoogle Scholar
  9. 9.
    Kopylova E., Noé L., Touzet H. 2012. SortMeRNA: Fast and accurate filtering of ribosomal RNAs in metatranscriptomic data. Bioinformatics. 28 (24), 3211–3217.CrossRefPubMedGoogle Scholar
  10. 10.
    Quast C., Pruesse E., Yilmaz P., Gerken J., Schweer T., Yarza P., Peplies J., Glöckner F.O. 2013. The SILVA ribosomal RNA gene database project: Improved data processing and web-based tools. Nucleic Acids Res. 41, D590–D596.CrossRefPubMedGoogle Scholar
  11. 11.
    Nawrocki E.P., Burge S.W., Bateman A., Daub J., Eberhardt R.Y., Eddy S.R., Floden E.W., Gardner P.P., Jones T.A., Tate J., Finn R.D. 2015. Rfam 12.0: Updates to the RNA families database. Nucleic Acids Res. 43, D130–D137.CrossRefPubMedGoogle Scholar
  12. 12.
    Grabherr M.G., Haas B.J., Yassour M., Levin J.Z., Thompson D.A., Amit I., Adiconis X., Fan L., Raychowdhury R., Zeng Q., Chen Z., Mauceli E., Hacohen N., Gnirke A., Rhind N., et al. 2011. Full-length transcriptome assembly from RNA-Seq data without a reference genome. Nat. Biotechnol. 29 (7), 644–652.CrossRefPubMedPubMedCentralGoogle Scholar
  13. 13.
    Simão F.A., Waterhouse R.M., Ioannidis P., Kriventseva E.V., Zdobnov E.M. 2015. BUSCO: Assessing genome assembly and annotation completeness with single-copy orthologs. Bioinformatics. 31 (19), 3210–3212CrossRefPubMedGoogle Scholar
  14. 14.
    Camacho C., Coulouris G., Avagyan V., Ma N., Papadopoulos J., Bealer K., Madden T.L. 2009. BLAST+: Architecture and applications. BMC Bioinformatics. 10, 421.CrossRefPubMedPubMedCentralGoogle Scholar
  15. 15.
    Trapnell C., Roberts A., Goff L., Pertea G., Kim D., Kelley D.R., Pimentel H., Salzberg S.L., Rinn J.L., Pachter L. 2012. Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks. Nat. Protoc. 7 (3), 562–578.CrossRefPubMedPubMedCentralGoogle Scholar
  16. 16.
    Caspi R., Billington R., Ferrer L., Foerster H., Fulcher C.A., Keseler I.M., Kothari A., Krummenacker M., Latendresse M., Mueller L.A., Ong Q., Paley S., Subhraveti P., Weaver D.S., Karp P.D. 2016. The MetaCyc database of metabolic pathways and enzymes and the BioCyc collection of pathway/genome databases. Nucleic Acids Res. 44 (D1), D471–D480.CrossRefPubMedGoogle Scholar
  17. 17.
    Singh R., Lawal H.M., Schilde C., Glöckner G., Barton G.J., Schaap P., Cole C. 2017. Improved annotation with de novo transcriptome assembly in four social amoeba species. BMC Genomics. 18 (1), 120.CrossRefPubMedPubMedCentralGoogle Scholar
  18. 18.
    Tassone E.E., Geib S.M., Hall B., Fabrick J.A., Brent C.S., Hull J.J. 2016. De novo construction of an expanded transcriptome assembly for the western tarnished plant bug, Lygus hesperus. Gigascience. 5, 6.CrossRefPubMedPubMedCentralGoogle Scholar
  19. 19.
    Lok S., Paton T.A., Wang Z., Kaur G., Walker S., Yuen R.K., Sung W.W., Whitney J., Buchanan J.A., Trost B., Singh N., Apresto B., Chen N., Coole M., Dawson T.J., et al. 2017. De novo genome and transcriptome assembly of the Canadian beaver (Castor canadensis). G3 (Bethesda). 7 (2), 755–773.CrossRefPubMedPubMedCentralGoogle Scholar
  20. 20.
    Zhao Q.Y., Wang Y., Kong Y.M., Luo D., Li X., Hao P. 2011. Optimizing de novo transcriptome assembly from short-read RNA-Seq data: A comparative study. BMC Bioinformatics. 12 (14), S2.CrossRefGoogle Scholar
  21. 21.
    Mohanta T. K., Bae H. 2015. The diversity of fungal genome. Biol. Proced. Online. 17, 8.CrossRefPubMedPubMedCentralGoogle Scholar
  22. 22.
    Zhao W., He X., Hoadley K.A., Parker J.S., Hayes D.N., Perou C.M. 2014. Comparison of RNA-Seq by poly(A) capture, ribosomal RNA depletion, and DNA microarray for expression profiling. BMC Genomics. 15, 419.CrossRefPubMedPubMedCentralGoogle Scholar
  23. 23.
    Macmanes M.D. 2014. On the optimal trimming of high-throughput mRNA sequence data. Front. Genet. 5, 13.CrossRefPubMedPubMedCentralGoogle Scholar
  24. 24.
    Williams C.R., Baccarella A., Parrish J.Z., Kim C.C. 2016. Trimming of sequence reads alters RNA-Seq gene expression estimates. BMC Bioinformatics. 17, 103.CrossRefPubMedPubMedCentralGoogle Scholar
  25. 25.
    Honaas L.A., Wafula E.K., Wickett N.J., Der J.P., Zhang Y., Edger P.P., Altman N.S., Pires J.C., Leebens-Mack J.H., de Pamphilis C.W. 2016. Selecting superior de novo transcriptome assemblies: Lessons learned by leveraging the best plant genome. PloS One. 11 (1), e0146062.CrossRefPubMedPubMedCentralGoogle Scholar
  26. 26.
    Tang L.H., Jian H.H., Song C.Y., Bao D.P., Shang X.D., Wu D.Q., Tan Q., Zhang X.H. 2013. Transcriptome analysis of candidate genes and signaling pathways associated with light-induced brown film formation in Lentinula edodes. Appl. Microbiol. Biotechnol. 97 (11), 4977–4989.CrossRefPubMedGoogle Scholar
  27. 27.
    Zhong M., Liu B., Wang X., Liu L., Lun Y., Li X., Ning A., Cao J., Huang M. 2013. De novo characterization of Lentinula edodes C 91–3 transcriptome by deep Solexa sequencing. Biochem. Biophys. Res. Commun. 431, 111–115.CrossRefPubMedGoogle Scholar
  28. 28.
    Zhang Z.H., Jhaveri D. J., Marshall V.M., Bauer D. C., Edson J., Narayanan R.K., Robinson G.J., Lundberg A.E., Bartlett P.F., Wray N.R., Zhao Q.Y. 2014. A comparative study of techniques for differential expression analysis on RNA-Seq data. PloS One. 9 (8), e103207.CrossRefPubMedPubMedCentralGoogle Scholar
  29. 29.
    Aguilera-Aguirre L., Hosoki K., Bacsi A., Radák Z., Wood T.G., Widen S.G., Sur S., Ameredes B.T., Saavedra-Molina A., Brasier A.R., Ba X., Boldogh I. 2015. Whole transcriptome analysis reveals an 8- oxoguanine DNA glycosylase-1-driven DNA repairdependent gene expression linked to essential biological processes. Free Radical Biol. Med. 81, 107–118.CrossRefGoogle Scholar
  30. 30.
    Kim H.I., Kim J.H., Park Y.J. 2016. Transcriptome and gene ontology (GO) enrichment analysis reveals genes involved in biotin metabolism that affect l-lysine production in Corynebacterium glutamicum. Int. J. Mol. Sci. 17 (3), 353.CrossRefPubMedPubMedCentralGoogle Scholar
  31. 31.
    Weber C., Koutero M., Dillies M.A., Varet H., Lopez- Camarillo C., Coppée J.Y., Hon C.C., Guillén N. 2016. Extensive transcriptome analysis correlates the plasticity of Entamoeba histolytica pathogenesis to rapid phenotype changes depending on the environment. Sci. Rep. 6, 35852.CrossRefPubMedPubMedCentralGoogle Scholar
  32. 32.
    Avin F.A., Bhassu S., Shin T.Y., Sabaratnam V. 2012. Molecular classification and phylogenetic relationships of selected edible Basidiomycetes species. Mol. Biol. Rep. 39 (7), 7355–7364.CrossRefPubMedGoogle Scholar
  33. 33.
    Vetchinkina E.P., Nikitina V.E. 2007. Morphological patterns of mycelial growth and fruition of some strains of an edible xylotrophic basidiomycete Lentinus edodes. Izv. Samarsk. Nauch. Tsentra Ross. Akad. Nauk. 9 (4), 1085–1090.Google Scholar
  34. 34.
    Garibova L.V., Zav’yalova L.A., Aleksandrova E.A., Nikitina V.E. 1999. Biology of Lentinus edodes: 1. Morphological-cultural and physiological-biochemical characteristics. Mikol. Fitopatol. 33 (2), 107–110.Google Scholar
  35. 35.
    Vetchinkina E.P., Pozdnyakova N.N., Nikitina V.E. 2008. Enzymes of the xylotrophic basidiomycete Lentinus edodes f-249 in the course of morphogenesis. Microbiology. 77 (2), 171–177.CrossRefPubMedGoogle Scholar
  36. 36.
    Vetchinkina E.P., Sokolov O.I., Nikitina V.E. 2008. Intracellular lectins of Lentinus edodes at various developmental stages of the fungus. Microbiology. 77 (4), 440–444.CrossRefGoogle Scholar
  37. 37.
    Vetchinkina E.P., Kupryashina M.A., Gorshkov V.Yu., Ageeva M.V., Gogolev Yu.V., and Nikitina V.E. 2017. Alteration in the ultrastructural morphology of mycelial hyphae and the dynamics of transcriptional activity of lytic enzyme genes during basidiomycete morphogenesis. J. Microbiol. 55 (4), 280–288.CrossRefPubMedGoogle Scholar
  38. 38.
    Ghangal R., Chaudhary S., Jain M., Purty R.S., Chand S.P. 2013. Optimization of de novo short read assembly of seabuckthorn (Hippophae rhamnoides L.) transcriptome. PLoS One. 8 (8), e72516.CrossRefPubMedPubMedCentralGoogle Scholar
  39. 39.
    Yang Y., Smith S.A. 2013. Optimizing de novo assembly of short-read RNA-seq data for phylogenomics. BMC Genomics. 14, 328.CrossRefPubMedPubMedCentralGoogle Scholar
  40. 40.
    Zhao Q.Y., Wang Y., Kong Y.M., Luo D., Li X., Hao P. 2011. Optimizing de novo transcriptome assembly from short-read RNA-Seq data: A comparative study. BMC Bioinformatics. 12 (14), S2.CrossRefGoogle Scholar
  41. 41.
    Haznedaroglu B.Z., Reeves D., Rismani-Yazdi H., Peccia J. 2012. Optimization of de novo transcriptome assembly from high-throughput short read sequencing data improves functional annotation for non-model organisms. BMC Bioinformatics. 13, 170.CrossRefPubMedPubMedCentralGoogle Scholar
  42. 42.
    Smith-Unna R., Boursnell C., Patro R., Hibberd J.M., Kelly S. 2016. TransRate: Reference free quality assessment of de-novo transcriptome assemblies. Genome Res. 26 (8), 1134–1144.CrossRefPubMedPubMedCentralGoogle Scholar
  43. 43.
    Chen S., McElroy J.S., Dane F., Peatman E. 2015. Optimizing transcriptome assemblies for Eleusine indica leaf and seedling by combining multiple assemblies from three de novo assemblers. Plant Genome. 8 (1), 1–10.CrossRefGoogle Scholar
  44. 44.
    Yuan Y., Xu H., Leung R.K. 2016. An optimized protocol for generation and analysis of Ion Proton sequencing reads for RNA-Seq. BMC Genomics. 17, 403.CrossRefPubMedPubMedCentralGoogle Scholar
  45. 45.
    Kim D., Langmead B., Salzberg S.L. 2015. HISAT: A fast spliced aligner with low memory requirements. Nat. Methods. 12 (4), 357–360.CrossRefPubMedPubMedCentralGoogle Scholar
  46. 46.
    Li R., Yu C., Li Y., Lam T.W., Yiu S.M., Kristiansen K., Wang J. 2009. SOAP2: An improved ultrafast tool for short read alignment. Bioinformatics. 25 (15), 1966–1967.CrossRefPubMedGoogle Scholar
  47. 47.
    Li H., Durbin R. 2009. Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics. 25 (14), 1754–1760.CrossRefPubMedPubMedCentralGoogle Scholar
  48. 48.
    Anders S., Pyl P.T., Huber W. 2015. HTSeq: A Python framework to work with high-throughput sequencing data. Bioinformatics. 31 (2), 166–169.CrossRefPubMedGoogle Scholar
  49. 49.
    Roberts A., Pachter L. 2013. Streaming fragment assignment for real-time analysis of sequencing experiments. Nat. Methods. 10 (1), 71–73.CrossRefPubMedGoogle Scholar
  50. 50.
    Li B., Dewey C.N. 2011. RSEM: Accurate transcript quantification from RNA-Seq data with or without a reference genome. BMC Bioinformatics. 12, 323.CrossRefPubMedPubMedCentralGoogle Scholar
  51. 51.
    Robinson M.D., McCarthy D.J., Smyth G.K. 2010. edgeR: A Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics. 26 (1), 139–140.CrossRefPubMedGoogle Scholar
  52. 52.
    Love M.I., Huber W., Anders S. 2014. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 15 (12), 550.CrossRefPubMedPubMedCentralGoogle Scholar
  53. 53.
    Hardcastle T.J., Kelly K.A. 2010. baySeq: Empirical Bayesian methods for identifying differential expression in sequence count data. BMC Bioinformatics. 11, 422.CrossRefPubMedPubMedCentralGoogle Scholar
  54. 54.
    Eddy S.R. 2009. A new generation of homology search tools based on probabilistic inference. Genome Inform. 23 (1), 205–211.PubMedGoogle Scholar
  55. 55.
    Jones P., Binns D., Chang H.Y., Fraser M., Li W., McAnulla C., McWilliam H., Maslen J., Mitchell A., Nuka G., Pesseat S., Quinn A.F., Sangrador-Vegas A., Scheremetjew M., Yong S.Y., et al. 2014. InterProScan 5: Genome-scale protein function classification. Bioinformatics. 30 (9), 1236–1240.CrossRefPubMedPubMedCentralGoogle Scholar
  56. 56.
    Musacchia F., Basu S., Petrosino G., Salvemini M., Sanges R. 2015. Annocript: A flexible pipeline for the annotation of transcriptomes able to identify putative long noncoding RNAs. Bioinformatics. 31 (13), 2199–2201.CrossRefPubMedGoogle Scholar
  57. 57.
    Luo W., Friedman M.S., Shedden K., Hankenson K.D., Woolf P.J. 2009. GAGE: Generally applicable gene set enrichment for pathway analysis. BMC Bioinformatics. 10, 161.CrossRefPubMedPubMedCentralGoogle Scholar
  58. 58.
    Huang da W., Sherman B.T., Lempicki R.A. 2009. Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nat. Protoc. 4 (1), 44–57.CrossRefGoogle Scholar
  59. 59.
    Young M.D., Wakefield M.J., Smyth G.K., Oshlack A. 2010. Gene ontology analysis for RNA-seq: Accounting for selection bias. Genome Biol. 11 (2), R14.CrossRefPubMedPubMedCentralGoogle Scholar
  60. 60.
    Chen L., Gong Y., Cai Y., Liu W., Zhou Y., Xiao Y., Xu Z., Liu Y., Lei X., Wang G., Guo M., Ma X., Bian Y. 2016. Genome sequence of the edible cultivated mushroom Lentinula edodes (Shiitake) reveals insights into lignocellulose degradation. PloS One. 11 (8), e0160336.CrossRefPubMedPubMedCentralGoogle Scholar

Copyright information

© Pleiades Publishing, Inc. 2018

Authors and Affiliations

  • R. F. Gubaev
    • 1
    • 2
  • V. Y. Gorshkov
    • 1
    • 2
  • L. M. Gapa
    • 1
    • 2
  • N. E. Gogoleva
    • 1
    • 2
  • E. P. Vetchinkina
    • 3
  • Y. V. Gogolev
    • 1
    • 2
  1. 1.Kazan Institute of Biochemistry and BiophysicsFederal Research Center “Kazan Scientific Center of RAS”KazanRussia
  2. 2.Kazan (Volga Region) Federal UniversityKazanRussia
  3. 3.Institute of Biochemistry and Physiology of Plants and MicroorganismsRussian Academy of SciencesSaratovRussia

Personalised recommendations