Abstract
Comparative transcriptomics has gained increasing popularity in genomic research thanks to the development of high-throughput technologies including microarray and next-generation RNA sequencing that have generated numerous transcriptomic data. An important question is to understand the conservation and divergence of biological processes in different species. We propose a testing-based method TROM (Transcriptome Overlap Measure) for comparing transcriptomes within or between different species, and provide a different perspective, in contrast to traditional correlation analyses, about capturing transcriptomic similarity. Specifically, the TROM method focuses on identifying associated genes that capture molecular characteristics of biological samples, and subsequently comparing the biological samples by testing the overlap of their associated genes. We use simulation and real data studies to demonstrate that TROM is more powerful in identifying similar transcriptomes and more robust to stochastic gene expression noise than Pearson and Spearman correlations. We apply TROM to compare the developmental stages of six Drosophila species, C. elegans, S. purpuratus, D. rerio and mouse liver, and find interesting correspondence patterns that imply conserved gene expression programs in the development of these species. The TROM method is available as an R package on CRAN (https://github.com/Vivianstats/TROM) with manuals and source codes available at http://jsb.ucla.edu/trom-transcriptome-overlap-measure.
Similar content being viewed by others
References
Arbeitman MN, Furlong EE, Imam F, Johnson E, Null BH, Baker BS, Krasnow MA, Scott MP, Davis RW, White KP (2002) Gene expression during the life cycle of Drosophila melanogaster. Science 297(5590):2270–2275
Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP, Dolinski K, Dwight SS, Eppig JT et al (2000) Gene ontology: tool for the unification of biology. Nat Genet 25(1):25–29
Bolstad BM, Irizarry RA, Åstrand M, Speed TP (2003) A comparison of normalization methods for high density oligonucleotide array data based on variance and bias. Bioinformatics 19(2):185–193
Cunningham F, Amode MR, Barrell D, Beal K, Billis K, Brent S, Carvalho-Silva D, Clapham P, Coates G, Fitzgerald S et al (2015) Ensembl 2015. Nucl Acids Res 43(D1):D662–D669
Davidson EH, Cameron RA, Ransick A (1998) Specification of cell fate in the sea urchin embryo: summary and some proposed mechanisms. Development 125(17):3269–3290
Domazet-Lošo T, Tautz D (2010) A phylogenetically based transcriptome age index mirrors ontogenetic divergence patterns. Nature 468(7325):815–818
Dong Z, Wei H, Sun R, Tian Z (2007) The roles of innate immune cells in liver injury and regeneration. Cell Mol Immunol 4(4):241–252
Fu X, Fu N, Guo S, Yan Z, Xu Y, Hu H, Menzel C, Chen W, Li Y, Zeng R et al (2009) Estimating accuracy of RNA-Seq and microarrays with proteomics. BMC Genom 10(1):161
Gerstein MB, Rozowsky J, Yan KK, Wang D, Cheng C, Brown JB, Davis CA, Hillier L, Sisu C, Li JJ et al (2014) Comparative analysis of the transcriptome across distant species. Nature 512(7515):445–448
Hata S, Namae M, Nishina H (2007) Liver development and regeneration: from laboratory study to clinical therapy. Develop Growth Differ 49(2):163–170
Hicks SC, Irizarry RA (2014) When to use quantile normalization? bioRxiv. doi:https://doi.org/10.1101/012203
Labbé RM, Irimia M, Currie KW, Lin A, Zhu SJ, Brown DD, Ross EJ, Voisin V, Bader GD, Blencowe BJ et al (2012) A comparative transcriptomic analysis reveals conserved features of stem cell pluripotency in planarians and mammals. Stem Cells 30(8):1734–1745
Le HS, Oltvai ZN, Bar-Joseph Z (2010) Cross-species queries of large gene expression databases. Bioinformatics 26(19):2416–2423
Li JJ, Huang H, Bickel PJ, Brenner SE (2014) Comparison of D. melanogaster and C. elegans developmental stages, tissues, and cells by modencode RNA-Seq data. Genome Res 24(7):1086–1101
Li T, Huang J, Jiang Y, Zeng Y, He F, Zhang MQ, Han Z, Zhang X (2009) Multi-stage analysis of gene expression and transcription regulation in c57/b6 mouse liver development. Genomics 93(3):235–242
Necsulea A, Soumillon M, Warnefors M, Liechti A, Daish T, Zeller U, Baker JC, Grützner F, Kaessmann H (2014) The evolution of lncRNA repertoires and expression patterns in tetrapods. Nature 505(7485):635–640
Pantalacci S, Sémon M (2015) Transcriptomics of developing embryos and organs: a raising tool for evo–devo. J Exp Zool Part B Mol Dev Evol 324(4):363–371
Puniyani K, Faloutsos C, Xing EP (2010) Spex2: automated concise extraction of spatial gene expression patterns from fly embryo ISH images. Bioinformatics 26(12):i47–i56
Shen Y, Yue F, McCleary DF, Ye Z, Edsall L, Kuan S, Wagner U, Dixon J, Lee L, Lobanenkov VV et al (2012) A map of the cis-regulatory sequences in the mouse genome. Nature 488(7409):116–120
Spencer WC, Zeller G, Watson JD, Henz SR, Watkins KL, McWhirter RD, Petersen S, Sreedharan VT, Widmer C, Jo J et al (2011) A spatial and temporal map of C. elegans gene expression. Genome Res 21(2):325–341
Tong X, Feng Y, Li JJ (2016) Neyman-Pearson (NP) classification algorithms and NP receiver operating characteristic (NP-ROC) curves. arXiv preprint arXiv:1608.03109
Tu Q, Cameron RA, Davidson EH (2014) Quantitative developmental transcriptomes of the sea urchin Strongylocentrotus purpuratus. Dev Biol 385(2):160–167
Virmani AK, Tsou JA, Siegmund KD, Shen LY, Long TI, Laird PW, Gazdar AF, Laird-Offringa IA (2002) Hierarchical clustering of lung cancer cell lines using DNA methylation markers. Cancer Epidemiol Biomark Prevent 11(3):291–297
Wang C, Gong B, Bushel PR, Thierry-Mieg J, Thierry-Mieg D, Xu J, Fang H, Hong H, Shen J, Su Z et al (2014) The concordance between RNA-seq and microarray data depends on chemical treatment and transcript abundance. Nat Biotechnol 32(9):926–932
Wang Z, Gerstein M, Snyder M (2009) RNA-Seq: a revolutionary tool for transcriptomics. Nat Rev Genet 10(1):57–63
Zhao S, Fung-Leung WP, Bittner A, Ngo K, Liu X (2014) Comparison of RNA-Seq and microarray in transcriptome profiling of activated T cells. PloS One 9(1)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Li, W.V., Chen, Y. & Li, J.J. TROM: A Testing-Based Method for Finding Transcriptomic Similarity of Biological Samples. Stat Biosci 9, 105–136 (2017). https://doi.org/10.1007/s12561-016-9163-y
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12561-016-9163-y