Tree Genetics & Genomes

, Volume 10, Issue 4, pp 1093–1101 | Cite as

Pop’s Pipes: poplar gene expression data analysis pipelines

  • Xiang Li
  • Chathura Gunasekara
  • Yufeng Guo
  • Hang Zhang
  • Liang Lei
  • Sermsawat Tunlaya-Anukit
  • Victor Busov
  • Vincent Chiang
  • Hairong Wei
Original Paper

Abstract

We developed multiple gene expression pipelines and assembled them into a web-based tool called Pop’s Pipes to facilitate preprocessing and analysis of substantial poplar gene expression data. The input data can be spatiotemporal microarray and RNA-seq data from comparable tissues, time points, or treatment-vs-control conditions. Pop’s Pipes can be used to identify differentially expressed genes between one or multiple paired tissues, time points, or treatment-vs-control conditions in a single in silico analysis. The differentially expressed genes (DEGs) obtained for each comparison will be automatically analyzed by Pop’s Pipes for identifying significantly enriched gene ontologies and interpro protein domains. Also, significantly changed metabolic pathways across all input data sets will be identified. We also integrated a pipeline into Pop's Pipes for constructing any of three type gene ontology trees when a short list of gene ontologies from biological processes, molecular functions, or cellular components is used as an input. The resulting information from Pop’s Pipes enables scrutiny to create spatiotemporal models and hypotheses to understand how poplar develops and functions. Pop’s Pipes can analyze a microarray or RNA-seq data set with 10 time points in 4–10 h, with each time point containing three replicates of treatments and three controls. Such a data set usually takes a bioinformatician a few months to a year to analyze. Pop’s Pipes can thus save users tremendous amounts of research time when large numbers of comparative data need to be analyzed.

Keywords

Poplar Microarray RNA-seq data Differentially expressed genes Pathway enrichment analysis Gene ontology enrichment analysis Protein domain enrichment analysis Pipeline Gene ontology tree 

Supplementary material

11295_2014_745_MOESM1_ESM.xlsx (6.9 mb)
Supplemental file 1(XLSX 7086 kb)
11295_2014_745_MOESM2_ESM.xlsx (199 kb)
Supplemental file 2(XLSX 199 kb)
11295_2014_745_MOESM3_ESM.doc (32 kb)
Supplemental file 3(DOC 31 kb)

References

  1. Adler PR, Del Grosso SJ, Parton WJ (2007) Life-cycle assessment of net greenhouse-gas flux for bioenergy cropping systems. Ecol Appl 17(3):675–691PubMedCrossRefGoogle Scholar
  2. Altschul SF et al (1990) Basic local alignment search tool. J MOL Biol 215(3):403–410Google Scholar
  3. Anders S, Huber W (2010) Differential expression analysis for sequence count data. Genome Biol 11(10):R106PubMedCentralPubMedCrossRefGoogle Scholar
  4. Auer PL, Doerge RW (2011) A two-stage Poisson model for testing RNA-seq data. Stat Appl Genet Mol Biol 10:Article 26Google Scholar
  5. Benjamini Y, Hochberg Y (1995) Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc B 57:289–300Google Scholar
  6. Breitling R et al (2004) Rank products: a simple, yet powerful, new method to detect differentially regulated genes in replicated microarray experiments. FEBS Lett 573(1–3):83–92PubMedCrossRefGoogle Scholar
  7. Di Y et al (2011) The NBP negative binomial model for assessing differential gene expression from RNA-seq. Stat Appl Genet Mol Biol 10:Article 24Google Scholar
  8. Dinu I et al (2007) Improving gene set analysis of microarray data by SAM-GS. BMC Bioinforma 8:242CrossRefGoogle Scholar
  9. Enguita FJ et al (2003) Crystal structure of a bacterial endospore coat component. A laccase with enhanced thermostability properties. J Biol Chem 278(21):19416–19425PubMedCrossRefGoogle Scholar
  10. Hardcastle TJ, Kelly KA (2010) baySeq: empirical Bayesian methods for identifying differential expression in sequence count data. BMC Bioinforma 11:422CrossRefGoogle Scholar
  11. Hong F, Wittner B (2008) Bioconductor RankProd Package VignetteGoogle Scholar
  12. Hsu CY et al (2011) FLOWERING LOCUS T duplication coordinates reproductive and vegetative growth in perennial poplar. Proc Natl Acad Sci U S A 108(26):10756–10761PubMedCentralPubMedCrossRefGoogle Scholar
  13. Kadota K, Nakai Y, Shimizu K (2008) A weighted average difference method for detecting differentially expressed genes from microarray data. Algorithms Mol Biol 3:8PubMedCentralPubMedCrossRefGoogle Scholar
  14. Kadota K, Nakai Y, Shimizu K (2009) Ranking differentially expressed genes from Affymetrix gene expression data: methods with reproducibility, sensitivity, and specificity. Algorithms Mol Biol 4:7PubMedCentralPubMedCrossRefGoogle Scholar
  15. Komori H, Miyazaki K, Higuchi Y (2009) X-ray structure of a two-domain type laccase: a missing link in the evolution of multi-copper proteins. FEBS Lett 583(7):1189–1195PubMedCrossRefGoogle Scholar
  16. Li J, Tibshirani R (2011) Finding consistent patterns: a nonparametric approach for identifying differential expression in RNA-Seq data. Stat Methods Med ResGoogle Scholar
  17. Lu S et al (2013) Ptr-miR397a is a negative regulator of laccase genes affecting lignin content in Populus trichocarpa. Proc Natl Acad Sci U S A 110(26):10848–10853PubMedCentralPubMedCrossRefGoogle Scholar
  18. Opgen-Rhein R, Strimmer K (2007) Accurate ranking of differentially expressed genes by a distribution-free shrinkage approach. Stat Appl Genet Mol Biol 6:Article9PubMedGoogle Scholar
  19. Robinson MD, Oshlack A (2010) A scaling normalization method for differential expression analysis of RNA-seq data. Genome Biol 11(3):R25PubMedCentralPubMedCrossRefGoogle Scholar
  20. Robinson MD, Smyth GK (2007) Moderated statistical tests for assessing differences in tag abundance. Bioinformatics 23(21):2881–2887PubMedCrossRefGoogle Scholar
  21. Robinson MD, McCarthy DJ, Smyth GK (2010) edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics 26(1):139–140PubMedCentralPubMedCrossRefGoogle Scholar
  22. Sakurai T, Kataoka K (2007) Basic and applied features of multicopper oxidases, CueO, bilirubin oxidase, and laccase. Chem Rec 7(4):220–229PubMedCrossRefGoogle Scholar
  23. Sartor MA et al (2006) Intensity-based hierarchical Bayes method improves testing for differentially expressed genes in microarray experiments. BMC Bioinforma 7:538CrossRefGoogle Scholar
  24. Smyth GK (2004) Linear models and empirical Bayes methods for assessing differential expression in microarray experiments. Stat Appl Genet Mol Biol 3:Article3PubMedGoogle Scholar
  25. Soneson C, Delorenzi M (2013) A comparison of methods for differential expression analysis of RNA-seq data. BMC Bioinforma 14:91CrossRefGoogle Scholar
  26. Tarazona S et al (2011) Differential expression in RNA-seq: a matter of depth. Genome Res 21(12):2213–2223PubMedCentralPubMedCrossRefGoogle Scholar
  27. Tooke F, Battey NH (2000) A leaf-derived signal is a quantitative determinant of floral form in Impatiens. Plant Cell 12(10):1837–1848PubMedCentralPubMedCrossRefGoogle Scholar
  28. Tusher VG, Tibshirani R, Chu G (2001) Significance analysis of microarrays applied to the ionizing radiation response. Proc Natl Acad Sci U S A 98(9):5116–5121PubMedCentralPubMedCrossRefGoogle Scholar
  29. Van De Wiel MA et al (2013) Bayesian analysis of RNA sequencing data by estimating multiple shrinkage priors. Biostatistics 14(1):113–128CrossRefGoogle Scholar
  30. Wei H et al (2013a) Global transcriptomic profiling of aspen trees under elevated [CO2] to identify potential molecular mechanisms responsible for enhanced radial growth. J Plant Res 126(2):305–320PubMedCrossRefGoogle Scholar
  31. Wei H et al (2013b) Nitrogen deprivation promotes Populus root growth through global transcriptome reprogramming and activation of hierarchical genetic networks. New Phytol 200(2):483–497PubMedCrossRefGoogle Scholar
  32. Yang L, Conway SR, Poethig RS (2011) Vegetative phase change is mediated by a leaf-derived signal that represses the transcription of miR156. Development 138(2):245–249PubMedCentralPubMedCrossRefGoogle Scholar
  33. Zdobnov EM, Apweiler R (2001) InterProScan—an integration platform for the signature-recognition methods in InterPro. Bioinformatics 17(9):847–848PubMedCrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2014

Authors and Affiliations

  • Xiang Li
    • 1
  • Chathura Gunasekara
    • 2
    • 4
  • Yufeng Guo
    • 1
  • Hang Zhang
    • 1
  • Liang Lei
    • 3
  • Sermsawat Tunlaya-Anukit
    • 4
  • Victor Busov
    • 5
    • 6
  • Vincent Chiang
    • 4
    • 7
  • Hairong Wei
    • 1
    • 5
    • 6
    • 7
  1. 1.Department of Computer ScienceMichigan Technological UniversityHoughtonUSA
  2. 2.Computer Science and Engineering ProgramMichigan Technological UniversityHoughtonUSA
  3. 3.Department of Computer TechnologyChongqing University of Science and TechnologyChongqingPeople’s Republic of China
  4. 4.Forest Biotechnology Group, Department of Forestry and Environmental ResourcesNorth Carolina State UniversityRaleighUSA
  5. 5.School of Forest Resources and Environmental ScienceMichigan Technological UniversityHoughtonUSA
  6. 6.Biotechnology Research CenterMichigan Technological UniversityHoughtonUSA
  7. 7.State Key Laboratory of Forest Tree Genetics and BreedingNortheast Forestry UniversityHarbinPeople’s Republic of China

Personalised recommendations