Abstract
Most biological processes including diseases are multifactorial and determined by a complex interplay of various genetic and environmental factors. This chapter aims to provide a user guide to data querying, analysis, and visualization with TargetMine and the associated auxiliary toolkit. We have also discussed some of the commonly used data queries for the researchers who are interested in gene set analysis within a data warehouse framework. Overall, TargetMine provides a convenient web browser-based interface that enables the discovery of new hypotheses interactively, by performing analysis of omics data using complicated searches without any scripting and programming efforts on the part of the user and also by providing the results in an easy-to-comprehend output format.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Ritchie MD, Holzinger ER, Li R et al (2015) Methods of integrating data to uncover genotype-phenotype interactions. Nat Rev Genet 16(2):85–97. https://doi.org/10.1038/nrg3868
Stein LD (2003) Integrating biological databases. Nat Rev Genet 4(5):337–345. https://doi.org/10.1038/nrg1065; pii: nrg1065
Triplet T, Butler G (2014) A review of genomic data warehousing systems. Brief Bioinform 15(4):471–483. https://doi.org/10.1093/bib/bbt031
Wong L (2002) Technologies for integrating biological data. Brief Bioinform 3(4):389–404
Chen YA, Tripathi LP, Mizuguchi K (2011) TargetMine, an integrated data warehouse for candidate gene prioritisation and target discovery. PLoS One 6(3):e17844. https://doi.org/10.1371/journal.pone.0017844
Chen YA, Tripathi LP, Mizuguchi K (2016) An integrative data analysis platform for gene set analysis and knowledge discovery in a data warehouse framework. Database (Oxford) 2016. https://doi.org/10.1093/database/baw009
Smith RN, Aleksic J, Butano D et al (2012) InterMine: a flexible data warehouse system for the integration and analysis of heterogeneous biological data. Bioinformatics 28(23):3163–3165. https://doi.org/10.1093/bioinformatics/bts577
Hamano Y, Kida H, Ihara S et al (2017) Classification of idiopathic interstitial pneumonias using anti-myxovirus resistance-protein 1 autoantibody. Sci Rep 7:43201. https://doi.org/10.1038/srep43201
Ihara S, Kida H, Arase H et al (2012) Inhibitory roles of signal transducer and activator of transcription 3 in antitumor immunity during carcinogen-induced lung tumorigenesis. Cancer Res 72(12):2990–2999. https://doi.org/10.1158/0008-5472.CAN-11-4062
Jin Y, Takeda Y, Kondo Y et al (2018) Double deletion of tetraspanins CD9 and CD81 in mice leads to a syndrome resembling accelerated aging. Sci Rep 8(1):5145. https://doi.org/10.1038/s41598-018-23338-x
Tripathi LP, Kambara H, Chen YA et al (2013) Understanding the biological context of NS5A-host interactions in HCV infection: a network-based approach. J Proteome Res 12(6):2537–2551. https://doi.org/10.1021/pr3011217
Tripathi LP, Kambara H, Moriishi K et al (2012) Proteomic analysis of hepatitis C virus (HCV) core protein transfection and host regulator PA28gamma knockout in HCV pathogenesis: a network-based study. J Proteome Res 11(7):3664–3679. https://doi.org/10.1021/pr300121a
Tripathi LP, Kataoka C, Taguwa S et al (2010) Network based analysis of hepatitis C virus core and NS4B protein interactions. Mol BioSyst 6(12):2539–2553. https://doi.org/10.1039/c0mb00103a
Chen YA, Tripathi LP, Dessailly BH et al (2014) Integrated pathway clusters with coherent biological themes for target prioritisation. PLoS One 9(6):e99030. https://doi.org/10.1371/journal.pone.0099030
Ashburner M, Ball CA, Blake JA et al (2000) Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet 25(1):25–29. https://doi.org/10.1038/75556
Aoki-Kinoshita KF, Kanehisa M (2007) Gene annotation and pathway mapping in KEGG. Methods Mol Biol 396:71–91. https://doi.org/10.1007/978-1-59745-515-2_6
Matthews L, Gopinath G, Gillespie M et al (2009) Reactome knowledgebase of human biological pathways and processes. Nucleic Acids Res 37(Database issue):D619–D622. https://doi.org/10.1093/nar/gkn863
Schaefer CF, Anthony K, Krupa S et al (2009) PID: the Pathway Interaction Database. Nucleic Acids Res 37(Database issue):D674–D679. https://doi.org/10.1093/nar/gkn653
Afgan E, Baker D, van den Beek M et al (2016) The Galaxy platform for accessible, reproducible and collaborative biomedical analyses: 2016 update. Nucleic Acids Res 44(W1):W3–W10. https://doi.org/10.1093/nar/gkw343
Benjamini Y, Drai D, Elmer G et al (2001) Controlling the false discovery rate in behavior genetics research. Behav Brain Res 125(1–2):279–284
Benjamini Y, Hochberg Y (1995) Controlling the false discovery rate - a practical and powerful approach to multiple testing. J R Stat Soc B 57(1):289–300
Dunn OJ (1961) Multiple comparisons among means. J Am Stat Assoc 56(293):52–64. https://doi.org/10.1080/01621459.1961.10482090
Holm S (1979) A simple sequentially rejective multiple test procedure. Scand J Stat 6(2):65–70
Cline MS, Smoot M, Cerami E et al (2007) Integration of biological networks and gene expression data using Cytoscape. Nat Protoc 2(10):2366–2382. https://doi.org/10.1038/nprot.2007.324; pii: nprot.2007.324
Raman K (2010) Construction and analysis of protein-protein interaction networks. Autom Exp 2(1):2. https://doi.org/10.1186/1759-4499-2-2
Yu H, Kim PM, Sprecher E et al (2007) The importance of bottlenecks in protein networks: correlation with gene essentiality and expression dynamics. PLoS Comput Biol 3(4):e59. https://doi.org/10.1371/journal.pcbi.0030059
Edwards AM, Isserlin R, Bader GD et al (2011) Too many roads not taken. Nature 470(7333):163–165. https://doi.org/10.1038/470163a
Fitch WM (2000) Homology a personal view on some of the problems. Trends Genet 16(5):227–231
Koonin EV (2005) Orthologs, paralogs, and evolutionary genomics. Annu Rev Genet 39:309–338
Webber C, Ponting CP (2004) Genes and homology. Curr Biol 14(9):R332–R333
Watson JD, Laskowski RA, Thornton JM (2005) Predicting protein function from sequence and structural data. Curr Opin Struct Biol 15(3):275–284. https://doi.org/10.1016/j.sbi.2005.04.003; pii: S0959-440X(05)00082-5
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Science+Business Media, LLC, part of Springer Nature
About this protocol
Cite this protocol
Chen, YA., Tripathi, L.P., Mizuguchi, K. (2019). Data Warehousing with TargetMine for Omics Data Analysis. In: Bolón-Canedo, V., Alonso-Betanzos, A. (eds) Microarray Bioinformatics. Methods in Molecular Biology, vol 1986. Humana, New York, NY. https://doi.org/10.1007/978-1-4939-9442-7_3
Download citation
DOI: https://doi.org/10.1007/978-1-4939-9442-7_3
Published:
Publisher Name: Humana, New York, NY
Print ISBN: 978-1-4939-9441-0
Online ISBN: 978-1-4939-9442-7
eBook Packages: Springer Protocols