Abstract
Methods capable of identifying genetic pathways with coordinated expression signatures are critical to advance our understanding of the functions of biological networks. Currently, the most comprehensive and validated biological networks are metabolic networks. Complete metabolic networks are easily sourced from multiple online databases. These databases reveal metabolic networks to be large, highly complex structures. This complexity is sufficient to hide the specific details on which pathways are interacting to produce an observed network response. In this chapter we will outline a complete framework for identifying the metabolic pathways that relate to an observed phenomenon. To illuminate the functional metabolic pathways, we overlay microarray experiments on top of a complete metabolic network. We then extract the functional components within a metabolic network through a combination of novel pathway ranking, clustering, and classification algorithms. This chapter is designed as a simple tutorial which enables this framework to be applied to any metabolic network and microarray data.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Kanehisa M, Goto S (2000) KEGG: kyoto encyclopedia of genes and genomes. Nucleic Acids Res 28:27–30
Caspi R, Altman T, Dale JM, Dreher K, Fulcher CA, Gilham F, Kaipa P, Karthikeyan AS, Kothari A, Krummenacker M, Latendresse M, Mueller LA, Paley S, Popescu L, Pujar A, Shearer AG, Zhang P, Karp PD (2010) The MetaCyc database of metabolic pathways and enzymes and the BioCyc collection of pathway/genome databases. Nucleic Acids Res 38(Database issue):D473–9
Matthews L, Gopinath G, Gillespie M, Caudy M, Croft D, de Bono B, Garapati P, Hemish J, Hermjakob H, Jassal B, Kanapin A, Lewis S, Mahajan S, May B, Schmidt E, Vastrik I, Wu G, Birney E, Stein L, D’Eustachio P (2009) Reactome knowledgebase of human biological pathways and processes. Nucleic Acids Res 37:D619–22
Vastrik I, D’Eustachio P, Schmidt E, Gopinath G, Croft D, de Bono B, Gillespie M, Jassal B, Lewis S, Matthews L, Wu G, Birney E, Stein L (2007) Reactome: a knowledge base of biologic pathways and processes. Genome Biol 8(3):R39
Subramanian A, Tamayo P, Mootha VK, Mukherjee S, Ebert BL, Gillette MA, Paulovich A, Pomeroy SL, Golub TR, Lander ES, Mesirov JP (2005) Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. PNAS 102(43):15545–15550
Liu Q, Dinu I, Adewale AJ, Potter JD, Yasui Y (2007) Comparative evaluation of gene-set analysis methods. BMC Bioinformatics 8:431
Sanguinetti G, Noirel J, Wright PC (2008) MMG: a probabilistic tool to identify submodules of metabolic pathways. Bioinformatics 24(8):1078–84
Hanisch D, Zien A, Zimmer R, Lengauer T (2002) Co-clustering of biological networks and gene expression data. Bioinformatics 18:S145–54
Ulitsky I, Shamir R (2007) Identification of functional modules using network topology and high-throughput data. BMC Syst Biol 1:8
Ideker T, Ozier O, Schwikowski B, Siegel AF (2002) Discovering regulatory and signalling circuits in molecular interaction networks. Bioinformatics 18:S233–40
Hancock T, Takigawa I, Mamitsuka H (2010) Mining metabolic pathways through gene expression. Bioinformatics 26(17):2128–2135
Takigawa I, Mamitsuka H (2008) Probabilistic path ranking based on adjacent pairwise coexpression for metabolic transcripts analysis. Bioinformatics 24(2):250–257
Mamitsuka H, Okuno Y, Yamaguchi A (2003) Mining biologically active patterns in metabolic pathways using microarray expression profiles. SIGKDD Explorations 5(2):113–121
Hancock T. and Mamitsuka H (2009) A Markov Classification Model for Metabolic Pathways, Algorithms in Bioinformatics, 9th International Workshop, (WABI), Philadelphia, PA, USA. Proceedings, WABI, volume 5724, Springer, Eds: Salzberg, Steven, and Warnow, Tandy
Hancock T, Mamitsuka H (2010) A Markov classification model for metabolic pathways. Algorithms Mol Biol 5(1):10
Gasch AP, Spellman PT, Kao CM, Carmel-Harel O, Eisen MB, Storz G, Botstein D, Brown PO (2000) Genomic expression programs in the response of yeast cells to environmental changes. Mol Biol Cell 11(12):4241–57
Gentleman R, Carey V, Huber W, Irizarray R, Dudoit S (2005) Bioinformatics and Computational Biology Solutions Using R and Bioconductor. Springer, Berlin
Gasch A, Carey V (2010) gaschyhs: Expressionset for response of yeast to heat shock and other environmental stresses. www.bioconductor.org
Yen J (1971) Finding the k-shortest loopless paths in a network. Manage Sci 17:712–716
Lawler E (1972) A procedure for computing the k best solutions to discrete optimization problems and its application to the shortest path problem. Manage Sci 18:401–405
Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the EM algorithm, J Roy Stat Soc B, 39(1), 1–38.
Park MY, Hastie T (2008) Penalized logistic regression for detecting gene interactions. Biostatistics 9(1):30–50
Jordan M, Jacobs R (1994) Hierarchical mixtures of experts and the EM algorithm. Neural Comput 6(2):181–214
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer Science+Business Media New York
About this protocol
Cite this protocol
Hancock, T., Takigawa, I., Mamitsuka, H. (2013). Identifying Pathways of Coordinated Gene Expression. In: Mamitsuka, H., DeLisi, C., Kanehisa, M. (eds) Data Mining for Systems Biology. Methods in Molecular Biology, vol 939. Humana Press, Totowa, NJ. https://doi.org/10.1007/978-1-62703-107-3_7
Download citation
DOI: https://doi.org/10.1007/978-1-62703-107-3_7
Published:
Publisher Name: Humana Press, Totowa, NJ
Print ISBN: 978-1-62703-106-6
Online ISBN: 978-1-62703-107-3
eBook Packages: Springer Protocols