Skip to main content

Identifying Pathways of Coordinated Gene Expression

  • Protocol
  • First Online:
Data Mining for Systems Biology

Part of the book series: Methods in Molecular Biology ((MIMB,volume 939))

Abstract

Methods capable of identifying genetic pathways with coordinated expression signatures are critical to advance our understanding of the functions of biological networks. Currently, the most comprehensive and validated biological networks are metabolic networks. Complete metabolic networks are easily sourced from multiple online databases. These databases reveal metabolic networks to be large, highly complex structures. This complexity is sufficient to hide the specific details on which pathways are interacting to produce an observed network response. In this chapter we will outline a complete framework for identifying the metabolic pathways that relate to an observed phenomenon. To illuminate the functional metabolic pathways, we overlay microarray experiments on top of a complete metabolic network. We then extract the functional components within a metabolic network through a combination of novel pathway ranking, clustering, and classification algorithms. This chapter is designed as a simple tutorial which enables this framework to be applied to any metabolic network and microarray data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 139.00
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Kanehisa M, Goto S (2000) KEGG: kyoto encyclopedia of genes and genomes. Nucleic Acids Res 28:27–30

    Article  PubMed  CAS  Google Scholar 

  2. Caspi R, Altman T, Dale JM, Dreher K, Fulcher CA, Gilham F, Kaipa P, Karthikeyan AS, Kothari A, Krummenacker M, Latendresse M, Mueller LA, Paley S, Popescu L, Pujar A, Shearer AG, Zhang P, Karp PD (2010) The MetaCyc database of metabolic pathways and enzymes and the BioCyc collection of pathway/genome databases. Nucleic Acids Res 38(Database issue):D473–9

    Google Scholar 

  3. Matthews L, Gopinath G, Gillespie M, Caudy M, Croft D, de Bono B, Garapati P, Hemish J, Hermjakob H, Jassal B, Kanapin A, Lewis S, Mahajan S, May B, Schmidt E, Vastrik I, Wu G, Birney E, Stein L, D’Eustachio P (2009) Reactome knowledgebase of human biological pathways and processes. Nucleic Acids Res 37:D619–22

    Article  PubMed  CAS  Google Scholar 

  4. Vastrik I, D’Eustachio P, Schmidt E, Gopinath G, Croft D, de Bono B, Gillespie M, Jassal B, Lewis S, Matthews L, Wu G, Birney E, Stein L (2007) Reactome: a knowledge base of biologic pathways and processes. Genome Biol 8(3):R39

    Article  PubMed  Google Scholar 

  5. Subramanian A, Tamayo P, Mootha VK, Mukherjee S, Ebert BL, Gillette MA, Paulovich A, Pomeroy SL, Golub TR, Lander ES, Mesirov JP (2005) Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. PNAS 102(43):15545–15550

    Article  PubMed  CAS  Google Scholar 

  6. Liu Q, Dinu I, Adewale AJ, Potter JD, Yasui Y (2007) Comparative evaluation of gene-set analysis methods. BMC Bioinformatics 8:431

    Article  PubMed  Google Scholar 

  7. Sanguinetti G, Noirel J, Wright PC (2008) MMG: a probabilistic tool to identify submodules of metabolic pathways. Bioinformatics 24(8):1078–84

    Article  PubMed  CAS  Google Scholar 

  8. Hanisch D, Zien A, Zimmer R, Lengauer T (2002) Co-clustering of biological networks and gene expression data. Bioinformatics 18:S145–54

    Article  PubMed  Google Scholar 

  9. Ulitsky I, Shamir R (2007) Identification of functional modules using network topology and high-throughput data. BMC Syst Biol 1:8

    Article  PubMed  Google Scholar 

  10. Ideker T, Ozier O, Schwikowski B, Siegel AF (2002) Discovering regulatory and signalling circuits in molecular interaction networks. Bioinformatics 18:S233–40

    Article  PubMed  Google Scholar 

  11. Hancock T, Takigawa I, Mamitsuka H (2010) Mining metabolic pathways through gene expression. Bioinformatics 26(17):2128–2135

    Article  PubMed  CAS  Google Scholar 

  12. Takigawa I, Mamitsuka H (2008) Probabilistic path ranking based on adjacent pairwise coexpression for metabolic transcripts analysis. Bioinformatics 24(2):250–257

    Article  PubMed  CAS  Google Scholar 

  13. Mamitsuka H, Okuno Y, Yamaguchi A (2003) Mining biologically active patterns in metabolic pathways using microarray expression profiles. SIGKDD Explorations 5(2):113–121

    Article  Google Scholar 

  14. Hancock T. and Mamitsuka H (2009) A Markov Classification Model for Metabolic Pathways, Algorithms in Bioinformatics, 9th International Workshop, (WABI), Philadelphia, PA, USA. Proceedings, WABI, volume 5724, Springer, Eds: Salzberg, Steven, and Warnow, Tandy

    Google Scholar 

  15. Hancock T, Mamitsuka H (2010) A Markov classification model for metabolic pathways. Algorithms Mol Biol 5(1):10

    Article  PubMed  Google Scholar 

  16. Gasch AP, Spellman PT, Kao CM, Carmel-Harel O, Eisen MB, Storz G, Botstein D, Brown PO (2000) Genomic expression programs in the response of yeast cells to environmental changes. Mol Biol Cell 11(12):4241–57

    PubMed  CAS  Google Scholar 

  17. Gentleman R, Carey V, Huber W, Irizarray R, Dudoit S (2005) Bioinformatics and Computational Biology Solutions Using R and Bioconductor. Springer, Berlin

    Book  Google Scholar 

  18. Gasch A, Carey V (2010) gaschyhs: Expressionset for response of yeast to heat shock and other environmental stresses. www.bioconductor.org

  19. Yen J (1971) Finding the k-shortest loopless paths in a network. Manage Sci 17:712–716

    Article  Google Scholar 

  20. Lawler E (1972) A procedure for computing the k best solutions to discrete optimization problems and its application to the shortest path problem. Manage Sci 18:401–405

    Article  Google Scholar 

  21. Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the EM algorithm, J Roy Stat Soc B, 39(1), 1–38.

    Google Scholar 

  22. Park MY, Hastie T (2008) Penalized logistic regression for detecting gene interactions. Biostatistics 9(1):30–50

    Article  PubMed  Google Scholar 

  23. Jordan M, Jacobs R (1994) Hierarchical mixtures of experts and the EM algorithm. Neural Comput 6(2):181–214

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Timothy Hancock .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer Science+Business Media New York

About this protocol

Cite this protocol

Hancock, T., Takigawa, I., Mamitsuka, H. (2013). Identifying Pathways of Coordinated Gene Expression. In: Mamitsuka, H., DeLisi, C., Kanehisa, M. (eds) Data Mining for Systems Biology. Methods in Molecular Biology, vol 939. Humana Press, Totowa, NJ. https://doi.org/10.1007/978-1-62703-107-3_7

Download citation

  • DOI: https://doi.org/10.1007/978-1-62703-107-3_7

  • Published:

  • Publisher Name: Humana Press, Totowa, NJ

  • Print ISBN: 978-1-62703-106-6

  • Online ISBN: 978-1-62703-107-3

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics