A Markov Classification Model for Metabolic Pathways

  • Timothy Hancock
  • Hiroshi Mamitsuka
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5724)


The size and complexity of metabolic networks has increased past the point where a researcher is able to intuitively understand all interacting components. Confronted with complexity, biologists must now create models of these networks to identify key relationships of specific interest to their experiments. In this paper focus on the problem of identifying pathways through metabolic networks that relate to a specific biological response. Our proposed model, HME3M, first identifies frequently traversed network paths using a Markov mixture model. Then by employing a hierarchical mixture of experts, separate classifiers are built using information specific to each path and combined into an ensemble classifier the response. We compare the performance of HME3M with logistic regression and support vector machines (SVM) in both simulated and realistic environments. These experiments clearly show HME3M is a highly interpretable model that outperforms common classification methods for large realistic networks and high levels of pathway noise.


Support Vector Machine Metabolic Network Support Vector Machine Model Interpretable Model Dominant Pathway 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Kanehisa, M., Goto, S.: KEGG: Kyoto Encyclopedia of Genes and Genomes. Nucleic Acids Res. 28, 27–30 (2000)CrossRefPubMedPubMedCentralGoogle Scholar
  2. 2.
    Brazma, A., Parkinson, H., Sarkans, U., Shojatalab, M., Vilo, J., Abeygunawardena, N., Holloway, E., Kapushesky, M., Kemmeren, P., Lara, G.G., Oezcimen, A., Rocca-Serra, P., Sansone, S.: ArrayExpress–a public repository for microarray gene expression data at the EBI. Nucl. Acids Res. 31(1), 68–71 (2003)CrossRefPubMedPubMedCentralGoogle Scholar
  3. 3.
    Evans, W.J., Grant, G.R.: Statistical methods in bioinformatics: An introduction, 2nd edn. Springer, Heidelberg (2005)Google Scholar
  4. 4.
    Mamitsuka, H., Okuno, Y., Yamaguchi, A.: Mining biologically active patterns in metabolic pathways using microarray expression profiles. SIGKDD Explorations 5(2), 113–121 (2003)CrossRefGoogle Scholar
  5. 5.
    Jordan, M., Jacobs, R.: Hierarchical mixtures of experts and the EM algorithm. Neural Computation 6(2), 181–214 (1994)CrossRefGoogle Scholar
  6. 6.
    Waterhouse, S.R., Robinson, A.J.: Classification using mixtures of experts. In: IEEE Workshop on Neural Networks for Signal Processing (IV), pp. 177–186 (1994)Google Scholar
  7. 7.
    Park, M.Y., Hastie, T.: Penalized logistic regression for detecting gene interactions. Biostatistics (2007)Google Scholar
  8. 8.
    Dimitdadou, E., Hornik, K., Leisch, F., Meyer, D., Weingessel, A.: e1071 - misc functions of the department of statistics (2002)Google Scholar
  9. 9.
    Schmid, M., Davison, T.S., Henz, S.R., Pape, U.J., Demar, M., Vingron, M., Schölkopf, B., Weigel, D., Lohmann, J.U.: A gene expression map of Arabidopsis thaliana development. Nature Genetics 37(5), 501–506 (2005)CrossRefPubMedGoogle Scholar
  10. 10.
    Swarbreck, D., Wilks, C., Lamesch, P., Berardini, T.Z., Garcia-Hernandez, M., Foerster, H., Li, D., Meyer, T., Muller, R., Ploetz, L., Radenbaugh, A., Singh, S., Swing, V., Tissier, C., Zhang, P., Huala, E.: The arabidopsis information resource (tair): gene structure and function annotation. Nucl. Acids Res. (2007)Google Scholar
  11. 11.
    Chawade, A., Bräutigam, M., Lindlöf, A., Olsson, O., Olsson, B.: Putative cold acclimation pathways in Arabidopsis thaliana identified by a combined analysis of mRNA co-expression patterns, promoter motifs and transcription factors. BMC Genomics 8(304) (2007)Google Scholar
  12. 12.
    Ndimba, B.K., Chivasa, S., Simon, W.J., Slabas, A.R.: Identification of Arabidopsis salt and osmotic stress responsive proteins using two-dimensional difference gel electrophoresis and mass spectrometry. Proteomics 5(16), 4185–4196 (2005)CrossRefPubMedGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Timothy Hancock
    • 1
  • Hiroshi Mamitsuka
    • 1
  1. 1.Bioinformatics Center, Institute for Chemical ResearchKyoto UniversityJapan

Personalised recommendations