Inferring Disease-Related Metabolite Dependencies with a Bayesian Optimization Algorithm

  • Holger Franken
  • Alexander Seitz
  • Rainer Lehmann
  • Hans-Ulrich Häring
  • Norbert Stefan
  • Andreas Zell
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7246)


Understanding disease-related metabolite interactions is a key issue in computational biology. We apply a modified Bayesian Optimization Algorithm to targeted metabolomics data from plasma samples of insulin-sensitive and -resistant subjects both suffering from non-alcoholic fatty liver disease. In addition to improving the classification accuracy by selecting relevant features, we extract the information that led to their selection and reconstruct networks from detected feature dependencies. We compare the influence of a variety of classifiers and different scoring metrics and examine whether the reconstructed networks represent physiological metabolite interconnections. We find that the presented method is capable of significantly improving the classification accuracy of otherwise hardly classifiable metabolomics data and that the detected metabolite dependencies can be mapped to physiological pathways, which in turn were affirmed by literature from the domain.


Feature Selection Bayesian Network Bayesian Information Criterion Feature Subset Nonalcoholic Fatty Liver Disease 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Atkinson, A., Colburn, W., DeGruttola, V., DeMets, D., Downing, G., Hoth, D., Oates, J., Peck, C., Schooley, R., Spilker, B., et al.: Biomarkers and surrogate endpoints: Preferred definitions and conceptual framework. Clinical Pharmacology & Therapeutics 69(3), 89–95 (2001)CrossRefGoogle Scholar
  2. 2.
    Bang, J., Crockford, D., Holmes, E., Pazos, F., Sternberg, M., Muggleton, S., Nicholson, J.: Integrative top-down system metabolic modeling in experimental disease states via data-driven Bayesian methods. The Journal of Proteome Research 7(2), 497–503 (2008)CrossRefGoogle Scholar
  3. 3.
    Ben-Gal, I.: Bayesian networks. Encyclopedia of Statistics in Quality and Reliability (2007)Google Scholar
  4. 4.
    Chickering, D.: Learning Bayesian networks is NP-complete. Learning from data: Artificial intelligence and statistics 112, 121–130 (1996)MathSciNetGoogle Scholar
  5. 5.
    Cleary, J., Trigg, L.: K*: An Instance-based Learner Using an Entropic Distance Measure. In: Proceedings of the 12th International Conference on Machine Learning, pp. 108–114 (1995)Google Scholar
  6. 6.
    Doak, J.: An evaluation of feature-selection methods and their application to computer security (Technical Report CSE-92-18). Davis: University of California, Department of Computer Science (1992)Google Scholar
  7. 7.
    Echegoyen, C., Lozano, J., Santana, R., Larranaga, P.: Exact Bayesian network learning in estimation of distribution algorithms. In: IEEE Congress on Evolutionary Computation, CEC 2007, pp. 1051–1058. IEEE (2007)Google Scholar
  8. 8.
    Franken, H., Lehmann, R., Häring, H., Fritsche, A., Stefan, N., Zell, A.: Wrapper-and Ensemble-Based Feature Subset Selection Methods for Biomarker Discovery in Targeted Metabolomics. Pattern Recognition in Bioinformatics, 121–132 (2011)Google Scholar
  9. 9.
    Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.: The WEKA data mining software: an update. ACM SIGKDD Explorations Newsletter 11(1), 10–18 (2009)CrossRefGoogle Scholar
  10. 10.
    Hall, M.: Correlation-based Feature Selection for Machine Learning. Ph.D. thesis, The University of Waikato (1999)Google Scholar
  11. 11.
    Huffman, K., Shah, S., Stevens, R., Bain, J., Muehlbauer, M., Slentz, C., Tanner, C., Kuchibhatla, M., Houmard, J., Newgard, C., et al.: Relationships between circulating metabolic intermediates and insulin action in overweight to obese, inactive men and women. Diabetes Care 32(9), 1678 (2009)CrossRefGoogle Scholar
  12. 12.
    Inza, I., Larranaga, P., Etxeberria, R., Sierra, B.: Feature subset selection by Bayesian network-based optimization. Artificial Intelligence 123(1-2), 157–184 (2000)zbMATHCrossRefGoogle Scholar
  13. 13.
    Kanehisa, M., Goto, S., Furumichi, M., Tanabe, M., Hirakawa, M.: Kegg for representation and analysis of molecular networks involving diseases and drugs. Nucleic Acids Research 38(Database issue), D355–D360 (2010)CrossRefGoogle Scholar
  14. 14.
    Kira, K., Rendell, L.: The feature selection problem: traditional methods and a new algorithm. In: Proceedings of the Tenth National Conference on Artificial Intelligence, pp. 129–134. AAAI Press (1992)Google Scholar
  15. 15.
    Kronfeld, M., Planatscher, H., Zell, A.: The EvA2 optimization framework. Learning and Intelligent Optimization, 247–250 (2010)Google Scholar
  16. 16.
    Krumsiek, J., Suhre, K., Illig, T., Adamski, J., Theis, F.: Gaussian graphical modeling reconstructs pathway reactions from high-throughput metabolomics data. BMC Systems Biology 5, 21 (2011)CrossRefGoogle Scholar
  17. 17.
    Lim, T.: A Comparison of Prediction Accuracy, Complexity, and Training Time of Thirty-Three Old and New Classification Algorithms. Machine Learning 40, 203–228 (2000)zbMATHCrossRefGoogle Scholar
  18. 18.
    Masseglia, F., Poncelet, P., Teisseire, M.: Successes and new directions in data mining. Information Science Publishing (2008)Google Scholar
  19. 19.
    Newgard, C., An, J., Bain, J., Muehlbauer, M., Stevens, R., Lien, L., Haqq, A., Shah, S., Arlotto, M., Slentz, C., et al.: A branched-chain amino acid-related metabolic signature that differentiates obese and lean humans and contributes to insulin resistance. Cell Metabolism 9(4), 311–326 (2009)CrossRefGoogle Scholar
  20. 20.
    Pelikan, M., Goldberg, D.: Hierarchical bayesian optimization algorithm, vol. 33, p. 63. Springer, Heidelberg (2006)Google Scholar
  21. 21.
    Pelikan, M., Goldberg, D., Cantu-Paz, E.: BOA: The Bayesian optimization algorithm (IlliGAL Report No. 99003). University of Illinois at Urbana-Champaign, Urbana (1999)Google Scholar
  22. 22.
    Petersen, K., Dufour, S., Befroy, D., Lehrke, M., Hendler, R., Shulman, G.: Reversal of Nonalcoholic Hepatic Steatosis, Hepatic Insulin Resistance, and Hyperglycemia by Moderate Weight Reduction in Patients With Type 2 Diabetes. Metabolism 54, 603–608 (2005)Google Scholar
  23. 23.
    Puri, P., Baillie, R.A., Wiest, M.M., Mirshahi, F., Choudhury, J., Cheung, O., Sargeant, C., Contos, M.J., Sanyal, A.J.: A lipidomic analysis of nonalcoholic fatty liver disease. Hepatology 46(4), 1081–1090 (2007)CrossRefGoogle Scholar
  24. 24.
    Saeys, Y., Inza, I.N., Larrañaga, P.: A review of feature selection techniques in bioinformatics. Bioinformatics 23(19), 2507–2517 (2007)CrossRefGoogle Scholar
  25. 25.
    Schölkopf, B., Smola, A.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. Adaptive Computation and Machine Learning, 1st edn. The MIT Press (2001)Google Scholar
  26. 26.
    Stefan, N., Kantartzis, K., Häring, H.U.: Causes and metabolic consequences of Fatty liver. Endocrine Reviews 29(7), 939–960 (2008)CrossRefGoogle Scholar
  27. 27.
    Zou, W., Tolstikov, V.: Probing genetic algorithms for feature selection in comprehensive metabolic profiling approach. Rapid Communications in Mass Spectrometry 22(8), 1312–1324 (2008)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Holger Franken
    • 1
  • Alexander Seitz
    • 1
  • Rainer Lehmann
    • 2
    • 3
  • Hans-Ulrich Häring
    • 2
    • 3
  • Norbert Stefan
    • 2
    • 3
  • Andreas Zell
    • 1
  1. 1.Center for Bioinformatics (ZBIT)University of TübingenTübingenGermany
  2. 2.Division of Clinical Chemistry and Pathobiochemistry (Central Laboratory)University Hospital TübingenTübingenGermany
  3. 3.Paul-Langerhans-Institute Tübingen, German Centre for Diabetes Research (DZD)Eberhard Karls University TübingenTübingenGermany

Personalised recommendations