Model simplification for supervised classification of metabolic networks

  • Ilaria Granata
  • Mario R. GuarracinoEmail author
  • Valery A. Kalyagin
  • Lucia Maddalena
  • Ichcha Manipur
  • Panos M. Pardalos


Many real applications require the representation of complex entities and their relations. Frequently, networks are the chosen data structures, due to their ability to highlight topological and qualitative characteristics. In this work, we are interested in supervised classification models for data in the form of networks. Given two or more classes whose members are networks, we build mathematical models to classify them, based on various graph distances. Due to the complexity of the models, made of tens of thousands of nodes and edges, we focus on model simplification solutions to reduce execution times, still maintaining high accuracy. Experimental results on three datasets of biological interest show the achieved performance improvements.


Supervised classification Network data Metabolic networks Network model simplification 

Mathematics Subject Classification (2010)

62-09 65S05 68R10 92C42 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.



The work was carried out also within the activities of M.R.G. and L.M. as members of the INdAM Research group GNCS. The authors would like to thank G. Trerotola for the technical support.


  1. 1.
    Agren, R., Bordel, S., Mardinoglu, A., Pornputtapong, N., Nookaew, I., Nielsen, J.: Reconstruction of genome-scale active metabolic networks for 69 human cell types and 16 cancer types using INIT. PLoS Comput. Biol. 8(5), 1002518 (2012)CrossRefGoogle Scholar
  2. 2.
    Attar, N., Aliakbaryb, S.: Classification of complex networks based on similarity of topological network features. Chaos 27, 091102 (2017). CrossRefGoogle Scholar
  3. 3.
    Bartlett, J., Bayani, J., Marshall, A., Dunn, J.A., Campbell, A., Cunningham, C., Sobol, M.S., Hall, P.S., Poole, C.J., Cameron, D.A., et al.: Comparing breast cancer multiparameter tests in the OPTIMA prelim trial: no test is more equal than the others. JNCI: J. Natl. Cancer Inst. 108(9) (2016).
  4. 4.
    Bonacich, P.: Factoring and weighting approaches to status scores and clique identification. J. Math. Sociol. 2(1), 113–120 (1972)CrossRefGoogle Scholar
  5. 5.
    Borgwardt, K.M., Kriegel, H.-P.: Shortest-path kernels on graphs. In: Proceedings of the Fifth IEEE International Conference on Data Mining, ICDM ’05, pp 74–81. IEEE Computer Society, Washington (2005)Google Scholar
  6. 6.
    Borgwardt, K.M., Ong, C.S., Schönauer, S., Vishwanathan, S. V. N., Smola, A.J., Kriegel, H.-P.: Protein function prediction via graph kernels. Bioinformatics 21(1), 47–56 (2005)CrossRefGoogle Scholar
  7. 7.
    Carpi, L.C., Schieber, T.A., Pardalos, P.M., Marfany, G., Masoller, C., Díaz-Guilera, A., Ravetti, M.G.: Assessing diversity in multiplex networks. Sci. Rep. 9(1), 4511 (2019). CrossRefGoogle Scholar
  8. 8.
    Davis, S., Meltzer, P.S.: GEOQuery: a bridge between the gene expression omnibus (geo) and bioconductor. Bioinformatics 23(14), 1846–1847 (2007)CrossRefGoogle Scholar
  9. 9.
    DeBerardinis, R.J., Thompson, C.B.: Cellular metabolism and disease: what do metabolic outliers teach us? Cell 148(6), 1132–1144 (2012)CrossRefGoogle Scholar
  10. 10.
    Deyarmin, B., Kane, J.L., Valente, A.L., van Laar, R., Gallagher, C., Shriver, C.D., Ellsworth, R.E.: Effect of ASCO/CAP guidelines for determining ER status on molecular subtype. Ann. Surg. Oncol. 20(1), 87–93 (2013)CrossRefGoogle Scholar
  11. 11.
    Fuglede, B., Topsoe, F.: Jensen-Shannon divergence and hilbert space embedding. In: ISIT 2004. Proceedings. International Symposium on Information Theory, 2004, pp. 31+ (2004)Google Scholar
  12. 12.
    Gadiyaram, V., Ghosh, S., Vishveshwara, S.: A graph spectral-based scoring scheme for network comparison. J. Complex Networks 5(2), 219–244 (2017)Google Scholar
  13. 13.
    Gao, X., Xiao, B., Tao, D., Li, X.: A survey of graph edit distance. Pattern. Anal. Applic. 13(1), 113–129 (2010)MathSciNetCrossRefGoogle Scholar
  14. 14.
    Gautier, L., Cope, L., Bolstad, B.M., Irizarry, R.A.: Affy−−analysis of Affymetrix GeneChip data at the probe level. Bioinformatics 20(3), 307–315 (2004)CrossRefGoogle Scholar
  15. 15.
    Ghosh, S., Gadiyaram, V., Vishveshwara, S.: Validation of protein structure models using network similarity score. Proteins: Struct., Funct., Bioinf. 85(9), 1759–1776 (2017)CrossRefGoogle Scholar
  16. 16.
    Granata, I., Guarracino, M.R., Kalyagin, V.A., Maddalena, L., Manipur, I., Pardalos, P.P.: Supervised classification of metabolic networks. In: 2018 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2688–2693, IEEE (2018)Google Scholar
  17. 17.
    Guarracino, M.R., Xanthopoulos, P., Pyrgiotakis, G., Tomaino, V., Moudgil, B.M., Pardalos, P.M.: Classification of cancer cell death with spectral dimensionality reduction and generalized eigenvalues. Artif. Intell. Med. 53(2), 119–125 (2011)CrossRefGoogle Scholar
  18. 18.
    Guarracino, M.R., Cifarelli, C., Seref, O., Pardalos, P.M.: A classification method based on generalized eigenvalue problems. Optim. Methods Softw. 22(1), 73–81 (2007)MathSciNetCrossRefzbMATHGoogle Scholar
  19. 19.
    Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA data mining software: an update. SIGKDD Explorations 11(1), 10–18 (2009)CrossRefGoogle Scholar
  20. 20.
    Hamilton, W.L., Ying, R., Leskovec, J.: Representation learning on graphs: Methods and applications. arXiv:1709.05584 (2017)
  21. 21.
    Hucka, M., Finney, A., Sauro, H.M., Bolouri, H., Doyle, J.C., Kitano, H., Arkin, A.P., Bornstein, B.J., Bray, D., Cornish-Bowden, A., et al.: The systems biology markup language (SBML): a medium for representation and exchange of biochemical network models. Bioinformatics 19(4), 524–531 (2003)CrossRefGoogle Scholar
  22. 22.
    Liu, Q., Dong, Z., Wang, E.: Cut based method for comparing complex networks. Sci. Rep. 8(1), 5134 (2018). CrossRefGoogle Scholar
  23. 23.
    Luo, H., Huang, Z., Xiao, G.: Image classification with a novel semantic linear-time graph kernel. In: 2015 11Th International Conference on Semantics, Knowledge and Grids (SKG), pp. 235–238 (Aug 2015)Google Scholar
  24. 24.
    Ma, H., Zeng, A.-P.: Reconstruction of metabolic networks from genome data and analysis of their global structure for various organisms. Bioinformatics 19(2), 270–277 (2003)CrossRefGoogle Scholar
  25. 25.
    Marshall, K., Phillippy, K., Sherman, P., Holko, M., Yefanov, A., Lee, H., Zhang, N., Robertson, C., Serova, N., Davis, S., Soboleva, A.: NCBI GEO: archive for functional genomics data sets–update. Nucleic Acids Res. 41, D991–5 (2013). Database issueGoogle Scholar
  26. 26.
    Platt, J.: Fast training of support vector machines using sequential minimal optimization. In: Schoelkopf, B., Burges, C., Smola, A. (eds.) Advances in Kernel Methods - Support Vector Learning. MIT Press (1998)Google Scholar
  27. 27.
    Richiardi, J., Ng, B.: Recent advances in supervised learning for brain graph classification. In: 2013 IEEE Global Conference on Signal and Information Processing, pp. 907–910 (Dec 2013)Google Scholar
  28. 28.
    Schieber, T.A., Carpi, L., Díaz-Guilera, A., Pardalos, P.M., Masoller, C., Ravetti, M.G.: Quantification of network structural dissimilarities. Nat. Commun. 8, 01 (2017)CrossRefGoogle Scholar
  29. 29.
    Shervashidze, N., Vishwanathan, S.V.N., Petri, T., Mehlhorn, K., Borgwardt, K.M.: Efficient graphlet kernels for large graph comparison. J. Mach. Learn. Res. - Proc. Track 5, 488–495 (2009)Google Scholar
  30. 30.
    Trafalis, T.B., Gilbert, R.C.: Robust support vector machines for classification and computational issues. Optim. Methods Softw. 22(1), 187–198 (2007)MathSciNetCrossRefzbMATHGoogle Scholar
  31. 31.
    Tsuda, K., Saigo, H.: Graph Classification. In: Managing and Mining Graph Data, pp. 337–363 (2010)Google Scholar
  32. 32.
    Uhlén, M., Fagerberg, L., Hallström, B.M., Lindskog, C., Oksvold, P., Mardinoglu, A., Sivertsson, Å, Kampf, C., Sjöstedt, E., Asplund, A., et al.: Tissue-based map of the human proteome. Science 347(6220), 1260419 (2015)CrossRefGoogle Scholar
  33. 33.
    Van Laar, R.K.: Design and multiseries validation of a web-based gene expression assay for predicting breast cancer recurrence and patient survival. J. Mol. Diagn. 13(3), 297–304 (2011)CrossRefGoogle Scholar
  34. 34.
    Vapnik, V.: The Nature of Statistical Learning Theory. Springer, Berlin (1995)CrossRefzbMATHGoogle Scholar
  35. 35.
    Vishwanathan, S.V.N., Schraudolph, N., Kondor, N., Borgwardt, K.: Graph kernels. J. Mach. Learn. Res. 11, 1201–1242 (2010)MathSciNetzbMATHGoogle Scholar
  36. 36.
    Wilkinson, J.: The Algebraic Eigenvalue Problem. Clarendon Press, Oxford (1965)zbMATHGoogle Scholar
  37. 37.
    Xanthopoulos, P., Guarracino, M.R., Pardalos, P.M.: Robust generalized eigenvalue classifier with ellipsoidal uncertainty. Annals OR 216(1), 327–342 (2014)MathSciNetCrossRefzbMATHGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.ICARNational Research Council of ItalyNaplesItaly
  2. 2.LATNANational Research University Higher School of EconomicsNizhny NovgorodRussia
  3. 3.CAOUniversity of FloridaGainesvilleUSA

Personalised recommendations