Skip to main content

gBoost: a mathematical programming approach to graph classification and regression

Abstract

Graph mining methods enumerate frequently appearing subgraph patterns, which can be used as features for subsequent classification or regression. However, frequent patterns are not necessarily informative for the given learning problem. We propose a mathematical programming boosting method (gBoost) that progressively collects informative patterns. Compared to AdaBoost, gBoost can build the prediction rule with fewer iterations. To apply the boosting method to graph data, a branch-and-bound pattern search algorithm is developed based on the DFS code tree. The constructed search space is reused in later iterations to minimize the computation time. Our method can learn more efficiently than the simpler method based on frequent substructure mining, because the output labels are used as an extra information source for pruning the search space. Furthermore, by engineering the mathematical program, a wide range of machine learning problems can be solved without modifying the pattern search algorithm.

References

  • Abiteboul, S., Buneman, P., & Suciu, D. (2000). Data on the web: from relations to semistructured data and XML. San Mateo: Morgan Kaufmann.

    Google Scholar 

  • Borgwardt, K. M., Ong, C. S., Schönauer, S., Vishwanathan, S. V. N., Smola, A. J., & Kriegel, H.-P. (2006). Protein function prediction via graph kernels. Bioinformatics, 21(suppl. 1), i47–i56.

    Google Scholar 

  • Boyd, S., & Vandenberghe, L. (2004). Convex optimization. Cambridge: Cambridge University Press.

    MATH  Google Scholar 

  • Bringmann, B., Zimmermann, A., Raedt, L. D., & Nijssen, S. (2006). Don’t be afraid of simpler patterns. In 10th European conference on principles and practice of knowledge discovery in databases (PKDD) (pp. 55–66).

  • Cai, L., & Hofmann, T. (2004). Hierarchical document categorization with support vector machines. In ACM 13th conference on information and knowledge management (pp. 78–87). New York: ACM Press.

    Google Scholar 

  • Cohen, W. W. (1995). Fast effective rule induction. In Proceedings of the 12th international conference on machine learning (pp. 115–123). San Mateo: Morgan Kaufmann.

    Google Scholar 

  • Demiriz, A., Bennet, K. P., & Shawe-Taylor, J. (2002). Linear programming boosting via column generation. Machine Learning, 46(1–3), 225–254.

    Article  MATH  Google Scholar 

  • du Merle, O., Villeneuve, D., Desrosiers, J., & Hansen, P. (1999). Stabilized column generation. Discrete Mathematics, 194, 229–237.

    Article  MathSciNet  MATH  Google Scholar 

  • Duran, J. L., Leland, B. A., Henry, D. R., & Nourse, J. G. (2002). Reoptimization of MDL keys for use in drug discovery. Journal of Chemical Information and Computer Sciences, 42(6), 1273–1280.

    Google Scholar 

  • Durbin, R., Eddy, S., Krogh, A., & Mitchison, G. (1998). Biological sequence analysis: probabilistic models of proteins and nucleic acids. Cambridge: Cambridge University Press.

    MATH  Google Scholar 

  • Frank, E., & Witten, I. H. (1998). Generating accurate rule sets without global optimization. In Proceedings of the 15th international conference on machine learning (pp. 114–151). San Mateo: Morgan Kaufmann.

    Google Scholar 

  • Freund, Y., & Schapire, R. E. (1997). A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences, 55(1), 119–139.

    Article  MathSciNet  MATH  Google Scholar 

  • Fröhrich, H., Wegner, J., Sieker, F., & Zell, Z. (2006). Kernel functions for attributed molecular graphs—a new similarity based approach to ADME prediction in classification and regression. QSAR & Combinatorial Science, 25(4), 317–326.

    Article  Google Scholar 

  • Gärtner, T., Flach, P., & Wrobel, S. (2003). On graph kernels: Hardness results and efficient alternatives. In Proceedings of the 16th annual conference on computational learning theory and 7th kernel workshop (pp. 129–143). Berlin: Springer.

    Google Scholar 

  • Gasteiger, J., & Engel, T. (2003). Chemoinformatics: a textbook. New York: Wiley-VCH.

    Book  Google Scholar 

  • Hamada, M., Tsuda, K., Kudo, T., Kin, T., & Asai, K. (2006). Mining frequent stem patterns from unaligned RNA sequences. Bioinformatics, 22, 2480–2487.

    Article  Google Scholar 

  • Helma, C., Cramer, T., Kramer, S., & Raedt, L. D. (2004). Data mining and machine learning techniques for the identification of mutagenicity inducing substructures and structure activity relationships of noncongeneric compounds. Journal of Chemical Information and Computer Sciences, 44, 1402–1411.

    Google Scholar 

  • Hong, H., Fang, H., Xie, Q., Perkins, R., Sheehan, D. M., & Tong, W. (2003). Comparative molecular field analysis (CoMFA) model using a large diverse set of natural, synthetic and environmental chemicals for binding to the androgen receptor. SAR and QSAR in Environmental Research, 14(5–6), 373–388.

    Article  Google Scholar 

  • Horváth, T., Gärtner, T., & Wrobel, S. (2004). Cyclic pattern kernels for predictive graph mining. In Proceedings of the 10th ACM SIGKDD international conference on knowledge discovery and data mining (pp. 158–167). New York: ACM Press.

    Chapter  Google Scholar 

  • Inokuchi, A. (2005). Mining generalized substructures from a set of labeled graphs. In Proceedings of the 4th IEEE international conference on data mining (pp. 415–418). Los Alamitos: IEEE Computer Society.

    Google Scholar 

  • James, C. A., Weininger, D., & Delany, J. (2004). Daylight theory manual.

  • Kashima, H., Tsuda, K., & Inokuchi, A. (2003). Marginalized kernels between labeled graphs. In Proceedings of the 21st international conference on machine learning (pp. 321–328). Menlo Park: AAAI Press.

    Google Scholar 

  • Kazius, J., Nijssen, S., Kok, J., Bäck, T., & Ijzerman, A. P. (2006). Substructure mining using elaborate chemical representation. Journal of Chemical Information and Modeling, 46, 597–605.

    Article  Google Scholar 

  • Kohavi, R., & John, G. H. (1997). Wrappers for feature subset selection. Artificial Intelligence, 1–2, 273–324.

    Article  Google Scholar 

  • Kudo, T., Maeda, E., & Matsumoto, Y. (2005). An application of boosting to graph classification. In Advances in neural information processing systems 17 (pp. 729–736). Cambridge: MIT Press.

    Google Scholar 

  • Le, Q. V., Smola, A. J., & Gärtner, T. (2006). Simpler knowledge-based support vector machines. In Proceedings of the 23rd international conference on machine learning (pp. 521–528). New York: ACM Press.

    Chapter  Google Scholar 

  • Luenberger, D. G. (1969). Optimization by vector space methods. New York: Wiley.

    MATH  Google Scholar 

  • Mahé, P., Ueda, N., Akutsu, T., Perret, J.-L., & Vert, J.-P. (2005). Graph kernels for molecular structure—activity relationship analysis with support vector machines. Journal of Chemical Information and Modeling, 45, 939–951.

    Article  Google Scholar 

  • Mahé, P., Ralaivola, L., Stoven, V., & Vert, J.-P. (2006). The pharmacophore kernel for virtual screening with support vector machines. Journal of Chemical Information and Modeling, 46(5), 2003–2014.

    Article  Google Scholar 

  • Morishita, S. (2001). Computing optimal hypotheses efficiently for boosting. In Discovery science (pp. 471–481).

  • Morishita, S., & Sese, J. (2000). Traversing itemset lattices with statistical metric pruning. In Proceedings of ACM SIGACT-SIGMOD-SIGART symposium on database systems (PODS) (pp. 226–236).

  • Nijssen, S., & Kok, J. N. (2004). A quickstart in frequent structure mining can make a difference. In Proceedings of the 10th ACM SIGKDD international conference on knowledge discovery and data mining (pp. 647–652). New York: ACM Press.

    Chapter  Google Scholar 

  • Quinlan, J. R. (1993). C4.5: programs for machine learning. San Mateo: Morgan Kaufmann.

    Google Scholar 

  • Ralaivola, L., Swamidass, S. J., Saigo, H., & Baldi, P. (2005). Graph kernels for chemical informatics. Neural Networks, 18(8), 1093–1110.

    Article  Google Scholar 

  • Rätsch, G., Mika, S., Schölkopf, B., & Müller, K.-R. (2002). Constructing boosting algorithms from SVMs: an application to one-class classification. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(9), 1184–1199.

    Article  Google Scholar 

  • Saigo, H., Kadowaki, T., & Tsuda, K. (2006). A linear programming approach for molecular QSAR analysis. In T. Gärtner, G.C. Garriga, & T. Meinl, (Eds.), International workshop on mining and learning with graphs (MLG) (pp. 85–96).

  • Schölkopf, B., & Smola, A. J. (2002). Learning with kernels: support vector machines, regularization, optimization, and beyond. Cambridge: MIT Press.

    Google Scholar 

  • Shi, L. M., Fang, H., Tong, W., Wu, J., Perkins, R., & Blair, R. M. (2001). QSAR models using a large diverse set of estrogens. Journal of Chemical Information and Computer Sciences, 41, 186–195.

    Google Scholar 

  • Takabayashi, K., Nguyen, P. C., Ohara, K., Motoda, H., & Washio, T. (2006). Mining discriminative patterns from graph structured data with constrained search. In T. Gärtner, G.C. Garriga, & T. Meinl (Eds.), Proceedings of the international workshop on mining and learning with graphs (MLG) (pp. 205–212).

  • Tibshrani, R. (1996). Regression shrinkage and selection via the LASSO. Journal of the Royal Statistical Society, Series B, 58(1), 267–288.

    MathSciNet  Google Scholar 

  • Wale, N., & Karypis, G. (2006). Comparison of descriptor spaces for chemical compound retrieval and classification. In Proceedings of the 2006 IEEE international conference on data mining (pp. 678–689).

  • Yan, X., & Han, J. (2002a). gSpan: graph-based substructure pattern mining. In Proceedings of the 2002 IEEE international conference on data mining (pp. 721–724). Los Alamitos: IEEE Computer Society.

    Google Scholar 

  • Yan, X., & Han, J. (2002b). gSpan: graph-based substructure pattern mining (Technical report). Department of Computer Science, University of Illinois at Urbana-Champaign.

  • Yuan, C., & Casasent, D. (2003). A novel support vector classifier with better rejection performance. In Proceedings of 2003 IEEE computer society conference on pattern recognition and computer vision (CVPR) (pp. 419–424).

  • Zou, H., & Hastie, T. (2005). Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society, Series B, 67(2), 301–320.

    Article  MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hiroto Saigo.

Additional information

Editors: Thomas Gärtner and Gemma C. Garriga.

Rights and permissions

Open Access This is an open access article distributed under the terms of the Creative Commons Attribution Noncommercial License (https://creativecommons.org/licenses/by-nc/2.0), which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.

Reprints and Permissions

About this article

Cite this article

Saigo, H., Nowozin, S., Kadowaki, T. et al. gBoost: a mathematical programming approach to graph classification and regression. Mach Learn 75, 69–89 (2009). https://doi.org/10.1007/s10994-008-5089-z

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10994-008-5089-z

Keywords

  • Graph mining
  • Mathematical programming
  • Classification
  • Regression
  • QSAR