Knowledge and Information Systems

, Volume 50, Issue 1, pp 53–77 | Cite as

Boosting for graph classification with universum

  • Shirui Pan
  • Jia WuEmail author
  • Xingquan Zhu
  • Guodong Long
  • Chengqi Zhang
Regular Paper


Recent years have witnessed extensive studies of graph classification due to the rapid increase in applications involving structural data and complex relationships. To support graph classification, all existing methods require that training graphs should be relevant (or belong) to the target class, but cannot integrate graphs irrelevant to the class of interest into the learning process. In this paper, we study a new universum graph classification framework which leverages additional “non-example” graphs to help improve the graph classification accuracy. We argue that although universum graphs do not belong to the target class, they may contain meaningful structure patterns to help enrich the feature space for graph representation and classification. To support universum graph classification, we propose a mathematical programming algorithm, ugBoost, which integrates discriminative subgraph selection and margin maximization into a unified framework to fully exploit the universum. Because informative subgraph exploration in a universum setting requires the search of a large space, we derive an upper bound discriminative score for each subgraph and employ a branch-and-bound scheme to prune the search space. By using the explored subgraphs, our graph classification model intends to maximize the margin between positive and negative graphs and minimize the loss on the universum graph examples simultaneously. The subgraph exploration and the learning are integrated and performed iteratively so that each can be beneficial to the other. Experimental results and comparisons on real-world dataset demonstrate the performance of our algorithm.


Graph mining Boosting Graph classification Universum Supervised learning 


  1. 1.
    Aggarwal C (2011) On classification of graph streams. In: Proceeding of the SDM. Arizona, USAGoogle Scholar
  2. 2.
    Bai X, Cherkassky V (2008) Gender classification of human faces using inference through contradictions. In: IJCNN, pp 746–750Google Scholar
  3. 3.
    Chen S, Zhang C (2009) Selecting informative universum sample for semi-supervised learning. IJCAI 6:1016–1021Google Scholar
  4. 4.
    Demiriz A, Bennett K, Shawe-Taylor J (2002) Linear programming boosting via column generation. Mach Learn 46:225–254CrossRefzbMATHGoogle Scholar
  5. 5.
    Deshpande M, Kuramochi M, Wale N, Karypis G (2005) Frequent substructure-based approaches for classifying chemical compounds. IEEE Trans Knowl Data Eng 17:1036–1050CrossRefGoogle Scholar
  6. 6.
    Fei H, Huan J (2008) Structure feature selection for graph classification. In: Proceedings of the ACM CIKM, California, USAGoogle Scholar
  7. 7.
    Fei H, Huan J (2010) Boosting with structure information in the functional space: an application to graph classification. In: Proceedings of the ACM SIGKDD, Washington DC, USAGoogle Scholar
  8. 8.
    Gaüzere B, Brun L, Villemin D (2012) Two new graphs kernels in chemoinformatics. Pattern Recognit Lett 33(15):2038–2047CrossRefGoogle Scholar
  9. 9.
    Guo T, Zhu X (2013) Understanding the roles of sub-graph features for graph classification: an empirical study perspective. In: Proceedings of the ACM CIKM Conference, pp 817–822. ACMGoogle Scholar
  10. 10.
    Hanley JA, McNeil BJ (1982) The meaning and use of the area under a receiver operating characteristic (roc) curve. Radiology 143(1):29–36CrossRefGoogle Scholar
  11. 11.
    Jiang C, Coenen F, Sanderson R, Zito M (2010) Text classification using graph mining-based feature extraction. Knowl Based Syst 23(4):302–308CrossRefGoogle Scholar
  12. 12.
    Jin N, Young C, Wang W (2009) Graph classification based on pattern co-occurrence. In: Proceedings of the ACM CIKM, Hong Kong, ChinaGoogle Scholar
  13. 13.
    Jin N, Young C, Wang W (2010) GAIA: graph classification using evolutionary computation. In: Proceedings of the 2010 ACM SIGMOD International Conference on Management of data, pp 879–890. ACMGoogle Scholar
  14. 14.
    Joachims T (2006) Training linear svms in linear time. In: KDD, pp 217–226Google Scholar
  15. 15.
    Kashima H, Tsuda K, Inokuchi A (2004) Kernels for Graphs, chap. In: Schlkopf B, Tsuda K, Vert JP (eds) Kernel methods in computational biology. MIT Press, CambridgeGoogle Scholar
  16. 16.
    Kong X, Philip SY (2012) gMLC: a multi-label feature selection framework for graph classification. Knowl Inf Syst 31(2):281–305CrossRefGoogle Scholar
  17. 17.
    Kong X, Yu P (2010) Semi-supervised feature selection for graph classification. In: Proceedings of the ACM SIGKDD, Washington, DC, USAGoogle Scholar
  18. 18.
    Luenberger D (1997) Optimization by vector space methods. Wiley, New YorkzbMATHGoogle Scholar
  19. 19.
    Nash S, Sofer A (1996) Linear and nonlinear programming. McGraw-Hill, New YorkGoogle Scholar
  20. 20.
    Pan S, Wu J, Zhu X (2015) Cogboost: boosting for fast cost-sensitive graph classification. IEEE Trans Knowl Data Eng 27(11):2933–2946. doi: 10.1109/TKDE.2015.2391115 CrossRefGoogle Scholar
  21. 21.
    Pan S, Wu J, Zhu X, Long G, Zhang C (2015) Finding the best not the most: regularized loss minimization subgraph selection for graph classification. Pattern Recognit 48(11):3783–3796CrossRefGoogle Scholar
  22. 22.
    Pan S, Wu J, Zhu X, Zhang C (2015) Graph ensemble boosting for imbalanced noisy graph stream classification. IEEE Trans Cybern 45(5):940–954Google Scholar
  23. 23.
    Pan S, Wu J, Zhu X, Zhang C, Yu P (2015) Joint structure feature exploration and regularization for multi-task graph classification. IEEE Trans Knowl Data Eng 28(3):715–728. doi: 10.1109/TKDE.2015.2492567 CrossRefGoogle Scholar
  24. 24.
    Pan S, Zhu X (2013) Graph classification with imbalanced class distributions and noise. In: IJCAIGoogle Scholar
  25. 25.
    Pan S, Zhu X, Zhang C, Yu PS (2013) Graph stream classification using labeled and unlabeled graphs. In: International Conference on Data Engineering (ICDE), IEEEGoogle Scholar
  26. 26.
    Pan SJ, Yang Q (2010) A survey on transfer learning. IEEE Trans Knowl Data Eng 22(10):1345–1359CrossRefGoogle Scholar
  27. 27.
    Peng B, Qian G, Ma Y (2008) View-invariant pose recognition using multilinear analysis and the universum. In: Advances in visual computing, pp 581–591. SpringerGoogle Scholar
  28. 28.
    Peng B, Qian G, Ma Y (2009) Recognizing body poses using multilinear analysis and semi-supervised learning. Pattern Recognit Lett 30(14):1289–1294CrossRefGoogle Scholar
  29. 29.
    Prakash BA, Vreeken J, Faloutsos C (2014) Efficiently spotting the starting points of an epidemic in a large graph. Knowl Inf Syst 38(1):35–59CrossRefGoogle Scholar
  30. 30.
    Raina R, Battle A, Lee H, Packer B, Ng AY (2007) Self-taught learning: transfer learning from unlabeled data. In: Proceedings of the 24th international conference on machine learning. ACM, pp 759–766Google Scholar
  31. 31.
    Ranu S, Singh A (2009) Graphsig: a scalable approach to mining significant subgraphs in large graph databases. In: Proceedings of the ICDE, IEEE, pp 844–855Google Scholar
  32. 32.
    Riesen K, Bunke H (2009) Graph classification by means of Lipschitz embedding. IEEE Trans SMC B 39:1472–1483Google Scholar
  33. 33.
    Russom CL, Bradbury SP, Broderius SJ, Hammermeister DE, Drummond RA (1997) Predicting modes of toxic action from chemical structure: acute toxicity in the fathead minnow (Pimephales promelas). Environ Toxicol Chem 16(5):948–967CrossRefGoogle Scholar
  34. 34.
    Saigo H, Nowozin S, Kadowaki T, Kudo T, Tsuda K (2009) gboost: a mathematical programming approach to graph classification and regression. Mach Learn 75:69–89CrossRefGoogle Scholar
  35. 35.
    Shen C, Wang P, Shen F, Wang H (2012) Uboost: boosting with the universum. IEEE Trans Pattern Anal Mach Intell 34(4):825–832CrossRefGoogle Scholar
  36. 36.
    Shervashidze N, Schweitzer P, Van Leeuwen EJ, Mehlhorn K, Borgwardt KM (2011) Weisfeiler-lehman graph kernels. J Mach Learn Res 12:2539–2561MathSciNetzbMATHGoogle Scholar
  37. 37.
    Shi X, Kong X, Yu PS (2012) Transfer significant subgraphs across graph databases. In: Proceedings of the SIAM international conference on data mining. SDMGoogle Scholar
  38. 38.
    Sinz FH, Chapelle O, Agarwal A, Schlkopf B (2007) An analysis of inference with the universum. In: NIPS’07, pp 1–1Google Scholar
  39. 39.
    Sutherland JJ, O’Brien LA, Weaver DF (2004) A comparison of methods for modeling quantitative structure-activity relationships. J Med Chem 47(22):5541–5554CrossRefGoogle Scholar
  40. 40.
    Thoma M, Cheng H, Gretton A, Han J, Kriegel H, Smola A, Song L, Yu P, Yan X, Borgwardt K (2009) Near-optimal supervised feature selection among frequent subgraphs. In: Proceedings of the SDM. USAGoogle Scholar
  41. 41.
    Tibshirani R (1996) Regression shrinkage and selection via the lasso. J Roy Stat Soc Ser B Methodol 58(1):267–288Google Scholar
  42. 42.
    Wang H, Zhang P, Tsang I, Chen L, Zhang C (2015) Defragging subgraph features for graph classification. In: Proceedings of the 24th ACM international on conference on information and knowledge management, pp 1687–1690. ACMGoogle Scholar
  43. 43.
    Wang Z, Zhu Y, Liu W, Chen Z, Gao D (2014) Multi-view learning with universum. Knowl Based Syst 70:376–391. doi: 10.1016/j.knosys.2014.07.019 CrossRefGoogle Scholar
  44. 44.
    Weston J, Collobert R, Sinz F, Bottou L, Vapnik V (2006) Inference with the universum. In: Proceedings of the 23rd international conference on machine learning, pp 1009–1016. ACMGoogle Scholar
  45. 45.
    Wu J, Hong Z, Pan S, Zhu X, Cai Z, Zhang C (2015) Multi-graph-view subgraph mining for graph classification. Knowl Inf Syst. doi: 10.1007/s10115-015-0872-1
  46. 46.
    Wu J, Hong Z, Pan S, Zhu X, Zhang C, Cai Z (2014) Multi-graph learning with positive and unlabeled bags. In: Proceedings of the 2014 SIAM international conference on data mining (SDM), pp 217–225Google Scholar
  47. 47.
    Wu J, Zhu X, Zhang C, Cai Z (2013) Multi-instance multi-graph dual embedding learning. In: ICDM, pp 827–836Google Scholar
  48. 48.
    Wu J, Zhu X, Zhang C, Yu PS (2014) Bag constrained structure pattern mining for multi-graph classification. IEEE Trans Knowl Data Eng 26(10):2382–2396CrossRefGoogle Scholar
  49. 49.
    Yan X, Cheng H, Han J, Yu PS (2008) Mining significant graph patterns by leap search. In: Proceedings of the 2008 ACM SIGMOD international conference on management of data, pp 433–444. ACMGoogle Scholar
  50. 50.
    Yan X, Han J (2002) gspan: Graph-based substructure pattern mining. In: Proceedings of the ICDM, Maebashi City, JapanGoogle Scholar
  51. 51.
    Zhang D, Wang J, Wang F, Zhang C (2008) Semi-supervised classification with universum. In: SDM, pp 323–333. SIAMGoogle Scholar
  52. 52.
    Zhao Y, Kong X, Yu PS (2011) Positive and unlabeled learning for graph classification. In: IEEE 11th international conference on Data Mining (ICDM), 2011, pp 962–971. IEEEGoogle Scholar
  53. 53.
    Zhu X (2006) Semi-supervised learning literature survey. Comput Sci Univ Wis Madison 2:3Google Scholar
  54. 54.
    Zhu X (2011) Cross-domain semi-supervised learning using feature formulation. IEEE Trans Syst Man Cybern Part B 41(6):1627–1638CrossRefGoogle Scholar
  55. 55.
    Zhu Y, Yu J, Cheng H, Qin L (2012) Graph classification: a diversified discriminative feature selection approach. In: Proceedings of the CIKM, pp 205–214. ACMGoogle Scholar

Copyright information

© Springer-Verlag London 2016

Authors and Affiliations

  • Shirui Pan
    • 1
  • Jia Wu
    • 1
    Email author
  • Xingquan Zhu
    • 2
  • Guodong Long
    • 1
  • Chengqi Zhang
    • 1
  1. 1.Centre for Quantum Computation and Intelligent Systems, FEITUniversity of Technology SydneySydneyAustralia
  2. 2.Department of Computer and Electrical Engineering and Computer ScienceFlorida Atlantic UniversityBoca RatonUSA

Personalised recommendations