Boosting for graph classification with universum

Pan, Shirui; Wu, Jia; Zhu, Xingquan; Long, Guodong; Zhang, Chengqi

doi:10.1007/s10115-016-0934-z

Boosting for graph classification with universum

Regular Paper
Published: 19 March 2016

Volume 50, pages 53–77, (2017)
Cite this article

Knowledge and Information Systems Aims and scope Submit manuscript

Shirui Pan¹,
Jia Wu¹,
Xingquan Zhu²,
Guodong Long¹ &
…
Chengqi Zhang¹

716 Accesses
7 Citations
Explore all metrics

Abstract

Recent years have witnessed extensive studies of graph classification due to the rapid increase in applications involving structural data and complex relationships. To support graph classification, all existing methods require that training graphs should be relevant (or belong) to the target class, but cannot integrate graphs irrelevant to the class of interest into the learning process. In this paper, we study a new universum graph classification framework which leverages additional “non-example” graphs to help improve the graph classification accuracy. We argue that although universum graphs do not belong to the target class, they may contain meaningful structure patterns to help enrich the feature space for graph representation and classification. To support universum graph classification, we propose a mathematical programming algorithm, ugBoost, which integrates discriminative subgraph selection and margin maximization into a unified framework to fully exploit the universum. Because informative subgraph exploration in a universum setting requires the search of a large space, we derive an upper bound discriminative score for each subgraph and employ a branch-and-bound scheme to prune the search space. By using the explored subgraphs, our graph classification model intends to maximize the margin between positive and negative graphs and minimize the loss on the universum graph examples simultaneously. The subgraph exploration and the learning are integrated and performed iteratively so that each can be beneficial to the other. Experimental results and comparisons on real-world dataset demonstrate the performance of our algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Representing Graphs as Bag of Vertices and Partitions for Graph Classification

Article Open access 28 June 2018

Mansurul Bhuiyan & Mohammad Al Hasan

Multi-graph-view subgraph mining for graph classification

Article 21 September 2015

Jia Wu, Zhibin Hong, … Chengqi Zhang

Subgraph Augmentation with Application to Graph Mining

Notes

We use bold-faced letters ($\varvec{w,\xi }$) to indicate a vector, normal letters $w_i$ and $\xi _i$ to represent scalar values.
The derivation from Eqs. (7) to (8) is illustrated in “Appendix.”
We can obtain the dual solutions of Eq. (8) immediately after solving Eq. (7) by using CVX package, available from http://cvxr.com/cvx/.
http://cvxr.com/cvx/.
Available at http://www.epa.gov/ncct/dsstox/sdf_epafhm.html.
http://arnetminer.org/citation.

References

Aggarwal C (2011) On classification of graph streams. In: Proceeding of the SDM. Arizona, USA
Bai X, Cherkassky V (2008) Gender classification of human faces using inference through contradictions. In: IJCNN, pp 746–750
Chen S, Zhang C (2009) Selecting informative universum sample for semi-supervised learning. IJCAI 6:1016–1021
Google Scholar
Demiriz A, Bennett K, Shawe-Taylor J (2002) Linear programming boosting via column generation. Mach Learn 46:225–254
Article MATH Google Scholar
Deshpande M, Kuramochi M, Wale N, Karypis G (2005) Frequent substructure-based approaches for classifying chemical compounds. IEEE Trans Knowl Data Eng 17:1036–1050
Article Google Scholar
Fei H, Huan J (2008) Structure feature selection for graph classification. In: Proceedings of the ACM CIKM, California, USA
Fei H, Huan J (2010) Boosting with structure information in the functional space: an application to graph classification. In: Proceedings of the ACM SIGKDD, Washington DC, USA
Gaüzere B, Brun L, Villemin D (2012) Two new graphs kernels in chemoinformatics. Pattern Recognit Lett 33(15):2038–2047
Article Google Scholar
Guo T, Zhu X (2013) Understanding the roles of sub-graph features for graph classification: an empirical study perspective. In: Proceedings of the ACM CIKM Conference, pp 817–822. ACM
Hanley JA, McNeil BJ (1982) The meaning and use of the area under a receiver operating characteristic (roc) curve. Radiology 143(1):29–36
Article Google Scholar
Jiang C, Coenen F, Sanderson R, Zito M (2010) Text classification using graph mining-based feature extraction. Knowl Based Syst 23(4):302–308
Article Google Scholar
Jin N, Young C, Wang W (2009) Graph classification based on pattern co-occurrence. In: Proceedings of the ACM CIKM, Hong Kong, China
Jin N, Young C, Wang W (2010) GAIA: graph classification using evolutionary computation. In: Proceedings of the 2010 ACM SIGMOD International Conference on Management of data, pp 879–890. ACM
Joachims T (2006) Training linear svms in linear time. In: KDD, pp 217–226
Kashima H, Tsuda K, Inokuchi A (2004) Kernels for Graphs, chap. In: Schlkopf B, Tsuda K, Vert JP (eds) Kernel methods in computational biology. MIT Press, Cambridge
Google Scholar
Kong X, Philip SY (2012) gMLC: a multi-label feature selection framework for graph classification. Knowl Inf Syst 31(2):281–305
Article Google Scholar
Kong X, Yu P (2010) Semi-supervised feature selection for graph classification. In: Proceedings of the ACM SIGKDD, Washington, DC, USA
Luenberger D (1997) Optimization by vector space methods. Wiley, New York
MATH Google Scholar
Nash S, Sofer A (1996) Linear and nonlinear programming. McGraw-Hill, New York
Google Scholar
Pan S, Wu J, Zhu X (2015) Cogboost: boosting for fast cost-sensitive graph classification. IEEE Trans Knowl Data Eng 27(11):2933–2946. doi:10.1109/TKDE.2015.2391115
Article Google Scholar
Pan S, Wu J, Zhu X, Long G, Zhang C (2015) Finding the best not the most: regularized loss minimization subgraph selection for graph classification. Pattern Recognit 48(11):3783–3796
Article Google Scholar
Pan S, Wu J, Zhu X, Zhang C (2015) Graph ensemble boosting for imbalanced noisy graph stream classification. IEEE Trans Cybern 45(5):940–954
Google Scholar
Pan S, Wu J, Zhu X, Zhang C, Yu P (2015) Joint structure feature exploration and regularization for multi-task graph classification. IEEE Trans Knowl Data Eng 28(3):715–728. doi:10.1109/TKDE.2015.2492567
Article Google Scholar
Pan S, Zhu X (2013) Graph classification with imbalanced class distributions and noise. In: IJCAI
Pan S, Zhu X, Zhang C, Yu PS (2013) Graph stream classification using labeled and unlabeled graphs. In: International Conference on Data Engineering (ICDE), IEEE
Pan SJ, Yang Q (2010) A survey on transfer learning. IEEE Trans Knowl Data Eng 22(10):1345–1359
Article Google Scholar
Peng B, Qian G, Ma Y (2008) View-invariant pose recognition using multilinear analysis and the universum. In: Advances in visual computing, pp 581–591. Springer
Peng B, Qian G, Ma Y (2009) Recognizing body poses using multilinear analysis and semi-supervised learning. Pattern Recognit Lett 30(14):1289–1294
Article Google Scholar
Prakash BA, Vreeken J, Faloutsos C (2014) Efficiently spotting the starting points of an epidemic in a large graph. Knowl Inf Syst 38(1):35–59
Article Google Scholar
Raina R, Battle A, Lee H, Packer B, Ng AY (2007) Self-taught learning: transfer learning from unlabeled data. In: Proceedings of the 24th international conference on machine learning. ACM, pp 759–766
Ranu S, Singh A (2009) Graphsig: a scalable approach to mining significant subgraphs in large graph databases. In: Proceedings of the ICDE, IEEE, pp 844–855
Riesen K, Bunke H (2009) Graph classification by means of Lipschitz embedding. IEEE Trans SMC B 39:1472–1483
Google Scholar
Russom CL, Bradbury SP, Broderius SJ, Hammermeister DE, Drummond RA (1997) Predicting modes of toxic action from chemical structure: acute toxicity in the fathead minnow (Pimephales promelas). Environ Toxicol Chem 16(5):948–967
Article Google Scholar
Saigo H, Nowozin S, Kadowaki T, Kudo T, Tsuda K (2009) gboost: a mathematical programming approach to graph classification and regression. Mach Learn 75:69–89
Article Google Scholar
Shen C, Wang P, Shen F, Wang H (2012) Uboost: boosting with the universum. IEEE Trans Pattern Anal Mach Intell 34(4):825–832
Article Google Scholar
Shervashidze N, Schweitzer P, Van Leeuwen EJ, Mehlhorn K, Borgwardt KM (2011) Weisfeiler-lehman graph kernels. J Mach Learn Res 12:2539–2561
MathSciNet MATH Google Scholar
Shi X, Kong X, Yu PS (2012) Transfer significant subgraphs across graph databases. In: Proceedings of the SIAM international conference on data mining. SDM
Sinz FH, Chapelle O, Agarwal A, Schlkopf B (2007) An analysis of inference with the universum. In: NIPS’07, pp 1–1
Sutherland JJ, O’Brien LA, Weaver DF (2004) A comparison of methods for modeling quantitative structure-activity relationships. J Med Chem 47(22):5541–5554
Article Google Scholar
Thoma M, Cheng H, Gretton A, Han J, Kriegel H, Smola A, Song L, Yu P, Yan X, Borgwardt K (2009) Near-optimal supervised feature selection among frequent subgraphs. In: Proceedings of the SDM. USA
Tibshirani R (1996) Regression shrinkage and selection via the lasso. J Roy Stat Soc Ser B Methodol 58(1):267–288
Wang H, Zhang P, Tsang I, Chen L, Zhang C (2015) Defragging subgraph features for graph classification. In: Proceedings of the 24th ACM international on conference on information and knowledge management, pp 1687–1690. ACM
Wang Z, Zhu Y, Liu W, Chen Z, Gao D (2014) Multi-view learning with universum. Knowl Based Syst 70:376–391. doi:10.1016/j.knosys.2014.07.019
Article Google Scholar
Weston J, Collobert R, Sinz F, Bottou L, Vapnik V (2006) Inference with the universum. In: Proceedings of the 23rd international conference on machine learning, pp 1009–1016. ACM
Wu J, Hong Z, Pan S, Zhu X, Cai Z, Zhang C (2015) Multi-graph-view subgraph mining for graph classification. Knowl Inf Syst. doi:10.1007/s10115-015-0872-1
Wu J, Hong Z, Pan S, Zhu X, Zhang C, Cai Z (2014) Multi-graph learning with positive and unlabeled bags. In: Proceedings of the 2014 SIAM international conference on data mining (SDM), pp 217–225
Wu J, Zhu X, Zhang C, Cai Z (2013) Multi-instance multi-graph dual embedding learning. In: ICDM, pp 827–836
Wu J, Zhu X, Zhang C, Yu PS (2014) Bag constrained structure pattern mining for multi-graph classification. IEEE Trans Knowl Data Eng 26(10):2382–2396
Article Google Scholar
Yan X, Cheng H, Han J, Yu PS (2008) Mining significant graph patterns by leap search. In: Proceedings of the 2008 ACM SIGMOD international conference on management of data, pp 433–444. ACM
Yan X, Han J (2002) gspan: Graph-based substructure pattern mining. In: Proceedings of the ICDM, Maebashi City, Japan
Zhang D, Wang J, Wang F, Zhang C (2008) Semi-supervised classification with universum. In: SDM, pp 323–333. SIAM
Zhao Y, Kong X, Yu PS (2011) Positive and unlabeled learning for graph classification. In: IEEE 11th international conference on Data Mining (ICDM), 2011, pp 962–971. IEEE
Zhu X (2006) Semi-supervised learning literature survey. Comput Sci Univ Wis Madison 2:3
Google Scholar
Zhu X (2011) Cross-domain semi-supervised learning using feature formulation. IEEE Trans Syst Man Cybern Part B 41(6):1627–1638
Article Google Scholar
Zhu Y, Yu J, Cheng H, Qin L (2012) Graph classification: a diversified discriminative feature selection approach. In: Proceedings of the CIKM, pp 205–214. ACM

Download references

Author information

Authors and Affiliations

Centre for Quantum Computation and Intelligent Systems, FEIT, University of Technology Sydney, Sydney, Australia
Shirui Pan, Jia Wu, Guodong Long & Chengqi Zhang
Department of Computer and Electrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, FL, USA
Xingquan Zhu

Authors

Shirui Pan
View author publications
You can also search for this author in PubMed Google Scholar
Jia Wu
View author publications
You can also search for this author in PubMed Google Scholar
Xingquan Zhu
View author publications
You can also search for this author in PubMed Google Scholar
Guodong Long
View author publications
You can also search for this author in PubMed Google Scholar
Chengqi Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jia Wu.

Appendix

Lagrangian Dual of Eq. (7). The Lagrangian function of Eq. (7) can be written as:

$$\begin{aligned} L(\varvec{\xi },\varvec{\psi },\varvec{\eta }, \varvec{w})= & {} \Vert \mathbf {w}\Vert + C_l\sum \limits _{i=1}^{l} \xi _i + C_u\sum \limits _{j=l+1}^n{(\psi _j+\eta _j)}\nonumber \\&-\, \sum _{i=1}^{l}\alpha _i\{y_i \sum \limits _{k=1}^{m}w_k\cdot \hbar _{g_k}(G_i) + \xi _i -1\} \nonumber \\&+\, \sum _{j=l+1}^{n}\beta _j\{\sum \limits _{k=1}^{m}w_k \cdot \hbar _{g_k}(G_j)-\varepsilon - \psi _j\} \nonumber \\&-\, \sum _{j=l+1}^{n}p_j\{\sum \limits _{k=1}^{m}w_k \cdot \hbar _{g_k}(G_j)+\varepsilon + \eta _j\} \nonumber \\&-\,\varvec{r}^T\varvec{w}-\varvec{s}^T\varvec{\xi }-\varvec{q}^T\varvec{\psi }-\varvec{z}^T\varvec{\eta } \end{aligned}$$

(14)

where, we have $\alpha _i \ge 0, \beta _i \ge 0, p_i \ge 0, r_i \ge 0, s_i \ge 0, q_i \ge 0, z_i \ge 0$.

At optimum, the first derivative of the Lagrangian with respect to the primal variables ($\varvec{\xi },\varvec{w}$, $\varvec{\psi }$ and $\varvec{\eta }$) must vanish,

$$\begin{aligned} \frac{\partial L}{\partial \xi _i}= & {} C_l - \alpha _i -s_i = 0 ~~\Rightarrow 0 \le \alpha _i \le C_l\\ \frac{\partial L}{\partial \psi _i}= & {} C_u - \beta _i -q_i = 0 ~~\Rightarrow 0 \le \beta _i \le C_u\\ \frac{\partial L}{\partial \eta _i}= & {} C_u - p_i -s_i = 0 ~~\Rightarrow 0 \le p_i \le C_u\\ \frac{\partial L}{\partial w_k}= & {} 1 - \sum \limits _{i=1}^l\alpha _i y_i \hbar _{g_k}(G_i) + \sum \limits _{j=l+1}^n \beta _j \hbar _{g_k}(G_j) \\&-\, \sum \limits _{j=l+1}^n p_j \hbar _{g_k}(G_j) -r_k = 0 \\\Rightarrow & {} \sum \limits _{i=1}^l\alpha _i y_i \hbar _{g_k}(G_j) + \sum \limits _{j=l+1}^n (p_j-\beta _j) \hbar _{g_k}(G_j) \le 1, \quad \forall k \end{aligned}$$

Substituting these variables in Eq. (14), we obtain the its dual problem as Eq. (8).

Rights and permissions

Reprints and permissions

About this article

Cite this article

Pan, S., Wu, J., Zhu, X. et al. Boosting for graph classification with universum. Knowl Inf Syst 50, 53–77 (2017). https://doi.org/10.1007/s10115-016-0934-z

Download citation

Received: 31 December 2014
Revised: 16 November 2015
Accepted: 03 March 2016
Published: 19 March 2016
Issue Date: January 2017
DOI: https://doi.org/10.1007/s10115-016-0934-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Boosting for graph classification with universum

Abstract

Access this article

Similar content being viewed by others

Representing Graphs as Bag of Vertices and Partitions for Graph Classification

Multi-graph-view subgraph mining for graph classification

Subgraph Augmentation with Application to Graph Mining

Notes

References

Author information

Authors and Affiliations

Corresponding author

Appendix

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Boosting for graph classification with universum

Abstract

Access this article

Similar content being viewed by others

Representing Graphs as Bag of Vertices and Partitions for Graph Classification

Multi-graph-view subgraph mining for graph classification

Subgraph Augmentation with Application to Graph Mining

Notes

References

Author information

Authors and Affiliations

Corresponding author

Appendix

Appendix

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation