Graph Regularized Transductive Classification on Heterogeneous Information Networks

  • Ming Ji
  • Yizhou Sun
  • Marina Danilevsky
  • Jiawei Han
  • Jing Gao
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6321)

Abstract

A heterogeneous information network is a network composed of multiple types of objects and links. Recently, it has been recognized that strongly-typed heterogeneous information networks are prevalent in the real world. Sometimes, label information is available for some objects. Learning from such labeled and unlabeled data via transductive classification can lead to good knowledge extraction of the hidden network structure. However, although classification on homogeneous networks has been studied for decades, classification on heterogeneous networks has not been explored until recently.

In this paper, we consider the transductive classification problem on heterogeneous networked data which share a common topic. Only some objects in the given network are labeled, and we aim to predict labels for all types of the remaining objects. A novel graph-based regularization framework, GNetMine, is proposed to model the link structure in information networks with arbitrary network schema and arbitrary number of object/link types. Specifically, we explicitly respect the type differences by preserving consistency over each relation graph corresponding to each type of links separately. Efficient computational schemes are then introduced to solve the corresponding optimization problem. Experiments on the DBLP data set show that our algorithm significantly improves the classification accuracy over existing state-of-the-art methods.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Banerjee, A., Basu, S., Merugu, S.: Multi-way clustering on relation graphs. In: SDM 2007 (2007)Google Scholar
  2. 2.
    Bekkerman, R., El-Yaniv, R., McCallum, A.: Multi-way distributional clustering via pairwise interactions. In: ICML 2005, pp. 41–48 (2005)Google Scholar
  3. 3.
    Belkin, M., Niyogi, P., Sindhwani, V.: Manifold regularization: A geometric framework for learning from examples. J. Mach. Learn. Res. 7, 2399–2434 (2006)MathSciNetGoogle Scholar
  4. 4.
    Chakrabarti, S., Dom, B., Indyk, P.: Enhanced hypertext categorization using hyperlinks. In: SIGMOD 1998, pp. 307–318. ACM, New York (1998)CrossRefGoogle Scholar
  5. 5.
    Chung, F.R.K.: Spectral Graph Theory. Regional Conference Series in Mathematics, vol. 92. AMS, Providence (1997)MATHGoogle Scholar
  6. 6.
    Friedman, N., Getoor, L., Koller, D., Pfeffer, A.: Learning probabilistic relational models. In: IJCAI 1999 (1999)Google Scholar
  7. 7.
    Gao, J., Liang, F., Fan, W., Sun, Y., Han, J.: Graph-based consensus maximization among multiple supervised and unsupervised models. In: Advances in Neural Information Processing Systems (NIPS), vol. 22, pp. 585–593 (2009)Google Scholar
  8. 8.
    Long, B., Zhang, Z.M., Wu, X., Yu, P.S.: Spectral clustering for multi-type relational data. In: ICML 2006, pp. 585–592 (2006)Google Scholar
  9. 9.
    Lu, Q., Getoor, L.: Link-based classification. In: ICML 2003 (2003)Google Scholar
  10. 10.
    Macskassy, S.A., Provost, F.: A simple relational classifier. In: Proc. of MRDM-2003 at KDD-2003, pp. 64–76 (2003)Google Scholar
  11. 11.
    Macskassy, S.A., Provost, F.: Classification in networked data: A toolkit and a univariate case study. J. Mach. Learn. Res. 8, 935–983 (2007)Google Scholar
  12. 12.
    Neville, J., Jensen, D.: Relational dependency networks. J. Mach. Learn. Res. 8, 653–692 (2007)Google Scholar
  13. 13.
    Neville, J., Jensen, D., Friedland, L., Hay, M.: Learning relational probability trees. In: KDD 2003, pp. 625–630 (2003)Google Scholar
  14. 14.
    Neville, J., Jensen, D., Gallagher, B.: Simple estimators for relational bayesian classifiers. In: ICDM 2003, p. 609 (2003)Google Scholar
  15. 15.
    Sen, P., Getoor, L.: Link-based classification. Technical Report CS-TR-4858, University of Maryland (February 2007)Google Scholar
  16. 16.
    Sun, Y., Yu, Y., Han, J.: Ranking-based clustering of heterogeneous information networks with star network schema. In: KDD 2009, pp. 797–806 (2009)Google Scholar
  17. 17.
    Taskar, B., Abbeel, P., Koller, D.: Discriminative probabilistic models for relational data. In: UAI, pp. 485–492 (2002)Google Scholar
  18. 18.
    Taskar, B., Segal, E., Koller, D.: Probabilistic classification and clustering in relational data. In: IJCAI 2001, pp. 870–876 (2001)Google Scholar
  19. 19.
    Yin, Z., Li, R., Mei, Q., Han, J.: Exploring social tagging graph for web object classification. In: KDD 2009, pp. 957–966 (2009)Google Scholar
  20. 20.
    Zhang, T., Popescul, A., Dom, B.: Linear prediction models with graph regularization for web-page categorization. In: KDD 2006, pp. 821–826 (2006)Google Scholar
  21. 21.
    Zhou, D., Bousquet, O., Lal, T.N., Weston, J., Schölkopf, B.: Learning with local and global consistency. In: NIPS 16 (2003)Google Scholar
  22. 22.
    Zhu, X., Ghahramani, Z., Lafferty, J.D.: Semi-supervised learning using gaussian fields and harmonic functions. In: ICML 2003, pp. 912–919 (2003)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Ming Ji
    • 1
  • Yizhou Sun
    • 1
  • Marina Danilevsky
    • 1
  • Jiawei Han
    • 1
  • Jing Gao
    • 1
  1. 1.Dept. of Computer ScienceUniversity of Illinois at Urbana-ChampaignUrbana

Personalised recommendations