Advertisement

Evaluating and Extending Latent Methods for Link-Based Classification

  • Luke K. McDowellEmail author
  • Aaron Fleming
  • Zane Markel
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 346)

Abstract

Data describing networks such as social networks, citation graphs, hypertext systems, and communication networks is becoming increasingly common and important for analysis. Research on link-based classification studies methods to leverage connections in such networks to improve accuracy. Recently, a number of such methods have been proposed that first construct a set of latent features or links that summarize the network, then use this information for inference. Some work has claimed that such latent methods improve accuracy, but has not compared against the best non-latent methods. In response, this article provides the first substantial comparison between these two groups. Using six real datasets, a range of synthetic data, and multiple underlying models, we show that (non-latent) collective inference methods usually perform best, but that the dataset’s label sparsity, attribute predictiveness, and link density can dramatically affect the performance trends. Inspired by these findings, we introduce three novel algorithms that combine a latent construction with a latent or non-latent method, and demonstrate that they can sometimes substantially increase accuracy.

Keywords

Link-based classification relational classification statistical relational learning latent methods 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Bilgic, M., Mihalkova, L., Getoor, L.: Active learning for networked data. In: Proc. of ICML, pp. 79–86 (2010)Google Scholar
  2. 2.
    Bollobás, B., Borgs, C., Chayes, J., Riordan, O.: Directed scale-free graphs. In: Proceedings of the 14th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pp. 132–139 (2003)Google Scholar
  3. 3.
    Chakrabarti, S., Dom, B., Indyk, P.: Enhanced hypertext categorization using hyperlinks. In: Proc. of SIGMOD, pp. 307–318 (1998)Google Scholar
  4. 4.
    Crane, R., McDowell, L.: Investigating markov logic networks for collective classification. In: Proc. of ICAART, pp. 5–15 (2012)Google Scholar
  5. 5.
    Fleming, A., McDowell, L.K., Markel, Z.: A Hidden Treasure? Evaluating and Extending Latent Methods for Link-based Classification. In: Proc. of IRI, pp. 669–676 (2014)Google Scholar
  6. 6.
    Gallagher, B., Tong, H., Eliassi-Rad, T., Faloutsos, C.: Using ghost edges for classification in sparsely labeled networks. In: Proc. of KDD, pp. 256–264 (2008)Google Scholar
  7. 7.
    Hoff, P.: Multiplicative latent factor models for description and prediction of social networks. Computational & Mathematical Organization Theory 15(4), 261–272 (2009)CrossRefGoogle Scholar
  8. 8.
    Jensen, D., Neville, J., Gallagher, B.: Why collective inference improves relational classification. In: Proc. of KDD, pp. 593–598 (2004)Google Scholar
  9. 9.
    Jensen, D., Neville, J.: Autocorrelation and linkage cause bias in evaluation of relational learners. In: Proc. of ILP, pp. 259–266 (2002)Google Scholar
  10. 10.
    Jensen, D., Neville, J.: Linkage and autocorrelation cause feature selection bias in relational learning. In: Proc. of ICML, pp. 259–266 (2002)Google Scholar
  11. 11.
    Kuwadekar, A., Neville, J.: Relational active learning for joint collective classification models. In: Proc. of ICML, pp. 385–392 (2011)Google Scholar
  12. 12.
    Lin, F., Cohen, W.W.: Semi-supervised classification of network data using very few labels. In: Proc. of ASONAM, pp. 192–199 (2010)Google Scholar
  13. 13.
    Macskassy, S., Provost, F.: Classification in networked data: A toolkit and a univariate case study. J. of Machine Learning Research 8, 935–983 (2007)Google Scholar
  14. 14.
    McDowell, L.K., Aha, D.: Semi-supervised collective classification via hybrid label regularization. In: Proc. of ICML, pp. 975–982 (2012)Google Scholar
  15. 15.
    McDowell, L.K., Aha, D.W.: Labels or attributes? Rethinking the neighbors for collective classification in sparsely-labeled networks. In: Proc. of CIKM, pp. 847–852 (2013)Google Scholar
  16. 16.
    McDowell, L., Gupta, K., Aha, D.: Cautious collective classification. J. of Machine Learning Research 10, 2777–2836 (2009)zbMATHMathSciNetGoogle Scholar
  17. 17.
    McDowell, L.K., Gupta, K.M., Aha, D.W.: Cautious inference in collective classification. In: Proc. of AAAI, pp. 596–601 (2007)Google Scholar
  18. 18.
    Menon, A., Elkan, C.: Link prediction via matrix factorization. Machine Learning and Knowledge Discovery in Databases, pp. 437–452 (2011)Google Scholar
  19. 19.
    Menon, A., Elkan, C.: Predicting labels for dyadic data. Data Mining and Knowledge Discovery 21(2), 327–343 (2010)CrossRefMathSciNetGoogle Scholar
  20. 20.
    Miller, K., Griffiths, T., Jordan, M.: Nonparametric latent feature models for link prediction. In: Advances in Neural Information Processing Systems (NIPS), pp. 1276–1284 (2009)Google Scholar
  21. 21.
    Namata, G.M., London, B., Getoor, L., Huang, B.: Query-driven active surveying for collective classification. In: Workshop on Mining and Learning with Graphs at ICML 2012 (2012)Google Scholar
  22. 22.
    Namata, G., Kok, S., Getoor, L.: Collective graph identification. In: Proc. of KDD, pp. 87–95 (2011)Google Scholar
  23. 23.
    Neville, J., Jensen, D.: Iterative classification in relational data. In: Proc. of the Workshop on Learning Statistical Models from Relational Data at AAAI-2000, pp. 13–20 (2000)Google Scholar
  24. 24.
    Neville, J., Jensen, D.: Leveraging relational autocorrelation with latent group models. In: Proc. of ICDM, pp. 170–177 (2005)Google Scholar
  25. 25.
    Neville, J., Jensen, D.: Relational dependency networks. J. of Machine Learning Research 8, 653–692 (2007)zbMATHGoogle Scholar
  26. 26.
    Neville, J., Simsek, Ö., Jensen, D., Komoroske, J., Palmer, K., Goldberg, H.G.: Using relational knowledge discovery to prevent securities fraud. In: Proc. of KDD, pp. 449–458 (2005)Google Scholar
  27. 27.
    Sen, P., Namata, G., Bilgic, M., Getoor, L., Gallagher, B., Eliassi-Rad, T.: Collective classification in network data. AI Magazine 29(3), 93–106 (2008)Google Scholar
  28. 28.
    Shi, X., Li, Y., Yu, P.: Collective prediction with latent graphs. In: Proc. of CIKM, pp. 1127–1136 (2011)Google Scholar
  29. 29.
    Tang, L., Liu, H.: Relational learning via latent social dimensions. In: Proc. of KDD, pp. 817–826 (2009)Google Scholar
  30. 30.
    Tang, L., Liu, H.: Leveraging social media networks for classification. Data Mining and Knowledge Discovery, pp. 1–32 (2011)Google Scholar
  31. 31.
    Tang, L., Wang, X., Liu, H.: Scalable learning of collective behavior. IEEE Transactions on Knowledge and Data Engineering (2011)Google Scholar
  32. 32.
    Taskar, B., Abbeel, P., Koller, D.: Discriminative probalistic models for relational data. In: Proc. of UAI, pp. 485–492 (2002)Google Scholar
  33. 33.
    Wang, T., Neville, J., Gallagher, B., Eliassi-Rad, T.: Correcting bias in statistical tests for network classifier evaluation. In: Proc. of ECML, pp. 506–521 (2011)Google Scholar
  34. 34.
    Xiang, R., Neville, J.: Pseudolikelihood EM for within-network relational learning. In: Proc. of ICDM, pp. 1103–1108 (2008)Google Scholar
  35. 35.
    Zhu, S., Yu, K., Chi, Y., Gong, Y.: Combining content and link for classification using matrix factorization. In: Proc. of SIGIR, pp. 487–494. ACM (2007)Google Scholar
  36. 36.
    Zhu, X., Ghahramani, Z., Lafferty, J.D.: Semi-supervised learning using gaussian fields and harmonic functions. In: Proc. of ICML, pp. 912–919 (2003)Google Scholar
  37. 37.
    Zhu, Y., Yan, X., Getoor, L., Moore, C.: Scalable text and link analysis with mixed-topic link models. In: Proc. of KDD, pp. 473–481 (2013)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  1. 1.Dept. Computer ScienceU.S. Naval AcademyAnnapolisUSA

Personalised recommendations