Skip to main content
Log in

Actively learning to infer social ties

  • Published:
Data Mining and Knowledge Discovery Aims and scope Submit manuscript

Abstract

We study the extent to which social ties between people can be inferred in large social network, in particular via active user interactions. In most online social networks, relationships are lack of meaning labels (e.g., “colleague” and “intimate friends”) due to various reasons. Understanding the formation of different types of social relationships can provide us insights into the micro-level dynamics of the social network. In this work, we precisely define the problem of inferring social ties and propose a Partially-Labeled Pairwise Factor Graph Model (PLP-FGM) for learning to infer the type of social relationships. The model formalizes the problem of inferring social ties into a flexible semi-supervised framework. We test the model on three different genres of data sets and demonstrate its effectiveness. We further study how to leverage user interactions to help improve the inferring accuracy. Two active learning algorithms are proposed to actively select relationships to query users for their labels. Experimental results show that with only a few user corrections, the accuracy of inferring social ties can be significantly improved. Finally, to scale the model to handle real large networks, a distributed learning algorithm has been developed.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Barabasi AL (2002) Statistical mechanics of complex networks. Rev Mod Phys 74(1): 47–97

    Article  MathSciNet  Google Scholar 

  • Backstrom L, Leskovec J (2011) Supervised random walks: predicting and recommending links in social networks. In: WSDM, pp 635–644

  • Bilgic M, Mihalkova L, Getoor L (2010) Active learning for networked data. In: Fürnkranz J, Joachims T (eds) ICML. Omnipress, pp 79–86

  • Califf ME, Mooney RJ (1999) Relational learning of pattern-match rules for information extraction. In: AAAI/IAAI, pp 328–334

  • Cesa-Bianchi N, Gentile C, Vitale F, Zappella G (2010) Active learning on trees and graphs. In: COLT, pp 320–332

  • Crandall D, Backstrom L, Cosley D, Suri S, Huttenlocher D, Kleinberg J (2010) Inferring social ties from geographic coincidences. PNAS 107(52): 22436

    Article  Google Scholar 

  • Diehl CP, Namata G, Getoor L (2007) Relationship identification for social network discovery. In: AAAI, AAAI Press, pp 546–552

  • Domingos P, Richardson M (2001) Mining the network value of customers. In: KDD, pp 57–66

  • Eagle N, Pentland AS, Lazer D (2008) Mobile phone data for inferring social network structure. In: Social computing, behavioral modeling, and prediction, pp 79–88

  • Faloutsos M, Faloutsos P, Faloutsos C (1999) On power-law relationships of the internet topology. In: SIGCOMM, pp 251–262

  • Getoor L, Taskar B (2007) Introduction to statistical relational learning. The MIT Press, Cambridge

    MATH  Google Scholar 

  • Golovin D, Krause A, Ray D (2010) Near-optimal Bayesian active learning with noisy observations. CoRR abs/1010.3091

  • Grob R, Kuhn M, Wattenhofer R, Wirz M (2009) Cluestr: mobile social networking for enhanced group communication. In: GROUP, pp 81–90

  • Hammersley JM, Clifford P (1971) Markov field on finite graphs and lattices. Unpublished manuscript

  • Hopcroft JE, Lou T, Tang J (2011) Who will follow you back? Reciprocal relationship prediction. In: CIKM’11

  • Karypis G, Kumar V (1998) MeTis: unstrctured graph partitioning and sparse matrix ordering system. Version 4.0 Sept

  • Kempe D, Kleinberg J, Tardos E (2003) Maximizing the spread of influence through a social network. In: KDD, pp 137–146

  • Kimura M, Saito K, Nakano R, Motoda H (2010) Extracting influential nodes on a social network for information diffusion. Data Min Knowl Discov 20(1): 70–97

    Article  MathSciNet  Google Scholar 

  • Kleinberg J (2005) Temporal dynamics of on-line information streams. In: Garofalakis M, Gehrke J, Rastogi R (eds) Data stream managemnt processing high-speed data. Springer, Heidelberg

    Google Scholar 

  • Krause A, Guestrin C (2009) Optimal value of information in graphical models. J Artif Intell Res (JAIR) 35: 557–591

    MathSciNet  MATH  Google Scholar 

  • Kuwadekar A, Neville J (2011) Relational active learning for joint collective classification models. In: Getoor L, Scheffer T (eds) Proceedings of the 28th International Conference on Machine Learning (ICML-11), ICML ’11, pp 385–392, New York, NY, USA, June. ACM.

  • Lafferty JD, McCallum A, Pereira FCN (2001) Conditional random fields: probabilistic models for segmenting and labeling sequence data. In: Proceedings of the 18th international conference on machine learning (ICML’01), pp 282–289

  • Leskovec J, Huttenlocher DP, Kleinberg JM (2010) Predicting positive and negative links in online social networks. In: WWW, pp 641–650

  • Liben-Nowell D, Kleinberg J (2007) The link-prediction problem for social networks. J Am Soc Inf Sci Technol 58(7): 1019–1031

    Article  Google Scholar 

  • Martinez O, Tsechpenakis G (2008) Integration of active learning in a collaborative CRF. In: Computer vision and pattern recognition workshop, pp 1–8

  • Menon AK, Elkan C (2010) A log-linear model with latent features for dyadic prediction. In: ICDM, pp 364–373

  • Murphy K, Weiss Y, Jordan M (1999) Loopy belief propagation for approximate inference: an empirical study. In: UAI, vol 9, pp 467–475

  • Newman MEJ (2003) The structure and function of complex networks. SIAM Rev 45:167–256

    Google Scholar 

  • Popescul A, Ungar L (2003) Statistical relational learning for link prediction. In: IJCAI03 workshop on learning statistical models from relational data volume 149,172

  • Roth M, Ben-David A, Deutscher D, Flysher G, Horn I, Leichtberg A, Leiser N, Matias Y, Merom R (2010) Suggesting friends using the implicit social graph. In: KDD, pp 233–242

  • Roy N, McCallum A (2001) Toward optimal active learning through sampling estimation of error reduction. In: ICML, pp 441–448

  • Settles B, Craven M (2008) An analysis of active learning strategies for sequence labeling tasks. In: EMNLP, pp 1070–1079

  • Shi L, Zhao Y, Tang J (2011) Batch mode active learning for networked data. In: ACM Transactions on Intelligent Systems and Technology (TIST)

  • Strogatz SH (2003) Exploring complex networks. Nature 410: 268–276

    Article  Google Scholar 

  • Tan C, Lee L, Tang J, Jiang L, Zhou M, Li P (2011) User-level sentiment analysis incorporating social networks. In: KDD, pp 1397–1405

  • Tan C, Tang J, Sun J, Lin Q, Wang F (2010) Social action tracking via noise tolerant time-varying factor graphs. In: KDD, pp 1049–1058

  • Tang J, Lou T, Kleinberg J (2012) Inferring social ties across heterogenous networks. In: WSDM’12

  • Tang J, Sun J, Wang C, Yang Z (2009) Social influence analysis in large-scale networks. In: KDD, pp 807–816

  • Tang J, Zhang J, Yao L, Li J, Zhang L, Su Z (2008) Arnetminer: Extraction and mining of academic social networks. In: KDD’08, pp 990–998

  • Tang L, Liu H (2011) Leveraging social media networks for classification. Data Min Knowl Discov 23(3): 447–478

    Article  MathSciNet  MATH  Google Scholar 

  • Tang W, Zhuang H, Tang J (2011) Learning to infer social ties in large networks. In: ECML/PKDD’11, pp 381–397

  • Taskar B, Wong MF, Abbeel P, Koller D (2003) Link prediction in relational data. In: NIPS. MIT Press

  • Wang C, Han J, Jia Y, Tang J, Zhang D, Yu Y, Guo J (2010) Mining advisor-advisee relationships from research publication networks. In: KDD, pp 203–212

  • Wang D, Pedreschi D, Song C, Giannotti F, Barabási A-L (2011) Human mobility, social ties, and link prediction. In: KDD, pp 1100–1108

  • Yang Z, Guo J, Cai K, Tang J, Li J, Zhang L, Su Z (2010) Understanding retweeting behaviors in social networks. In: CIKM, pp 1633–1636

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jie Tang.

Additional information

Responsible editor: Dimitrios Gunopulos, Donato Malerba and Michalis Vazirgiannis.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhuang, H., Tang, J., Tang, W. et al. Actively learning to infer social ties. Data Min Knowl Disc 25, 270–297 (2012). https://doi.org/10.1007/s10618-012-0274-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10618-012-0274-x

Keywords

Navigation