Skip to main content
Log in

Predicting neighbor label distributions in dynamic heterogeneous information networks

  • Published:
World Wide Web Aims and scope Submit manuscript

Abstract

In Dynamic Heterogeneous Information Networks (DHINs), predicting neighbor label distribution is important for a variety of applications. For example, when a user changes job, the composition of the user’s friends can change, hence the profession distribution of his/her social circle may change. If we can accurately predict the change of the distribution, we will be able to improve the quality of personal services for him/her. The challenges of predicting neighbor label distribution mainly come from four aspects: infinite state space of neighbor label distributions, link sparsity, the complexity of link formation preferences, and the stream of DHIN snapshots. To address these challenges, we propose a Latent Space Evolution Model (LSEM) for the prediction of neighbor label distribution, which builds a Neighbor Label Distribution Matrix (NLDM) for each type of labels of neighbors of given nodes. LSEM can predict the next NLDM by reconstructing it from two latent feature matrices estimated by their respective autoregressive models. The experiments conducted on real datasets verify the effectiveness of LSEM and the efficiency of the proposed algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8
Figure 9
Figure 10
Figure 11
Figure 12
Figure 13
Figure 14
Figure 15
Figure 16
Figure 17

Similar content being viewed by others

References

  1. Back, M.D., Schmukle, S.C., Egloff, B.: How extraverted is honey.bunny77@hotmail.de? inferring personality from e-mail addresses. J. Res. Pers. 42(4), 1116–1122 (2008)

    Article  Google Scholar 

  2. Bao, J., Zheng, Y., Mokbel, M.F.: Location-based and preference-aware recommendation using sparse geo-social networking data. GIS ’12 Proceedings of the 20th International Conference on Advances in Geographic Information Systems pp. 199–208 (2012)

  3. Brdar, S., Culibrk, D., Crnojevic, V.: Demographic attributes prediction on the real-world mobile data. Proceedings of the Nokia mobile data challenge 2012 workshop (2012)

  4. Cryer, J.D., Chan, K.S.: Time series analysis with applications in r, second edition (2008)

  5. Diane, K., Teevan, J.: Implicit feedback for inferring user preference: a bibliography. ACM SIGIR Forum 37(2) (2003)

  6. Acar, E., Dunlavy, D.M., Kolda, T.G.: Link prediction on evolving data using matrix and tensor factorizations. ICDM 2010 workshops. IEEE Computer Society pp. 262–269 (2010)

  7. Gorrell, G., Webb, B.: Generalized hebbian algorithm for incremental latent semantic analysis. Proceedings of Interspeech (2006)

  8. Gunes, I., Cataltepe, Z., Gunduz-Oguducu, S.: Ga-tvrc-het: genetic algorithm enhanced time varying relational classifier for evolving heterogeneous networks. Data Min. Knowl. Disc. 28(2), 670–701 (2014)

    Article  MathSciNet  Google Scholar 

  9. Hasan, M.A., Chaoji, V., Salem, S., Zaki, M.: Link prediction using supervised learning. SDM’06, Proceedings of SIAM International Conference on Datamining (workshop on Link Analysis Counterterrorism and Security) (2006)

  10. IMDb: http://www.imdb.com/ (2015)

  11. Callut, J., Francoisse, K., Saerens, M., Dupont, P.: Semi-supervised classification from discriminative random walks. ECML/PKDD 5211, 162–177 (2008)

    Google Scholar 

  12. Zhu, J., Hong, J., Hughes, J.G.: Using markov models for web site link prediction. Hypertext. ACM pp. 169–170 (2002)

  13. Costa, P.T. Jr, McCrae, R.R.: Reply to ben-porath and waller. Psychol. Assess. 4(1), 20–22 (1992)

    Article  Google Scholar 

  14. Kong, X., Zhang, J., Philip, Y.: Inferring anchor links across multiple heterogeneous social networks. CIKM ’13 Proceedings of the 22nd ACM international conference on Conference on Information and Knowledge Management (2013)

  15. Kosinski, M., Stillwell, D., Graepel, T.: Private traits and attributes are predictable from digital records of human behavior. Proc. Natl. Acad. Sci. 110(15), 5802–5805 (2013)

    Article  Google Scholar 

  16. Labov, W.: The Social Stratification of English in New York City. PhD thesis, Columbia University (1964)

  17. Lamar, R.N., Beswick, R.W.: Voice mail versus conventional channels: A cost minimization analysis of individuals’ preferences Academy of Management Journal 33(4) (1990)

  18. Leroy, V., Cambazoglu, B.B., Bonchi, F.: Cold start link prediction. KDD’10, Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining (2010)

  19. Lichtenwalter, R.N., Lussier, J.T., Chawla, N.V.: New perspectives and methods in link prediction. KDD’10. Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining 26(12), 2942–2955 (2010)

    Google Scholar 

  20. Linden, G., Smith, B., York, J.: Amazon.com recommendations: Item-to-item collaborative filtering. IEEE Internet Computing (2003)

  21. Ma, Y.: Source code https://pan.baidu.com/s/1gffUYmJ

  22. Ma, Y.: Keywords-direction dictionary. http://pan.baidu.com/s/1kTmXd4N (2015)

  23. Ma, Y., Yang, N., Li, C., Zhang, L., Yu, P.: Predicting neighbor distribution in heterogeneous information networks SDM ’14 (2014)

  24. Mosteller, F., Wallace, D.L.: Inference in an authorship problem: a comparative study of discrimination methods applied to the authorship of the disputed federalist papers. J. Am. Stat. Assoc. 58(302), 275–309 (1963)

    MATH  Google Scholar 

  25. Netflix prize. http://www.netflixprize.com

  26. Paterek, A.: Improving Regularized Singular Value Decomposition for Collaborative Filtering. In: Proceedings of KDD Cup and Workshop 2007, vol. 2007, pp. 5–8 (2007)

  27. Project, N.: Nltk 3.0 documentation. http://www.nltk.org/ (2015)

  28. Sarwar, B., Karypis, G., Konstan, J., Riedl, J.: Item-based collaborative filteringrecommendation algorithms. WWW’10 Proceedings of the 19th international conference on World Wide Web (2001)

  29. Savia, E., Puolamäki, K., Kaski, S.: Latent grouping models for user preference prediction. Mach. Learn. 74(1), 75–109 (2009)

    Article  Google Scholar 

  30. Sharan, U., Neville, J.: Exploiting time-varying relationships in statistical relational models. Proceedings of the 9th WebKDD and 1st SNA-KDD 2007 workshop on Web mining and social network analysis pp. 9–15 (2007)

  31. Sharan, U., Neville, J.: Temporalcrelational classifiers for prediction in evolving domains. ICDM’08, Proceedings of the 8th IEEE International Conference on Data Mining pp. 540–549 (2008)

  32. Shi, C., Li, Y., Yu, P.S., Wu, B.: Constrained-meta-path-based ranking in heterogeneous information network. Knowledge Information System pp. 1–29 (2016)

  33. Sigal, S., Avi, R., Sarit, K., Navot, A.: A Hybrid Approach of Classifier and Clustering for Solving the Missing Node Problem. In: AAAI, pp. 282–289 (2015)

  34. Sun, Y., Han, J.: Mining heterogeneous information networks: Principles and methodologies (2012)

  35. Sun, Y., Han, J., Aggarwal, C.C., Chawla, N.V.: When will it happen? - relationship prediction in heterogeneous information networks. WSDM’12 Proceedings of the 8th ACM International Conference on Web Search and Data Mining (2012)

  36. Sun, Y., Han, J., Yan, X., Yu, P.S., Wu, T.: Pathsim: Meta path-based top-k similarity search in heterogeneous information networks. VLDB’ 11 (2011)

  37. Sun, Y., Tang, J., Han, J., Chen, C., Gupta, M.: Co-Evolution of Multi-Typed Objects in Dynamic Star Networks. TKDE, IEEE Transaction on Knowledge and Data Engineering (2013)

  38. Taskar, B., fai Wong, M., Abbeel, P., Koller, D.: Link prediction in relational data. Learning Statistical Patterns in Relational Data Using Probabilistic Relational Models 7 (2005)

  39. Team, D.: Dblp. http://dblp.uni-trier.de/ (2015)

  40. Tucker, L.R.: Some mathematical notes on three-mode factor analysis. Psychometrika 31, 279–311 (1966)

    Article  MathSciNet  Google Scholar 

  41. Wang, C., Satuluri, V., Parthasarathy, S.: Local probabilistic models for link prediction. ICDM’07, Proceedings of IEEE International Conference on Data Mining 2007 (2007)

  42. Webb, B.: Netflix update: Try this at home http://sifter.org/~simon/JOURNAL/20061211.html (2006)

  43. Yelp dataset challenge. https://www.yelp.com/dataset_challenge

  44. Yu, K., Schwaighofer, A., Tresp, V., Xu, X., Kriegel, H.P.: Probabilistic memorybased collaborative filtering. TKDE Trans. Knowl. Data Eng. 16(1), 56–69 (2004)

    Article  Google Scholar 

  45. Zhang, J., Kong, X., Philip, Y.S.: Predicting social links for new users across aligned heterogeneous social networks. ICDM ’13 Proceedings of the 13th International Conference on Data Mining pp. 1289–1294 (2013)

  46. Zhang, J., Kong, X., Philip, Y.: Transferring heterogeneous links across location-based social networks WSDM ’14 Proceedings of the 7th ACM international conference on Web search and Data Mining (2014)

  47. Zheng, Y., Liu, T., Wang, Y., Zhu, Y., Liu, Y., Chang, E.: Diagnosing new york city’s noises with ubiquitous data. Ubicomp ’14, Proceedings of ACM International Conference on Ubiquitous Computing 2014 (2014)

  48. Zhong, Y., Yuan, N.J., Zhong, W., Zhang, F., Xie, X.: You are where you go: Inferring demographic attributes from location check-ins. WSDM’15 Proceedings of the 8th ACM international conference on Web search and Data Mining (2015)

Download references

Acknowledgments

This work is supported by National Science Foundation of China through grant 61173099, the Basic Research Program of Sichuan Province with Grant 2014JY0220, and US National Science Foundation through grants CNS-1115234, DBI-0960443, and OISE-1129076.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Yuchi Ma or Ning Yang.

Additional information

Yuchi Ma and Ning Yang are equally contributed to this study and share the first authorship.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ma, Y., Yang, N., Zhang, L. et al. Predicting neighbor label distributions in dynamic heterogeneous information networks. World Wide Web 20, 1269–1291 (2017). https://doi.org/10.1007/s11280-017-0435-3

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11280-017-0435-3

Keywords

Navigation