Abstract
In Dynamic Heterogeneous Information Networks (DHINs), predicting neighbor label distribution is important for a variety of applications. For example, when a user changes job, the composition of the user’s friends can change, hence the profession distribution of his/her social circle may change. If we can accurately predict the change of the distribution, we will be able to improve the quality of personal services for him/her. The challenges of predicting neighbor label distribution mainly come from four aspects: infinite state space of neighbor label distributions, link sparsity, the complexity of link formation preferences, and the stream of DHIN snapshots. To address these challenges, we propose a Latent Space Evolution Model (LSEM) for the prediction of neighbor label distribution, which builds a Neighbor Label Distribution Matrix (NLDM) for each type of labels of neighbors of given nodes. LSEM can predict the next NLDM by reconstructing it from two latent feature matrices estimated by their respective autoregressive models. The experiments conducted on real datasets verify the effectiveness of LSEM and the efficiency of the proposed algorithm.
Similar content being viewed by others
References
Back, M.D., Schmukle, S.C., Egloff, B.: How extraverted is honey.bunny77@hotmail.de? inferring personality from e-mail addresses. J. Res. Pers. 42(4), 1116–1122 (2008)
Bao, J., Zheng, Y., Mokbel, M.F.: Location-based and preference-aware recommendation using sparse geo-social networking data. GIS ’12 Proceedings of the 20th International Conference on Advances in Geographic Information Systems pp. 199–208 (2012)
Brdar, S., Culibrk, D., Crnojevic, V.: Demographic attributes prediction on the real-world mobile data. Proceedings of the Nokia mobile data challenge 2012 workshop (2012)
Cryer, J.D., Chan, K.S.: Time series analysis with applications in r, second edition (2008)
Diane, K., Teevan, J.: Implicit feedback for inferring user preference: a bibliography. ACM SIGIR Forum 37(2) (2003)
Acar, E., Dunlavy, D.M., Kolda, T.G.: Link prediction on evolving data using matrix and tensor factorizations. ICDM 2010 workshops. IEEE Computer Society pp. 262–269 (2010)
Gorrell, G., Webb, B.: Generalized hebbian algorithm for incremental latent semantic analysis. Proceedings of Interspeech (2006)
Gunes, I., Cataltepe, Z., Gunduz-Oguducu, S.: Ga-tvrc-het: genetic algorithm enhanced time varying relational classifier for evolving heterogeneous networks. Data Min. Knowl. Disc. 28(2), 670–701 (2014)
Hasan, M.A., Chaoji, V., Salem, S., Zaki, M.: Link prediction using supervised learning. SDM’06, Proceedings of SIAM International Conference on Datamining (workshop on Link Analysis Counterterrorism and Security) (2006)
IMDb: http://www.imdb.com/ (2015)
Callut, J., Francoisse, K., Saerens, M., Dupont, P.: Semi-supervised classification from discriminative random walks. ECML/PKDD 5211, 162–177 (2008)
Zhu, J., Hong, J., Hughes, J.G.: Using markov models for web site link prediction. Hypertext. ACM pp. 169–170 (2002)
Costa, P.T. Jr, McCrae, R.R.: Reply to ben-porath and waller. Psychol. Assess. 4(1), 20–22 (1992)
Kong, X., Zhang, J., Philip, Y.: Inferring anchor links across multiple heterogeneous social networks. CIKM ’13 Proceedings of the 22nd ACM international conference on Conference on Information and Knowledge Management (2013)
Kosinski, M., Stillwell, D., Graepel, T.: Private traits and attributes are predictable from digital records of human behavior. Proc. Natl. Acad. Sci. 110(15), 5802–5805 (2013)
Labov, W.: The Social Stratification of English in New York City. PhD thesis, Columbia University (1964)
Lamar, R.N., Beswick, R.W.: Voice mail versus conventional channels: A cost minimization analysis of individuals’ preferences Academy of Management Journal 33(4) (1990)
Leroy, V., Cambazoglu, B.B., Bonchi, F.: Cold start link prediction. KDD’10, Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining (2010)
Lichtenwalter, R.N., Lussier, J.T., Chawla, N.V.: New perspectives and methods in link prediction. KDD’10. Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining 26(12), 2942–2955 (2010)
Linden, G., Smith, B., York, J.: Amazon.com recommendations: Item-to-item collaborative filtering. IEEE Internet Computing (2003)
Ma, Y.: Source code https://pan.baidu.com/s/1gffUYmJ
Ma, Y.: Keywords-direction dictionary. http://pan.baidu.com/s/1kTmXd4N (2015)
Ma, Y., Yang, N., Li, C., Zhang, L., Yu, P.: Predicting neighbor distribution in heterogeneous information networks SDM ’14 (2014)
Mosteller, F., Wallace, D.L.: Inference in an authorship problem: a comparative study of discrimination methods applied to the authorship of the disputed federalist papers. J. Am. Stat. Assoc. 58(302), 275–309 (1963)
Netflix prize. http://www.netflixprize.com
Paterek, A.: Improving Regularized Singular Value Decomposition for Collaborative Filtering. In: Proceedings of KDD Cup and Workshop 2007, vol. 2007, pp. 5–8 (2007)
Project, N.: Nltk 3.0 documentation. http://www.nltk.org/ (2015)
Sarwar, B., Karypis, G., Konstan, J., Riedl, J.: Item-based collaborative filteringrecommendation algorithms. WWW’10 Proceedings of the 19th international conference on World Wide Web (2001)
Savia, E., Puolamäki, K., Kaski, S.: Latent grouping models for user preference prediction. Mach. Learn. 74(1), 75–109 (2009)
Sharan, U., Neville, J.: Exploiting time-varying relationships in statistical relational models. Proceedings of the 9th WebKDD and 1st SNA-KDD 2007 workshop on Web mining and social network analysis pp. 9–15 (2007)
Sharan, U., Neville, J.: Temporalcrelational classifiers for prediction in evolving domains. ICDM’08, Proceedings of the 8th IEEE International Conference on Data Mining pp. 540–549 (2008)
Shi, C., Li, Y., Yu, P.S., Wu, B.: Constrained-meta-path-based ranking in heterogeneous information network. Knowledge Information System pp. 1–29 (2016)
Sigal, S., Avi, R., Sarit, K., Navot, A.: A Hybrid Approach of Classifier and Clustering for Solving the Missing Node Problem. In: AAAI, pp. 282–289 (2015)
Sun, Y., Han, J.: Mining heterogeneous information networks: Principles and methodologies (2012)
Sun, Y., Han, J., Aggarwal, C.C., Chawla, N.V.: When will it happen? - relationship prediction in heterogeneous information networks. WSDM’12 Proceedings of the 8th ACM International Conference on Web Search and Data Mining (2012)
Sun, Y., Han, J., Yan, X., Yu, P.S., Wu, T.: Pathsim: Meta path-based top-k similarity search in heterogeneous information networks. VLDB’ 11 (2011)
Sun, Y., Tang, J., Han, J., Chen, C., Gupta, M.: Co-Evolution of Multi-Typed Objects in Dynamic Star Networks. TKDE, IEEE Transaction on Knowledge and Data Engineering (2013)
Taskar, B., fai Wong, M., Abbeel, P., Koller, D.: Link prediction in relational data. Learning Statistical Patterns in Relational Data Using Probabilistic Relational Models 7 (2005)
Team, D.: Dblp. http://dblp.uni-trier.de/ (2015)
Tucker, L.R.: Some mathematical notes on three-mode factor analysis. Psychometrika 31, 279–311 (1966)
Wang, C., Satuluri, V., Parthasarathy, S.: Local probabilistic models for link prediction. ICDM’07, Proceedings of IEEE International Conference on Data Mining 2007 (2007)
Webb, B.: Netflix update: Try this at home http://sifter.org/~simon/JOURNAL/20061211.html (2006)
Yelp dataset challenge. https://www.yelp.com/dataset_challenge
Yu, K., Schwaighofer, A., Tresp, V., Xu, X., Kriegel, H.P.: Probabilistic memorybased collaborative filtering. TKDE Trans. Knowl. Data Eng. 16(1), 56–69 (2004)
Zhang, J., Kong, X., Philip, Y.S.: Predicting social links for new users across aligned heterogeneous social networks. ICDM ’13 Proceedings of the 13th International Conference on Data Mining pp. 1289–1294 (2013)
Zhang, J., Kong, X., Philip, Y.: Transferring heterogeneous links across location-based social networks WSDM ’14 Proceedings of the 7th ACM international conference on Web search and Data Mining (2014)
Zheng, Y., Liu, T., Wang, Y., Zhu, Y., Liu, Y., Chang, E.: Diagnosing new york city’s noises with ubiquitous data. Ubicomp ’14, Proceedings of ACM International Conference on Ubiquitous Computing 2014 (2014)
Zhong, Y., Yuan, N.J., Zhong, W., Zhang, F., Xie, X.: You are where you go: Inferring demographic attributes from location check-ins. WSDM’15 Proceedings of the 8th ACM international conference on Web search and Data Mining (2015)
Acknowledgments
This work is supported by National Science Foundation of China through grant 61173099, the Basic Research Program of Sichuan Province with Grant 2014JY0220, and US National Science Foundation through grants CNS-1115234, DBI-0960443, and OISE-1129076.
Author information
Authors and Affiliations
Corresponding authors
Additional information
Yuchi Ma and Ning Yang are equally contributed to this study and share the first authorship.
Rights and permissions
About this article
Cite this article
Ma, Y., Yang, N., Zhang, L. et al. Predicting neighbor label distributions in dynamic heterogeneous information networks. World Wide Web 20, 1269–1291 (2017). https://doi.org/10.1007/s11280-017-0435-3
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11280-017-0435-3