Abstract
Social Networks popularity has facilitated the providers with an opportunity to target specific user groups for various applications such as viral marketing and customized programs. However, the volume and variety of data present in a network challenge the identification of user communities effectively. The sparseness and heterogeneity in a network make it difficult to group the users with similar interests whereas the high dimensionality and sparseness in text pose difficulty in finding content focused groups. We present this problem of discovering user communities with common interests as the multi-type relational data (MTRD) learning with the content and structural information, and propose a novel solution based on non-negative matrix factorization with added regularization. We empirically evaluate the effectiveness of the proposed method on real-world Twitter datasets benchmarking with the state-of-the-art community discovery and clustering methods.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
References
Akbari, M., Chua, T.S.: Leveraging behavioral factorization and prior knowledge for community discovery and profiling. In: Proceedings of the Tenth ACM International Conference on Web Search and Data Mining, pp. 71–79. ACM (2017)
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3(Jan), 993–1022 (2003)
Blondel, V.D., Guillaume, J.L., Lambiotte, R., Lefebvre, E.: Fast unfolding of communities in large networks. J. Stat. Mech.: Theory Exp. 2008(10), P10008 (2008)
Boyd, S., Vandenberghe, L.: Convex Optimization. Cambridge University Press, Cambridge (2004)
Dhillon, I.S.: Co-clustering documents and words using bipartite spectral graph partitioning. In: Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 269–274. ACM (2001)
Ding, C., Li, T., Peng, W., Park, H.: Orthogonal nonnegative matrix t-factorizations for clustering. In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 126–135. ACM (2006)
Girvan, M., Newman, M.E.: Community structure in social and biological networks. Proc. Natl. Acad. Sci. 99(12), 7821–7826 (2002)
Gu, Q., Zhou, J.: Co-clustering on manifolds. In: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 359–368. ACM (2009)
Hu, X., Liu, H.: Text analytics in social media. In: Aggarwal, C., Zhai, C. (eds.) Mining text Data, pp. 385–414. Springer, Boston (2012). https://doi.org/10.1007/978-1-4614-3223-4_12
Iyer, R., Wong, J., Tavanapong, W., Peterson, D.A.: Identifying policy agenda sub-topics in political tweets based on community detection. In: Proceedings of the 2017 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining 2017, pp. 698–705. ACM (2017)
Jain, A.K.: Data clustering: 50 years beyond K-means. Pattern Recogn. Lett. 31(8), 651–666 (2010)
Jing, L., Yun, J., Yu, J., Huang, J.: High-order co-clustering text data on semantics-based representation model. In: Huang, J.Z., Cao, L., Srivastava, J. (eds.) PAKDD 2011. LNCS (LNAI), vol. 6634, pp. 171–182. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-20841-6_15
Kuang, D., Ding, C., Park, H.: Symmetric nonnegative matrix factorization for graph clustering. In: Proceedings of the 2012 SIAM International Conference on Data Mining, pp. 106–117. SIAM (2012)
Lee, D.D., Seung, H.S.: Algorithms for non-negative matrix factorization. In: Advances in Neural Information Processing Systems, pp. 556–562 (2001)
Li, P., Bu, J., Chen, C., He, Z.: Relational co-clustering via manifold ensemble learning. In: Proceedings of the 21st ACM International Conference on Information and Knowledge Management, pp. 1687–1691. ACM (2012)
Long, B., Zhang, Z.M., Wu, X., Yu, P.S.: Spectral clustering for multi-type relational data. In: Proceedings of the 23rd International Conference on Machine Learning, pp. 585–592. ACM (2006)
Luong, K., Nayak, R.: Learning association relationship and accurate geometric structures for multi-type relational data. In: 2018 IEEE 34th International Conference on Data Engineering (ICDE), pp. 509–520. IEEE (2018)
Luong, K., Nayak, R.: Clustering multi-view data using non-negative matrix factorization and manifold learning for effective understanding: a survey paper. In: P, D., Jurek-Loughrey, A. (eds.) Linking and Mining Heterogeneous and Multi-view Data. USL, pp. 201–227. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-01872-6_9
Mohotti, W.A., Nayak, R.: Corpus-based augmented media posts with density-based clustering for community detection. In: 2018 IEEE 30th International Conference on Tools with Artificial Intelligence (ICTAI), pp. 379–386. IEEE (2018)
Mucha, P.J., Richardson, T., Macon, K., Porter, M.A., Onnela, J.P.: Community structure in time-dependent, multiscale, and multiplex networks. Science 328(5980), 876–878 (2010)
Park, A., Conway, M., Chen, A.T.: Examining thematic similarity, difference, and membership in three online mental health communities from reddit: a text mining and visualization approach. Comput. Hum. Behav. 78, 98–112 (2018)
Pei, Y., Chakraborty, N., Sycara, K.: Nonnegative matrix tri-factorization with graph regularization for community detection in social networks. In: Proceedings of the 24th International Conference on Artificial Intelligence, IJCAI 2015, pp. 2083–2089. AAAI Press (2015)
Qin, M., Jin, D., Lei, K., Gabrys, B., Musial-Gabrys, K.: Adaptive community detection incorporating topology and content in social networks. Knowl.-Based Syst. 161, 342–356 (2018)
Rosvall, M., Bergstrom, C.T.: Maps of random walks on complex networks reveal community structure. Proc. Natl. Acad. Sci. 105(4), 1118–1123 (2008)
Ruan, Y., Fuhry, D., Parthasarathy, S.: Efficient community detection in large networks using content and links. In: Proceedings of the 22nd International Conference on World Wide Web, pp. 1089–1098. ACM (2013)
Sachan, M., Contractor, D., Faruquie, T.A., Subramaniam, L.V.: Using content and interactions for discovering communities in social networks. In: Proceedings of the 21st International Conference on World Wide Web, pp. 331–340. ACM (2012)
Salton, G., Buckley, C.: Term-weighting approaches in automatic text retrieval. Inform. Process. Manage. 24(5), 513–523 (1988)
Schütze, H., Manning, C.D., Raghavan, P.: Introduction to Information Retrieval, vol. 39. Cambridge University Press, Cambridge (2008)
Tang, L., Wang, X., Liu, H.: Community detection via heterogeneous interaction analysis. Data Min. Knowl. Discov. 25(1), 1–33 (2012)
Wang, H., Huang, H., Ding, C.: Simultaneous clustering of multi-type relational data via symmetric nonnegative matrix tri-factorization. In: Proceedings of the 20th ACM International Conference on Information and Knowledge Management, pp. 279–284. ACM (2011)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Gayani Tennakoon, T.M., Luong, K., Mohotti, W., Chakravarthy, S., Nayak, R. (2019). Multi-type Relational Data Clustering for Community Detection by Exploiting Content and Structure Information in Social Networks. In: Nayak, A., Sharma, A. (eds) PRICAI 2019: Trends in Artificial Intelligence. PRICAI 2019. Lecture Notes in Computer Science(), vol 11671. Springer, Cham. https://doi.org/10.1007/978-3-030-29911-8_42
Download citation
DOI: https://doi.org/10.1007/978-3-030-29911-8_42
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-29910-1
Online ISBN: 978-3-030-29911-8
eBook Packages: Computer ScienceComputer Science (R0)