Multi-type Relational Data Clustering for Community Detection by Exploiting Content and Structure Information in Social Networks

Gayani Tennakoon, Tennakoon Mudiyanselage; Luong, Khanh; Mohotti, Wathsala; Chakravarthy, Sharma; Nayak, Richi

doi:10.1007/978-3-030-29911-8_42

Multi-type Relational Data Clustering for Community Detection by Exploiting Content and Structure Information in Social Networks

Tennakoon Mudiyanselage Gayani Tennakoon¹⁰,
Khanh Luong¹⁰,
Wathsala Mohotti¹⁰,
Sharma Chakravarthy¹¹ &
…
Richi Nayak¹⁰

Conference paper
First Online: 23 August 2019

2726 Accesses
5 Citations

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11671))

Abstract

Social Networks popularity has facilitated the providers with an opportunity to target specific user groups for various applications such as viral marketing and customized programs. However, the volume and variety of data present in a network challenge the identification of user communities effectively. The sparseness and heterogeneity in a network make it difficult to group the users with similar interests whereas the high dimensionality and sparseness in text pose difficulty in finding content focused groups. We present this problem of discovering user communities with common interests as the multi-type relational data (MTRD) learning with the content and structural information, and propose a novel solution based on non-negative matrix factorization with added regularization. We empirically evaluate the effectiveness of the proposed method on real-world Twitter datasets benchmarking with the state-of-the-art community discovery and clustering methods.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

1.
https://trisma.org/.

References

Akbari, M., Chua, T.S.: Leveraging behavioral factorization and prior knowledge for community discovery and profiling. In: Proceedings of the Tenth ACM International Conference on Web Search and Data Mining, pp. 71–79. ACM (2017)
Google Scholar
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3(Jan), 993–1022 (2003)
MATH Google Scholar
Blondel, V.D., Guillaume, J.L., Lambiotte, R., Lefebvre, E.: Fast unfolding of communities in large networks. J. Stat. Mech.: Theory Exp. 2008(10), P10008 (2008)
Article Google Scholar
Boyd, S., Vandenberghe, L.: Convex Optimization. Cambridge University Press, Cambridge (2004)
Book Google Scholar
Dhillon, I.S.: Co-clustering documents and words using bipartite spectral graph partitioning. In: Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 269–274. ACM (2001)
Google Scholar
Ding, C., Li, T., Peng, W., Park, H.: Orthogonal nonnegative matrix t-factorizations for clustering. In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 126–135. ACM (2006)
Google Scholar
Girvan, M., Newman, M.E.: Community structure in social and biological networks. Proc. Natl. Acad. Sci. 99(12), 7821–7826 (2002)
Article MathSciNet Google Scholar
Gu, Q., Zhou, J.: Co-clustering on manifolds. In: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 359–368. ACM (2009)
Google Scholar
Hu, X., Liu, H.: Text analytics in social media. In: Aggarwal, C., Zhai, C. (eds.) Mining text Data, pp. 385–414. Springer, Boston (2012). https://doi.org/10.1007/978-1-4614-3223-4_12
Chapter Google Scholar
Iyer, R., Wong, J., Tavanapong, W., Peterson, D.A.: Identifying policy agenda sub-topics in political tweets based on community detection. In: Proceedings of the 2017 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining 2017, pp. 698–705. ACM (2017)
Google Scholar
Jain, A.K.: Data clustering: 50 years beyond K-means. Pattern Recogn. Lett. 31(8), 651–666 (2010)
Article Google Scholar
Jing, L., Yun, J., Yu, J., Huang, J.: High-order co-clustering text data on semantics-based representation model. In: Huang, J.Z., Cao, L., Srivastava, J. (eds.) PAKDD 2011. LNCS (LNAI), vol. 6634, pp. 171–182. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-20841-6_15
Chapter Google Scholar
Kuang, D., Ding, C., Park, H.: Symmetric nonnegative matrix factorization for graph clustering. In: Proceedings of the 2012 SIAM International Conference on Data Mining, pp. 106–117. SIAM (2012)
Google Scholar
Lee, D.D., Seung, H.S.: Algorithms for non-negative matrix factorization. In: Advances in Neural Information Processing Systems, pp. 556–562 (2001)
Google Scholar
Li, P., Bu, J., Chen, C., He, Z.: Relational co-clustering via manifold ensemble learning. In: Proceedings of the 21st ACM International Conference on Information and Knowledge Management, pp. 1687–1691. ACM (2012)
Google Scholar
Long, B., Zhang, Z.M., Wu, X., Yu, P.S.: Spectral clustering for multi-type relational data. In: Proceedings of the 23rd International Conference on Machine Learning, pp. 585–592. ACM (2006)
Google Scholar
Luong, K., Nayak, R.: Learning association relationship and accurate geometric structures for multi-type relational data. In: 2018 IEEE 34th International Conference on Data Engineering (ICDE), pp. 509–520. IEEE (2018)
Google Scholar
Luong, K., Nayak, R.: Clustering multi-view data using non-negative matrix factorization and manifold learning for effective understanding: a survey paper. In: P, D., Jurek-Loughrey, A. (eds.) Linking and Mining Heterogeneous and Multi-view Data. USL, pp. 201–227. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-01872-6_9
Chapter Google Scholar
Mohotti, W.A., Nayak, R.: Corpus-based augmented media posts with density-based clustering for community detection. In: 2018 IEEE 30th International Conference on Tools with Artificial Intelligence (ICTAI), pp. 379–386. IEEE (2018)
Google Scholar
Mucha, P.J., Richardson, T., Macon, K., Porter, M.A., Onnela, J.P.: Community structure in time-dependent, multiscale, and multiplex networks. Science 328(5980), 876–878 (2010)
Article MathSciNet Google Scholar
Park, A., Conway, M., Chen, A.T.: Examining thematic similarity, difference, and membership in three online mental health communities from reddit: a text mining and visualization approach. Comput. Hum. Behav. 78, 98–112 (2018)
Article Google Scholar
Pei, Y., Chakraborty, N., Sycara, K.: Nonnegative matrix tri-factorization with graph regularization for community detection in social networks. In: Proceedings of the 24th International Conference on Artificial Intelligence, IJCAI 2015, pp. 2083–2089. AAAI Press (2015)
Google Scholar
Qin, M., Jin, D., Lei, K., Gabrys, B., Musial-Gabrys, K.: Adaptive community detection incorporating topology and content in social networks. Knowl.-Based Syst. 161, 342–356 (2018)
Article Google Scholar
Rosvall, M., Bergstrom, C.T.: Maps of random walks on complex networks reveal community structure. Proc. Natl. Acad. Sci. 105(4), 1118–1123 (2008)
Article Google Scholar
Ruan, Y., Fuhry, D., Parthasarathy, S.: Efficient community detection in large networks using content and links. In: Proceedings of the 22nd International Conference on World Wide Web, pp. 1089–1098. ACM (2013)
Google Scholar
Sachan, M., Contractor, D., Faruquie, T.A., Subramaniam, L.V.: Using content and interactions for discovering communities in social networks. In: Proceedings of the 21st International Conference on World Wide Web, pp. 331–340. ACM (2012)
Google Scholar
Salton, G., Buckley, C.: Term-weighting approaches in automatic text retrieval. Inform. Process. Manage. 24(5), 513–523 (1988)
Article Google Scholar
Schütze, H., Manning, C.D., Raghavan, P.: Introduction to Information Retrieval, vol. 39. Cambridge University Press, Cambridge (2008)
MATH Google Scholar
Tang, L., Wang, X., Liu, H.: Community detection via heterogeneous interaction analysis. Data Min. Knowl. Discov. 25(1), 1–33 (2012)
Article MathSciNet Google Scholar
Wang, H., Huang, H., Ding, C.: Simultaneous clustering of multi-type relational data via symmetric nonnegative matrix tri-factorization. In: Proceedings of the 20th ACM International Conference on Information and Knowledge Management, pp. 279–284. ACM (2011)
Google Scholar

Download references

Author information

Authors and Affiliations

Queensland University of Technology, 2 George Street, Brisbane, Australia
Tennakoon Mudiyanselage Gayani Tennakoon, Khanh Luong, Wathsala Mohotti & Richi Nayak
University of Texas at Arlington, 701 S Nedderman Dr, Arlington, TX, 76019, USA
Sharma Chakravarthy

Authors

Tennakoon Mudiyanselage Gayani Tennakoon
View author publications
You can also search for this author in PubMed Google Scholar
Khanh Luong
View author publications
You can also search for this author in PubMed Google Scholar
Wathsala Mohotti
View author publications
You can also search for this author in PubMed Google Scholar
Sharma Chakravarthy
View author publications
You can also search for this author in PubMed Google Scholar
Richi Nayak
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tennakoon Mudiyanselage Gayani Tennakoon .

Editor information

Editors and Affiliations

Department of Computing, Macquarie University, Sydney, NSW, Australia
Abhaya C. Nayak
RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
Alok Sharma

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Gayani Tennakoon, T.M., Luong, K., Mohotti, W., Chakravarthy, S., Nayak, R. (2019). Multi-type Relational Data Clustering for Community Detection by Exploiting Content and Structure Information in Social Networks. In: Nayak, A., Sharma, A. (eds) PRICAI 2019: Trends in Artificial Intelligence. PRICAI 2019. Lecture Notes in Computer Science(), vol 11671. Springer, Cham. https://doi.org/10.1007/978-3-030-29911-8_42

Download citation

DOI: https://doi.org/10.1007/978-3-030-29911-8_42
Published: 23 August 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-29910-1
Online ISBN: 978-3-030-29911-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics