Skip to main content

Multi-type Relational Data Clustering for Community Detection by Exploiting Content and Structure Information in Social Networks

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11671))

Abstract

Social Networks popularity has facilitated the providers with an opportunity to target specific user groups for various applications such as viral marketing and customized programs. However, the volume and variety of data present in a network challenge the identification of user communities effectively. The sparseness and heterogeneity in a network make it difficult to group the users with similar interests whereas the high dimensionality and sparseness in text pose difficulty in finding content focused groups. We present this problem of discovering user communities with common interests as the multi-type relational data (MTRD) learning with the content and structural information, and propose a novel solution based on non-negative matrix factorization with added regularization. We empirically evaluate the effectiveness of the proposed method on real-world Twitter datasets benchmarking with the state-of-the-art community discovery and clustering methods.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    https://trisma.org/.

References

  1. Akbari, M., Chua, T.S.: Leveraging behavioral factorization and prior knowledge for community discovery and profiling. In: Proceedings of the Tenth ACM International Conference on Web Search and Data Mining, pp. 71–79. ACM (2017)

    Google Scholar 

  2. Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3(Jan), 993–1022 (2003)

    MATH  Google Scholar 

  3. Blondel, V.D., Guillaume, J.L., Lambiotte, R., Lefebvre, E.: Fast unfolding of communities in large networks. J. Stat. Mech.: Theory Exp. 2008(10), P10008 (2008)

    Article  Google Scholar 

  4. Boyd, S., Vandenberghe, L.: Convex Optimization. Cambridge University Press, Cambridge (2004)

    Book  Google Scholar 

  5. Dhillon, I.S.: Co-clustering documents and words using bipartite spectral graph partitioning. In: Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 269–274. ACM (2001)

    Google Scholar 

  6. Ding, C., Li, T., Peng, W., Park, H.: Orthogonal nonnegative matrix t-factorizations for clustering. In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 126–135. ACM (2006)

    Google Scholar 

  7. Girvan, M., Newman, M.E.: Community structure in social and biological networks. Proc. Natl. Acad. Sci. 99(12), 7821–7826 (2002)

    Article  MathSciNet  Google Scholar 

  8. Gu, Q., Zhou, J.: Co-clustering on manifolds. In: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 359–368. ACM (2009)

    Google Scholar 

  9. Hu, X., Liu, H.: Text analytics in social media. In: Aggarwal, C., Zhai, C. (eds.) Mining text Data, pp. 385–414. Springer, Boston (2012). https://doi.org/10.1007/978-1-4614-3223-4_12

    Chapter  Google Scholar 

  10. Iyer, R., Wong, J., Tavanapong, W., Peterson, D.A.: Identifying policy agenda sub-topics in political tweets based on community detection. In: Proceedings of the 2017 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining 2017, pp. 698–705. ACM (2017)

    Google Scholar 

  11. Jain, A.K.: Data clustering: 50 years beyond K-means. Pattern Recogn. Lett. 31(8), 651–666 (2010)

    Article  Google Scholar 

  12. Jing, L., Yun, J., Yu, J., Huang, J.: High-order co-clustering text data on semantics-based representation model. In: Huang, J.Z., Cao, L., Srivastava, J. (eds.) PAKDD 2011. LNCS (LNAI), vol. 6634, pp. 171–182. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-20841-6_15

    Chapter  Google Scholar 

  13. Kuang, D., Ding, C., Park, H.: Symmetric nonnegative matrix factorization for graph clustering. In: Proceedings of the 2012 SIAM International Conference on Data Mining, pp. 106–117. SIAM (2012)

    Google Scholar 

  14. Lee, D.D., Seung, H.S.: Algorithms for non-negative matrix factorization. In: Advances in Neural Information Processing Systems, pp. 556–562 (2001)

    Google Scholar 

  15. Li, P., Bu, J., Chen, C., He, Z.: Relational co-clustering via manifold ensemble learning. In: Proceedings of the 21st ACM International Conference on Information and Knowledge Management, pp. 1687–1691. ACM (2012)

    Google Scholar 

  16. Long, B., Zhang, Z.M., Wu, X., Yu, P.S.: Spectral clustering for multi-type relational data. In: Proceedings of the 23rd International Conference on Machine Learning, pp. 585–592. ACM (2006)

    Google Scholar 

  17. Luong, K., Nayak, R.: Learning association relationship and accurate geometric structures for multi-type relational data. In: 2018 IEEE 34th International Conference on Data Engineering (ICDE), pp. 509–520. IEEE (2018)

    Google Scholar 

  18. Luong, K., Nayak, R.: Clustering multi-view data using non-negative matrix factorization and manifold learning for effective understanding: a survey paper. In: P, D., Jurek-Loughrey, A. (eds.) Linking and Mining Heterogeneous and Multi-view Data. USL, pp. 201–227. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-01872-6_9

    Chapter  Google Scholar 

  19. Mohotti, W.A., Nayak, R.: Corpus-based augmented media posts with density-based clustering for community detection. In: 2018 IEEE 30th International Conference on Tools with Artificial Intelligence (ICTAI), pp. 379–386. IEEE (2018)

    Google Scholar 

  20. Mucha, P.J., Richardson, T., Macon, K., Porter, M.A., Onnela, J.P.: Community structure in time-dependent, multiscale, and multiplex networks. Science 328(5980), 876–878 (2010)

    Article  MathSciNet  Google Scholar 

  21. Park, A., Conway, M., Chen, A.T.: Examining thematic similarity, difference, and membership in three online mental health communities from reddit: a text mining and visualization approach. Comput. Hum. Behav. 78, 98–112 (2018)

    Article  Google Scholar 

  22. Pei, Y., Chakraborty, N., Sycara, K.: Nonnegative matrix tri-factorization with graph regularization for community detection in social networks. In: Proceedings of the 24th International Conference on Artificial Intelligence, IJCAI 2015, pp. 2083–2089. AAAI Press (2015)

    Google Scholar 

  23. Qin, M., Jin, D., Lei, K., Gabrys, B., Musial-Gabrys, K.: Adaptive community detection incorporating topology and content in social networks. Knowl.-Based Syst. 161, 342–356 (2018)

    Article  Google Scholar 

  24. Rosvall, M., Bergstrom, C.T.: Maps of random walks on complex networks reveal community structure. Proc. Natl. Acad. Sci. 105(4), 1118–1123 (2008)

    Article  Google Scholar 

  25. Ruan, Y., Fuhry, D., Parthasarathy, S.: Efficient community detection in large networks using content and links. In: Proceedings of the 22nd International Conference on World Wide Web, pp. 1089–1098. ACM (2013)

    Google Scholar 

  26. Sachan, M., Contractor, D., Faruquie, T.A., Subramaniam, L.V.: Using content and interactions for discovering communities in social networks. In: Proceedings of the 21st International Conference on World Wide Web, pp. 331–340. ACM (2012)

    Google Scholar 

  27. Salton, G., Buckley, C.: Term-weighting approaches in automatic text retrieval. Inform. Process. Manage. 24(5), 513–523 (1988)

    Article  Google Scholar 

  28. Schütze, H., Manning, C.D., Raghavan, P.: Introduction to Information Retrieval, vol. 39. Cambridge University Press, Cambridge (2008)

    MATH  Google Scholar 

  29. Tang, L., Wang, X., Liu, H.: Community detection via heterogeneous interaction analysis. Data Min. Knowl. Discov. 25(1), 1–33 (2012)

    Article  MathSciNet  Google Scholar 

  30. Wang, H., Huang, H., Ding, C.: Simultaneous clustering of multi-type relational data via symmetric nonnegative matrix tri-factorization. In: Proceedings of the 20th ACM International Conference on Information and Knowledge Management, pp. 279–284. ACM (2011)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tennakoon Mudiyanselage Gayani Tennakoon .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Gayani Tennakoon, T.M., Luong, K., Mohotti, W., Chakravarthy, S., Nayak, R. (2019). Multi-type Relational Data Clustering for Community Detection by Exploiting Content and Structure Information in Social Networks. In: Nayak, A., Sharma, A. (eds) PRICAI 2019: Trends in Artificial Intelligence. PRICAI 2019. Lecture Notes in Computer Science(), vol 11671. Springer, Cham. https://doi.org/10.1007/978-3-030-29911-8_42

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-29911-8_42

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-29910-1

  • Online ISBN: 978-3-030-29911-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics