Detecting topics and overlapping communities in question and answer sites

  • Zide MengEmail author
  • Fabien Gandon
  • Catherine Faron-Zucker
  • Ge Song
Original Article


In many social networks, people interact based on their interests. Community detection algorithms are then useful to reveal the sub-structures of a network and in particular interest groups. Identifying these users’ communities and the interests that bind them can help us assist their life-cycle. Certain kinds of online communities such as question-and-answer (Q&A) sites, have no explicit social network structure. Therefore, many traditional community detection techniques do not apply directly. In this paper, we propose an efficient approach for extracting topic from Q&A to detect communities of interest. Then we compare three detection methods we applied on a dataset extracted from the popular Q&A site StackOverflow. Our method based on topic modeling and user membership assignment is shown to be much simpler and faster while preserving the quality of the detection.


Overlapping community detection Question–answer sites  Topic modeling 



The authors would like to thank StackOverflow for sharing their data. We also sincerely thank volunteers for helping us label the dataset for evaluation. We thank the ANR ocktopus project (ANR-12-CORD-0026) grant for the support of this research. We also appreciate very helpful advices from anonymous reviewers.


  1. Ahn YY, Bagrow JP, Lehmann S (2010) Link communities reveal multiscale complexity in networks. Nature 466(7307):761–764CrossRefGoogle Scholar
  2. Anderson A, Huttenlocher DP, Kleinberg JM, Leskovec J (2012) Discovering value from community activity on focused question answering sites: a case study of stack overflow. In: KDD, pp 850–858Google Scholar
  3. Blei DM, Ng AY, Jordan MI (2003) Latent dirichlet allocation. J Mach Learn Res 3:993–1022zbMATHGoogle Scholar
  4. Chang S, Pal A (2013) Routing questions for collaborative answering in community question answering. In: Proceedings of the 2013 IEEE/ACM international conference on advances in social networks analysis and mining, ASONAM ’13ACM, New York, NY, USA, pp 494–501Google Scholar
  5. Duan L, Street WN, Liu Y, Lu H (2014) Community detection in graphs through correlation. In: Proceedings of the 20th ACM SIGKDD international conference on knowledge discovery and data mining, ACM, New York, pp 1376–1385Google Scholar
  6. Fortunato S (2009) Community detection in graphs. CoRR. arxiv:0906.0612
  7. Gargi U, Lu W, Mirrokni VS, Yoon S (2011) Large-scale community detection on youtube for topic discovery and exploration. In: ICWSMGoogle Scholar
  8. Gopalan PK, Blei DM (2013) Efficient discovery of overlapping communities in massive networks. Proc Natl Acad Sci 110(36):14534–14539zbMATHMathSciNetCrossRefGoogle Scholar
  9. Gregory S (2010) Finding overlapping communities in networks by label propagation. New J Phys 12(10):103018CrossRefGoogle Scholar
  10. Gregory S (2011) Fuzzy overlapping communities in networks. J Stat Mech Theory Exp 2011(02):P02017CrossRefGoogle Scholar
  11. Griffiths TL, Steyvers M (2004) Finding scientific topics. Proc Natl Acad Sci USA 101(Suppl 1):5228–5235 CrossRefGoogle Scholar
  12. Lancichinetti A, Radicchi F, Ramasco JJ, Fortunato S (2011) Finding statistically significant communities in networks. PloS one 6(4):e18,961CrossRefGoogle Scholar
  13. Leskovec J, Lang KJ, Dasgupta A, Mahoney MW (2008) Statistical properties of community structure in large social and information networks. In: Proceedings of the 17th international conference on World Wide Web, ACM, New York, pp 695–704Google Scholar
  14. Li B, King I (2010) Routing questions to appropriate answerers in community question answering services. In: Proceedings of the 19th ACM international conference on Information and knowledge management, ACM, New York, pp 1585–1588Google Scholar
  15. Li D, He B, Ding Y, Tang J, Sugimoto C, Qin Z, Yan E, Li J, Dong T (2010) Community-based topic modeling for social tagging. In: Proceedings of the 19th ACM international conference on information and knowledge management, CIKM ’10, ACM, New York, NY, USA, pp 1565–1568Google Scholar
  16. McDaid A, Hurley N (2010) Detecting highly overlapping communities with model-based overlapping seed expansion. In: Proceedings of the 2010 International Conference on Advances in Social Networks Analysis and Mining (ASONAM). IEEE, pp 112–119Google Scholar
  17. Mika P (2007) Ontologies are us: A unified model of social networks and semantics. Web Semant Sci Serv Agents World Wide Web 5(1):5–15MathSciNetCrossRefGoogle Scholar
  18. Ng AY, Jordan MI, Weiss Y (2001) On spectral clustering: analysis and an algorithm. In: Advances in neural information processing systemS, MIT Press, Cambridge, pp 849–856Google Scholar
  19. Sun X, Lin H (2013) Topical community detection from mining user tagging behavior and interest. JASIST 64(2):321–333CrossRefGoogle Scholar
  20. Suri S, Vassilvitskii S (2011) Counting triangles and the curse of the last reducer. In: Proceedings of the 20th international conference on World Wide Web, ACM, New York, pp 607–614Google Scholar
  21. Tang J, Zhang J, Yao L, Li J, Zhang L, Su Z (2008) Arnetminer: extraction and mining of academic social networks. In: Proceedings of the 14th ACM SIGKDD international conference on knowledge discovery and data mining, ACM, New York, pp 990–998Google Scholar
  22. Xie J, Kelley S, Szymanski BK (2013) Overlapping community detection in networks: the state-of-the-art and comparative study. ACM Comput Surv 45(4):43CrossRefGoogle Scholar
  23. Xu Z, Ke Y, Wang Y, Cheng H, Cheng J (2012) A model-based approach to attributed graph clustering. In: SIGMOD conference, pp 505–516Google Scholar
  24. Yang J, McAuley J, Leskovec J (2013) Community detection in networks with node attributes. In: Data mining (ICDM), 2013 IEEE 13th international conference on, IEEE, pp 1151–1156Google Scholar
  25. Yang L, Qiu M, Gottipati S, Zhu F, Jiang J, Sun H, Chen Z (2013) Cqarank: jointly model topics and expertise in community question answering. In: Proceedings of the 22nd ACM international conference on information and knowledge management, ACM, New York, pp 99–108Google Scholar
  26. Zhang H, Qiu B, Giles CL, Foley HC, Yen J (2007) An lda-based community structure discovery approach for large-scale social networks. In: ISI, pp 200–207Google Scholar
  27. Zhou TC, Lyu MR, King I (2012) A classification-based approach to question routing in community question answering. In: Proceedings of the 21st international conference companion on world wide web., WWW ’12 CompanionACM, New York, NY, USA, pp 783–790Google Scholar

Copyright information

© Springer-Verlag Wien 2015

Authors and Affiliations

  • Zide Meng
    • 1
    Email author
  • Fabien Gandon
    • 1
  • Catherine Faron-Zucker
    • 2
  • Ge Song
    • 1
    • 3
  1. 1.INRIA Sophia Antipolis MéditerranéeSophia AntipolisFrance
  2. 2.University of Nice Sophia Antipolis, CNRS, I3S, UMR 7271Sophia AntipolisFrance
  3. 3.Ecole Centrale ParisChâtenay-MalabryFrance

Personalised recommendations