Modeling a Web Forum Ecosystem into an Enriched Social Graph

  • Tarique Anwar
  • Muhammad Abulaish
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8329)


This paper considers the community interactions in online social media (OSM) as an OSM ecosystem and addresses the problem of modeling a Web forum ecosystem into a social graph. We propose a text mining method to model cross-thread interactions and interests of users in a Web forum ecosystem to generate an enriched social graph. In addition to modeling reply-to relationships between users, the proposed method models message-similarity relationship to keep track of all similar posts resulting out of deviated discussions in different threads. Although, the proposed graph-generation method considers a reply-to relation as the primary means of linkage, it establishes links between clusters of similar posts instead of links between individual users, and the linkages between users can be derived from the existing linkages between clusters. The method starts with linking posts in each thread individually by identifying reply-to relationships, and applies an agglomerative clustering algorithm based on similarity of posts across the forum to group all posts into different clusters. Finally, relations between each pair of individual posts are mapped to create a link between clusters containing the posts. As a result, the generated social graph resembles a network of clusters that can also be presented at the granule of users who authored the posts to generate a social network of forum users, and at the same time it keeps information for all other users with similar interests.


Social media mining Web forum ecosystem Social graph generation Agglomerative clustering 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Anwar, T., Abulaish, M.: Identifying cliques in dark web forums- an agglomerative clustering approach. In: Proc. of the 10th IEEE Int’l Conf. on ISI, pp. 171–173 (2012)Google Scholar
  2. 2.
    Anwar, T., Abulaish, M.: Mining an enriched social graph to model cross-thread community interactions and interests. In: Proc. of the 3th Int’l Workshop on MSM, Co-located with 23rd ACM Int’l Conf. on HT, pp. 35–38 (2012)Google Scholar
  3. 3.
    Aumayr, E., Chan, J., Hayes, C.: Reconstruction of threaded conversations in online discussion forums. In: Proc. of the AAAI ICWSM, pp. 26–33 (2011)Google Scholar
  4. 4.
    Benevenuto, F., Rodrigues, T., Cha, M., Almeida, V.: Characterizing user behavior in online social networks. In: Proc. of the 9th ACM SIGCOMM Internet Measurement Conf., pp. 49–62 (2009)Google Scholar
  5. 5.
    Bentivoglio, C.A.: Recognizing community interaction states in discussion forum evolution. In: AAAI Fall Symposium Series, pp. 20–25 (2009)Google Scholar
  6. 6.
    Brewington, B.E., Cybenko, G.: How dynamic is the web? Comput. Netw. 33(1-6), 257–276 (2000)CrossRefGoogle Scholar
  7. 7.
    Chan, J., Hayes, C., Daly, E.: Decomposing Discussion Forums using User Roles. In: Proc. of the WebSci 2010: Extending the Frontiers of Society On-Line (2010)Google Scholar
  8. 8.
    Cohen, W., Ravikumar, P., Fienberg, S.: A comparison of string distance metrics for name-matching tasks. In: Proc. of the Int’l Workshop on IIWeb, held with IJCAI, pp. 73–78 (2003)Google Scholar
  9. 9.
    Correa, T., Hinsley, A.W., de Zúñiga, H.G.: Who interacts on the web?: The intersection of users’ personality and social media use. Comput. Hum. Behav. 26(2), 247–253 (2010)CrossRefGoogle Scholar
  10. 10.
    De Choudhury, M., Mason, W.A., Hofman, J.M., Watts, D.J.: Inferring relevant social networks from interpersonal communication. In: Proc. of the 19th Int’l Conf. on WWW, pp. 301–310 (2010)Google Scholar
  11. 11.
    El Abaddi, A., Backstrom, L., Chakrabarti, S., Jaimes, A., Leskovec, J., Tomkins, A.: Social media: source of information or bunch of noise. In: Proc. of the 20th Int’l Conf. Companion on WWW, pp. 327–328 (2011)Google Scholar
  12. 12.
    Fu, T., Abbasi, A., Chen, H.: A hybrid approach to web forum interactional coherence analysis. J. Am. Soc. Inf. Sci. Technol. 59(8), 1195–1209 (2008)CrossRefGoogle Scholar
  13. 13.
    Gilbert, E., Karahalios, K.: Predicting tie strength with social media. In: Proc. of the 27th Int’l Conf. on Human Fact. in Comp. Sys., pp. 211–220 (2009)Google Scholar
  14. 14.
    Gómez, V., Kaltenbrunner, A., López, V.: Statistical analysis of the social network and discussion threads in slashdot. In: Proc. of the Int’l Conf. on WWW, pp. 645–654 (2008)Google Scholar
  15. 15.
    Guan, Y.-H., Tsai, C.-C., Hwang, F.-K.: Content analysis of online discussion on a senior-high-school discussion forum of a virtual physics laboratory. Instructional Science 34(4), 279–311 (2006)CrossRefGoogle Scholar
  16. 16.
    Han, J., Kamber, M., Pei, J.: Data Mining: Concepts and Techniques, 2nd edn., pp. 408–418. Morgan Kaufmann (2006)Google Scholar
  17. 17.
    Hargittai, E.: Hurdles to Information Seeking: Spelling and Typographical Mistakes During Users’ Online Behavior. J. of the Assoc. for Information Systems 7(1), 52–67 (2006)Google Scholar
  18. 18.
    Herring, S.C.: Computer-mediated communication on the internet. Ann. Rev. of Inf. Sc. and Tech. 36(1), 109–168 (2002)MathSciNetCrossRefGoogle Scholar
  19. 19.
    Himelboim, I., Gleave, E., Smith, M.: Discussion catalysts in online political discussions: Content importers and conversation starters. J. of Computer-Mediated Comm. 14(4), 771–789 (2009)CrossRefGoogle Scholar
  20. 20.
    Jaro, M.A.: Advances in Record-Linkage Methodology as Applied to Matching the 1985 Census of Tampa, Florida. J. of the Am. Statistical Assoc. 84(406), 414–420 (1989)CrossRefGoogle Scholar
  21. 21.
    Jones, S., Fox, S.: Generations online in 2009. Technical report, PewResearch Center (2009),
  22. 22.
    Kang, J.-H., Kim, J.: Analyzing answers in threaded discussions using a role-based information network. In: Proc. of the 3rd IEEE Int’l Conf. on Soc. Comp. (2011)Google Scholar
  23. 23.
    Kaplan, A.M., Haenlein, M.: Users of the world, unite! the challenges and opportunities of social media. Business Horizons 53(1), 59–68 (2010)CrossRefGoogle Scholar
  24. 24.
    Lenhart, A.: Adults and social network websites. Technical report, PewResearch Center (2009),
  25. 25.
    Liu, D., Percival, D., Fienberg, S.E.: User interest and interaction structure in online forums. In: Proc. of the 4th Int’l AAAI Conf. on Weblogs and Soc. Med., pp. 283–286 (2010)Google Scholar
  26. 26.
    Nahnsen, T., Uzuner, O., Katz, B.: Lexical chains and sliding locality windows in content-based text similarity detection. Technical report, MIT (CSAIL), MIT-CSAIL-TR-2005-034, AIM-2005-017 (2005),
  27. 27.
    Rosé, C.P., Di Eugenio, B., Levin, L.S., Carol: Discourse processing of dialogues with multiple threads. In: Proc. of the 33rd Ann. Meet. on Assoc. for Comp. Ling., pp. 31–38 (1995)Google Scholar
  28. 28.
    Severinson Eklundh, K.: To quote or not to quote: Setting the context for computer-mediated dialogues. Language@Internet 7(5) (2010)Google Scholar
  29. 29.
    van Dijck, J.: Users like you? theorizing agency in user-generated content. Media Culture Society 31(1), 41–58 (2009)CrossRefGoogle Scholar
  30. 30.
    Winkler, W.E.: String comparator metrics and enhanced decision rules in the fellegi-sunter model of record linkage. In: Proc. of the Section on Survey Research, pp. 354–359 (1990)Google Scholar
  31. 31.
    Xu, R., Wunsch, D.: Survey of clustering algorithms. IEEE Trans. on Neural Networks 16(3), 645–678 (2005)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Tarique Anwar
    • 1
  • Muhammad Abulaish
    • 2
  1. 1.Centre for Computing and Engineering Software SystemsSwinburne University of TechnologyMelbourneAustralia
  2. 2.Department of Computer ScienceJamia Millia Islamia (A Central University)New DelhiIndia

Personalised recommendations