World Wide Web

, Volume 16, Issue 5–6, pp 645–675 | Cite as

A likelihood-based framework for the analysis of discussion threads

  • Vicenç Gómez
  • Hilbert J. Kappen
  • Nelly Litvak
  • Andreas Kaltenbrunner
Open Access
Article

Abstract

Online discussion threads are conversational cascades in the form of posted messages that can be generally found in social systems that comprise many-to-many interaction such as blogs, news aggregators or bulletin board systems. We propose a framework based on generative models of growing trees to analyse the structure and evolution of discussion threads. We consider the growth of a discussion to be determined by an interplay between popularity, novelty and a trend (or bias) to reply to the thread originator. The relevance of these features is estimated using a full likelihood approach and allows to characterise the habits and communication patterns of a given platform and/or community. We apply the proposed framework on four popular websites: Slashdot, Barrapunto (a Spanish version of Slashdot), Meneame (a Spanish Digg-clone) and the article discussion pages of the English Wikipedia. Our results provide significant insight into understanding how discussion cascades grow and have potential applications in broader contexts such as community management or design of communication platforms.

Keywords

discussion threads online conversations information cascades preferential attachment novelty maximum likelihood Slashdot Wikipedia 

References

  1. 1.
    Adamic, L.A., Zhang, J., Bakshy, E., Ackerman, M.S.: Knowledge sharing and Yahoo answers: everyone knows something. In: Proceedings of the 17th International Conference on World Wide Web, WWW ’08, pp. 665–674. ACM, New York, NY, USA (2008)CrossRefGoogle Scholar
  2. 2.
    Adar, E., Adamic, L.A.: Tracking information epidemics in blogspace. In: Proceedings of the 2005 IEEE/WIC/ACM International Conference on Web Intelligence, WI ’05, pp. 207–214. IEEE Computer Society, Washington, DC, USA (2005)CrossRefGoogle Scholar
  3. 3.
    Bakshy, E., Karrer, B., Adamic, L.A.: Social influence and the diffusion of user-created content. In: Proceedings of the 10th ACM conference on Electronic Commerce, EC ’09, pp. 325–334. ACM, New York, NY, USA (2009)CrossRefGoogle Scholar
  4. 4.
    Banerjee, A.V.: A simple model of herd behavior. Q. J. Econ. 107(3), 797–818 (1992)CrossRefGoogle Scholar
  5. 5.
    Barabási, A.L., Albert, R.: Emergence of scaling in random networks. Science 286(5439), 509–512 (1999)MathSciNetCrossRefGoogle Scholar
  6. 6.
    Ben-Naim, E., Krapivsky, P.L.: Stratification in the preferential attachment network. J. Phys. A: Math. Theor. 42(47), 475,001 (2009)MathSciNetCrossRefGoogle Scholar
  7. 7.
    Bikhchandani, S., Hirshleifer, D., Welch, I.: A theory of fads, fashion, custom, and cultural change as informational cascades. J. Polit. Econ. 100(5), 992–1026 (1992)CrossRefGoogle Scholar
  8. 8.
    Blasio, B.F., Svensson, A., Liljeros, F.: Preferential attachment in sexual networks. Proc. Natl. Acad. Sci. 104(26), 10,762–10,767 (2007)CrossRefGoogle Scholar
  9. 9.
    Brush, A.B., Wang, X., Turner, T.C., Smith, M.A.: Assessing differential usage of Usenet social accounting meta-data. In: Proc. SIGCHI ’05, pp. 889–898. ACM, New York, USA (2005)Google Scholar
  10. 10.
    Cha, M., Mislove, A., Gummadi, K.P.: A measurement-driven analysis of information propagation in the Flickr social network. In: Proceedings of the 18th International Conference on World Wide Web, WWW ’09, pp. 721–730. ACM, New York, USA (2009)CrossRefGoogle Scholar
  11. 11.
    D’Souza, R.M., Borgs, C., Chayes, J.T., Berger, N., Kleinberg, R.D.: Emergence of tempered preferential attachment from optimization. Proc. Natl. Acad. Sci. 104(15), 6,112–6,117 (2007)CrossRefGoogle Scholar
  12. 12.
    Eggenberger, F., Pólya, G.: Über die Statistik verketteter Vorgänge. Z. Angew. Math. Mech. 3, 279–289 (1923)CrossRefMATHGoogle Scholar
  13. 13.
    Fisher, D., Smith, M., Welser, H.T.: You are who you talk to: detecting roles in Usenet newsgroups. In: Proc. HICSS ’06. IEEE CS, Washington, USA (2006)Google Scholar
  14. 14.
    Goh, K.I., Eom, Y.H., Jeong, H., Kahng, B., Kim, D.: Structure and evolution of online social relationships: heterogeneity in unrestricted discussions. Phys. Rev. E 73(6), 66,123 (2006)CrossRefGoogle Scholar
  15. 15.
    Golub, B., Jackson, M.O.: Using selection bias to explain the observed structure of internet diffusions. Proc. Natl. Acad. Sci. 107(24), 10,833–10,836 (2010)CrossRefGoogle Scholar
  16. 16.
    Gómez, V., Kaltenbrunner, A., López, V.: Statistical analysis of the social network and discussion threads in Slashdot. In: Proceedings of the 17th international conference on World Wide Web, WWW ’08, pp. 645–654. ACM, New York, NY, USA (2008)CrossRefGoogle Scholar
  17. 17.
    Gómez, V., Kappen, H.J., Kaltenbrunner, A.: Modeling the structure and evolution of discussion cascades. In: Proceedings of the 22nd ACM Conference on Hypertext and Hypermedia, HT ’11, pp. 181–190. ACM, New York, NY, USA (2011)CrossRefGoogle Scholar
  18. 18.
    Gonzalez-Bailon, S., Kaltenbrunner, A., Banchs, R.E.: The structure of political discussion networks: a model for the analysis of e-deliberation. J. Inf. Technol. 25, 230–243 (2010)CrossRefGoogle Scholar
  19. 19.
    Götz, M., Leskovec, J., McGlohon, M., Faloutsos, C.: Modeling blog dynamics. In: International Conference on Weblogs and Social Media, ICWSM ’09 (2009)Google Scholar
  20. 20.
    Gruhl, D., Guha, R., Liben-Nowell, D., Tomkins, A.: Information diffusion through blogspace. In: Proceedings of the 13th International Conference on World Wide Web, WWW ’06, pp. 491–501. ACM Press, New York, USA (2004)CrossRefGoogle Scholar
  21. 21.
    van der Hofstad, R.: Random graphs and complex networks. Lecture notes, available online at: http://www.win.tue.nl/~rhofstad/NotesRGCN2011.pdf (2011)
  22. 22.
    Iribarren, J.L., Moro, E.: Impact of human activity patterns on the dynamics of information diffusion. Phys. Rev. Lett. 103(3), 38,702 (2009)CrossRefGoogle Scholar
  23. 23.
    Jeong, H., Néda, Z., Barabási, A.L.: Measuring preferential attachment in evolving networks. Europhys. Lett. 61(4), 567 (2003)CrossRefGoogle Scholar
  24. 24.
    Joyce, E., Kraut, R.E.: Predicting continued participation in newsgroups. J. Comput-Mediat. Comm. 11, 723–747 (2006)CrossRefGoogle Scholar
  25. 25.
    Kaltenbrunner, A., Gómez, V., López, V.: Description and prediction of Slashdot activity. In: Proceedings of the 5th Latin American Web Congress, LA-WEB ’07. IEEE Computer Society, Santiago de Chile (2007)Google Scholar
  26. 26.
    Kaltenbrunner, A., Gonzalez, G., Ruiz de Querol, R., Volkovich, Y.: Comparative analysis of articulated and behavioural social networks in a social news sharing website. New Rev. Hypermedia Multimed. 7(3), 243–266 (2011)CrossRefGoogle Scholar
  27. 27.
    Kearns, M., Suri, S., Montfort, N.: An experimental study of the coloring problem on human subject networks. Science 313(5788), 824–827 (2006)CrossRefGoogle Scholar
  28. 28.
    Kumar, R., Mahdian, M., McGlohon, M.: Dynamics of conversations. In: SIGKDD ’10, pp. 553–562. ACM, New York, USA (2010)Google Scholar
  29. 29.
    Kumar, R., Novak, J., Raghavan, P., Tomkins, A.: On the bursty evolution of blogspace. World Wide Web 8(2), 159–178 (2005)CrossRefGoogle Scholar
  30. 30.
    Kwak, H., Lee, C., Park, H., Moon, S.: What is Twitter, a social network or a news media? In: Proceedings of the 19th International Conference on World Wide Web, WWW ’10, pp. 591–600. ACM, New York, USA (2010)CrossRefGoogle Scholar
  31. 31.
    Lampe, C., Johnston, E.: Follow the (slash) dot: effects of feedback on new members in an online community. In: Proceedings of the 2005 International ACM SIGGROUP Conference on Supporting Group Work, pp. 11–20. ACM, New York, USA (2005)CrossRefGoogle Scholar
  32. 32.
    Laniado, D., Tasso, R., Volkovich, Y., Kaltenbrunner, A.: When the Wikipedians talk: network and tree structure of Wikipedia discussion pages. In: International Conference on Weblogs and Social Media, ICWSM ’11. The AAAI Press (2011)Google Scholar
  33. 33.
    Lerman, K., Hogg, T.: Using a model of social dynamics to predict popularity of news. In: Proceedings of the 19th International Conference on World Wide Web, WWW ’10, pp. 621–630. ACM, New York, NY, USA (2010)CrossRefGoogle Scholar
  34. 34.
    Leskovec, J., Adamic, L.A., Huberman, B.A.: The dynamics of viral marketing. ACM Trans. Web. 1(1) (2007)Google Scholar
  35. 35.
    Leskovec, J., McGlohon, M., Faloutsos, C., Glance, N., Hurst, M.: Cascading behavior in large blog graphs: patterns and a model. In: SDM ’07 (2007)Google Scholar
  36. 36.
    Liben-Nowell, D., Kleinberg, J.: Tracing information flow on a global scale using internet chain-letter data. Proc. Natl. Acad. Sci. 105(12), 4633–4638 (2008)CrossRefGoogle Scholar
  37. 37.
    Malmgren, R.D., Stouffer, D.B., Motter, A.E., Amaral, L.A.N.: A Poissonian explanation for heavy tails in e-mail communication. Proc. Natl. Acad. Sci. 47(105), 18,135–18,158 (2008)Google Scholar
  38. 38.
    Mcglohon, M., Hurst, M.: Community structure and information flow in Usenet: improving analysis with a thread ownership model. In: International Conference on Weblogs and Social Media, ICWSM ’09 (2009)Google Scholar
  39. 39.
    Musiał, K., Kazienko, P.: Social networks on the internet. World Wide Web 1–42 (2012). doi:10.1007/s11280-011-0155-z
  40. 40.
    Nonnecke, B., Andrews, D., Preece, J.: Non-public and public online community participation: needs, attitudes and behavior. Electron. Commerce Res. 1(6), 7–20 (2006)CrossRefGoogle Scholar
  41. 41.
    Peruani, F., Tabourier, L.: Directedness of information flow in mobile phone communication networks. PLoS ONE 6(12), e28,860 (2011)CrossRefGoogle Scholar
  42. 42.
    Preece, J., Nonnecke, B., Andrews, D.: The top five reasons for lurking: improving community experiences for everyone. Comput. Hum. Behav. 2(20), 201–223 (2004)CrossRefGoogle Scholar
  43. 43.
    Rangwala, H., Jamali, S.: Defining a coparticipation network using comments on Digg. IEEE Intell. Syst. 25, 36–45 (2010)CrossRefGoogle Scholar
  44. 44.
    Rogers, E.M.: Diffusion of Innovations, 5th edn. Free Press, New York (2003)Google Scholar
  45. 45.
    Rudas, A., Tóth, B., Valkó, B.: Random trees and general branching processes. Random Struct. Algorithms 31, 186–202 (2007)CrossRefMATHGoogle Scholar
  46. 46.
    Sack, W.: Discourse diagrams: Interface design for very large-scale conversations. In: Proc. HICSS ’00. vol. 3, p. 3034. IEEE CS, Washington, DC, USA (2000)Google Scholar
  47. 47.
    Sadikov, E., Medina, M., Leskovec, J., Garcia-Molina, H.: Correcting for missing data in information cascades. In: Proceedings of the 4th ACM International Conference on Web Search and Data Mining, WSDM ’11, pp. 55–64. ACM, New York, NY, USA (2011)Google Scholar
  48. 48.
    Smith, M.: Tools for navigating large social cyberspaces. Commun. ACM 45(4), 51–55 (2002)CrossRefGoogle Scholar
  49. 49.
    Sun, E., Rosenn, I., Marlow, C., Lento, T.M.: Gesundheit! Modeling contagion through Facebook news feed. In: International Conference on Weblogs and Social Media, ICWSM ’09. The AAAI Press (2009)Google Scholar
  50. 50.
    Szabo, G., Huberman, B.A.: Predicting the popularity of online content. Commun. ACM 53, 80–88 (2010)CrossRefGoogle Scholar
  51. 51.
    Wang, C., Ye, M., Huberman, B.A.: From User Comments to On-line Conversations. SSRN eLibrary (2012). doi:10.2139/ssrn.2012183 Google Scholar
  52. 52.
    Wang, D., Wen, Z., Tong, H., Lin, C.Y., Song, C., Barabási, A.L.: Information spreading in context. In: Proceedings of the 20th International Conference on World Wide Web (2011)Google Scholar
  53. 53.
    Watts, D.J.: A simple model of global cascades on random networks. Proc. Natl. Acad. Sci. 99, 5766–5771 (2002)MathSciNetCrossRefMATHGoogle Scholar
  54. 54.
    Welser, H.T., Gleave, E., Fisher, D., Smith, M.: Visualizing the signatures of social roles in online discussion groups. Journal of Social Structure (JoSS) 8(2), 1–32 (2007)Google Scholar
  55. 55.
    Whittaker, S., Terveen, L., Hill, W., Cherny, L.: The dynamics of mass interaction. In: Proceedings of the 1998 ACM Conference on Computer Supported Cooperative Work, CSCW ’98, pp. 257–264. New York, USA (1998)Google Scholar
  56. 56.
    Wiuf, C., Brameier, M., Hagberg, O., Stumpf, M.P.H.: A likelihood approach to analysis of network data. Proc. Natl. Acad. Sci. 103(20), 7,566–7,570 (2006)MathSciNetCrossRefGoogle Scholar
  57. 57.
    Wu, F., Huberman, B.: Novelty and collective attention. Proc. Natl. Acad. Sci. 104(45), 17,599–17,601 (2007)CrossRefGoogle Scholar
  58. 58.
    Zhongbao, K., Changshui, Z.: Reply networks on a bulletin board system. Phys. Rev. E 67(3), 36,117 (2003)CrossRefGoogle Scholar

Copyright information

© The Author(s) 2012

Authors and Affiliations

  • Vicenç Gómez
    • 1
  • Hilbert J. Kappen
    • 1
  • Nelly Litvak
    • 2
  • Andreas Kaltenbrunner
    • 3
  1. 1.Donders Institute for Brain Cognition and BehaviourRadboud University NijmegenNijmegenThe Netherlands
  2. 2.Faculty of Electrical Engineering, Mathematics and Computer Science, Department of Applied MathematicsUniversity of TwenteEnschedeThe Netherlands
  3. 3.Social Media Research GroupBarcelona MediaBarcelonaSpain

Personalised recommendations