Advertisement

On the Relation of Edit Behavior, Link Structure, and Article Quality on Wikipedia

  • Thorsten RuprechterEmail author
  • Tiago Santos
  • Denis Helic
Conference paper
Part of the Studies in Computational Intelligence book series (SCI, volume 882)

Abstract

When editing articles on Wikipedia, arguments between editors frequently occur. These conflicts occasionally lead to destructive behavior and diminish article quality. Currently, the relation between editing behavior, link structure, and article quality is not well-understood in our community, notwithstanding that this relation may facilitate editing processes and article quality on Wikipedia. To shed light on this complex relation, we classify edits for 13,045 articles and perform an in-depth analysis of a 4,800 article subsample. Additionally, we build a network of wikilinks (internal Wikipedia hyperlinks) between articles. Using this data, we compute parsimonious metrics to quantify editing and linking behavior. Our analysis unveils that controversial articles differ considerably from others for almost all metrics, while slight trends are also detectable for higher-quality articles. With our work, we assist online collaboration communities, especially Wikipedia, in long-term improvement of article quality by identifying deviant behavior via simple sequence-based edit and network-based article metrics.

Keywords

Wikipedia Edit behavior Link structure Article quality Edit wars Semantic edit types Multi-label classification Network analysis 

Notes

Acknowledgments

Tiago Santos is a recipient of a DOC Fellowship of the Austrian Academy of Sciences at the Institute of Interactive Systems and Data Science of the Graz University of Technology.

References

  1. 1.
    Adler, B.T., De Alfaro, L., Mola-Velasco, S.M., Rosso, P., West, A.G.: Wikipedia vandalism detection: combining natural language, metadata, and reputation features. In: CICLing, pp. 277–288. Springer (2011)Google Scholar
  2. 2.
    Borra, E., Weltevrede, E., Ciuccarelli, P., Kaltenbrunner, A., Laniado, D., Magni, G., Mauri, M., Rogers, R., Venturini, T.: Societal Controversies in Wikipedia articles. In: SIGCHI, pp. 193–196 (2015)Google Scholar
  3. 3.
    Brandes, U., Kenis, P., Lerner, J., Van Raaij, D.: Network analysis of collaboration structure in Wikipedia. In: WWW, pp. 731–740. ACM (2009)Google Scholar
  4. 4.
    Chandrasekharan, E., Pavalanathan, U., Srinivasan, A., Glynn, A., Eisenstein, J., Gilbert, E.: You Can’t stay here: the efficacy of Reddit’s 2015 ban examined through hate speech. HCI 1(CSCW), 31:1–31:22 (2017)Google Scholar
  5. 5.
    Consonni, C., Laniado, D., Montresor, A.: WikiLinkGraphs: a complete, longitudinal and multi-language dataset of the Wikipedia link networks. In: ICWSM, vol. 13, pp. 598–607 (2019)Google Scholar
  6. 6.
    Coursey, K., Mihalcea, R.: Topic identification using Wikipedia graph centrality. In: NAACL HLT, pp. 117–120 (2009)Google Scholar
  7. 7.
    Daxenberger, J., Gurevych, I.: A corpus-based study of edit categories in featured and non-featured wikipedia articles. In: COLING, pp. 711–726 (2012)Google Scholar
  8. 8.
    De La Robertie, B., Pitarch, Y., Teste, O.: Measuring article quality in Wikipedia using the collaboration network. In: ASONAM, pp. 464–471 (2015)Google Scholar
  9. 9.
    Dimitrov, D., Lemmerich, F., Singer, P., Strohmaier, M.: What Makes a Link Successful on Wikipedia? In: WWW, pp. 917–926 (2017)Google Scholar
  10. 10.
    Dimitrov, D., Singer, P., Helic, D., Strohmaier, M.: The role of structural information for designing navigational user interfaces. In: HT, pp. 59–68. ACM (2015)Google Scholar
  11. 11.
    Editorial: Britannica attacks. Nature 440(582) (2006)Google Scholar
  12. 12.
    Faigley, L., Witte, S.: Analyzing revision. College Compos. Commun. 32(4), 400–414 (1981)CrossRefGoogle Scholar
  13. 13.
    Flöck, F., Erdogan, K., Acosta, M.: TokTrack: a complete token provenance and change tracking dataset for the English Wikipedia. In: ICWSM, pp. 408–417 (2017)Google Scholar
  14. 14.
    Gandica, Y., dos Aidos, F.S., Carvalho, J.: The dynamic nature of conflict in Wikipedia. EPL 108(1), 18003 (2014)CrossRefGoogle Scholar
  15. 15.
    Halfaker, A., Geiger, R.S., Morgan, J.T., Sarabadani, A., Wight, A.: ORES: facilitating remediation of Wikipedia’s socio-technical problems (2018)Google Scholar
  16. 16.
    Hanada, R., Cristo, M., Pimentel, M.D.G.C.: How do metrics of link analysis correlate to quality, relevance and popularity in Wikipedia? In: WebMedia, pp. 105–112 (2013)Google Scholar
  17. 17.
    Ingawale, M., Dutta, A., Roy, R., Seetharaman, P.: Network analysis of user generated content quality in Wikipedia. Online Inf. Rev. 37(4), 602–619 (2013)CrossRefGoogle Scholar
  18. 18.
    Kamps, J., Koolen, M.: Is Wikipedia link structure different? In: WSDM, pp. 232–241 (2009)Google Scholar
  19. 19.
    Kittur, A., Suh, B., Pendleton, B.A., Chi, E.H.: He says, she says: conflict and coordination in Wikipedia. In: SIGCHI, pp. 453–462 (2007)Google Scholar
  20. 20.
    Kumar, S., Spezzano, F., Subrahmanian, V.: VEWS: a Wikipedia vandal early warning system. In: SIGKDD, pp. 607–616 (2015)Google Scholar
  21. 21.
    Li, X., Tang, J., Wang, T., Luo, Z., De Rijke, M.: Automatically assessing wikipedia article quality by exploiting article-editor networks. In: European Conference on Information Retrieval, pp. 574–580. Springer (2015)Google Scholar
  22. 22.
    Liu, J., Ram, S.: Using big data and network analysis to understand Wikipedia article quality. Data Knowl. Eng. 115, 80–93 (2018)CrossRefGoogle Scholar
  23. 23.
    Milne, D., Witten, I.H.: An effective, low-cost measure of semantic relatedness obtained from Wikipedia links. In: AAAI (2008)Google Scholar
  24. 24.
    Platt, E.L., Romero, D.M.: Network structure, efficiency, and performance in WikiProjects. In: ICWSM, pp. 251–260 (2018)Google Scholar
  25. 25.
    Ravasz, E., Barabási, A.L.: Hierarchical organization in complex networks. Phys. Rev. E 67(2), 026112 (2003)CrossRefGoogle Scholar
  26. 26.
    Sage Ross: Editing Wikipedia, a print guide for new contributors (2014). https://w.wiki/86W. Accessed 09 Apr 2019
  27. 27.
    Samoilenko, A., Lemmerich, F., Zens, M., Jadidi, M., Génois, M., Strohmaier, M.: (Don’t) mention the war: a comparison of Wikipedia and britannica articles on national histories. In: WWW, pp. 843–852 (2018)Google Scholar
  28. 28.
    Shin, K., Eliassi-Rad, T., Faloutsos, C.: CoreScope: graph mining using k-core analysis - patterns, anomalies and algorithms. In: ICDM, pp. 469–478 (2016)Google Scholar
  29. 29.
    Suchecki, K., Salah, A.A.A., Gao, C., Scharnhorst, A.: Evolution of Wikipedia’s category structure. Adv. Complex Syst. 15, 1250068 (2012)CrossRefGoogle Scholar
  30. 30.
    Sumi, R., Yasseri, T., et al.: Edit wars in Wikipedia. In: PASSAT/SocialCom, pp. 724–727 (2011)Google Scholar
  31. 31.
    Vautard, R., Mo, K.C., Ghil, M.: Statistical significance test for transition matrices of atmospheric Markov chains. J. Atmos. Sci. 47(15), 1926–1931 (1990)CrossRefGoogle Scholar
  32. 32.
    Yang, D., Halfaker, A., Kraut, R., Hovy, E.: Edit categories and editor role identification in Wikipedia. In: LREC, pp. 1295–1299 (2016)Google Scholar
  33. 33.
    Yang, D., Halfaker, A., Kraut, R., Hovy, E.: Identifying semantic edit intentions from revisions in Wikipedia. In: EMNLP, pp. 2000–2010 (2017)Google Scholar
  34. 34.
    Yasseri, T., Kertész, J.: Value production in a collaborative environment. J. Stat. Phys. 151(3), 414–439 (2013)MathSciNetCrossRefGoogle Scholar
  35. 35.
    Yasseri, T., Spoerri, A., Graham, M., Kertész, J.: The most controversial topics in Wikipedia. Global Wikipedia 25 (2014)Google Scholar
  36. 36.
    Yasseri, T., Sumi, R., Rung, A., Kornai, A., Kertész, J.: Dynamics of conflicts in Wikipedia. PLoS ONE 7(6), 1–12 (2012)CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  • Thorsten Ruprechter
    • 1
    Email author
  • Tiago Santos
    • 1
  • Denis Helic
    • 1
  1. 1.Graz University of TechnologyGrazAustria

Personalised recommendations