Semantic Stability in Wikipedia

Stanisavljevic, Darko; Hasani-Mavriqi, Ilire; Lex, Elisabeth; Strohmaier, Markus; Helic, Denis

doi:10.1007/978-3-319-50901-3_31

Semantic Stability in Wikipedia

Darko Stanisavljevic⁶,
Ilire Hasani-Mavriqi⁷,
Elisabeth Lex⁸,
Markus Strohmaier⁹ &
…
Denis Helic⁸

Conference paper
First Online: 30 November 2016

2641 Accesses
13 Altmetric

Part of the book series: Studies in Computational Intelligence ((SCI,volume 693))

Abstract

In this paper we assess the semantic stability of Wikipedia by investigating the dynamics of Wikipedia articles’ revisions over time. In a semantically stable system, articles are infrequently edited, whereas in unstable systems, article content changes more frequently. In other words, in a stable system, the Wikipedia community has reached consensus on the majority of articles. In our work, we measure semantic stability using the Rank Biased Overlap method. To that end, we preprocess Wikipedia dumps to obtain a sequence of plain-text article revisions, whereas each revision is represented as a TF-IDF vector. To measure the similarity between consequent article revisions, we calculate Rank Biased Overlap on subsequent term vectors. We evaluate our approach on 10 Wikipedia language editions including the five largest language editions as well as five randomly selected small language editions. Our experimental results reveal that even in policy driven collaboration networks such as Wikipedia, semantic stability can be achieved. However, there are differences on the velocity of the semantic stability process between small and large Wikipedia editions. Small editions exhibit faster and higher semantic stability than large ones. In particular, in large Wikipedia editions, a higher number of successive revisions is needed in order to reach a certain semantic stability level, whereas, in small Wikipedia editions, the number of needed successive revisions is much lower for the same level of semantic stability.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 259.00; Price excludes VAT (USA)

Softcover Book: USD 329.99; Price excludes VAT (USA)

Hardcover Book: USD 329.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Biancani, S.: Measuring the Quality of Edits toWikipedia. In: Proceedings of The International Symposium on Open Collaboration, OpenSym ’14. ACM, New York, NY, USA (2014)
Google Scholar
Debole, F., Sebastiani, F.: Supervised Term Weighting for Automated Text Categorization. In: Proceedings of the 2003 ACM Symposium on Applied Computing, SAC ’03. ACM, New York, NY, USA (2003)
Google Scholar
DeDeo, S.: Conflict and computation on wikipedia: A finite-state machine analysis of editor interactions. Future Internet 8(3) (2016)
Google Scholar
Hajian, B., White, T.: Measuring Semantic Similarity using a Multi-tree Model. In: Proceedings of the 9thWorkshop on Intelligent Techniques forWeb Personalization and Recommender Systems, ITWP 2011. CEUR Workshop Proceedings (2011)
Google Scholar
Kalyanasundaram, A., Wei, W., Carley, K.M., Herbsleb, J.D.: An Agent-based Model of Edit Wars in Wikipedia: How and when is Consensus Reached. In: Proceedings of the 2015 Winter Simulation Conference, WSC ’15. IEEE Press, Piscataway, NJ, USA (2015)
Google Scholar
Müller-Birn, C., Dobusch, L., Herbsleb, J.D.: Work-to-rule: The Emergence of Algorithmic Governance inWikipedia. In: Proceedings of the 6th International Conference on Communities and Technologies, C&T ’13. ACM, New York, NY, USA (2013)
Google Scholar
Osman, K.: The Role of Conflict in Determining Consensus on Quality in Wikipedia Articles. In: Proceedings of the 9th International Symposium on Open Collaboration, WikiSym ’13. ACM, New York, NY, USA (2013)
Google Scholar
Sahlgren, M.: An Introduction to Random Indexing. In: Methods and Applications of Semantic Indexing Workshop at the 7th International Conference on Terminology and Knowledge Engineering, TKE ’05. Copenhagen, Denmark (2005)
Google Scholar
Salton, G., Buckley, C.: Term-Weighting Approaches in Automatic Text Retrieval. Information Processing and Management: an International Journal 24(5) (1988)
Google Scholar
Shirakawa, M., Nakayama, K., Hara, T., Nishio, S.: Probabilistic semantic similarity measurements for noisy short texts using Wikipedia entities. In: Proceedings of the 22nd ACM international conference on Conference on information & knowledge management, CIKM ’13. ACM, New York, NY, USA (2013)
Google Scholar
Stefanescu, D., Rus, V., Niraula, N.B., Banjade, R.: Combining Knowledge and Corpus-based Measures for Word-to-Word Similarity. In: Proceedings of the Twenty-Seventh International Florida Artificial Intelligence Research Society Conference, FLAIRS ’14. AAAI Press, Palo Alto, California (2014)
Google Scholar
Takale, S.A., Nandgaonkar, S.S.: Measuring Semantic Similarity between Words Using Web Documents. International Journal of Advanced Computer Science and Applications, IJACSA 1(4) (2010)
Google Scholar
Török, J., Iñiguez, G., Yasseri, T., San Miguel, M., Kaski, K., Kertész, J.: Opinions, conflicts, and consensus: Modeling social dynamics in a collaborative environment. Phys. Rev. Lett. 110 (2013)
Google Scholar
Turney, P.D., Pantel, P.: From Frequency to Meaning Vector Space Models of Semantics. Journal of Artificial Intelligence Research 37(1) (2010)
Google Scholar
Wagner, C., Singer, P., Strohmaier, M., Huberman, B.A.: Semantic Stability in Social Tagging Streams. In: Proceedings of the 23rd International Conference on World Wide Web, WWW ’14. ACM, New York, NY, USA (2014)
Google Scholar
Webber, W., Moffat, A., Zobel, J.: A similarity Measure for Indefinite Rankings. ACM Transactions on Information Systems, TOIS 28(4) (2010)
Google Scholar
Yasseri, T., Kertész, J.: Value production in a collaborative environment. Journal of Statistical Physics 151(3) (2013)
Google Scholar
Zaman, A.: Stop Word Lists in Document Retrieval Using Latent Semantic Indexing: an Evaluation. Journal of E-Technology 3(1) (2012)
Google Scholar

Download references

Author information

Authors and Affiliations

VIRTUAL VEHICLE Research Center, Inffeldgasse 21a, Graz, Austria
Darko Stanisavljevic
Graz University of Technology and Know-Center GmbH, Inffeldgasse 13, Graz, Austria
Ilire Hasani-Mavriqi
Graz University of Technology, Inffeldgasse 13, Graz, Austria
Elisabeth Lex & Denis Helic
GESIS and University of Koblenz-Landau, Unter Sachsenhausen 6-8, Cologne, Germany
Markus Strohmaier

Authors

Darko Stanisavljevic
View author publications
You can also search for this author in PubMed Google Scholar
Ilire Hasani-Mavriqi
View author publications
You can also search for this author in PubMed Google Scholar
Elisabeth Lex
View author publications
You can also search for this author in PubMed Google Scholar
Markus Strohmaier
View author publications
You can also search for this author in PubMed Google Scholar
Denis Helic
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Darko Stanisavljevic or Ilire Hasani-Mavriqi .

Editor information

Editors and Affiliations

University of Burgundy , Dijon, France
Hocine Cherifi
Computer Science Department, University of Milan Computer Science Department, Milan, Italy
Sabrina Gaito
IMT Lucca , Lucca, Italy
Walter Quattrociocchi
Blanchardstown Business and Tech Park, Bell Labs-Nokia Blanchardstown Business and Tech Park, Blanchardstown, Ireland
Alessandra Sala

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Stanisavljevic, D., Hasani-Mavriqi, I., Lex, E., Strohmaier, M., Helic, D. (2017). Semantic Stability in Wikipedia. In: Cherifi, H., Gaito, S., Quattrociocchi, W., Sala, A. (eds) Complex Networks & Their Applications V. COMPLEX NETWORKS 2016 2016. Studies in Computational Intelligence, vol 693. Springer, Cham. https://doi.org/10.1007/978-3-319-50901-3_31

Download citation

DOI: https://doi.org/10.1007/978-3-319-50901-3_31
Published: 30 November 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-50900-6
Online ISBN: 978-3-319-50901-3
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics