Wikipedia Vandalism Detection: Combining Natural Language, Metadata, and Reputation Features

  • B. Thomas Adler
  • Luca de Alfaro
  • Santiago M. Mola-Velasco
  • Paolo Rosso
  • Andrew G. West
Conference paper

DOI: 10.1007/978-3-642-19437-5_23

Part of the Lecture Notes in Computer Science book series (LNCS, volume 6609)
Cite this paper as:
Adler B.T., de Alfaro L., Mola-Velasco S.M., Rosso P., West A.G. (2011) Wikipedia Vandalism Detection: Combining Natural Language, Metadata, and Reputation Features. In: Gelbukh A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2011. Lecture Notes in Computer Science, vol 6609. Springer, Berlin, Heidelberg

Abstract

Wikipedia is an online encyclopedia which anyone can edit. While most edits are constructive, about 7% are acts of vandalism. Such behavior is characterized by modifications made in bad faith; introducing spam and other inappropriate content.

In this work, we present the results of an effort to integrate three of the leading approaches to Wikipedia vandalism detection: a spatio-temporal analysis of metadata (STiki), a reputation-based system (WikiTrust), and natural language processing features. The performance of the resulting joint system improves the state-of-the-art from all previous methods and establishes a new baseline for Wikipedia vandalism detection. We examine in detail the contribution of the three approaches, both for the task of discovering fresh vandalism, and for the task of locating vandalism in the complete set of Wikipedia revisions.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • B. Thomas Adler
    • 1
  • Luca de Alfaro
    • 2
  • Santiago M. Mola-Velasco
    • 3
  • Paolo Rosso
    • 3
  • Andrew G. West
    • 4
  1. 1.University of CaliforniaSanta CruzUSA
  2. 2.Google and UC Santa CruzUSA
  3. 3.NLE Lab. - ELiRF - DSIC.Universidad Politécnica de ValenciaSpain
  4. 4.University of PennsylvaniaPhiladelphiaUSA

Personalised recommendations