Machine Learning

, Volume 81, Issue 2, pp 207–225

Graph regularization methods for Web spam detection

  • Jacob Abernethy
  • Olivier Chapelle
  • Carlos Castillo
Article

DOI: 10.1007/s10994-010-5171-1

Cite this article as:
Abernethy, J., Chapelle, O. & Castillo, C. Mach Learn (2010) 81: 207. doi:10.1007/s10994-010-5171-1

Abstract

We present an algorithm, witch, that learns to detect spam hosts or pages on the Web. Unlike most other approaches, it simultaneously exploits the structure of the Web graph as well as page contents and features. The method is efficient, scalable, and provides state-of-the-art accuracy on a standard Web spam benchmark.

Keywords

Adversarial information retrieval Spam detection Web spam Graph regularization 
Download to read the full article text

Copyright information

© The Author(s) 2010

Authors and Affiliations

  • Jacob Abernethy
    • 1
  • Olivier Chapelle
    • 2
  • Carlos Castillo
    • 3
  1. 1.University of CaliforniaBerkeleyUSA
  2. 2.Yahoo! ResearchSanta ClaraUSA
  3. 3.Yahoo! ResearchBarcelonaSpain