Tracking Concept Drift at Feature Selection Stage in SpamHunting: An Anti-spam Instance-Based Reasoning System

  • J. R. Méndez
  • F. Fdez-Riverola
  • E. L. Iglesias
  • F. Díaz
  • J. M. Corchado
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4106)

Abstract

In this paper we propose a novel feature selection method able to handle concept drift problems in spam filtering domain. The proposed technique is applied to a previous successful instance-based reasoning e-mail filtering system called SpamHunting. Our achieved information criterion is based on several ideas extracted from the well-known information measure introduced by Shannon. We show how results obtained by our previous system in combination with the improved feature selection method outperforms classical machine learning techniques and other well-known lazy learning approaches. In order to evaluate the performance of all the analysed models, we employ two different corpus and six well-known metrics in various scenarios.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • J. R. Méndez
    • 1
  • F. Fdez-Riverola
    • 1
  • E. L. Iglesias
    • 1
  • F. Díaz
    • 2
  • J. M. Corchado
    • 3
  1. 1.Dept. InformáticaUniversity of Vigo, Escuela Superior de Ingeniería Informática, Edificio PolitécnicoOurenseSpain
  2. 2.Dept. InformáticaUniversity of Valladolid, Escuela Universitaria de InformáticaSegoviaSpain
  3. 3.Dept. Informática y AutomáticaUniversity of SalamancaSalamancaSpain

Personalised recommendations