Improving SVM Text Classification Performance through Threshold Adjustment

  • James G. Shanahan
  • Norbert Roma
Conference paper

DOI: 10.1007/978-3-540-39857-8_33

Part of the Lecture Notes in Computer Science book series (LNCS, volume 2837)
Cite this paper as:
Shanahan J.G., Roma N. (2003) Improving SVM Text Classification Performance through Threshold Adjustment. In: Lavrač N., Gamberger D., Blockeel H., Todorovski L. (eds) Machine Learning: ECML 2003. ECML 2003. Lecture Notes in Computer Science, vol 2837. Springer, Berlin, Heidelberg

Abstract

In general, support vector machines (SVM), when applied to text classification provide excellent precision, but poor recall. One means of customizing SVMs to improve recall, is to adjust the threshold associated with an SVM. We describe an automatic process for adjusting the thresholds of generic SVM which incorporates a user utility model, an integral part of an information management system. By using thresholds based on utility models and the ranking properties of classifiers, it is possible to overcome the precision bias of SVMs and insure robust performance in recall across a wide variety of topics, even when training data are sparse. Evaluations on TREC data show that our proposed threshold adjusting algorithm boosts the performance of baseline SVMs by at least 20% for standard information retrieval measures.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Copyright information

© Springer-Verlag Berlin Heidelberg 2003

Authors and Affiliations

  • James G. Shanahan
    • 1
  • Norbert Roma
    • 1
  1. 1.Clairvoyance CorporationPittsburghUSA

Personalised recommendations