Measuring Performance when Positives Are Rare: Relative Advantage versus Predictive Accuracy — A Biological Case-Study

  • Stephen H. Muggleton
  • Christopher H. Bryant
  • Ashwin Srinivasan
Conference paper

DOI: 10.1007/3-540-45164-1_32

Part of the Lecture Notes in Computer Science book series (LNCS, volume 1810)
Cite this paper as:
Muggleton S.H., Bryant C.H., Srinivasan A. (2000) Measuring Performance when Positives Are Rare: Relative Advantage versus Predictive Accuracy — A Biological Case-Study. In: López de Mántaras R., Plaza E. (eds) Machine Learning: ECML 2000. ECML 2000. Lecture Notes in Computer Science (Lecture Notes in Artificial Intelligence), vol 1810. Springer, Berlin, Heidelberg

Abstract

This paper presents a new method of measuring performance when positives are rare and investigates whether Chomsky-like grammar representations are useful for learning accurate comprehensible predictors of members of biological sequence families. The positive-only learning framework of the Inductive Logic Programming (ILP) system CProgol is used to generate a grammar for recognising a class of proteins known as human neuropeptide precursors (NPPs). Performance is measured using both predictive accuracy and a new cost function, Relative Advantage (RA). The RA results show that searching for NPPs by using our best NPP predictor as a filter is more than 100 times more efficient than randomly selecting proteins for synthesis and testing them for biological activity. Predictive accuracy is not a good measure of performance for this domain because it does not discriminate well between NPP recognition models: despite covering varying numbers of (the rare) positives, all the models are awarded a similar (high) score by predictive accuracy because they all exclude most of the abundant negatives.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Copyright information

© Springer-Verlag Berlin Heidelberg 2000

Authors and Affiliations

  • Stephen H. Muggleton
    • 1
  • Christopher H. Bryant
    • 1
  • Ashwin Srinivasan
    • 2
  1. 1.Computer ScienceUniversity of YorkYorkUK
  2. 2.Computing LaboratoryOxford UniversityOxfordUK

Personalised recommendations