Skip to main content

RADAR: Rare Category Detection via Computation of Boundary Degree

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 6635))

Abstract

Rare category detection is an open challenge for active learning. It can help select anomalies and then query their class labels with human experts. Compared with traditional anomaly detection, this task does not focus on finding individual and isolated instances. Instead, it selects interesting and useful anomalies from small compact clusters. Furthermore, the goal of rare category detection is to request as few queries as possible to find at least one representative data point from each rare class. Previous research works can be divided into three major groups, model-based, density-based and clustering-based methods. Performance of these approaches is affected by the local densities of the rare classes. In this paper, we develop a density insensitive method for rare category detection called RADAR. It makes use of reverse k-nearest neighbors to measure the boundary degree of each data point, and then selects examples with high boundary degree for the class-label querying. Experimental results on both synthetic and real-world data sets demonstrate the effectiveness of our algorithm.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Asuncion, A., Newman, D.J.: UCI machine learning repository (2007)

    Google Scholar 

  2. Pelleg, D., Moore, A.W.: Active learning for anomaly and rare-category detection. In: Proc. NIPS 2004, pp. 1073–1080. MIT Press, Boston (2004)

    Google Scholar 

  3. Bay, S., Kumaraswamy, K., Anderle, M., Kumar, R., Steier, D.: Large scale detection of irregularities in accounting data. In: ICDM 2006, pp. 75–86 (2006)

    Google Scholar 

  4. Stokes, J.W., Platt, J.C., Kravis, J., Shilman, M.: ALADIN: active learning of anomalies to detect intrusions. Technical report, Microsoft Research (2008)

    Google Scholar 

  5. Porter, R., Hush, D., Harvey, N., Theiler, J.: Toward interactive search in remote sensing imagery. In: Proc. SPIE, Vol. 7709, pp. 77090V–77090V–10 (2010)

    Google Scholar 

  6. Xia, C., Hsu, W., Lee, M.L., Ooi, B.C.: BORDER: efficient computation of boundary points. IEEE Trans. on Knowledge and Data Engineering 18(3), 289–303 (2006)

    Article  Google Scholar 

  7. He, J., Carbonell, J.: Nearest-neighbor-based active learning for rare category detection. In: Proc. NIPS 2007, pp. 633–640. MIT Press, Boston (2007)

    Google Scholar 

  8. He, J., Liu, Y., Lawrence, R.: Graph-based rare category detection. In: Proc. ICDM 2008, pp. 833–838 (2008)

    Google Scholar 

  9. He, J., Carbonell, J.: Prior-free rare category detection. In: Proc. SDM 2009, pp. 155–163 (2009)

    Google Scholar 

  10. Vatturi, P., Wong, W.: Category detection using hierarchical mean shift. In: Proc. KDD 2009, pp. 847–856 (2009)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Huang, H., He, Q., He, J., Ma, L. (2011). RADAR: Rare Category Detection via Computation of Boundary Degree. In: Huang, J.Z., Cao, L., Srivastava, J. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2011. Lecture Notes in Computer Science(), vol 6635. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-20847-8_22

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-20847-8_22

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-20846-1

  • Online ISBN: 978-3-642-20847-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics