Skip to main content

Class-Based Outlier Detection: Staying Zombies or Awaiting for Resurrection?

  • Conference paper
  • First Online:
Advances in Intelligent Data Analysis XIV (IDA 2015)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9385))

Included in the following conference series:

Abstract

This paper addresses the task of finding outliers within each class in the context of supervised classification problems. Class-based outliers are cases that deviate too much with respect to the cases of the same class. We introduce a novel method for outlier detection in labelled data based on Random Forests and compare it with existing methods both on artificial and real-world data. We show that it is competitive with the existing methods and sometimes gives more intuitive results. We also provide an overview for outlier detection in labelled data. The main contribution are two methods for class-based outlier description and interpretation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://www.stat.berkeley.edu/~breiman/RandomForests/cc_home.htm#outliers.

  2. 2.

    http://docs.rapidminer.com/studio/operators/data_transformation/data_cleansing/outlier_detection/detect_outlier_cof.html.

References

  1. ODD2 Ws on Outlier Detection & Description under Data Diversity, KDD (2014)

    Google Scholar 

  2. Aggarwal, C.C.: Outlier Analysis. Springer, New York (2013)

    Book  MATH  Google Scholar 

  3. Angiulli, F., Fassetti, F.: Exploiting domain knowledge to detect outliers. Data Min. Knowl. Discov. 28(2), 519–568 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  4. Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)

    Article  MATH  Google Scholar 

  5. Dang, X.H., Micenková, B., Assent, I., Ng, R.T.: Local outlier detection with interpretation. In: Blockeel, H., Kersting, K., Nijssen, S., Železný, F. (eds.) ECML PKDD 2013, Part III. LNCS, vol. 8190, pp. 304–320. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

  6. Hall, M., et al.: The weka data mining software: an update. SIGKDD Explor. Newsl. 11(1), 10–18 (2009)

    Article  Google Scholar 

  7. Frenay, B., Verleysen, M.: Classification in the presence of label noise: a survey. IEEE Trans. Neural Netw. Learn. Syst. 25(5), 845–869 (2014)

    Article  Google Scholar 

  8. He, Z., Xu, X., Huang, J.Z., Deng, S.: Mining class outliers: concepts, algorithms and applications in CRM. Expert Syst. Appl. 27(4), 681–697 (2004)

    Article  Google Scholar 

  9. He, Z., Deng, S., Xu, X.: Outlier detection integrating semantic knowledge. In: Meng, X., Su, J., Wang, Y. (eds.) WAIM 2002. LNCS, vol. 2419, p. 126. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  10. Hewahi, N., Saad, M.: Class outliers mining: distance-based approach. Int. J. Intell. Technol. 2(1), 5568 (2007)

    Google Scholar 

  11. John, G.H.: Robust decision trees: removing outliers from databases. In: Knowledge Discovery and Data Mining, pp. 174–179. AAAI Press (1995)

    Google Scholar 

  12. Konijn, R.M., Duivesteijn,W., Kowalczyk, W., Knobbe, A.J.: Discovering local subgroups, with an application to fraud detection. In: Proceedings of PAKDD 2013, pp. 1–12 (2013)

    Google Scholar 

  13. Leman, D., Feelders, A., Knobbe, A.J.: Exceptional model mining. In: Daelemans, W., Goethals, B., Morik, K. (eds.) ECML PKDD 2008, Part II. LNCS (LNAI), vol. 5212, pp. 1–16. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  14. Micenková, B., Ng, R.T., Dang, X.H., Assent, I.: Explaining outliers by subspace separability. In: IEEE ICDM 2013, pp. 518–527 (2013)

    Google Scholar 

  15. Müller, E., Keller, F., Blanc, S., Böhm, K.: OutRules: A framework for outlier descriptions in multiple context spaces. In: Flach, P.A., De Bie, T., Cristianini, N. (eds.) ECML PKDD 2012, Part II. LNCS, vol. 7524, pp. 828–832. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  16. Papadimitriou, S., Faloutsos, C.: Cross-outlier detection. In: Hadzilacos, T., Manolopoulos, Y., Roddick, J., Theodoridis, Y. (eds.) SSTD 2003. LNCS, vol. 2750. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  17. Smith, M.R., Martinez, T.R.: Improving classification accuracy by identifying and removing instances that should be misclassified. In: IJCNN, pp. 2690–2697. IEEE (2011)

    Google Scholar 

  18. Vaculík, K., Nezvalová, L., Popelínský, L.: Educational data mining for analysis of students’ solutions. In: Agre, G., Hitzler, P., Krisnadhi, A.A., Kuznetsov, S.O. (eds.) AIMSA 2014. LNCS, vol. 8722, pp. 150–161. Springer, Heidelberg (2014)

    Google Scholar 

Download references

Acknowledgments

We thanks to IDA reviewers for valuable comments and suggestions and to Vaclav Blahut for implementation and experiments with CB-ILP. We would like to thank also to the members of KDLab FI MU for their help. This work has been partially supported by Faculty of Informatics, Masaryk University, Brno.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Luboš Popelínský .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Nezvalová, L., Popelínský, L., Torgo, L., Vaculík, K. (2015). Class-Based Outlier Detection: Staying Zombies or Awaiting for Resurrection?. In: Fromont, E., De Bie, T., van Leeuwen, M. (eds) Advances in Intelligent Data Analysis XIV. IDA 2015. Lecture Notes in Computer Science(), vol 9385. Springer, Cham. https://doi.org/10.1007/978-3-319-24465-5_17

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-24465-5_17

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-24464-8

  • Online ISBN: 978-3-319-24465-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics