Skip to main content

One-Class Classification for Anomaly Detection with Kernel Density Estimation and Genetic Programming

Part of the Lecture Notes in Computer Science book series (LNTCS,volume 9594)

Abstract

A novel approach is proposed for fast anomaly detection by one-class classification. Standard kernel density estimation is first used to obtain an estimate of the input probability density function, based on the one-class input data. This can be used for anomaly detection: query points are classed as anomalies if their density is below some threshold. The disadvantage is that kernel density estimation is lazy, that is the bulk of the computation is performed at query time. For large datasets it can be slow. Therefore it is proposed to approximate the density function using genetic programming symbolic regression, before imposing the threshold. The runtime of the resulting genetic programming trees does not depend on the size of the training data. The method is tested on datasets including in the domain of network security. Results show that the genetic programming approximation is generally very good, and hence classification accuracy approaches or equals that when using kernel density estimation to carry out one-class classification directly. Results are also generally superior to another standard approach, one-class support vector machines.

Keywords

  • Anomaly detection
  • One-class classification
  • Kernel density estimation

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-3-319-30668-1_1
  • Chapter length: 16 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
eBook
USD   59.99
Price excludes VAT (USA)
  • ISBN: 978-3-319-30668-1
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   79.99
Price excludes VAT (USA)
Fig. 1.
Fig. 2.
Fig. 3.
Fig. 4.

References

  1. Aggarwal, C.C.: Outlier Analysis. Springer Science & Business Media, New York (2013)

    CrossRef  MATH  Google Scholar 

  2. Bishop, C.M.: Novelty detection and neural network validation. In: IEE Proceedings on Vision, Image and Signal Processing, vol. 141, pp. 217–222. IET (1994)

    Google Scholar 

  3. Chandola, V., Banerjee, A., Kumar, V.: Anomaly detection: A survey. ACM computing surveys (CSUR) 41(3), 1–58 (2009)

    CrossRef  Google Scholar 

  4. Curry, R., Heywood, M.: One-class learning with multi-objective genetic programming. In: ISIC 2007 IEEE International Conference onSystems, Man and Cybernetics, pp. 1938–1945. IEEE (2007)

    Google Scholar 

  5. Curry, R., Heywood, M.I.: One-class genetic programming. In: Vanneschi, L., Gustafson, S., Moraglio, A., De Falco, I., Ebner, M. (eds.) EuroGP 2009. LNCS, vol. 5481, pp. 1–12. Springer, Heidelberg (2009)

    CrossRef  Google Scholar 

  6. Fiore, U., Palmieri, F., Castiglione, A., De Santis, A.: Network anomaly detection with the restricted Boltzmann machine. Neurocomputing 122, 13–23 (2013)

    CrossRef  Google Scholar 

  7. Gray, A.G., Moore, A.W.: Nonparametric density estimation: toward computational tractability. In: SDM, pp. 203–211. SIAM (2003)

    Google Scholar 

  8. Hido, S., Tsuboi, Y., Kashima, H., Sugiyama, M., Kanamori, T.: Statistical outlier detection using direct density ratio estimation. Knowl. Inf. Syst. 26(2), 309–336 (2011)

    CrossRef  Google Scholar 

  9. Japkowicz, N.: Concept-learning in the absence of counter-examples: an autoassociation-based approach to classification. Ph.D. thesis, Rutgers, The State University of New Jersey (1999)

    Google Scholar 

  10. KDD Cup Dataset (1999). http://kdd.ics.uci.edu/databases/kddcup99/kddcup99.html

  11. Khan, S.S., Madden, M.G.: A survey of recent trends in one class classification. In: Coyle, L., Freyne, J. (eds.) AICS 2009. LNCS, vol. 6206, pp. 188–197. Springer, Heidelberg (2010)

    CrossRef  Google Scholar 

  12. Koza, J.R.: Genetic Programming: On the Programming of Computers by Means of Natural Selection, vol. 1. MIT press, Cambridge (1992)

    MATH  Google Scholar 

  13. Lee, W., Stolfo, S.J.: A framework for constructing features and models for intrusion detection systems. ACM Trans. Inf. Syst. Secur. (TiSSEC) 3(4), 227–261 (2000)

    CrossRef  Google Scholar 

  14. Lee, W., Stolfo, S.J., Mok, K.W.: A data mining framework for building intrusion detection models. In: Proceedings of the 1999 IEEE Symposium on Security and Privacy, 1999, pp. 120–132. IEEE (1999)

    Google Scholar 

  15. Lichman, M.: UCI machine learning repository (2013). http://archive.ics.uci.edu/ml

  16. Loveard, T., Ciesielski, V.: Representing classification problems in genetic programming. In: Proceedings of the 2001 Congress on Evolutionary Computation, 2001, vol. 2, pp. 1070–1077. IEEE (2001)

    Google Scholar 

  17. Manevitz, L.M., Yousef, M.: One-class SVMs for document classification. J. Mach. Learn. Res. 2, 139–154 (2002)

    MATH  Google Scholar 

  18. Moya, M.M., Koch, M.W., Hostetler, L.D.: One-class classifier networks for target recognition applications. Technical report, Sandia National Labs., Albuquerque, NM (United States) (1993)

    Google Scholar 

  19. Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., Duchesnay, E.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)

    MathSciNet  MATH  Google Scholar 

  20. Perdisci, R., Gu, G., Lee, W.: Using an ensemble of one-class SVM classifiers to harden payload-based anomaly detection systems. In: ICDM 2006, Sixth International Conference on Data Mining, pp. 488–498. IEEE (2006)

    Google Scholar 

  21. Schölkopf, B., Williamson, R., Smola, A., Shawe-Taylor, J.: SV estimation of a distributions support. Adv. Neural Inf. Process. Syst. 12 (1999)

    Google Scholar 

  22. Schölkopf, B., Williamson, R.C., Smola, A.J., Shawe-Taylor, J., Platt, J.C.: Support vector method for novelty detection. NIPS 12, 582–588 (1999)

    Google Scholar 

  23. Scott, D.W.: Multivariate Density Estimation: Theory, Practice, and Visualization. John Wiley & Sons, New York (2015)

    CrossRef  MATH  Google Scholar 

  24. Shafi, K., Abbass, H.A.: Evaluation of an adaptive genetic-based signature extraction system for network intrusion detection. Pattern Anal. Appl. 16(4), 549–566 (2013)

    MathSciNet  CrossRef  Google Scholar 

  25. Tavallaee, M., Bagheri, E., Lu, W., Ghorbani, A.A.: A detailed analysis of the KDD cup 99 data set. In: Proceedings of the Second IEEE Symposium on Computational Intelligence for Security and Defence Applications 2009 (2009)

    Google Scholar 

  26. Tavallaee, M., Bagheri, E., Lu, W., Ghorbani, A.A.: NSL-KDD dataset (2012). http://www.iscx.ca/NSL-KDD

  27. Tax, D.M.: One-class classification. Delft University of Technology (2001)

    Google Scholar 

  28. Tax, D.M., Duin, R.P.: Data domain description using support vectors. In: ESANN, vol. 99, pp. 251–256 (1999)

    Google Scholar 

  29. Tax, D.M., Duin, R.P.: Support vector domain description. Pattern Recogn. Lett. 20(11), 1191–1199 (1999)

    CrossRef  Google Scholar 

  30. Tax, D.M., Duin, R.P.: Uniform object generation for optimizing one-class classifiers. J. Mach. Learn. Res. 2, 155–173 (2002)

    MATH  Google Scholar 

  31. To, C., Elati, M.: A parallel genetic programming for single class classification. In: Proceedings of the 15th Annual Conference Companion on Genetic and Evolutionary Computation, pp. 1579–1586. ACM (2013)

    Google Scholar 

  32. Wand, M.P., Jones, M.C.: Kernel Smoothing. CRC Press, Boca Raton (1994)

    MATH  Google Scholar 

  33. Wang, W., Gombault, S., Guyet, T.: Towards fast detecting intrusions: using key attributes of network traffic. In: ICIMP 2008, The Third International Conference on Internet Monitoring and Protection, pp. 86–91. IEEE (2008)

    Google Scholar 

  34. Wikipedia: Kernel density estimation – Wikipedia, the free encyclopedia (2015). https://en.wikipedia.org/w/index.php?title=Kernel_density_estimation&oldid=690734894

Download references

Acknowledgements

This work is funded by Vietnam International Education Development (VIED) and by agreement with the Irish Universities Association.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Van Loi Cao .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Cao, V.L., Nicolau, M., McDermott, J. (2016). One-Class Classification for Anomaly Detection with Kernel Density Estimation and Genetic Programming. In: Heywood, M., McDermott, J., Castelli, M., Costa, E., Sim, K. (eds) Genetic Programming. EuroGP 2016. Lecture Notes in Computer Science(), vol 9594. Springer, Cham. https://doi.org/10.1007/978-3-319-30668-1_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-30668-1_1

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-30667-4

  • Online ISBN: 978-3-319-30668-1

  • eBook Packages: Computer ScienceComputer Science (R0)