Knowledge and Information Systems

, Volume 14, Issue 3, pp 377–392 | Cite as

The importance of generalizability for anomaly detection

Short Paper

Abstract

In security-related areas there is concern over novel “zero-day” attacks that penetrate system defenses and wreak havoc. The best methods for countering these threats are recognizing “nonself” as in an Artificial Immune System or recognizing “self” through clustering. For either case, the concern remains that something that appears similar to self could be missed. Given this situation, one could incorrectly assume that a preference for a tighter fit to self over generalizability is important for false positive reduction in this type of learning problem. This article confirms that in anomaly detection as in other forms of classification a tight fit, although important, does not supersede model generality. This is shown using three systems each with a different geometric bias in the decision space. The first two use spherical and ellipsoid clusters with a k-means algorithm modified to work on the one-class/blind classification problem. The third is based on wrapping the self points with a multidimensional convex hull (polytope) algorithm capable of learning disjunctive concepts via a thresholding constant. All three of these algorithms are tested using the Voting dataset from the UCI Machine Learning Repository, the MIT Lincoln Labs intrusion detection dataset, and the lossy-compressed steganalysis domain.

Keywords

Clustering Anomaly detection Convex polytope Ellipsoid 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Avcibas I, Memon N, Sankur B (2002) Image steganalysis with binary similarity measures. In: International conference on image processing, Rochester, NYGoogle Scholar
  2. 2.
    Barber CB, Dobkin DP, Huhdanpaa HT (1997) The quickhull algorithm for convex hulls. ACM Trans Math Softw 22:469–483CrossRefMathSciNetGoogle Scholar
  3. 3.
    Barber CB, Huhdanpaa HT (2002) Qhull, Version 2002.1. 283k. Computer Software. Available at: http://www.thesa.com/software/qhull/Google Scholar
  4. 4.
    Barron AR (1991) Approximation and estimation bounds for artificial neural networks. In: Proceedings of the fourth annual workshop on computational learning theory, Morgan Kaufmann, Palo Alto, CA, pp 243–249Google Scholar
  5. 5.
    Baum EB, Haussler D (1988) What size net gives valid generalization?. In: Proceedings of neural information processing systems, New York, pp 81–90Google Scholar
  6. 6.
    Blake CL, Merz CJ (1998) UCI repository of machine learning databases. University of California, Department of Information and Computer Science, Irvine, CA. Available at: http://www.ics.uci.edu/~mlearn/MLRepository.htmlGoogle Scholar
  7. 7.
    Brotherton T, Johnson T (2001) Anomaly detection for advanced military aircraft using neural networks. In: IEEE Aerospace Conference, Big Sky, MTGoogle Scholar
  8. 8.
    Chang CI, Chiang SS (2002) Anomaly detection and classificiation for hyperspectral imagery. IEEE Trans Geosci Remote Sens 40(6):1314–1325Google Scholar
  9. 9.
    Cho SB, Park HJ (2003) Efficient anomaly detection by modeling privilege flows using hidden Markov model. Comput Secur 22(1):45–55CrossRefMathSciNetGoogle Scholar
  10. 10.
    Cohen WW (1988) Generalizing number and learning from multiple examples in explanation based learning. Mach Learn 256–269Google Scholar
  11. 11.
    Coxeter HSM (1973) Regular polytopes, 3rd ed. Dover, New YorkGoogle Scholar
  12. 12.
    Dasgupta D, Gonzales F (2002) An immunity-based technique to characterize intrusions in computer networks. IEEE Trans Evol Comput 6(3):281–291Google Scholar
  13. 13.
    Delany SJ, Cunningham P (2006) ECUE: A spam filter that uses machine learning to track concept drift. Technical Report TCD-CS-2006-05. Trinity College Dublin, Computer Science Department, IrelandGoogle Scholar
  14. 14.
    Denning DE (1987) An intrusion detection model. IEEE Trans Softw Eng SE-13:222–232Google Scholar
  15. 15.
    Drummond C, Holte R (2005) Learning to live with false alarms. In: KDD-2005 workshop on data mining methods for anomaly detection, 21–25 August, Chicago, IL, pp 21–24Google Scholar
  16. 16.
    Duda RO, Hart PE, Stork DG (2001) Pattern classification, 2nd edn. Wiley, New YorkMATHGoogle Scholar
  17. 17.
    Eskin, E (2000) Anomaly detection over noisy data using learned probability distributions. In: Proceedings of the international conference on machine learning, Stanford University, Stanford, CAGoogle Scholar
  18. 18.
    Eskin E, Arnold A, Prerau M, Portnoy L, Stolfo S (2002) A geometric framework for unsupervised anomaly detection: detecting intrusions in unlabeled data. Appl Data Min Comp Secur 82–102Google Scholar
  19. 19.
    Faird H, Lyu S (2003) Higher-order wavelet statistics and their application to digital forensics. In: IEEE workshop on statistical analysis in computer vision, Madison, WIGoogle Scholar
  20. 20.
    Fan W, Miller M, Stolfo S, Lee W, Chan P (2004) Using artificial anomalies to detect unknown and known network intrusions. Knowl Inf Syst 6(5):507–527CrossRefGoogle Scholar
  21. 21.
    Fridrich J, Goljan M, Du R (2001) Detecting LSB steganography in color and gray-scale images. In: IEEE Multimedia Magazine, Special Issue on Security, October 2001, pp 22–28Google Scholar
  22. 22.
    Gupta A, Sekar R (2003) An approach for detecting self-propagating email using anomaly detection. In: Recent advances in intrusion detection: 6th international symposium, RAID 2003, Lecture notes in computer science, Pittsburgh, PA, 8–10 SeptemberGoogle Scholar
  23. 23.
    Haines J, Lippmann R, Fried D, Tran E, Boswell S, Zissman M (1999) DARPA intrusion detection system evaluation: design and procedures. MIT Lincoln Laboratory Technical Report, CambridgeGoogle Scholar
  24. 24.
    Hamerly G, Elkan C (2003) Learning the k in k-means. Adv Neural Inf Process Syst 15: 289–296 (NIPS)Google Scholar
  25. 25.
    Inous H, Forrest S (2002) Anomaly intrusion detection in dynamic execution environments. In: New Security Paradigms Workshops, pp 52–60Google Scholar
  26. 26.
    Jackson J (2003) Targeting covert messages: A unique approach for detecting novel steganography. Masters Thesis, Air Force Institute of Technology, Wright Patterson AFB, OHGoogle Scholar
  27. 27.
    Kharrazi M, Sencar T, Memon N (2005) Benchmarking steganographic and steganalysis techniques. In: IEEE SPIE, San Jose, CA, 16–20 JanuaryGoogle Scholar
  28. 28.
    Kubler TL (2006) Ant clustering with locally weighting ant perception and diversified memory. Masters Thesis, Air Force Institute of Technology, Wright Patterson AFB, OHGoogle Scholar
  29. 29.
    Lambert T (1998) Convex hull algorithms applet, UNSW School of Computer Science and Engineering. Availabe at: http://www.cse.unsw.edu.au/~lambert/java/3d/hull.htmlGoogle Scholar
  30. 30.
    Lane T, Brodley C (2003) An empirical study of two approaches to sequence learning for anomaly detection. Mach Learn 51(1):73–107MATHCrossRefGoogle Scholar
  31. 31.
    Lazarevic A, Ertoz L, Ozgur A, Srivastava J, Kumar V (2003) Evaluation of outlier detection schemes for detecting network intrusions. In: Proceedings of the third SIAM international conference on data mining, San Francisco, CAGoogle Scholar
  32. 32.
    Lyu S, Farid H (2002) Detecting hidden messages using higher-order statistics and support vector machines. In: Information hiding: 5th international workshop, IH 2002, Noordwijkerhout, The Netherlands, 7–9 OctoberGoogle Scholar
  33. 33.
    Lyu S, Farid H (2004) Steganalysis using color wavelet statistics and one-class support vector machines. In: SPIE symposium on electronic Imaging, San Jose, CAGoogle Scholar
  34. 34.
    MacQueen J (1967) Some methods for classification and analysis of multivariate observations. In: Proceedings of the 5th Berkeley symposium on mathematical statistics and probability, pp 281–297Google Scholar
  35. 35.
    Mahoney M, Chan P (2003) An analysis of the 1999 DARPA/Lincoln laboratory evaluation data for network anomaly detection. In: Proceedings of the recent advances in intrusion detection, RAID 2003. Pittsburgh, PA, 8–10 SeptemberGoogle Scholar
  36. 36.
    McBride B, Peterson G, (2004) Blind data classification using hyper-dimensional convex polytopes. In: Proceedings of the 17th international FLAIRS conference, Miami, FL, pp 520–526Google Scholar
  37. 37.
    McBride BT, Peterson GL, Gustafson SC (2005) A new blind method for detecting novel steganography. Digit Invest 2:50–70CrossRefGoogle Scholar
  38. 38.
    Melnik O (2002) Decision region connectivity analysis: A method for analyzing high-dimensional classifiers. Mach Learn 48:(1/2/3)Google Scholar
  39. 39.
    Mitchell TM (1982) Generalization as search. Artif Intell 18:203–226CrossRefGoogle Scholar
  40. 40.
    Mitchell TM, Keller RM, Kedar-Cabelli ST (1986) Explanation-based generalization: A unifying view. Mach Learn 1(1):47–80Google Scholar
  41. 41.
    Nguyen H, Melnik O, Nissim K (2003) Explaining high-dimensional data. unpublished presentation. Available at: http://dimax.rutgers.edu/~hnguyen/GOAL.ppt. Accessed 4 Aug 2003Google Scholar
  42. 42.
    O’Rourke K (1998) Computation geometry in C, 2nd edn. Cambridge University Press, Cambridge, UKGoogle Scholar
  43. 43.
    Pelleg D, Moore A (2000) X-means: extending K-means with efficient estimation of the number of clusters. In: Proceedings of the 17th international conference on machine learning (ICML), pp 727–734Google Scholar
  44. 44.
    Peterson GL, Mills RF, McBride BT, Alred WC (2005) A comparison of generalizability for anomaly detection. In: KDD-2005 workshop on data mining methods for anomaly detection, 21–25 August, Chicago, IL, pp 53–57Google Scholar
  45. 45.
    Thrun S (1995) Lifelong learning: a case study. Technical Report CMU-CS-95-208, Carnegie Mellon University, Computer Science Department, Pittsburgh, PAGoogle Scholar
  46. 46.
    Wah BW (1999) Generalization and generalizability measures. IEEE Trans Knowl Data Eng 11(1):175–186CrossRefMathSciNetGoogle Scholar
  47. 47.
    Widmer G, Kubat M (1996) Learning in the presence of concept drift and hidden contexts. Mach Learn 23(1):68–101Google Scholar
  48. 48.
    Wong C, Chen C, Yeh S (2000) K-means-based fuzzy classifier design. In: The ninth IEEE international conference on fuzzy systems, vol. 1, pp 48–52Google Scholar

Copyright information

© Springer-Verlag London Limited 2007

Authors and Affiliations

  1. 1.Department of Electrical and Computer EngineeringAir Force Institute of Technology, Wright-Patterson Air Force BaseOHUSA

Personalised recommendations