Mining Common Outliers for Intrusion Detection

  • Goverdhan Singh
  • Florent Masseglia
  • Céline Fiot
  • Alice Marascu
  • Pascal Poncelet
Part of the Studies in Computational Intelligence book series (SCI, volume 292)


Data mining for intrusion detection can be divided into several sub-topics, among which unsupervised clustering (which has controversial properties). Unsupervised clustering for intrusion detection aims to i) group behaviours together depending on their similarity and ii) detect groups containing only one (or very few) behaviour(s). Such isolated behaviours seem to deviate from the model of normality; therefore, they are considered as malicious. Obviously, not all atypical behaviours are attacks or intrusion attempts. This represents one drawback of intrusion detection methods based on clustering.We take into account the addition of a new feature to isolated behaviours before they are considered malicious. This feature is based on the possible repeated occurrences of the bahaviour on many information systems. Based on this feature, we propose a new outlier mining method which we validate through a set of experiments.


Intrusion Detection Anomalies Outliers Data Streams 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Aggarwal, C.C., Yu, P.S.: Outlier detection for high dimensional data. SIGMOD Records 30(2), 37–46 (2001)CrossRefGoogle Scholar
  2. Aleskerov, E., Freisleben, B., Rao, B.: Cardwatch: A neural network based database mining system for credit card fraud detection. In: IEEE Computational Intelligence for Financial Engineering (1997)Google Scholar
  3. Barbara, D., Wu, N., Jajodia, S.: Detecting novel network intrusions using Bayes estimators. In: 1st SIAM Conference on Data Mining (2001)Google Scholar
  4. Barnett, V., Lewis, T. (eds.): Outliers in statistical data. John Wiley & Sons, Chichester (1994)zbMATHGoogle Scholar
  5. Billor, N., Hadi, A.S., Velleman, P.F.: BACON: blocked adaptive computationally efficient outlier nominators. Computational Statistics and Data Analysis 34 (2000)Google Scholar
  6. Bloedorn, E., Christiansen, A.D., Hill, W., Skorupka, C., Talbot, L.M.: Data Mining for Network Intrusion Detection: How to Get Started. Technical report, MITRE (2001)Google Scholar
  7. Breunig, M.M., Kriegel, H.-P., Ng, R.T., Sander, J.: LOF: identifying density-based local outliers. SIGMOD Records 29(2), 93–104 (2000)CrossRefGoogle Scholar
  8. Chandola, V., Banerjee, A., Kumar, V.: Anomaly Detection - A Survey. ACM Computing Surveys (2008)Google Scholar
  9. Chimphlee, W., Abdullah, A.H., Md Sap, M.N., Chimphlee, S.: Unsupervised Anomaly Detection with Unlabeled Data Using Clustering. In: International conference on information and communication technology (2005)Google Scholar
  10. Dokas, P., Ertoz, L., Kumar, V., Lazarevic, A., Srivastava, J., Tan, P.: Data mining for network intrusion detection. In: NSF Workshop on Next Generation Data Mining (2002)Google Scholar
  11. Duan, L., Xiong, D., Lee, J., Guo, F.: A Local Density Based Spatial Clustering Algorithm with Noise. In: IEEE International Conference on Systems, Man and Cybernetics (2006)Google Scholar
  12. Ertoz, L., Eilertson, E., Lazarevic, A., Tan, P.-N., Kumar, V., Srivastava, J., Dokas, P.: MINDS - Minnesota Intrusion Detection System. In: Data Mining - Next Generation Challenges and Future Directions (2004)Google Scholar
  13. Eskin, E., Arnold, A., Prerau, M., Portnoy, L., Stolfo, S.: A geometric framework for unsupervised anomaly detection: Detecting intrusions in unlabeled data. Applications of Data Mining in Computer Security (2002)Google Scholar
  14. Ester, M., Kriegel, H.-P., Sander, J., Xu, X.: A density–based algorithm for discovering clusters in large spatial databases with noise. In: 2nd International Conference on Knowledge Discovery and Data Mining (1996)Google Scholar
  15. Fan, H., Zaiane, O.R., Foss, A., Wu, J.: A nonparametric outlier detection for effectively discovering top-N outliers from engineering data. In: Pacific-Asia conference on knowledge discovery and data mining (2006)Google Scholar
  16. Fujimaki, R., Yairi, T., Machida, K.: An approach to spacecraft anomaly detection problem using kernel feature space. In: 11th ACM SIGKDD international conference on Knowledge discovery in data mining (2005)Google Scholar
  17. He, Z., Xu, X., Deng, S.: Discovering cluster-based local outliers. Pattern Recognition Letters 24 (2003)Google Scholar
  18. Hodge, V., Austin, J.: A survey of outlier detection methodologies. Artificial Intelligence Review 22 (2004)Google Scholar
  19. Jin, W., Tung, A.K.H., Han, J.: Mining top-n local outliers in large databases. In: 7th ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 293–298 (2001)Google Scholar
  20. Joshua Oldmeadow, J., Ravinutala, S., Leckie, C.: Adaptive Clustering for Network Intrusion Detection. In: Dai, H., Srikant, R., Zhang, C. (eds.) PAKDD 2004. LNCS (LNAI), vol. 3056, pp. 255–259. Springer, Heidelberg (2004)Google Scholar
  21. Knorr, E.M., Ng, R.T.: Algorithms for Mining Distance-Based Outliers in Large Datasets. In: 24th International Conference on Very Large Data Bases, pp. 392–403 (1998)Google Scholar
  22. Kwitt, R., Hofmann, U.: Unsupervised Anomaly Detection in Network Traffic by Means of Robust PCA. In: International Multi-Conference on Computing in the Global Information Technology (2007)Google Scholar
  23. Lazarevic, A., Ertoz, L., Kumar, V., Ozgur, A., Srivastava, J.: A comparative study of anomaly detection schemes in network intrusion detection. In: 3rd SIAM International Conference on Data Mining (2003)Google Scholar
  24. Lee, W., Stolfo, S.J.: Data mining approaches for intrusion detection. In: 7th conference on USENIX Security Symposium (1998)Google Scholar
  25. Lee, W., Xiang, D.: Information-Theoretic Measures for Anomaly Detection. In: IEEE Symposium on Security and Privacy (2001)Google Scholar
  26. Locasto, M., Parekh, J., Stolfo, S., Keromytis, A., Malkin, T., Misra, V.: Collaborative Distributed Intrusion Detection. Technical Report CUCS-012-04, Columbia Unviersity Technical Report (2004)Google Scholar
  27. Marascu, A., Masseglia, F.: Parameterless outlier detection in data streams. In: SAC, pp. 1491–1495 (2009)Google Scholar
  28. Marchette, D.: A statistical method for profiling network traffic. In: 1st USENIX Workshop on Intrusion Detection and Network Monitoring, pp. 119–128 (1999)Google Scholar
  29. Markou, M., Singh, S.: Novelty detection: a review - part 1: statistical approaches. Signal Processing 83 (2003)Google Scholar
  30. Otey, M., Parthasarathy, S., Ghoting, A., Li, G., Narravula, S., Panda, D.: Towards nic–based intrusion detection. In: 9th ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 723–728 (2003)Google Scholar
  31. Papadimitriou, S., Kitagawa, H., Gibbons, P., Faloutsos, C.: LOCI: fast outlier detection using the local correlation integral. In: 19th International Conference on Data Engineering (2003)Google Scholar
  32. Patcha, A., Park, J.-M.: An overview of anomaly detection techniques: Existing solutions and latest technological trends. Comput. Networks 51 (2007)Google Scholar
  33. Pires, A., Santos-Pereira, C.: Using clustering and robust estimators to detect outliers in multivariate data. In: International Conference on Robust Statistics (2005)Google Scholar
  34. Portnoy, L., Eskin, E., Stolfo, S.: Intrusion detection with unlabeled data using clustering. In: ACM CSS Workshop on Data Mining Applied to Security (2001)Google Scholar
  35. Ramaswamy, S., Rastogi, R., Shim, K.: Efficient algorithms for mining outliers from large data sets. SIGMOD Records 29(2), 427–438 (2000)CrossRefGoogle Scholar
  36. Roesch, M.: SNORT (1998),
  37. Rousseeuw, P., Leroy, A.M. (eds.): Robust Regression and Outlier Detection. Wiley-IEEE (1996)Google Scholar
  38. Spence, C., Parra, L., Sajda, P.: Detection, synthesis and compression in mammographic image analysis with a hierarchical image probability model. In: IEEE Workshop on Mathematical Methods in Biomedical Image Analysis (2001)Google Scholar
  39. Valdes, A., Skinner, K.: Probabilistic Alert Correlation. In: Lee, W., Mé, L., Wespi, A. (eds.) RAID 2001. LNCS, vol. 2212, pp. 54–68. Springer, Heidelberg (2001)CrossRefGoogle Scholar
  40. Verma, N., Trousset, F., Poncelet, P., Masseglia, F.: Intrusion Detections in Collaborative Organizations by Preserving Privacy. In: Guillet, F., Ritschard, G., Briand, H., Zighed, D.A. (eds.) Advances in Knowledge Discovery and Management. SCI, vol. 292, pp. 237–250. Springer, Heidellberg (2010)Google Scholar
  41. Vinueza, A., Grudic, G.: Unsupervised outlier detection and semi–supervised learning. Technical Report CU-CS-976-04, Univ. of Colorado, Boulder (2004)Google Scholar
  42. Wu, N., Zhang, J.: Factor analysis based anomaly detection. In: IEEE Workshop on Information Assurance (2003)Google Scholar
  43. Yegneswaran, V., Barford, P., Jha, S.: Global Intrusion Detection in the DOMINO Overlay System. In: Network and Distributed Security Symposium (2004)Google Scholar
  44. Zhong, S., Khoshgoftaar, T.M., Seliya, N.: Clustering-based Network Intrusion Detection. International Journal of Reliability, Quality and Safety Engineering 14 (2007)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Goverdhan Singh
    • 1
  • Florent Masseglia
    • 1
  • Céline Fiot
    • 1
  • Alice Marascu
    • 1
  • Pascal Poncelet
    • 2
  1. 1.INRIASophia Antipolis
  2. 2.LIRMM UMR CNRS 5506Montpellier Cedex 5France

Personalised recommendations