Data Security Analysis Using Unsupervised Learning and Explanations

  • G. Corral
  • E. Armengol
  • A. Fornells
  • E. Golobardes
Part of the Advances in Soft Computing book series (AINSC, volume 44)


Vulnerability assessment is an effective security mechanism to identify vulnerabilities in systems or networks before they are exploited. However manual analysis of network test and vulnerability assessment results is time consuming and demands expertise. This paper presents an improvement of Analia, which is a security system to process results obtained after a vulnerability assessment using artificial intelligence techniques. The system applies unsupervised clustering techniques to discover hidden patterns and extract abnormal device behaviour by clustering devices in groups that share similar vulnerabilities. The proposed improvement consists in extracting a symbolic explanation for each cluster in order to help security analysts to understand the clustering solution using network security lexicon.


Network Security Unsupervised Learning Clustering Explanations 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Peltier TR, Peltier J (2003) Managing a Network Vulnerability Assessment. CRC Press, IncGoogle Scholar
  2. 2.
    Eskin E, Arnold A, Prerau M (2002) A geometric framework for unsupervised anomaly detection: Detecting intrusions in unlabeled data. In: Data Mining for Security ApplicationsGoogle Scholar
  3. 3.
    Corral G, Golobardes E, Andreu O, Serra I (2005) Application of clustering techniques in a network security testing system. AI Research and Development, IOS Press, 131:157–164Google Scholar
  4. 4.
    Fornells A, Golobardes E, Vernet D, Corral G (2006) Unsupervised case memory organization: Analysing computational time and soft computing capabilities. In 8th European Conference on CBR, LNAI Springer-Verlag, 4106:241–255Google Scholar
  5. 5.
    Hartigan J, Wong M (1979) A k-means clustering algorithm. Applied Statistics, 28:100–108MATHCrossRefGoogle Scholar
  6. 6.
    Kohonen T (1989) Self-Organization and Associative Memory. In: Springer Series in Information Sciences, Springer, Berlin, vol 8Google Scholar
  7. 7.
    Corral G, Zaballos A, Cadenas X, Grané A (2005) A distributed security system for an intranet. In 39th IEEE Carnahan Conference on Security Technology, pp 291–294Google Scholar
  8. 8.
    Armengol E, Plaza E (2000) Bottom-up induction of feature terms. Machine Learning 41(1):259–294MATHCrossRefGoogle Scholar
  9. 9.
    Julisch K (2003) Clustering intrusion detection alarms to support root cause analysis. ACM Transactions on Information and System Security 6:443–471CrossRefGoogle Scholar
  10. 10.
    Leung K, Leckie C (2005) Unsupervised anomaly detection in network intrusion detection using clusters. In: Proceedings of 28th Australasian CS Conference, vol 38Google Scholar
  11. 11.
    Marchette D (1999) A statistical method for profiling network traffic. In: 1st USENIX Workshop on Intrusion Detection and Network Monitoring, pp 119–128Google Scholar
  12. 12.
    Ramadas M, Ostermann S, Tjaden BC (2003) Detecting anomalous network traffic with SOMs. In: 6th Symposium on Recent Advances in Intrusion Detection, 2820: 36–54Google Scholar
  13. 13.
    Depren M, Topallar M (2004) Network-based anomaly intrusion detection system using SOMs. In: IEEE 12th Signal Processing and Communications Applications, pp 76–79Google Scholar
  14. 14.
    DeLooze L (2004) Classification of computer attacks using a self-organizing map. In: Proceedings of the 2004 IEEE Workshop on Information Assurance, pp 365–369Google Scholar
  15. 15.
    Armengol E, Plaza E (2006) Symbolic Explanation of Similarities in CBR. Computing and Informatics 25:1001–1019Google Scholar
  16. 16.
    Cheeseman P, Stutz J (1996) Bayesian classification (autoclass): Theory and results. Advances in Knowledge Discovery and Data Mining, pp 153–180Google Scholar
  17. 17.
    Corral G, Fornells A, Golobardes E, Abella J (2006) Cohesion factors: improving the clustering capabilities of Consensus. Intelligent Data Engineering and Automated Learning, LNCS Springer, 4224:488–495CrossRefGoogle Scholar
  18. 18.
    Dunn J (1974) Well separated clusters and optimal fuzzy partitions. Journal of Cybernetics 4:95–104CrossRefMathSciNetGoogle Scholar
  19. 19.
    Davies DL, Bouldin DW (1979) A cluster separation measure. IEEE Transactions on Pattern Analysis and Machine Learning 4:224–227CrossRefGoogle Scholar
  20. 20.
    Rousseeuw P (1987) Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. Journal of Computational and Applied Mathematics 20:53–65MATHCrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2007

Authors and Affiliations

  • G. Corral
    • 1
  • E. Armengol
    • 2
  • A. Fornells
    • 1
  • E. Golobardes
    • 1
  1. 1.Grup de Recerca en Sistemes Intelligents Enginyeria i Arquitectura La SalleUniversitat Ramon LlullBarcelonaSpain
  2. 2.IIIA, Artificial Intelligence Research InstituteCSIC, Spanish Council for Scientific ResearchBellaterra, BarcelonaSpain

Personalised recommendations