NEtwork Digest Analysis Driven by Association Rule Discoverers

  • Daniele Apiletti
  • Tania Cerquitelli
  • Vincenzo D’Elia
Part of the Studies in Computational Intelligence book series (SCI, volume 299)

Abstract

An important issue in network traffic analysis is to profile communications, detect anomalies or security threats, and identify recurrent patterns. To these aims, the analysis could be performed on: (i) Packet payloads, (ii) traffic metrics, and (iii) statistical features computed on traffic flows. Data mining techniques play an important role in network traffic domain, where association rules are successfully exploited for anomaly identification and network traffic characterization. However, to discover (potentially relevant) knowledge a very low support threshold needs to be enforced, thus generating a large number of unmanageable rules. To address this issue, efficient techniques to reduce traffic volume and to efficiently discover relevant knowledge are needed. This paper presents a NEtwork Digest framework, named NED, to efficiently support network traffic analysis. NED exploits continuous queries to perform real-time aggregation of captured network data and supports filtering operations to further reduce traffic volume focusing on relevant data. Furthermore, NED exploits two efficient algorithms to discover both traditional and generalized association rules. Extracted knowledge provides a high level abstraction of the network traffic by highlighting unexpected and interesting traffic rules. Experimental results performed on different network dumps showed the efficiency and effectiveness of the NED framework to characterize traffic data and detect anomalies.

Keywords

Association Rule Intrusion Detection Network Traffic Association Rule Mining Support Threshold 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Agrawal, R., Srikant, R.: Fast Algorithms for Mining Association Rules in Large Databases. In: Proceedings of the 20th International Conference on Very Large Data Bases, pp. 487–499 (1994)Google Scholar
  2. 2.
    Arasu, A., Babu, S., Widom, J.: The CQL continuous query language: semantic foundations and query execution. The VLDB Journal, The International Journal on Very Large DataBases 15(2), 121–142 (2006)CrossRefGoogle Scholar
  3. 3.
    Babu, S., Widom, J.: Continuous queries over data streams. ACM SIGMOD Record 30(3), 109–120 (2001)CrossRefGoogle Scholar
  4. 4.
    Baldi, M., Baralis, E., Risso, F.: Dipt. di Autom. e Inf. Data mining techniques for effective and scalable traffic analysis. In: 2005 9th IFIP/IEEE International Symposium on Integrated Network Management, IM 2005, pp. 105–118 (2005)Google Scholar
  5. 5.
    Goethals, B.: Frequent Pattern Mining Implementations, http://www.adrem.ua.ac.be/~goethals/software
  6. 6.
    Burn-Thornton, K., Garibaldi, J., Mahdi, A.: Pro-active network management using data mining. In: Global Telecommunications Conference, GLOBECOM 1998, vol. 2 (1998)Google Scholar
  7. 7.
    Erman, J., Arlitt, M., Mahanti, A.: Traffic classification using clustering algorithms. In: MineNet 2006, pp. 281–286. ACM Press, New York (2006)CrossRefGoogle Scholar
  8. 8.
  9. 9.
    Guan, Y., Ghorbani, A., Belacel, N.: Y-Means: A clustering method for intrusion detection. In: Proceedings of Canadian Conference on Electrical and Computer Engineering, pp. 4–7 (2003)Google Scholar
  10. 10.
    Harris, B., Hunt, R.: TCP/IP security threats and attack methods. Computer Communications 22(10), 885–897 (1999)CrossRefGoogle Scholar
  11. 11.
    Hossain, M., Bridges, S., Vaughn Jr., R.: Adaptive intrusion detection with data mining. In: IEEE International Conference on Systems, Man and Cybernetics, vol. 4 (2003)Google Scholar
  12. 12.
    Han, J., Kamber, M.: Data Mining: Concepts and Techniques. In: Gray, J. (ed.) The Morgan Kaufmann Series in Data Management Systems. Morgan Kaufmann Publishers, San Francisco (August 2000)Google Scholar
  13. 13.
    Karagiannis, T., Papagiannaki, K., Faloutsos, M.: Blinc: multilevel traffic classification in the dark. In: SIGCOMM, pp. 229–240 (2005)Google Scholar
  14. 14.
    Le, F., Lee, S., Wong, T., Kim, H.S., Newcomb, D.: Minerals: using data mining to detect router misconfigurations. In: MineNet 2006, pp. 293–298. ACM Press, New York (2006)CrossRefGoogle Scholar
  15. 15.
    Lee, W., Stolfo, S.: A framework for construction features and models for intrusion detection systems. ACM Transactions on Information and System Security (TISSEC) 3(4), 227–261 (2000)CrossRefGoogle Scholar
  16. 16.
    Moore, A.W., Zuev, D.: Internet traffic classification using Bayesian analysis techniques. In: SIGMETRICS 2005, pp. 50–60. ACM Press, New York (2005)CrossRefGoogle Scholar
  17. 17.
    NetGroup, Politecnico di Torino. Analyzer 3.0, http://analyzer.polito.it/30alpha/
  18. 18.
    Portnoy, L., Eskin, E., Stolfo, S.: Intrusion detection with unlabeled data using clustering. In: Proceedings of ACM CSS Workshop on Data Mining Applied to Security, PA (November 2001)Google Scholar
  19. 19.
    The SANS Institute. Port 4662 details, http://isc.sans.org/port.html?port=4662
  20. 20.
    Uno, T., Kiyomi, M., Arimura, H.: LCM ver. 2: Efficient mining algorithms for frequent/closed/maximal itemsets. In: FIMI (2004)Google Scholar
  21. 21.
    Wang, Q., Megalooikonomu, V.: A clustering algorithm for intrusion detection. Proc. SPIE 5812, 31–38 (2005)CrossRefGoogle Scholar
  22. 22.
    Yang, Q., Zhang, H.: Web-log mining for predictive Web caching. IEEE Transactions on Knowledge and Data Engineering 15(4), 1050–1053 (2003)CrossRefGoogle Scholar
  23. 23.
    Knobbe, A., Van der Wallen, D., Lewis, L.: Experiments with data mining in enterprise management. In: Proceedings of the Sixth IFIP/IEEE International Symposium on Distributed Management for the Networked Millennium, Integrated Network Management, pp. 353–366 (1999)Google Scholar
  24. 24.
    Bianco, A., Mardente, G., Mellia, M., Munafo, M., Muscariello, L.: Web User Session Characterization via Clustering techniques. In: GLOBECOM, New York, vol. 2, p. 1102 (2005)Google Scholar
  25. 25.
    Duffield, N.G., Grossglauser, M.: Trajectory sampling for direct traffic observation. IEEE/ACM Trans. Netw. 9(3), 280–292 (2001)CrossRefGoogle Scholar
  26. 26.
    Lee, I., Fapojuwo, A.: Data Mining Network Traffic. In: Canadian Conference on Electrical and Computer Engineering (2006)Google Scholar
  27. 27.
    Roesch, M.: Snort–Lightweight intrusion detection for networks. In: Proceeding of the 13th Systems Administration Conference, LISA 1999, pp. 299–238 (1999)Google Scholar
  28. 28.
    Lee, W., Stolfo, S., Mok, K.: A data mining framework for building intrusion detection models. In: IEEE Symposium on Security and Privacy, vol. 132 (1999)Google Scholar
  29. 29.
    Lee, W., Stolfo, S.: Data mining approaches for intrusion detection. In: Proceedings of the 7th USENIX Security Symposium, vol. 1, pp. 26–29 (1998)Google Scholar
  30. 30.
    World Wide Web Consortium. eXtensible Markup Language, http://www.w3.org/XML
  31. 31.
    Baralis, E., Cerquitelli, T., D’Elia, V.: Generalized itemset discovery by means of opportunistic aggregation. Technical report, Politecnico di Torino (2008), https://dbdmg.polito.it/twiki/bin/view/Public/NetworkTrafficAnalysis
  32. 32.
    Han, J., Fu, Y.: Mining multiple-level association rules in large databases. IEEE Trans. Knowl. Data Eng. 11(5), 798–804 (1999)CrossRefGoogle Scholar
  33. 33.
    Naguyen, T., Armitage, G.: A Survey of Techniques for Internet Traffic Classification Using Machine Learning. In: IEEE Communications Surveys and Tutorials 2008 (October 2008)Google Scholar
  34. 34.
    Haffner, P., Sen, S., Spatscheck, O., Wang, D.: ACAS: automated construction of application signatures. In: Proceedings of the 2005 ACM SIGCOMM workshop on Mining network data, pp. 197–202. ACM, New York (2005)CrossRefGoogle Scholar
  35. 35.
    Auld, T., Moore, A., Gull, S.: Bayesian Neural Networks for Internet Traffic Classification. IEEE Trans. on Neural Networks 18(1), 223 (2007)CrossRefGoogle Scholar
  36. 36.
    Bernaille, L., Akodkenou, I., Soule, A., Salamatian, K.: Traffic classification on the fly. ACM SIGCOMM Computer Communication Review 36(2), 23–26 (2006)CrossRefGoogle Scholar
  37. 37.
    McGregor, A., Hall, M., Lorier, P., Brunskill, J.: Flow Clustering Using Machine Learning Techniques. LNCS, pp. 205–214. Springer, Heidelberg (2004)Google Scholar
  38. 38.
    Internet Assigned Numbers Authority (IANA). Port Numbers, http://www.iana.org/assignments/port-numbers
  39. 39.
    Tan, P., Steinbach, M., Kumar, V.: Introduction to Data Mining. Addison-Wesley, Reading (2005)Google Scholar
  40. 40.
    NetGroup, Politecnico di Torino. Analyzer 3.0, http://analyzer.polito.it
  41. 41.
    Telecommunication Network Group, Politecnico di Torino. Tstat 1.01, http://tstat.polito.it
  42. 42.
    Network Research Group, Lawrence Berkeley National Laboratory. Tcpdump 4.0.0, http://www.tcpdump.org
  43. 43.
    Network Research Group, Lawrence Berkeley National Laboratory. Libpcap 1.0.0, http://www.tcpdump.org

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Daniele Apiletti
    • 1
  • Tania Cerquitelli
    • 1
  • Vincenzo D’Elia
    • 1
  1. 1.Dipartimento di Automatica e InformaticaPolitecnico di TorinoTorinoItaly

Personalised recommendations