International Conference on Security and Privacy in Communication Systems

Security and Privacy in Communication Networks pp 175-192 | Cite as

TADOOP: Mining Network Traffic Anomalies with Hadoop

  • Geng Tian
  • Zhiliang Wang
  • Xia Yin
  • Zimu Li
  • Xingang Shi
  • Ziyi Lu
  • Chao Zhou
  • Yang Yu
  • Dan Wu
Part of the Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering book series (LNICST, volume 164)

Abstract

Today, various anomalies and large number of flows in a network make traffic anomaly detection a big challenge. In this paper, we propose DTE-FP (Dual qTsallis Entropy for flow Feature with Properties), a more efficient method for traffic anomaly detection. To handle huge amount of traffic, based on Hadoop, we implement a network traffic anomaly detection system named TADOOP, which supports semi-automatic training and both offline and online traffic anomaly detection. TADOOP with a cluster of five servers has been deployed in Tsinghua University Campus Network. Furthermore, we compare DTE-FP with Tsallis entropy, and the experimental results show that DTE-FP has much better detection capability than Tsallis entropy.

Keywords

Tsallis entropy Traffic anomaly detection Hadoop Big data MapReduce 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Lakhina, A., Crovella, M., Diot, C.: Mining anomalies using traffic feature distributions. In: Proceedings of the 2005 Conference on Applications, Technologies, Architectures, and Protocols for Computer Communications (SIGCOMM 2005), pp. 217–228. ACM, New York (2005)Google Scholar
  2. 2.
    Gu, Y., McCallum, A., Towsley, D.: Detecting anomalies in network traffic using maximum entropy estimation. In: Proceedings of the 5th ACM SIGCOMM Conference on Internet Measurement, IMC 2005, pp. 32–32. USENIX Association, Berkeley (2005)Google Scholar
  3. 3.
    Nychis, G., Sekar, V., Andersen, D.G., Kim, H., Zhang, H.: An empirical evaluation of entropy-based traffic anomaly detection. In: Proceedings of the 8th ACM SIGCOMM Conference on Internet Measurement, pp. 151–156. ACM (2008)Google Scholar
  4. 4.
    Tellenbach, B., Burkhart, M., Sornette, D., Maillart, T.: Beyond shannon: characterizing internet traffic with generalized entropy metrics. In: Moon, S.B., Teixeira, R., Uhlig, S. (eds.) PAM 2009. LNCS, vol. 5448, pp. 239–248. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  5. 5.
    Bereziński, P., Szpyrka, M., Jasiul, B., Mazur, M.: Network anomaly detection using parameterized entropy. In: Saeed, K., Snášel, V. (eds.) CISIM 2014. LNCS, vol. 8838, pp. 465–478. Springer, Heidelberg (2014)Google Scholar
  6. 6.
    Dean, J., Ghemawat, S.: Mapreduce: Simplified data processing on large clusters. Commun. ACM 51(1), 107–113 (2008)CrossRefGoogle Scholar
  7. 7.
    Apache hadoop (2014). http://hadoop.apache.org
  8. 8.
    Lee, Y., Lee, Y.: Toward scalable internet traffic measurement and analysis with hadoop. SIGCOMM Comput. Commun. Rev. 43(1), 5–13 (2013)CrossRefGoogle Scholar
  9. 9.
    Zhang, L., Wang, J., Lin, S.: Design of the network traffic anomaly detection system in cloud computing environment. In: 2012 International Symposium on Information Science and Engineering (ISISE), pp. 16–19. IEEE (2012)Google Scholar
  10. 10.
    Hodge, V.J., Jackson, T., Austin, J.: A hadoop-based framework for parallel and distributed feature selection (2013)Google Scholar
  11. 11.
    Bhuyan, M., Bhattacharyya, D., Kalita, J.: Network anomaly detection: Methods, systems and tools. IEEE Communications Surveys Tutorials 16(1), 303–336 (2014)CrossRefGoogle Scholar
  12. 12.
    Fontugne, R., Mazel, J., Fukuda, K.: Hashdoop: a mapreduce framework for network anomaly detection. In: 2014 IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS), pp. 494–499, April 2014Google Scholar
  13. 13.
    Ziviani, A., Gomes, A.T.A., Monsores, M., Rodrigues, P.: Network anomaly detection using nonextensive entropy. IEEE Communications Letters 11(12), 1034–1036 (2007)CrossRefGoogle Scholar
  14. 14.
    Wang, Z., Yang, J., Li, F.: An on-line anomaly detection method based on a new stationary metric-entropy-ratio. In: 2014 IEEE 13th International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom), pp. 90–97. IEEE (2014)Google Scholar
  15. 15.
    Tsallis, C.: Possible generalization of boltzmann-gibbs statistics. Journal of Statistical Physics 52(1–2), 479–487 (1988)MathSciNetCrossRefMATHGoogle Scholar
  16. 16.
    Tsallis, C.: Nonextensive statistics: theoretical, experimental and computational evidences and connections. Brazilian Journal of Physics 29(1), 1–35 (1999)CrossRefGoogle Scholar
  17. 17.
    Tsallis, C.: Entropic nonextensivity: a possible measure of complexity. Chaos, Solitons & Fractals 13(3), 371–391 (2002)MathSciNetCrossRefMATHGoogle Scholar
  18. 18.
    IPFIX library (2014). http://libipfix.sourceforge.net/
  19. 19.
    Tian, G., Wang, Z., Yin, X., Li, Z., Shi, X., Lu, Z., Zhou, C., Yu, Y., Guo, Y.: Mining network traffic anomaly based on adjustable piecewise entropy. In: IEEE/ACM International Symposium on Quality of Service (IWQoS), June 2015Google Scholar

Copyright information

© Institute for Computer Sciences, Social Informatics and Telecommunications Engineering 2015

Authors and Affiliations

  • Geng Tian
    • 1
    • 3
  • Zhiliang Wang
    • 2
    • 3
  • Xia Yin
    • 1
    • 3
  • Zimu Li
    • 2
    • 3
  • Xingang Shi
    • 2
    • 3
  • Ziyi Lu
    • 4
  • Chao Zhou
    • 4
  • Yang Yu
    • 1
    • 3
  • Dan Wu
    • 1
    • 3
  1. 1.Department of Computer Science and TechnologyTsinghua UniversityBeijingChina
  2. 2.Institute for Network Sciences and CyberspaceTsinghua UniversityBeijingChina
  3. 3.Tsinghua National Laboratory for Information Science and Technology (TNList)BeijingChina
  4. 4.Cisco Systems, Inc.ShanghaiChina

Personalised recommendations