Advertisement

Natural Laws (Benford’s Law and Zipf’s Law) for Network Traffic Analysis

  • Aamo IorliamEmail author
Chapter
Part of the SpringerBriefs in Cybersecurity book series (BRIEFSCYBER)

Abstract

Recently, Benford’s law and Zipf’s law, which are both statistical laws, have been effectively used to distinguish between authentic data and fake data. Some similarities that exist between Benford’s law and Zipf’s law are that both of these laws are classified as natural laws. Also, both laws are Power laws and it is expected that distributions that follow Benford’s law should also follow Zipf’s law. Even though both laws have similarities, there exist some differences between these two laws. Benford’s law establishes a relationship between digit and frequency. In contrast, Zipf’s law shows a relationship between rank and frequency. Another difference that exists between these two laws is that Benford’s law applies to numeric attributes, whereas Zipf’s law applies to both numeric and string attributes. In this chapter, we perform a comparative analysis of these two laws on network traffic data and to determine whether they follow these laws and discriminate between non-malicious and malicious network traffic flows. We observe that both the laws effectively detected whether a particular network was non-malicious or malicious by investigating its data using these laws. Furthermore, we observe that the initial Benford’s law chi-square divergence values obtained seem to be inversely proportional to Zipf’s law P-values, which can be potentially exploited for intrusion detection system applications. These passive forensic detection methods when properly deployed to analyse network traffic data in Nigeria will save the Nigerian cyber space from malware and related attacks.

Keywords

Benford’s law Zipf’s law Network traffic analysis Cyber space 

Notes

Acknowledgements

The author would like to specially thank Prof. Anthony T.S. Ho, Prof. Adrian Waller, Prof. Shujun Li, Dr. Norman Poh and Dr. Santosh Tirunagari for their assistance.

References

  1. 1.
    Sambridge M, Tkalčić H, Jackson A (2010) Benford’s law in the natural sciences. Geophys Res Lett 37(22)Google Scholar
  2. 2.
    Nigrini MJ, Mittermaier LJ (1997) The use of Benford’s law as an aid in analytical procedures. Auditing 16(2):52Google Scholar
  3. 3.
    Mahanti A, Carlsson N, Arlitt M, Williamson C (2013) A tale of the tails: power-laws in Internet measurements. IEEE Netw 27(1):59–64CrossRefGoogle Scholar
  4. 4.
    Arshadi L, Jahangir AH (2014) Benford’s law behavior of internet traffic. J Netw Comput Appl 40:194–205CrossRefGoogle Scholar
  5. 5.
    Faloutsos M, Faloutsos P, Faloutsos C (1999) On power-law relationships of the internet topology. In: ACM SIGCOMM Computer Communication Review, vol 29, pp 251–262. ACMGoogle Scholar
  6. 6.
    van Mierlo T, Hyatt D, Ching AT (2015) Mapping power law distributions in digital health social networks: methods, interpretations, and practical implications. J Med Internet Res 17(6)Google Scholar
  7. 7.
    Fu D, Shi YQ, Su Q (2007) A generalized Benford’s law for JPEG coefficients and its applications in image forensics. In: Proceedings of the SPIE Multimedia Content Access: Algorithms and SystemsGoogle Scholar
  8. 8.
    Li XH, Zhao YQ, Liao M, Shih FY (2012) Detection of tampered region for JPEG images by using mode-based first digit features. EURASIP J Adv Signal 1:1–10Google Scholar
  9. 9.
    Xu B, Wang J, Liu G, Dai Y (2011) Photorealistic computer graphics forensics based on leading digit law. J Electron (China) 28(1):95–100Google Scholar
  10. 10.
    Benford F (1938) The law of anomalous numbers. Proc Am Philos Soc 78:551–572Google Scholar
  11. 11.
    Pérez-González F, Heileman GL, Abdallah CT (2007) Benford’s law in image processing. In: IEEE International Conference on Image Processing, vol 1, pp I–405. ICIP 2007 78:551–572. IEEEGoogle Scholar
  12. 12.
    Hill TP (1995) Base-invariance implies Benford’s law. Proc Am Math Soc 123(3):887–895MathSciNetzbMATHGoogle Scholar
  13. 13.
    Durtschi C, Hillison W, Pacini C (2004) The effective use of Benford’s law to assist in detecting fraud in accounting data. J Forensic Account 5(1):17–34Google Scholar
  14. 14.
    Manning CD, Schtze H (1999) Foundations of statistical natural language processing. MIT PressGoogle Scholar
  15. 15.
    Newman MEJ (2005) Power laws, Pareto distributions and Zipf’s law. Contemp Phys 46(5):323–351CrossRefGoogle Scholar
  16. 16.
    Tao T (2009) Benford’s law, Zipf’s law, and the Pareto distribution. http://terrytao.wordpress.com/2009/07/03/benfords-law-zipfs-lawand-the-pareto-distribution/
  17. 17.
    Cristelli M, Batty M, Pietronero L (2012) There is more than a power law in Zipf. Sci Rep 2Google Scholar
  18. 18.
    Clauset A, Shalizi CR, Newman MEJ (2009) Power-law distributions in empirical data. SIAM Rev 51(4):661–703MathSciNetCrossRefGoogle Scholar
  19. 19.
    Huang SH, Yen DC, Yang LW, Hua JS (2008) An investigation of Zipf’s law for fraud detection. Decis Support Syst 46:70–83Google Scholar
  20. 20.
    Iorliam A, Ho ATS, Poh N, Tirunagari S, Bours P (2015) Data forensic techniques using Benford’s law and Zipf’s law for keystroke dynamics. In: 3rd International Workshop on Biometrics and Forensics (IWBF 2015). IEEE, pp 1–6Google Scholar
  21. 21.
    Kruegel C, Valeur F, Vigna G (2004) Intrusion detection and correlation: challenges and solutions, vol 14. Springer Science & Business MediaGoogle Scholar
  22. 22.
    Sperotto A, Pras A (2011) Flow-based intrusion detection. In: IFIP/IEEE International Symposium on Integrated Network Management (IM), 2011. IEEE, pp 958–963Google Scholar
  23. 23.
    Patcha A, Park JM (2007) An overview of anomaly detection techniques: existing solutions and latest technological trends. Comput Netw 51(12):3448–3470CrossRefGoogle Scholar
  24. 24.
    Gogoi P, Bhuyan MH, Bhattacharyya DK, Kalita JK (2012) Packet and ow based network intrusion dataset. In: Contemporary Computing, pp 322–334. SpringerGoogle Scholar
  25. 25.
    Eskin E (2000) Anomaly detection over noisy data using learned probability distributionsGoogle Scholar
  26. 26.
    Chan PK, Mahoney MV, Arshad MH (2003) A machine learning approach to anomaly detection. Department of Computer Sciences, Florida Institute of Technology, MelbourneGoogle Scholar
  27. 27.
    Simmross-Wattenberg F, Asensio-Perez JI, Casaseca de-la Higuera P, Martin-Fernandez M, Dimitriadis IA, Alberola-Lopez C (2011) Anomaly detection in network traffic based on statistical inference and alpha-stable modeling. IEEE Trans Dependable Secur Comput 8(4):494–509Google Scholar
  28. 28.
    Lu W, Ghorbani AA (2009) Network anomaly detection based on wavelet analysis. EURASIP J Adv Signal Process 2009:4zbMATHGoogle Scholar
  29. 29.
    Bejtlich R (2004) The Tao of network security monitoring: beyond intrusion detection. Pearson EducationGoogle Scholar
  30. 30.
    Steinberger J, Schehlmann L, Abt S, Baier H (2013) Anomaly detection and mitigation at internet scale: a survey. In: Emerging Management Mechanisms for the Future Internet, pp 49–60. SpringerGoogle Scholar
  31. 31.
    Lakhina A, Papagiannaki K, Crovella M, Diot C, Kolaczyk ED, Taft N (2004) Structural analysis of network traffic flows, vol 32. ACMGoogle Scholar
  32. 32.
    Tune P, Roughan M (2013) Internet traffic matrices: a Primer. Recent Adv Netw. ACM SIGCOMM eBook, vol. 1. ACMGoogle Scholar
  33. 33.
    Lawrence Berkeley National Laboratory and International Computer Science Institute (2005) LBNL/ICSI enterprise tracing project. http://www.icir.org/enterprise-tracing. Accessed 04 Apr 2015
  34. 34.
    Shiravi A, Shiravi H, Tavallaee M, Ghorbani AA (2012) Toward developing a systematic approach to generate benchmark datasets for intrusion detection. Comput Secur 31(3):357–374Google Scholar
  35. 35.
    Szabó G, Gódor I, Veres A, Malomsoky S, Molnár S (2010) Traffic classification over gbit speed with commodity hardware. IEEE J Commun Softw Syst 5Google Scholar
  36. 36.
    Pcap Traces (2015). http://www.simpleweb.org/wiki/Traces. Accessed 20 May 2015
  37. 37.
    NETRESEC AB. Publicly available PCAP files, http://www.netresec.com/?page=pcapfiles. Accessed 20 May 2015
  38. 38.
    Inter-service academy cyber defense competition (2009). https://www.itoc.usma.edu/research/dataset/. Accessed 20 May 2015
  39. 39.
    Capture files from Mid-Atlantic CCDC (2015). http://www.netresec.com/?page=MACCDC. Accessed 20 May 2015
  40. 40.
    Sperotto A, Sadre R, van Vliet DF, Pras A (2009) A labeled data set for ow-based intrusion detection. In: Proceedings of the 9th IEEE International Workshop on IP Operations and Management, IPOM 2009, Venice, Italy. Lecture Notes in Computer Science, vol 5843. Springer, pp 39–50Google Scholar
  41. 41.
    Song J, Takakura H, Okabe Y (2008) Cooperation of intelligent honeypots to detect unknown malicious codes. In WOMBAT Workshop on Information Security Threats Data Collection and Sharing. WISTDCS’08., pp 31–39. IEEEGoogle Scholar
  42. 42.
    Saad S, Traore I, Ghorbani A, Sayed B, Zhao D, Lu W, Felix J, Hakimian P (2011) Detecting P2P botnets through network behavior analysis and machine learning. In: Proceedings of 2011 9th Annual International Conference on Privacy, Security and Trust (PST 2011), pp 174–180. IEEEGoogle Scholar

Copyright information

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Department of Mathematics and Computer ScienceBenue State UniversityMakurdiNigeria

Personalised recommendations