Analyzing Statistical Effect of Sampling on Network Traffic Dataset

  • Raman Singh
  • Harish Kumar
  • R. K. Singla
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 248)


In sampling of huge network traffic dataset, some packets are chosen out of total packets. Leftover packets may have effect on statistical characteristics of the data. In this paper effect of sampling on statistical characteristics is discussed. A well-known benchmarked NSL KDD network traffic dataset is used. Three sampling techniques namely - random, systematic and under-over sampling are used. Various attributes of dataset considered are duration, src_bytes, dst_bytes, wrong_fragment, num_compromised, num_file_ creations and srv_count. Parameter of statistical characteristics like range, mean and standard deviation is used for analysis purpose. Result shows that sampling has considerable statistical effect on network traffic dataset.


Sampling Network traffic dataset Intrusion detection system 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    He, G., Hou, J.C.: On sampling self-similar Internet traffic. Computer Networks 50(16), 2919–2936 (2006)CrossRefMATHGoogle Scholar
  2. 2.
    Mahmood, A.N., Hu, J., Tari, Z., Leckie, C.: Critical infrastructure protection: Resource efficient sampling to improve detection of less frequent patterns in network traffic. Journal of Network and Computer Applications 33(4), 491–502 (2010)CrossRefGoogle Scholar
  3. 3.
    Liu, J.G., Martin, C.: Generative oversampling for imbalanced datasets. In: International Conference on Data Mining (DMIN), Las Vegas, Nevada, USA, June 25-28, pp. 66–72 (2007)Google Scholar
  4. 4.
    Kotsiantis, S., Kanellopoulos, D., Pintelas, P.: Handling imbalanced datasets: A review. GESTS International Transactions on Computer Science and Engineering 30(1), 25–36 (2006)Google Scholar
  5. 5.
    Liu, Y., Yu, X., Huang, J.X., An, A.: Combining integrated sampling with SVM ensembles for learning from imbalanced datasets. Information Processing & Management 47(4), 617–631 (2011)CrossRefGoogle Scholar
  6. 6.
    Lippmann Richard, P., Fried David, J., Isaac, G., Haines Joshua, W., Kendall Kristopher, R., David, M., Dan, W., Webster Seth, E., Dan, W., Cunningham Robert, K., Zissman Marc, A.: Evaluating Intrusion Detection Systems: The 1998 DARPA Off-line Intrusion Detection Evaluation. In: DARPA Information Survivability Conference and Exposition, Hilton Head, South Carolina, January 25-27, pp. 12–26 (2000)Google Scholar
  7. 7.
    Singh, R., Kumar, H., Singla, R.K.: Traffic Analysis of Campus Network for Classification of Broadcast Data. In: 47th Annual National Convention of Computer Society of India, International Conference on Intelligent Infrastructure, Science City, Kolkata, December 1-2, pp. 163–166 (2012)Google Scholar
  8. 8.

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  1. 1.University Institute of Engineering and Technology, Panjab UniversityChandigarhIndia
  2. 2.DCSAPanjab UniversityChandigarhIndia

Personalised recommendations