Skip to main content

Advertisement

Log in

Designing an Internet Traffic Predictive Model by Applying a Signal Processing Method

  • Published:
Journal of Network and Systems Management Aims and scope Submit manuscript

Abstract

Detection of abnormal internet traffic has become a significant area of research in network security. Due to its importance, many predictive models are designed by utilizing machine learning algorithms. The models are well designed to show high performances in detecting abnormal internet traffic behaviors. However, they may not guarantee reliable detection performances for new incoming abnormal internet traffic because they are designed using raw features from imbalanced internet traffic data. Since internet traffic is non-stationary time-series data, it is difficult to identify abnormal internet traffic with the raw features. In this study, we propose a new approach to detecting abnormal internet traffic. Our approach begins with extracting hidden, but important, features by utilizing discrete wavelet transformation. Then, statistical analysis is performed to filter out irrelevant and less important features. Only statistically significant features are used to design a reliable predictive model with logistic regression. A comparative analysis is conducted to determine the importance of our approach by measuring accuracy, sensitivity, and the Area Under the receiver operating characteristic Curve. From the analysis, we found that our model detects abnormal internet traffic successfully with high accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

References

  1. Han, J., Kamber, M.: Data mining: concepts and techniques. The Morgan Kaufmann Series in Data Management Systems. Elsevier Science (2011)

  2. Madhukar, A., Williamson, C.: A longitudinal study of p2p traffic classification. In: Modeling, analysis, and simulation of computer and telecommunication systems, 2006. MASCOTS 2006. 14th IEEE International Symposium on, pp. 179–188 (2006). doi:10.1109/MASCOTS.2006.6

  3. Dashevskiy, M., Luo, Z.: Reliable probabilistic classification and its application to internet traffic. In: Huang, II, D.S., Levine, D.C.W., Levine, D.S., Jo, K.H. (eds.) ICIC (1), Lecture notes in computer science, 5226, pp. 380–388. Springer (2008)

  4. Kim, J.T., Park, H.K., Paik, E.H.: Security issues in peer-to-peer systems. In: Advanced communication technology, 2005, ICACT 2005. The 7th International Conference on, vol. 2, 1059–1063 (2005). doi:10.1109/ICACT.2005.246141

  5. Sen, S., Spatscheck, O., Wang, D.: Accurate, scalable in-network identification of p2p traffic using application signatures. In: Proceedings of the 13th International Conference on World Wide Web. WWW ’04, pp. 512–521. ACM, New York, NY, USA (2004)

  6. Raahemi, B., Zhong, W., Liu, J.: Peer-to-peer traffic identification by mining ip layer data streams using concept-adapting very fast decision tree. In: Tools with artificial intelligence, 2008. ICTAI ’08. 20th IEEE International Conference on, vol. 1, pp. 525–532 (2008)

  7. Moore, A., Papagiannaki, K.: Toward the accurate identification of network applications. In: Dovrolis, C. (ed.) Passive and active network measurement, lecture notes in computer science, vol. 3431, pp. 41–54. Springer, Berlin (2005)

    Chapter  Google Scholar 

  8. Kushida, T., Shibata, Y.: Empirical study of inter-arrival packet times and packet losses. In: Distributed computing systems workshops, 2002. In: Proceedings. 22nd international conference on, pp. 233–238 (2002). doi:10.1109/ICDCSW.2002.1030775

  9. Li, W., Canini, M., Moore, A.W., Bolla, R.: Efficient application identification and the temporal and spatial stability of classification schema. Elsevier Computer Network (2009)

  10. Karagiannis, T., Broido, A., Faloutsos, M., claffy, K.: Transport layer identification of p2p traffic. In: Proceedings of the 4th ACM SIGCOMM conference on internet measurement, IMC ’04, pp. 121–134. ACM, New York, NY, USA (2004). doi:10.1145/1028788.1028804

  11. Xu, K., Zhang, M., Ye, M., Chiu, D.M., Wu, J.: Identify p2p traffic by inspecting data transfer behavior. Comput. Commun. 33(10), 1141–1150 (2010)

    Article  Google Scholar 

  12. Holanda Filho, R., Fontenelle do Carmo, M., Maia, J., Siqueira, G.: An internet traffic classification methodology based on statistical discriminators. In: Network operations and management symposium, 2008. NOMS 2008. IEEE, pp. 907–910 (2008). doi:10.1109/NOMS.2008.4575244

  13. Williams, N., Zander, S., Armitage, G.: A preliminary performance comparison of five machine learning algorithms for practical ip traffic flow classification. SIGCOMM Comput. Commun. Rev. 36(5), 5–16 (2006)

    Article  Google Scholar 

  14. Lu, X., Duan, H., Li, X.: Identification of p2p traffic based on the content redistribution characteristic. In: Communications and information technologies, 2007. ISCIT ’07. International symposium on, pp. 596–601 (2007). doi:10.1109/ISCIT.2007.4392088

  15. He, H., Ma, Y.: Imbalanced learning: foundations, algorithms, and applications, 1st edn. Wiley-IEEE Press, London (2013)

    Book  Google Scholar 

  16. Maloof, M.A.: Learning when data sets are imbalanced and when costs are unequal and unknown. In: ICML-2003 workshop on learning from imbalanced data sets II (2003)

  17. Bhuyan, M., Bhattacharyya, D., Kalita, J.: Network anomaly detection: methods, systems and tools. Commun. Surv. Tutor. IEEE 16(1), 303–336 (2014). doi:10.1109/SURV.2013.052213.00046

    Article  Google Scholar 

  18. Estevez-Tapiador, J.M., Garcia-Teodoro, P., Diaz-Verdejo, J.E.: Anomaly detection methods in wired networks: a survey and taxonomy. Comput. Commun. 27(16), 1569–1584 (2004). doi:10.1016/j.comcom.2004.07.002

    Article  Google Scholar 

  19. Este, A., Gringoli, F., Salgarelli, L.: Support vector machines for tcp traffic classification. Comput. Netw. 53(14), 2476–2490 (2009). doi:10.1016/j.comnet.2009.05.003

    Article  Google Scholar 

  20. Li, Z., Yuan, R., Guan, X.: Accurate classification of the internet traffic based on the svm method. In: Communications, 2007. ICC ’07. IEEE international conference on, pp. 1373–1378 (2007). doi:10.1109/ICC.2007.231

  21. Huang, S.Y., Huang, Y.N.: Network traffic anomaly detection based on growing hierarchical som. In: 2013 43rd annual IEEE/IFIP international conference on dependable systems and networks (DSN) 0, 1–2 (2013)

  22. Hoz Franco, E., Ortiz Garcia, A., Ortega Lopera, J., Hoz Correa, E., Prieto Espinosa, A.: Network anomaly detection with bayesian self-organizing maps. Advances in computational intelligence, lecture notes in computer science, vol. 7902, pp. 530–537. Springer, Berlin (2013)

    Google Scholar 

  23. Auld, T., Moore, A., Gull, S.: Bayesian neural networks for internet traffic classification. Neural Netw. IEEE Trans. 18(1), 223–239 (2007)

    Article  Google Scholar 

  24. Sun, R., Yang, B., Peng, L., Chen, Y., Zhang, L., Jing, S.: Traffic classification using probabilistic neural networks. In: Natural computation (ICNC), 2010 sixth international conference on, vol. 4, pp. 1914–1919 (2010)

  25. Moore, A.W., Zuev, D.: Internet traffic classification using bayesian analysis techniques. SIGMETRICS Perform. Eval. Rev. 33(1), 50–60 (2005)

    Article  Google Scholar 

  26. Alarcon-Aquino, V., Barria, J.: Anomaly detection in communication networks using wavelets. Commun. IEE Proc. 148(6), 355–362 (2001)

    Article  Google Scholar 

  27. Kim, S., Reddy, A., Vannucci, M.: Detecting traffic anomalies using discrete wavelet transform. In: Kahng, H.K., Goto, S. (eds.) Information networking. Networking technologies for broadband and mobile networks. Lecture notes in computer science, vol. 3090, pp. 951–961. Springer, Berlin (2004)

    Chapter  Google Scholar 

  28. Lu, W., Ghorbani, A.A.: Network anomaly detection based on wavelet analysis. EURASIP J. Adv. Signal Process, pp. 1–16 (2009). Hindawi Publishing Corporation, New York (2008)

  29. Kyriakopoulos, K., Parish, D.: Using wavelets for compression and detecting events in anomalous network traffic. In: Systems and networks communications, 2009. ICSNC ’09. Fourth international conference on, pp. 195–200 (2009)

  30. Barford, P., Kline, J., Plonka, D., Ron, A.: A signal analysis of network traffic anomalies. In: Proceedings of the 2nd ACM SIGCOMM workshop on internet measurment. IMW ’02, pp. 71–82. ACM, New York (2002)

  31. Callegari, C., Giordano, S., Pagano, M.: Application of wavelet packet transform to network anomaly detection. In: Balandin, S., Moltchanov, D., Koucheryavy, Y. (eds.) Next generation teletraffic and wired/wireless advanced networking. Lecture notes in computer science, vol. 5174, pp. 246–257. Springer, Berlin (2008)

    Chapter  Google Scholar 

  32. Gao, J., Hu, G., Yao, X., Chang, R.: Anomaly detection of network traffic based on wavelet packet. In: Communications, 2006. APCC ’06. Asia-Pacific conference on, pp. 1–5 (2006)

  33. Tan, J., Chen, Xs, Du, M., Zhu, K.: A novel internet traffic identification approach using wavelet packet decomposition and neural network. J. Cent. South Univ. 19(8), 2218–2230 (2012). doi:10.1007/s11771-012-1266-0

    Article  Google Scholar 

  34. Ramanathan, A.: WADeS: a tool for distributed denial of service attack detection. Texas A&M University, Texas (2002)

    Google Scholar 

  35. Dainotti, A., Pescape, A., Ventre, G.: Nis04-1: Wavelet-based detection of dos attacks. In: Global telecommunications conference, 2006. GLOBECOM ’06. IEEE, pp. 1–6 (2006). doi:10.1109/GLOCOM.2006.279

  36. Moore, A., Crogan, M., Moore, A.W., Mary, Q., Zuev, D., Zuev, D., Crogan, M.L.: Discriminators for use in flow-based classification. Tech. rep. (2005)

  37. He, H., Garcia, E.A.: Learning from imbalanced data. IEEE Trans. Knowl. Data Eng. 21(9), 1263–1284 (2009). doi:10.1109/TKDE.2008.239

    Article  Google Scholar 

  38. Wang, W., Zhang, X., Gombault, S., Knapskog, S.: Attribute normalization in network intrusion detection. In: Pervasive systems, algorithms, and networks (ISPAN), 2009 10th international symposium on, pp. 448–453 (2009)

  39. Unser, M., Aldroubi, A.: A review of wavelets in biomedical applications. Proc. IEEE 84(4), 626–638 (1996)

    Article  Google Scholar 

  40. Meyer, Y., Ryan, R.: Wavelets: Algorithms and applications. Miscellaneous Bks. Soc. Ind. Appl. Math. (1993)

  41. Hasford, J., Ansari, H., Lehmann, K.: Cart and logistic regression analyses of risk factors for first dose hypotension by an ace-inhibitor. Therapie 48(5), 479–482 (1993)

    Google Scholar 

  42. Kuhnert, P.M., Do, K.A., McClure, R.: Combining non-parametric models with logistic regression: an application to motor vehicle injury data. Comput. Stat. Data Anal. 34(3), 371–386 (2000)

    Article  Google Scholar 

  43. Long, W.J., Griffith, J.L., Selker, H.P., D’agostino, R.B.: A comparison of logistic regression to decision-tree induction in a medical domain. Comput. Biomed. Res. 74–97 (1993)

  44. Stone, M.: Cross-validatory choice and assessment of statistical predictions. R. Stat. Soc. 36, 111–147 (1974)

    Google Scholar 

  45. Cawley, G.C., Talbot, N.L.: Efficient leave-one-out cross-validation of kernel fisher discriminant classifiers. Pattern Recognit. 36(11), 2585–2592 (2003). doi:10.1016/S0031-3203(03)00136-5

    Article  Google Scholar 

  46. Chapelle, O., Vapnik, V., Bousquet, O., Mukherjee, S.: Choosing multiple parameters for support vector machines. Mach. Learn. 46(1–3), 131–159 (2002). doi:10.1023/A:1012450327387

    Article  Google Scholar 

  47. Vapnik, V., Chapelle, O.: Bounds on error expectation for support vector machines. Neural Comput. 12(9), 2013–2036 (2000)

    Article  Google Scholar 

  48. Bradley, A.P.: The use of the area under the roc curve in the evaluation of machine learning algorithms. Pattern Recognit. 30(7), 1145–1159 (1997). doi:10.1016/S0031-3203(96)00142-2

    Article  Google Scholar 

  49. King, G., Zeng, L.: Logistic regression in rare events data. Polit. Anal. 9, 137–163 (2001)

    Article  Google Scholar 

  50. Menardi, G., Torelli, N.: Training and assessing classification rules with imbalanced data. Data Min. Knowl. Discov. 28(1), 92–122 (2014). doi:10.1007/s10618-012-0295-5

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgments

This study is based on the work supported by US Army Research Office (ARO) Grant W911NF1310143.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Soo-Yeon Ji.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ji, SY., Choi, S. & Jeong, D.H. Designing an Internet Traffic Predictive Model by Applying a Signal Processing Method. J Netw Syst Manage 23, 998–1015 (2015). https://doi.org/10.1007/s10922-014-9335-3

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10922-014-9335-3

Keywords

Navigation