Abstract
Network traffic audit data provide unique and valuable information for network security. Although a comprehensive intrusion detection scheme contains multiple data sources and multiple measurements, the system-level traffic data provide important baseline information on anomalous traffic that could harm the network system, and such information can be learned from training data. However, when labeled abnormal data are not available or such events are insufficient in training data, conventional supervised classification methods, such as regression models and neural networks, are not suitable. Using the bootstrap resampling method, we developed a simple probability model trained with an anomaly-free training sample and yielded a receiver operating characteristic area of 0.96, specificity of 0.96, sensitivity of 0.96, and a classification agreement rate of 0.96 to detect abnormal events in a testing sample. The model provides a potential approach for classifying network traffic when limited or no abnormal information is available in training data.
Similar content being viewed by others
References
Anderson, D., Frivold, T. and Valdes, A. (1995) Next-generation Intrusion Detection Expert System (NIDES). Summary Report, SRI International.
Barbard, D., Wu, N. and Jajodia, S. (2001) Detecting Novel Network Intrusions Using Bayes Estimators. Proceedings of the First SIAM International Conference on Data Mining. Chicago, USA, 5–7 April 2001.
Cho, S.B. and Park, H.J. (2003) Efficient Anomaly Detection by Modeling Privilege Flows Using Hidden Markov Model. Computer and Security. Vol. 22, No. 1, pp 45–55.
Cunningham, R.K., Lippmann, R.P., Fried, D.J., Garfinkel, S.L., Graf, I., Kendall, K.R., Webster, S.E., Wyschogrod, D. and Zissman, M.A. . (1999) Evaluating Intrusion Detection Systems Without Attacking your Friends: The 1998 DARPA Intrusion Detection Evaluation. Proceedings of the Third Conference and Workshop on Intrusion Detection and Response. California, USA.
Denning, D. (1987) An Intrusion-Detection Model. IEEE Transaction on Software Engingeering. Vol. 13, No. 2, pp 222–232.
Efron, B. (1982) The Jackknife, the Bootstrap, and Other Resampling Plans. Philadelphia: SIAM.
Efron, B. and Tibshirani, E.R. (1994) An Introduction to the Bootstrap. London: Chapman & Hall.
Elkan, C. (2000) Results of the KDD’99 Classifier Learning Contest. ACM Transactions on Information and System Security. Vol. 3, No. 4, pp 262–294.
Gao, B., Ma, H. and Yang, Y. (2002) HMMS (Hidden Markov Models) Based on Anomaly Intrusion Detection Method. Proceedings of the First International Conference on Machine Learning and Cybernetics. Beijing, China, pp 381–385.
Helman, P. and Liepins, G. (1993) Statistical Foundations of Audit Trail Analysis for the Detection of Computer Misuse. IEEE Transactions on Software Engineering. Vol. 19, No. 9, pp 886–901.
Hosmer, D.W. and Lemeshow, S. (2000) Applied Logistic Regression. 2nd edn. New York: John Wiley & Sons, Inc.
Javitz, H.S. and Valdes, A. (1994) The NIDES Statistical Component Description and Justification. Retrieved from http://wwwcsif.cs.ucdavis.edu/~zhangk1/papers/NIDES-STA-description.pdf.
Lane, T. and Brodley, C.E. (1997) Detecting the Abnormal: Machine Learning in Computer Security, Technical Report TR-ECE 97-1, West Lafayette, IN: School of Electrical and Computer Engineering, Purdue University.
Marchette, D.J. (1999) A Statistical Method for Profiling Network Traffic. Proceedings of the Workshop on Intrusion Detection and Network Monitoring. California, USA, pp 119–128.
Marchette, D.J. (2003) Statistical Opportunities in Network Security. Proceedings of the 35th Symposium on the Interface of Computing Science and Statistics, Computing Science and Statistics. Utah, USA, Vol. 35, pp 28–38.
Maxion, R.A. and Townsend, T.N. (2002) Masquerade Detection Using Truncated Command Lines. Proceedings of International Conference on Dependable Systems and Networks (DSN-02). Washington DC, USA, pp 219–228.
Schonlau, M., DuMouchel, W., Ju, W., Karr, A., Theus, M. and Vardi, Y. (2001) Computer Intrusion: Detecting Masquerades. Statistical Science. Vol. 16, No. 1, pp 58–74.
Shyu, M., Chen, S., Sarinnapakorn, K. and Chang, L. (2003) A Novel Anomaly Detection Scheme Based on Principal Component Classifier. Proceedings of the IEEE Foundations and New Directions of Data Mining Workshop, in conjunction with the Third IEEE International Conference on Data Mining (ICDM). Florida, USA.
Stallings, W. (2003) Network Security Essentials, Applications and Standards 2nd edn. New Jersey: Pearson Education.
Taylor, C. and Alves-Foss, J. (2002) An Empirical Analysis of NATE: Network Analysis of Anomalous Traffic Events. Proceedings of the 2002 Workshop on New Security Paradigms. Country Cork, Ireland, pp 18–26.
Vaccaro, HS. and Liepins, GE (1989) Detection of Anomalous Computer Session Activity. Proceedings of the 1989 IEEE Symposium on Security and Privacy. California, USA, pp 280–289.
Wang, Y. (2005) A Multinomial Logistic Regression Modeling Approach for Anomaly Intrusion Detection. Computer & Security. Vol. 24, No. 8, pp 662–674.
Wang, K. and Stolfo, S.J. (2003) One-class Training for Masquerade Detection. Third IEEE Conference Data Mining Workshop on Data Maining for Computer Security. Florida.
Wang, Y. and Cannady, J. (2005) Develop a Composite Risk Score to Detect Anomaly Intrusion. Proceedings of the IEEE SoutheastCon 2005. Florida, USA, pp 445–449.
Wang, Y. and Normand, S.L.T. (2006) Determining the Minimum Sample Size of Audit Data Required to Profile User Behavior and Detect Anomaly Intrusion. International Journal of Business Data Communications and Networking. Vol. 2, No. 3, pp 31–45.
Acknowledgements
We thank anonymous reviewers for helpful suggestions and comments, and Deron Galusha, biostatistician at the School of Medicine, Yale University, for his valuable comments on the early version of the manuscript. The content of this publication does not necessarily reflect the views or policies of Yale University, Yale New Haven Health, or Qualidigm; nor does mention of trade names, commercial products, or organizations imply endorsement by Yale University, Yale New Haven Health, or Qualidigm. We assume full responsibility for the accuracy and completeness of the ideas represented.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Wang, Y., Kim, I. A Bootstrap-based Simple Probability Model for Classifying Network Traffic and Detecting Network Intrusion. Secur J 21, 278–290 (2008). https://doi.org/10.1057/palgrave.sj.8350073
Published:
Issue Date:
DOI: https://doi.org/10.1057/palgrave.sj.8350073