Abstract
Detecting, analyzing, and defending against cyber threats is an important topic in cyber security. Applying machine learning techniques to detect such threats has received considerable attention in research literature. Anomalies of Border Gateway Protocol (BGP) affect network operations and their detection is of interest to researchers and practitioners. In this Chapter, we describe main properties of the protocol and datasets that contain BGP records collected from various public and private domain repositories such as Route Views, Réseaux IP Européens (RIPE), and BCNET. We employ various feature selection algorithms to extract the most relevant features that are later used to classify BGP anomalies.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
(Mar. 2018) BCNET. [Online]. Available: http://www.bc.net.
(Mar. 2018) Data Mining Tools See5 and C5.0. [Online]. Available: http://www.rulequest.com/see5-info.html.
(Mar. 2018) Sans Institute. The mechanisms and effects of the Code Red worm. [Online]. Available: https://www.sans.org/reading-room/whitepapers/dlp/mechanisms-effects-code-red-worm-87.
(Mar. 2018) The Internet Engineering Task Force (IETF) [Online]. Available: https://www.ietf.org/.
(Mar. 2018) bgpdump [Online]. Available: https://bitbucket.org/ripencc/bgpdump/wiki/Home.
(Mar. 2018) mRMR feature selection (using mutual information computation). [Online]. Available: https://www.mathworks.com/matlabcentral/fileexchange/14608-mrmr-feature-selection--using-mutual-information-computation-.
(Mar. 2018) MRT rooting information export format. [Online]. Available: http://tools.ietf.org/html/draft-ietf-grow-mrt-13.
(Mar. 2018) Sans Institute. Nimda worm—why is it different? [Online]. Available: http://www.sans.org/reading-room/whitepapers/malicious/nimda-worm-different-98.
(Mar. 2018) RIPE NCC: RIPE Network Coordination Center. [Online]. Available: http://www.ripe.net/data-tools/stats/ris/ris-raw-data.
(Mar. 2018) YouTube Hijacking: A RIPE NCC RIS case study [Online]. Available: http://www.ripe.net/internet-coordination/news/industry-developments/youtube-hijacking-a-ripe-ncc-ris-case-study.
(Mar. 2018) University of Oregon Route Views project [Online]. Available: http://www.routeviews.org/.
(Mar. 2018) Center for Applied Internet Data Analysis. The Spread of the Sapphire/Slammer Worm [Online]. Available: http://www.caida.org/publications/papers/2003/sapphire/.
(Mar. 2018) Sans Institute. Malware FAQ: MS-SQL Slammer. [Online]. Available: https://www.sans.org/security-resources/malwarefaq/ms-sql-exploit.
T. Ahmed, B. Oreshkin, and M. Coates, “Machine learning approaches to network anomaly detection,” in Proc. USENIX Workshop on Tackling Computer Systems Problems with Machine Learning Techniques, Cambridge, MA, Apr. 2007, pp. 1–6.
N. Al-Rousan and Lj. Trajković, “Machine learning models for classification of BGP anomalies,” in Proc. IEEE Conf. on High Performance Switching and Routing (HPSR), Belgrade, Serbia, June 2012, pp. 103–108.
N. Al-Rousan, S. Haeri, and Lj. Trajković, “Feature selection for classification of BGP anomalies using Bayesian models,” in Proc. Int. Conf. Mach. Learn. Cybern. (ICMLC), Xi’an, China, July 2012, pp. 140–147.
K. El-Arini and K. Killourhy, “Bayesian detection of router configuration anomalies,” in Proc. Workshop Mining Network Data, Philadelphia, PA, USA, Aug. 2005, pp. 221–222.
M. Bhuyan, D. Bhattacharyya, and J. Kalita, “Network anomaly detection: methods, systems and tools,” IEEE Commun. Surveys Tut., vol. 16, no. 1, pp. 303–336, Mar. 2014.
L. Breiman, “Bagging predictors,” Machine Learning, vol. 24, no. 2, pp. 123–140, Aug. 1996.
Y.-W. Chen and C.-J. Lin, “Combining SVMs with various feature selection strategies,” Strategies, vol. 324, no. 1, pp. 1–10, Nov. 2006.
J. Chen, H. Huang, S. Tian, and Y. Qu, “Feature selection for text classification with naive Bayes,” Expert Systems with Applications, vol. 36, no. 3, pp. 5432–5435, Apr. 2009.
M. Ćosović, S. Obradović, and Lj. Trajković, “Classifying anomalous events in BGP datasets,” in Proc. The 29th Annu. IEEE Can. Conf. on Elect. and Comput. Eng. (CCECE), Vancouver, Canada, May 2016, pp. 697–700.
S. Deshpande, M. Thottan, T. K. Ho, and B. Sikdar, “An online mechanism for BGP instability detection and analysis,” IEEE Trans. Comput., vol. 58, no. 11, pp. 1470–1484, Nov. 2009.
Q. Ding, Z. Li, P. Batta, and Lj. Trajković, “Detecting BGP anomalies using machine learning techniques,” in Proc. IEEE Int. Conf. Syst., Man, and Cybern., Budapest, Hungary, Oct. 2016, pp. 3352–3355.
T. Farah, S. Lally, R. Gill, N. Al-Rousan, R. Paul, D. Xu, and Lj. Trajković, “Collection of BCNET BGP traffic,” in Proc. 23rd ITC, San Francisco, CA, USA, Sept. 2011, pp. 322–323.
Q. Gu, Z. Li, and J. Han, “Generalized Fisher score for feature selection,” in Proc. Conf. Uncertainty in Artificial Intelligence, Barcelona, Spain, July 2011, pp. 266–273.
H. Hajji, “Statistical analysis of network traffic for adaptive faults detection,” IEEE Trans. Neural Netw., vol. 16, no. 5, pp. 1053–1063, Sept. 2005.
G. H. John, R. Kohavi, and K. Pfleger, “Irrelevant features and the subset selection problem,” in Proc. Int. Conf. Machine Learning, New Brunswick, NJ, USA, July 1994, pp. 121–129.
M. N. A. Kumar and H. S. Sheshadri, “On the classification of imbalanced datasets,” Int. J. Comput. Appl., vol. 44, no. 8, pp. 1–7, Apr. 2012.
J. Kurose and K. W. Ross, “Computer Networking: A Top-Down Approach (6th edition).” Addison-Wesley, 2012, pp. 305–431.
S. Lally, T. Farah, R. Gill, R. Paul, N. Al-Rousan, and Lj. Trajković, “Collection and characterization of BCNET BGP traffic,” in Proc. 2011 IEEE Pacific Rim Conf. Commun., Comput. and Signal Process., Victoria, BC, Canada, Aug. 2011, pp. 830–835.
F. Lau, S. H. Rubin, M. H. Smith, and Lj. Trajković, “Distributed denial of service attacks,” in Proc. IEEE Int. Conf. Syst., Man, and Cybern., SMC 2000, Nashville, TN, USA, Oct. 2000, pp. 2275–2280.
J. Li, D. Dou, Z. Wu, S. Kim, and V. Agarwal, “An Internet routing forensics framework for discovering rules of abnormal BGP events,” SIGCOMM Comput. Commun. Rev., vol. 35, no. 5, pp. 55–66, Oct. 2005.
Y. Li, H. J. Xing, Q. Hua, X.-Z. Wang, P. Batta, S. Haeri, and Lj. Trajković, “Classification of BGP anomalies using decision trees and fuzzy rough sets,” in Proc. IEEE Trans. Syst., Man, Cybern., San Diego, CA, USA, Oct. 2014, pp. 1331–1336.
R. O. Duda, P. E. Hart, and D. G. Stork, Pattern Classification. Hoboken, NJ, USA: Wiley-Interscience Publication, 2001.
H. Liu, H. Motoda, Eds., Computational Methods of Feature Selection. Boca Raton, FL, USA: Chapman and Hall/CRC Press, 2007.
(Mar. 2018) D. Meyer, “BGP communities for data collection,” RFC 4384, IETF, Feb. 2006. [Online]. Available: http://www.ietf.org/rfc/rfc4384.txt.
Z. Pawlak, “Rough sets,” Int. J. Inform. and Comput. Sci., vol. 11, no. 5, pp. 341–356, Oct. 1982.
C. Patrikakis, M. Masikos, and O. Zouraraki, “Distributed denial of service attacks,” The Internet Protocol, vol. 7, no. 4, pp. 13–31, Dec. 2004.
H. Peng, F. Long, and C. Ding, “Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 27, no. 8, pp. 1226–1238, Aug. 2005.
(Mar. 2018) A. C. Popescu, B. J. Premore, and T. Underwood, The anatomy of a leak: AS9121. Renesys Corporation, Manchester, NH, USA. May 2005. [Online]. Available: http://50.31.151.73/meetings/nanog34/presentations/underwood.pdf.
J. R. Quinlan, “Induction of decision trees,” Mach. Learn., vol. 1, no. 1, pp. 81–106, Mar. 1986.
A. M. Radzikowska and E. E. Kerre, “A comparative study of fuzzy rough sets,” Fuzzy Sets and Syst., vol. 126, no. 2, pp. 137–155, Mar. 2002.
(Mar. 2018) Y. Rekhter and T. Li, “A Border Gateway Protocol 4 (BGP-4),” RFC 1771, IETF, Mar. 1995. [Online]. Available: http://tools.ietf.org/rfc/rfc1771.txt.
(Mar. 2018) Y. Rekhter, T. Li, and S. Hares, “A Border Gateway Protocol 4 (BGP-4),” RFC 4271, IETF, Jan. 2016. [Online]. Available: http://tools.ietf.org/rfc/rfc4271.txt.
L. Rokach and O. Maimon, “Top-down induction of decision trees classifiers—a survey,” IEEE Trans. Syst., Man, Cybern., Appl. and Rev., vol. 35, no. 4, pp. 476–487, Nov. 2005.
M. Thottan and C. Ji, “Anomaly detection in IP networks,” IEEE Trans. Signal Process., vol. 51, no. 8, pp. 2191–2204, Aug. 2003.
L. Wang, X. Zhao, D. Pei, R. Bush, D. Massey, A. Mankin, S. F. Wu, and L. Zhang, “Observation and analysis of BGP behavior under stress,” in Proc. 2nd ACM SIGCOMM Workshop on Internet Meas., New York, NY, USA, 2002, pp. 183–195.
J. Wang, X. Chen, and W. Gao, “Online selecting discriminative tracking features using particle filter,” in Proc. Comput. Vision and Pattern Recognition, San Diego, CA, USA, June 2005, vol. 2, pp. 1037–1042.
X.-Z. Wang, L. C. Dong, and J. H. Yan, “Maximum ambiguity based sample selection in fuzzy decision tree induction,” IEEE Trans. Knowl. Data Eng., vol. 24, no. 8, pp. 1491–1505, Aug. 2012.
D. P. Watson and D. H. Scheidt, “Autonomous systems,” Johns Hopkins APL Technical Digest, vol. 26, no. 4, pp. 368–376, Oct.–Dec. 2005.
D. S. Yeung, D. G. Chen, E. C. C. Tsang, J. W. T. Lee, and X.-Z. Wang, “On the generalization of fuzzy rough sets,” IEEE Trans. Fuzz. Syst., vol. 13, no. 3, pp. 343–361, June 2005.
J. Zhang, J. Rexford, and J. Feigenbaum, “Learning-based anomaly detection in BGP updates,” in Proc. Workshop Mining Netw. Data, Philadelphia, PA, USA, Aug. 2005, pp. 219–220.
Y. Zhang, Z. M. Mao, and J. Wang, “A firewall for routers: protecting against routing misbehavior,” in Proc. 37th Annu. IEEE/IFIP Int. Conf. on Dependable Syst. and Netw., Edinburgh, UK, June 2007, pp. 20–29.
Acknowledgements
We thank Yan Li, Hong-Jie Xing, Qiang Hua, and Xi-Zhao Wang from Hebei University, Marijana Ćosović from University of East Sarajevo, and Prerna Batta from Simon Fraser University for their helpful contributions in earlier publications related to this project.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this chapter
Cite this chapter
Ding, Q., Li, Z., Haeri, S., Trajković, L. (2018). Application of Machine Learning Techniques to Detecting Anomalies in Communication Networks: Datasets and Feature Selection Algorithms. In: Dehghantanha, A., Conti, M., Dargahi, T. (eds) Cyber Threat Intelligence. Advances in Information Security, vol 70. Springer, Cham. https://doi.org/10.1007/978-3-319-73951-9_3
Download citation
DOI: https://doi.org/10.1007/978-3-319-73951-9_3
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-73950-2
Online ISBN: 978-3-319-73951-9
eBook Packages: Computer ScienceComputer Science (R0)