Abstract
Of the many challenges that continue to make detection of cyber-attack detection elusive, lack of training data remains the biggest one. Even though organizations and business turn to known network monitoring tools such as Wireshark, millions of people are still vulnerable because of lack of information pertaining to website behaviors and features that can amount to an attack. In fact, most of the attacks do not occur because of threat actors’ resort to complex coding and evasion techniques but because victims lack the basic tools to detect and avoid the attacks. Despite these challenges, machine learning is proving to revolutionize the understanding of the nature of cyber-attacks, and this study implemented machine learning techniques to Phishing Website data with the objective of comparing five algorithms and providing insight that the general public can use to avoid phishing pitfalls. The findings of the study suggest that Neural Network is the best performing algorithm and the model suggest that inclusion of an IP address in the domain name, longer URL, use of URL shortening services, inclusion of “@” symbol in the URL, inclusion of “−” symbol in the URL, use of non-trusted SSL certificates with expiry duration less than 6 months, domains registered for less than one year, and favicon redirecting from other URLs as the leading features of phishing websites. Neural Network is based on multi-layer perceptron and is the basis of intelligence so that in future, phishing detection will be automated and rendered an artificial intelligence task.
Keywords
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Asuncion, A., Newman, D.J.: UCI machine learning repository (2007). http://www.ics.uci.edu/~mlearn/MLRepository.html
Pietraszeka, T., Tanner, A.: Data mining and machine learning—towards reducing false positives in intrusion detection. Inf. Secur. Techn. Rep. 1(3), 169–183 (2005)
Kumar, V., Srivastava, J., Lazarevic, A.: Managing Cyberthreats: Issues, Approaches, and Challenges, vol. 5. Springer Science & Business Media (2006)
Saha, A., Sanyal, S.: Application layer intrusion detection with combination of explicit-rule-based and machine learning algorithms and deployment in cyber- defence program. Int. J. Adv. Netw. Appl. 6(2), 2202–2208 (2014)
Topham, L., et al.: Cyber security teaching and learning laboratories: a survey. Inf. Secur. 35(1), 51–80 (2016)
Bailetti, T., Gad, M., Shah, A.: Intrusion learning: an overview of an emergent discipline. Technol. Innov. Manag. Rev. 6(2), 15–20 (2016)
Dawson, M.: Hyper-Connectivity: Intricacies of National and International Cyber Securities. 10800987th, London Metropolitan University (United Kingdom), Ann Arbor (2017)
Sommer, R., Paxson, V.: Outside the closed world: on using machine learning for network intrusion detection. In: 2010 IEEE Symposium on Security and Privacy (SP), pp. 305–316. IEEE (2010)
Buczak, A., Guven, E.: A survey of data mining and machine learning methods for cybersecurity intrusion detection. IEEE Commun. Surv. Tutor. 18(2), 133–1176 (2016)
Hallaq, B., et al.: Artificial intelligence within the military domain and cyber warfare (2017)
Hurley, J.S.: Beyond the struggle: artificial intelligence in the department of defense (DoD) (2018)
Pechenkin, A., Demidov, R.: Application of deep neural networks for security analysis of digital infrastructure components (2018)
Ahmad, B., Wang, J., Zain, A.A.: Role of machine learning and data mining in internet security: standing state with future directions. J. Comput. Netw. Commun. 2018, 10 (2018)
Ahmad, B., Wang, J., Zain, A.A.: Role of machine learning and data mining in internet security: standing state with future directions. J. Comput. Netw. Commun. 2018, 10 (2018)
Li, C., Wang, J., Ye, X.: Using a recurrent neural network and restricted Boltzmann machines for malicious traffic detection. NeuroQuantology 16(5) (2018)
Teixeira, M.A., et al.: SCADA system testbed for cybersecurity research using machine learning approach. Future Internet 10(8), 76 (2018)
Ahmad, K., Yousef, M., et al.: Analyzing cyber-physical threats on robotic platforms. Sensors 18(5), 1643 (2018)
Ramotsoela, D., Abu-Mahfouz, A., Hancke, G.: A survey of anomaly detection in industrial wireless sensor networks with critical water system infrastructure as a case study. Sensors 18(8), 2491 (2018)
Williams, N., Zander, S., Armitage, G.: A preliminary performance comparison of five machine learning algorithms for practical IP traffic flow classification. SIGCOMM Comput. Commun. Rev. 36(5), 5–16 (2006)
Yamanishi, K., Takeuchi, J., Maruyama, Y.: Data mining for security. NEC J Adv Technol 2(1), 63–69 (2005)
Witten, I.H., Frank, E.: Data Mining—Practical Machine Learning Tools and Techniques, 2nd edn. Elsevier (2005)
Tesink, S.: Improving intrusion detection systems through machine learning (2007). http://ilk.uvt.nl/downloads/pub/papers/thesis-tesink.pdf
Čeponis, D., Goranin, N.: Towards a robust method of dataset generation of malicious activity for anomaly-based HIDS training and presentation of AWSCTD dataset. Baltic J Mod Comput 6(3), 217–234 (2018)
Li, Y., Qiu, R., Jing, S.: Intrusion detection system using Online Sequence Extreme Learning Machine (OS-ELM) in advanced metering infrastructure of smart grid. PLoS ONE 13(2) (2018)
Parrend, P., et al.: Foundations and applications of artificial Intelligence for zero-day and multi-step attack detection. EURASIP J. Inf. Secur. 2018(1), 1–21 (2018)
Siddiqui, M.Z., Yadav, S., Mohd, S.H.: application of artificial intelligence in fighting against cybercrimes: a review. Int. J. Adv. Res. Comput. Sci. 9, 118–121 (2018)
Monks, K., Sitnikova, E., Moustafa, N.: Cyber intrusion detection in operations of bulk handling ports (2018)
Masombuka, M., Grobler, M., Watson, B.: Towards an artificial intelligence framework to actively defend cyberspace (2018)
Zhao, Y., Japkowicz, N.: Anomaly behaviour detection based on the meta-Morisita index for large scale spatio-temporal data set. J. Big Data 5(1), 1–28 (2018)
Eskin, E., Portnoy, L.: Intrusion detection with unlabeled data using clustering. Columbia University, New York (1999)
Duddu, V.: A survey of adversarial machine learning in cyber warfare. Def. Sci. J. 68(4), 356–366 (2018)
Tolubko, V., et al.: Method for determination of cyber threats based on machine learning for real-time information system. Int. J. Intell. Syst. Appl. 10(8), 11 (2018)
Thakong, M., et al.: One-pass-throw-away learning for cybersecurity in streaming non-stationary environments by dynamic stratum network. PLoS ONE 13(9) (2018)
Alawad, H., Kaewunruen, S.: Wireless sensor networks: toward smarter railway stations. Infrastructures 3(3) (2018)
Amsaad, F., et al.: Reliable delay based algorithm to boost PUF security against modeling attacks. Information 9(9) (2018)
Nascimento, Z., Sadok, D.: MODC: a pareto-optimal optimization approach for network traffic classification based on the divide and conquer strategy. Information 9(9) (2018)
Kanatov, M., Atymtayeva, L., Yagaliyeva, B.: Expert systems for information security management and audit. Implementation phase issues. In 2014 Joint 7th International Conference on an Advanced Intelligent Systems (ISIS), 3th International Symposium on Soft Computing and Intelligent Systems (SCIS), pp. 896–900. IEEE (2014)
Eskin, E., Arnold, A., Portnoy, L.: A Geometric Framework for Unsupervised Anomaly Detection: Detecting Intrusions in Unlabeled Data, p. 4. Columbia University, New York (2001)
Snoek, J., Larochelle, H., Adams, R.: Practical Bayesian optimization of machine learning algorithms. In: Advances in Neural Information Processing Systems, pp. 2951–2959 (2012)
Almeida, M. Alzubi, M., Kovacs, S., Alkasassbeh, M.: Evaluation of machine learning algorithms for intrusion detection system. In: 2017 IEEE 3th International Symposium on Intelligent Systems and Informatics (SISY), pp. 000277–000282. IEEE (2018)
Ford, V., Siraj, A.: Applications of machine learning in cyber security. In: Proceedings of the 27th International Conference on Computer Applications in Industry and Engineering (2014)
Singh, N.: Artificial Neural Networks and Neural Networks Applications [Online] (2017). Available at: https://www.xenonstack.com/blog/data-science/artificial-neural-networks-applications-algorithms/. Accessed 3 Nov 2018
Lee, W., Stolfo, S.: Data mining approaches for intrusion detection. In: USENIX Security Symposium, pp. 79–93 (1998)
Acknowledgements
The challenges of accessing reliable cyber security dataset are well documented and a common one among researchers. As such, we are grateful to Rami Mustafa and Lee McCluskey of the University of Huddersfield and Fadi Thabtah of the Canadian University of Dubai for their preparing and sharing the data.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this chapter
Cite this chapter
Alloghani, M., Al-Jumeily, D., Hussain, A., Mustafina, J., Baker, T., Aljaaf, A.J. (2020). Implementation of Machine Learning and Data Mining to Improve Cybersecurity and Limit Vulnerabilities to Cyber Attacks. In: Yang, XS., He, XS. (eds) Nature-Inspired Computation in Data Mining and Machine Learning. Studies in Computational Intelligence, vol 855. Springer, Cham. https://doi.org/10.1007/978-3-030-28553-1_3
Download citation
DOI: https://doi.org/10.1007/978-3-030-28553-1_3
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-28552-4
Online ISBN: 978-3-030-28553-1
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)