Abstract
Consumer tastes have moved away from conventional shopping and toward electronic commerce due to the Internet’s fast growth. Rather than conducting bank or shop robberies, today’s criminals use a range of sophisticated cyber methods to track down their victims. Attackers have developed new ways of deceiving customers, such as phishing, using fake websites to gather sensitive information such as account IDs, usernames, and passwords. The semantic-based nature of the assaults, which mainly leverage the vulnerabilities of computer users, makes establishing the authenticity of a web page more difficult. Machine learning (ML) is a typical data analysis technique that has shown promising results in the battle against phishing. The article examines the applicability of machine learning methods for identifying phishing attempts and their advantages and disadvantages. Specifically, a variety of machine learning methods have been explored to find appropriate anti-Phishing technology solutions. More significantly, we used a wide range of machine learning methods to test real-world phishing datasets and against several criteria. To detect phishing websites, six different machine learning classification methods are employed. The Random Forest classifier had the most outstanding possible accuracy of 97.17% in this research, while the Gradient Boost Classifier had the highest achievable accuracy of 94.75%. The Decision Tree classifier has a provisioning accuracy of 94.69%. In contrast, Logistic Regression has a provisioning accuracy of 92.76%, KNN has a provisioning accuracy of 60.45%, and SVM has 56.04%. We showed that KNN has trouble detecting phishing sites since it hasn’t been updated in terms of accuracy. Decision trees are almost similar to Gradient Boosting in terms of performance.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Shaikh AN, Shabut AM, Alamgir Hossain M (2016) A literature review on phishing crime, prevention review and investigation of gaps. In: 2016 10th international conference on software, knowledge, information management & applications (SKIMA). IEEE
Scheau C, Arsene A, Dinca G (2016) Phishing and e-commerce: an information security management problem. J Def Resources Manage 7(1):12
Sarjiyus O, Oye ND, Baha BY (2019) Improved online security framework for e-banking services in Nigeria: a real world perspective. J Sci Res Rep 1–14
Mohammad RM, Thabtah F, McCluskey L (2015) Tutorial and critical analysis of phishing websites methods. Comput Sci Rev 17:1–24
Adebowale MA et al (2019) Intelligent web-phishing detection and protection scheme using integrated features of Images, frames and text. Expert Syst Appl 115:300–313
Ali A (2016) Social engineering: phishing latest and future techniques. Accessed 10 Mar 2015
Goel D, Jain AK (2018) Mobile phishing attacks and defence mechanisms: state of art and open research challenges. Comput Secur 73:519–544
FBI releases the internet crime complaint center 2020 internet crime report, including COVID-19 scam statistics. https://www.fbi.gov/news/pressrel/press-releases/fbi-releases-the-interne-crime-complaint-center-2020-internet-crime-report-including-covid-19-scam-statistics
Jain AK, Gupta BB (2016) A novel approach to protect against phishing attacks at client side using auto-updated white-list. EURASIP J Inf Secur 2016(1):1–11
Dhamija R, Doug Tygar J, Hearst M (2006) Why phishing works. In: Proceedings of the SIGCHI conference on Human Factors in computing systems
91% of all cyber attacks begin with a phishing email to an unexpected victim. https://www2.deloitte.com/my/en/pages/risk/articles/91-percent-of-all-cyber-attacks-begin-with-a-phishing-email-to-an-unexpected-victim.html
Phishing activity trends reports. https://apwg.org/trendsreports/
Charoen D (2011) Phishing: a field experiment. Int J Comput Sci Secur (IJCSS) 5(2):277
Jakobsson M, Myers S (eds) Phishing and countermeasures. Understanding the increasing problem of electronic identity theft. Wiley, Hoboken
Ramzan Z (2010) Phishing attacks and countermeasures. In: Handbook of information and communication security, pp 433–448
Must-know phishing statistics. https://www.tessian.com/blog/phishing-statistics-2020/
Jain AK, Gupta BB (2021) A survey of phishing attack techniques, defence mechanisms and open research challenges. Enterp Inf Syst 1–39
Passos IC, Mwangi B, Kapczinski F (2016) Big data analytics and machine learning: 2015 and beyond. Lancet Psychiat 3(1):13–15
Whittaker C, Ryner B, Nazif M (2010) Large-scale automatic classification of phishing pages
Pfleeger SL, Bloom G (2005) Canning spam: proposed solutions to unwanted email. IEEE Secur Priv 3(2):40–47
Zhang Y, Hong JI, Cranor LF (2007) Cantina: a content-based approach to detecting phishing web sites. In: Proceedings of the 16th international conference on World Wide Web
Islam R, Abawajy J (2013) A multi-tier phishing detection and filtering approach. J Netw Comput Appl 36(1):324–335
Mohammad RM, Thabtah F, McCluskey L (2014) Predicting phishing websites based on self-structuring neural network. Neural Comput Appl 25(2):443–458
Basit A et al (2020) A comprehensive survey of AI-enabled phishing attacks detection techniques. Telecommun Syst 1–16
Peng T, Harris I, Sawa Y (2018) Detecting phishing attacks using natural language processing and machine learning. In: 2018 IEEE 12th international conference on semantic computing (ICSC). IEEE
Phishing website detector. https://www.kaggle.com/eswarchandt/phishing-website-detector
Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, Vanderplas J, Duchesnay E (2011) Scikit-learn: machine learning in Python. J Mach Learn Res 12:2825–2830
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Mahamudul Hasan, S.M., Jakilim, N.M., Forhad Rabbi, M., Rahman Pir, R.M.S. (2022). Determining the Most Effective Machine Learning Techniques for Detecting Phishing Websites. In: Unhelker, B., Pandey, H.M., Raj, G. (eds) Applications of Artificial Intelligence and Machine Learning. Lecture Notes in Electrical Engineering, vol 925. Springer, Singapore. https://doi.org/10.1007/978-981-19-4831-2_48
Download citation
DOI: https://doi.org/10.1007/978-981-19-4831-2_48
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-19-4830-5
Online ISBN: 978-981-19-4831-2
eBook Packages: Computer ScienceComputer Science (R0)