Utilising Machine Learning Against Email Phishing to Detect Malicious Emails

Parmar, Yogeshvar Singh; Jahankhani, Hamid

doi:10.1007/978-3-030-88040-8_3

Yogeshvar Singh Parmar¹² &
Hamid Jahankhani¹²

Part of the book series: Advanced Sciences and Technologies for Security Applications ((ASTSA))

1417 Accesses
1 Citations

Abstract

Phishing is an identity theft evasion strategy used in which consumers accept bogus emails from fraudulent accounts that claim to belong to a legal and real company in the effort to steal sensitive information of the client. This act places many users’ privacy at risk, and therefore researchers continue to work on identifying and improving current detection instruments. Classification is one of the machine learning methods that can be used to detect emails received. Different classification algorithms such as Naïve Bayes and Support Vector Machine (SVM) are discussed and compared in the course of this study. In an integration of the monitored and unregulated strategies, a new method has been developed to detect phishing emails. The research also contrasts the collection classes for manual and automatic emails. Series of terms are used to acquire words to differentiate between malicious and non-malicious communications in this research. In predicting the class attribute, the exactness of the different classifiers has been compared. SVM approach has the most reliable classification and misclassification rates of malicious emails than the Naïve Bayes method. To date, 98% precision was achieved, but if a researcher has a big corpus of training data, it can also be increased further. This research aims to investigate whether email phishing during a pandemic has been accelerated and the proposed research highlights that the phishing sensitivity is focused on the protocols utilised in this research. The key purpose is to express a technique or algorithm for the dissection of mailbox information in order to identify it as phishing or to include a genuine email. Machine Learning is a part of Artificial Intelligence (AI), which uses the knowledge mining method to recognise new or current trends (or highlights) of a data set which is then used for characterisation purposes. This study will discuss the advancement and types of phishing attacks. It will examine the Machine Learning techniques and methods which are currently being utilised. The researcher will further analyse a structure on how to avoid phishing as well as recommending methods which can be improved upon for email phishing. Furthermore, the important role of human behaviour is highlighted i.e., working from home during the pandemic.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 149.00; Price excludes VAT (USA)

Softcover Book: USD 199.99; Price excludes VAT (USA)

Hardcover Book: USD 199.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Abu-Nimeh S, Nappa D, Wang X, Nair S (2007) A comparison of machine learning techniques for phishing detection. In: Proceedings of the anti-phishing working groups 2nd annual eCrime researchers summit, pp 60–69
Google Scholar
Abu-Nimeh S, Nappa D, Wang X, Nair S (2009) Distributed phishing detection by applying variable selection using Bayesian additive regression trees. In: 2009 IEEE international conference on communications. IEEE, pp 1–5
Google Scholar
Adida B, Bond M, Clulow J, Lin A, Murdoch S, Anderson R, Rivest R (2006) Phish and chips. In: International workshop on security protocols. Springer, Berlin, Heidelberg, pp 40–48
Google Scholar
Alguliev RM, Aliguliyev RM, Nazirova SA (2011) Classification of textual e-mail spam using data mining techniques. Appl Comput Intell Soft Comput 2011:10
Google Scholar
Almomani A, Gupta BB, Atawneh S, Meulenberg A, Almomani E (2013) A survey of phishing email filtering techniques. IEEE Commun Surv Tutor 15(4):2070–2090
Article Google Scholar
AmtrustFinancial (2021) Social engineering scams rise during COVID-19 | AmTrust Financial. https://amtrustfinancial.com/blog/small-business/social-engineering-scams-rise-covid19-pandemic. Accessed 12 Jan 2021
Azad MA, Morla R (2011) Multistage spit detection in transit voip. In: SoftCOM 2011, 19th international conference on software, telecommunications and computer networks. IEEE, pp 1–9
Google Scholar
Basnet RB, Sung AH (2010) Classifying phishing emails using confidence-weighted linear classifiers. In: International conference on information security and artificial intelligence (ISAI), pp 108–112
Google Scholar
Bergholz A, Chang J. Paass G, Reichartz F, Strobel S (2008) Improved phishing detection using model-based features. In: CEAS
Google Scholar
Brewster T (2021) Coronavirus scam alert: watch out for these risky COVID-19 Websites and Emails. [online] Forbes. https://www.forbes.com/sites/thomasbrewster/2020/03/12/coronavirus-scam-alert-watch-out-for-these-risky-covid-19-websites-and-emails/#2f558bca1099. Accessed 4 Jan 2021
Cao Y, Han W, Le Y (2008) Anti-phishing based on automated individual white-list. In: Proceedings of the 4th ACM workshop on digital identity management, pp 51–60
Google Scholar
Gansterer WN, Pölz D (2009) E-mail classification for phishing defense. In: European conference on information retrieval. Springer, Berlin, Heidelberg, pp 449–460
Google Scholar
Jameel NGM, George LE (2013) Detection of phishing emails using feed forward neural network. Int J Comput Appl 77(7)
Google Scholar
Khonji M, Jones A, Iraqi Y (2013) An empirical evaluation for feature selection methods in phishing email classification. Int J Comput Syst Sci Eng 28(1):37–51
Google Scholar
Kumar RK, Poonkuzhali G, Sudhakar P (2012) Comparative study on email spam classifier using data mining techniques. In: Proceedings of the international multiconference of engineers and computer scientists, vol 1, pp 14–16
Google Scholar
Kumaraguru P, Sheng S, Acquisti A, Cranor LF, Hong J (2010) Teaching Johnny not to fall for phish. ACM Trans Internet Technol (TOIT) 10(2):1–31
Article Google Scholar
Ma L, Torney R, Watters P, Brown S (2009) Automatically generating classifier for phishing email prediction. In: 2009 10th international symposium on pervasive systems, algorithms, and networks. IEEE, pp 779–783
Google Scholar
Muncaster P (2020) COVID19 fears drive phishing emails up 667% in under a month. [online] Infosecurity Magazine. Available at: [Accessed 9 June 2020]
Google Scholar
Nizamani S, Memon N, Glasdam M, Nguyen DD (2014) Detection of fraudulent emails by employing advanced feature abundance. Egypt Inform J 15(3):169–174
Article Google Scholar
Paaß G, Bergholz A (2009) AntiPhish-machine learning for phishing detection. Project Exhibition at ECML/PKDD, 8
Google Scholar
Ramanathan V, Wechsler H (2012) phishGILLNET—phishing detection methodology using probabilistic latent semantic analysis, AdaBoost, and co-training. EURASIP J Inf Secur 2012(1):1
Article Google Scholar
Thomson Reuters Institute (2021) COVID-19 and financial scams, fraud and misinformation: what you need to know—Thomson Reuters Institute. https://www.thomsonreuters.com/en-us/posts/government/covid-19-scams-frauds/. Accessed 4 Jan 2021
Tidy J (2020) Google blocking 18M coronavirus scam emails a day. BBC News. [online] Available at: [Accessed 7 June 2020].
Google Scholar
Toolan F, Carthy J (2009) Phishing detection using classifier ensembles. In: 2009 eCrime researchers summit. IEEE, pp 1–9
Google Scholar
Wu Y, Zhao Z, Qiu Y, Bao F (2010) Blocking foxy phishing emails with historical information. In: 2010 IEEE international conference on communications. IEEE, pp 1–5
Google Scholar
Zhang W, Lu H, Xu B, Yang H (2013) Web phishing detection based on page spatial layout similarity. Informatica 37(3)
Google Scholar

Download references

Author information

Authors and Affiliations

Northumbria University, London, UK
Yogeshvar Singh Parmar & Hamid Jahankhani

Authors

Yogeshvar Singh Parmar
View author publications
You can also search for this author in PubMed Google Scholar
Hamid Jahankhani
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hamid Jahankhani .

Editor information

Editors and Affiliations

Hillary Rodham Clinton School of Law, Swansea University, Swansea, UK
Reza Montasari
Northumbria University, London, UK
Hamid Jahankhani

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Parmar, Y.S., Jahankhani, H. (2021). Utilising Machine Learning Against Email Phishing to Detect Malicious Emails. In: Montasari, R., Jahankhani, H. (eds) Artificial Intelligence in Cyber Security: Impact and Implications. Advanced Sciences and Technologies for Security Applications. Springer, Cham. https://doi.org/10.1007/978-3-030-88040-8_3

Download citation

DOI: https://doi.org/10.1007/978-3-030-88040-8_3
Published: 01 January 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-88039-2
Online ISBN: 978-3-030-88040-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics