URL based phishing attack detection using BiLSTM-gated highway attention block convolutional neural network

Nanda, Manika; Goel, Shivani

doi:10.1007/s11042-023-17993-0

URL based phishing attack detection using BiLSTM-gated highway attention block convolutional neural network

Published: 31 January 2024

(2024)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Manika Nanda¹ &
Shivani Goel²

172 Accesses
Explore all metrics

Abstract

Phishing is an attack that attempts to replicate the official websites of businesses, including government agencies, financial institutions, e-commerce platforms, and banks. These fraudulent websites aim to obtain sensitive information from users, such as credit card numbers, email addresses, passwords, and personal identities. In response to the increasing number of phishing assaults, several anti-phishing strategies have been developed. However, existing techniques often fail to extract the most crucial features, leading to potential misclassification. Additionally, the complex algorithms employed result in high response times. To address these challenges, this paper proposes a novel approach called Bidirectional Long Short-Term Memory based Gated Highway Attention block Convolutional Neural Network (BiLSTM-GHA-CNN) for detecting phishing URLs. The BiLSTM captures contextual features, while the CNN extracts salient features. The integration of the highway network into the BiLSTM-CNN architecture enables the capture of significant features with rapid convergence. Furthermore, a gating mechanism is employed to weigh the output features of the CNN and BiLSTM. Five datasets from diverse sources such as Phish Tank and Open Phish were created for experimentation. The results demonstrate that BiLSTM-GHA-CNN achieves superior detection accuracy, precision recall, and F1-score compared to state-of-the-art techniques. Moreover, the proposed system significantly reduces the response time to a remarkable 12.46 ms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

CCBLA: a Lightweight Phishing Detection Model Based on CNN, BiLSTM, and Attention Mechanism

Article 18 May 2022

A Character-Level BiGRU-Attention for Phishing Classification

Bidirectional LSTM: An Innovative Approach for Phishing URL Identification

Data Availability

Web page Phishing Detection dataset: https://www.kaggle.com/datasets/shashwatwork/web-page-phishing-detection-dataset

PhishTank: https://www.phishtank.com/

OpenPhish: https://www.openphish.com/

Banking website: https://www.similarweb.com/top-websites/category/finance/banking-credit-and-lending/

Common Crawl: https://www.domcop.com/top-10-million-websites

Abbreviations

ML :: Machine Learning
BiLSTM :: Bidirectional Long Short-Term Memory
GHA :: Gated Highway Attention
CNN :: Convolutional Neural Network
KNN :: K-Nearest Neighbors
SVM :: Support Vector Machine
MFET :: Multiple feature extraction technique
AWT :: Automated Whitelist Technique
MSA :: Multi-head self-attention
GAN :: Generative Adversarial Network
NIOSELM :: Non-inverse matrix extreme learning machine
ADASYN :: Adaptive Synthetic Sampling
DAE :: Denoising auto-encoder
DNN :: Deep Neural Network
DTOF-ANN :: Decision Tree and Optimal Features based Artificial Neural Network
TF-IDF :: Term Frequency-Inverse Document Frequency
HABCNN :: Highway attention block CNN
ABM :: Attention block module
RNN :: Recurrent Neural Network
TPR :: True Positive Rate
FPR :: False Positive Rate
TNR :: True Negative Rate
FNR :: False Negative Rate

References

Sahingoz OK, Buber E, Demir O, Diri B (2019) Machine learning based phishing detection from URLs. Expert Syst Appl 117:345–357
Article Google Scholar
Sánchez-Paniagua M, Fernández EF, Alegre E, Al-Nabki W, Gonzalez-Castro V (2022) Phishing URL detection: A real-case scenario through login URLs. IEEE Access 10:42949–42960
Article Google Scholar
Hota HS, Shrivas AK, Hota R (2018) An ensemble model for detecting phishing attack with proposed remove-replace feature selection technique. Procedia Comput Sci 132:900–907
Article Google Scholar
Vijayalakshmi M, Mercy Shalinie S, Yang MH, U RM, (2020) Web phishing detection techniques: a survey on the state-of-the-art, taxonomy and future directions. Iet Networks 9(5):235–246
Article Google Scholar
Jain AK, Gupta BB (2019) A machine learning based approach for phishing detection using hyperlinks information. J Ambient Intell Humaniz Comput 10(5):2015–2028
Article Google Scholar
Chiew KL, Tan CL, Wong K, Yong KS, Tiong WK (2019) A new hybrid ensemble feature selection framework for machine learning-based phishing detection system. Inf Sci 484:153–166
Article Google Scholar
Yang P, Zhao G, Zeng P (2019) Phishing website detection based on multidimensional features driven by deep learning. IEEE Access 7:15196–15209
Article Google Scholar
Ahammad SH, Kale SD, Upadhye GD, Pande SD, Babu EV, Dhumane AV, Bahadur MD (2022) Phishing URL detection using machine learning methods. Adv Eng Softw 173:103288
Article Google Scholar
Li Y, Yang Z, Chen X, Yuan H, Liu W (2019) A stacking model using URL and HTML features for phishing webpage detection. Futur Gener Comput Syst 94:27–39
Article Google Scholar
Sameen M, Han K, Hwang SO (2020) PhishHaven—an efficient real-time ai phishing URLs detection system. IEEE Access 8:83425–83443
Article Google Scholar
Sonowal G, Kuppusamy KS (2020) PhiDMA–A phishing detection model with multi-filter approach. J King Saud Univ-Comput Inf Sci 32(1):99–112
Google Scholar
El Aassal A, Baki S, Das A, Verma RM (2020) An in-depth benchmarking and evaluation of phishing detection research for security needs. IEEE Access 8:22170–22192
Article Google Scholar
Elsadig M, Ibrahim AO, Basheer S, Alohali MA, Alshunaifi S, Alqahtani H, Alharbi N, Nagmeldin W (2022) Intelligent Deep Machine Learning Cyber Phishing URL Detection Based on BERT Features Extraction. Electronics 11(22):3647
Article Google Scholar
Suleman MT, Awan SM (2019) Optimization of URL-based phishing websites detection through genetic algorithms. Autom Control Comput Sci 53(4):333–341
Article Google Scholar
Catal C, Giray G, Tekinerdogan B, Kumar S, Shukla S (2022) Applications of deep learning for phishing detection: a systematic literature review. Knowl Inf Syst 64(6):1457–1500
Article Google Scholar
Barraclough HPA, Fehringer G, Woodward J (2021) Intelligent cyber-phishing detection for online. Comput Secur 104:102123
Article Google Scholar
SatheeshKumar M, Srinivasagan KG, UnniKrishnan G (2022) A lightweight and proactive rule-based incremental construction approach to detect phishing scam. Inf Technol Manage 23(4):271–298
Article Google Scholar
Aldakheel EA, Zakariah M, Gashgari GA, Almarshad FA, Alzahrani AI (2023) A Deep Learning-Based Innovative Technique for Phishing Detection in Modern Security with Uniform Resource Locators. Sensors 23(9):4403
Article Google Scholar
Assegie TA (2021) K-nearest neighbor based url identification model for phishing attack detection. Indian J Artif Intell Neural Networking (IJAINN). https://doi.org/10.35940/ijainn.B1019.041221
Alsariera YA, Elijah AV, Balogun AO (2020) Phishing website detection: forest by penalizing attributes algorithm and its enhanced variations. Arab J Sci Eng 45:10459–10470
Article Google Scholar
Anupam S, Kar AK (2021) Phishing website detection using support vector machines and nature-inspired optimization algorithms. Telecommun Syst 76(1):17–32
Article Google Scholar
Dong S, Wang P, Abbas K (2021) A survey on deep learning and its applications. Comput Sci Rev 40:100379
Article MathSciNet Google Scholar
Gupta BB, Yadav K, Razzak I, Psannis K, Castiglione A, Chang X (2021) A novel approach for phishing URLs detection using lexical based machine learning in a real-time environment. Comput Commun 175:47–57
Article Google Scholar
Azeez NA, Misra S, Margaret IA, Fernandez-Sanz L (2021) Adopting automated whitelist approach for detecting phishing attacks. Comput Secur 108:102328
Article Google Scholar
Xiao X, Xiao W, Zhang D, Zhang B, Hu G, Li Q, Xia S (2021) Phishing websites detection via CNN and multi-head self-attention on imbalanced datasets. Comput Secur 108:102372
Article Google Scholar
Yang L, Zhang J, Wang X, Li Z, Li Z, He Y (2021) An improved ELM-based and data preprocessing integrated approach for phishing detection considering comprehensive features. Expert Syst Appl 165:113863
Article Google Scholar
Alshingiti Z, Alaqel R, Al-Muhtadi J, Haq QE, Saleem K, Faheem MH (2023) A Deep Learning-Based Phishing Detection System Using CNN, LSTM, and LSTM-CNN. Electronics 12(1):232
Article Google Scholar
Ozcan A, Catal C, Donmez E, Senturk B (2021) A hybrid DNN–LSTM model for detecting phishing URLs. Neural Comput Appl 1–7
Zhu E, Ju Y, Chen Z, Liu F, Fang X (2020) DTOF-ANN: an artificial neural network phishing detection model based on decision tree and optimal features. Appl Soft Comput 95:106505
Article Google Scholar
Rao RS, Vaishnavi T, Pais AR (2020) CatchPhish: detection of phishing websites by inspecting URLs. J Ambient Intell Humaniz Comput 11(2):813–825
Article Google Scholar
Nowroozi E, Mohammadi M, Conti M (2022) An adversarial attack analysis on malicious advertisement url detection framework. IEEE Trans Netw Serv Man https://doi.org/10.1109/TNSM.2022.3225217
Karim A, Shahroz M, Mustofa K, Belhaouari SB, Joga SR (2023) Phishing detection system through hybrid machine learning based on URL. IEEE Access 11:36805–36822
Article Google Scholar
Prabakaran MK, MeenakshiSundaram P, Chandrasekar AD (2023) An enhanced deep learning-based phishing detection mechanism to effectively identify malicious URLs using variational autoencoders. IET Inf Secur 17(3):423–440
Article Google Scholar
Kumar PP, Jaya T, Rajendran V (2023) SI-BBA–A novel phishing website detection based on Swarm intelligence with deep learning. Materials Today: Proceedings 80:3129–3139
Google Scholar
Su KW, Wu KP, Lee HM, Wei TE (2013) Suspicious URL filtering based on logistic regression with multi-view analysis. In 2013 Eighth Asia joint conference on information security (pp. 77–84). IEEE
Ali F, Khan P, Riaz K, Kwak D, Abuhmed T, Park D, Kwak KS (2017) A fuzzy ontology and SVM–based Web content classification system. IEEE Access 5:25781–25797
Article Google Scholar
Adeniyi DA, Wei Z, Yongquan Y (2016) Automated web usage data mining and recommendation system using K-Nearest Neighbor (KNN) classification method. Appl Comput Inf 12(1):90–108
Google Scholar
Subasi A, Molah E, Almkallawi F, Chaudhery TJ (2017) Intelligent phishing website detection using random forest classifier. In 2017 International conference on electrical and computing technologies and applications (ICECTA) (pp. 1–5). IEEE
He S, Li B, Peng H, Xin J, Zhang E (2021) An effective cost-sensitive XGBoost method for malicious URLs detection in imbalanced dataset. IEEE Access 9:93089–93096
Article Google Scholar
Subasi A, Kremic E (2020) Comparison of adaboost with multi boosting for phishing website detection. Procedia Comput Sci 168:272–278
Article Google Scholar
Rajalakshmi R, Aravindan C (2018) A Naive Bayes approach for URL classification with supervised feature selection and rejection framework. Comput Intell 34(1):363–396
Article MathSciNet Google Scholar
Krishnan M, Lim Y, Perumal S, Palanisamy G (2022) Detection and defending the XSS attack using novel hybrid stacking ensemble learning-based DNN approach. Digital Communications and Networks. https://doi.org/10.1016/j.dcan.2022.09.024
Somesha M, Pais AR, Rao RS, Rathour VS (2020) Efficient deep learning techniques for the detection of phishing websites. Sādhanā 45:1–8
Article Google Scholar
Roy SS, Awad AI, Amare LA, Erkihun MT, Anas M (2022) Multimodel Phishing URL Detection Using LSTM, Bidirectional LSTM, and GRU Models. Future Internet 14(11):340
Article Google Scholar
Tang L, Mahmoud QH (2021) A deep learning-based framework for phishing website detection. IEEE Access 10:1509–1521
Article Google Scholar
Al-Ahmadi S, Alotaibi A, Alsaleh O (2022) PDGAN: Phishing Detection with Generative Adversarial Networks. IEEE Access. https://doi.org/10.1109/ACCESS.2022.3168235
Firdaus M, Madasu A, Ekbal A (2023) A Unified Framework for Slot based Response Generation in a Multimodal Dialogue System. arXiv preprint arXiv:2305.17433. https://doi.org/10.1007/s11042-023-15915-8

Download references

Funding

There is no funding for this study.

Author information

Authors and Affiliations

Bennett University, Greater Noida, UP, India
Manika Nanda
SR University, Warangal, India
Shivani Goel

Authors

Manika Nanda
View author publications
You can also search for this author in PubMed Google Scholar
Shivani Goel
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

All authors contributed to the study conception and design. Material preparation, data collection and analysis were performed by Manika Nanda, Shivani Goel. The first draft of the manuscript was written by Manika Nanda and all authors commented on previous versions of the manuscript. All authors read and approved the final manuscript.

Conceptualization: Manika Nanda, Pankaj Sharma; Methodology: Manika Nanda; Formal analysis and investigation: Manika Nanda, Shivani Goel; Writing—original draft preparation: Manika Nanda, Pankaj Sharma; Writing—review and editing: Shivani Goel, Pankaj Sharma; Supervision: Shivani Goel.

Corresponding author

Correspondence to Manika Nanda.

Ethics declarations

Ethical approval

This article does not contain any studies with human participants and/or animals performed by any of the authors.

Informed consent

There is no informed consent for this study.

Conflict of Interest

Authors declares that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Nanda, M., Goel, S. URL based phishing attack detection using BiLSTM-gated highway attention block convolutional neural network. Multimed Tools Appl (2024). https://doi.org/10.1007/s11042-023-17993-0

Download citation

Received: 16 November 2022
Revised: 21 December 2023
Accepted: 25 December 2023
Published: 31 January 2024
DOI: https://doi.org/10.1007/s11042-023-17993-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

URL based phishing attack detection using BiLSTM-gated highway attention block convolutional neural network

Abstract

Access this article

Similar content being viewed by others

CCBLA: a Lightweight Phishing Detection Model Based on CNN, BiLSTM, and Attention Mechanism

A Character-Level BiGRU-Attention for Phishing Classification

Bidirectional LSTM: An Innovative Approach for Phishing URL Identification

Data Availability

Abbreviations

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Ethical approval

Informed consent

Conflict of Interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

URL based phishing attack detection using BiLSTM-gated highway attention block convolutional neural network

Abstract

Access this article

Similar content being viewed by others

CCBLA: a Lightweight Phishing Detection Model Based on CNN, BiLSTM, and Attention Mechanism

A Character-Level BiGRU-Attention for Phishing Classification

Bidirectional LSTM: An Innovative Approach for Phishing URL Identification

Data Availability

Abbreviations

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Ethical approval

Informed consent

Conflict of Interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation