Phishing Website Detection: Forest by Penalizing Attributes Algorithm and Its Enhanced Variations

Alsariera, Yazan Ahmad; Elijah, Adeyemo Victor; Balogun, Abdullateef O.

doi:10.1007/s13369-020-04802-1

Phishing Website Detection: Forest by Penalizing Attributes Algorithm and Its Enhanced Variations

Research Article-Computer Engineering and Computer Science
Published: 21 July 2020

Volume 45, pages 10459–10470, (2020)
Cite this article

Arabian Journal for Science and Engineering Aims and scope Submit manuscript

Yazan Ahmad Alsariera ORCID: orcid.org/0000-0003-1359-6336¹,
Adeyemo Victor Elijah² &
Abdullateef O. Balogun^3,4

526 Accesses
29 Citations
1 Altmetric
Explore all metrics

Abstract

The damaging effect of phishing is traumatizing as attackers or hackers execute theft of sensitive information from users subtly for inappropriate or unauthorized usage. In the light of curbing phishing, blacklisting of websites proved ineffective as the deployment of phishing websites are rampantly increasing and often short-lived. Hence, machine learning (ML) methods are seen as viable measures and used to develop deplorable models that can detect a phishing website. ML methods are fast gaining attention and acceptance in detecting phishing websites as they can cope with the dynamism of phishing websites and attackers. However, ML methods still suffer some shortcomings in terms of low detection accuracy, high false alarm rate (FAR) and induced bias of developed ML solutions. In addition, with the evolving nature of phishing attacks, there is a continuing imperative need for novel and effective ML-based methods for detecting phishing websites. This study proposed 3 meta-learner models based on Forest Penalizing Attributes (ForestPA) algorithm. ForestPA uses a weight assignment and weight increment strategy to build highly efficient decision trees by exploiting the prowess of all attributes (non-class inclusive) in a given dataset. From the experimental results, the proposed meta-learners (ForestPA-PWDM, Bagged-ForestPA-PWDM, and Adab-ForestPA-PWDM) are highly efficient with the least accuracy of 96.26%, 0.004 FAR, and 0.994 ROC value. Further, with the superiority of the proposed models over other existing methods, we recommend the development and adoption of meta-learners based on ForestPA for phishing website detection and other cybersecurity attacks.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Heart Disease Prediction using Machine Learning Techniques

Article 16 October 2020

A random forest guided tour

Article 19 April 2016

A Review on Random Forest: An Ensemble Classifier

References

Yang, P.; Zhao, G.; Zeng, P.: Phishing website detection based on multidimensional features driven by deep learning. IEEE Access 7, 15196–15209 (2019)
Article Google Scholar
Abdulrahaman, M.D.; Alhassan, J.K.; Adebayo, O.S.; Ojeniyi, J.A.; Olalere, M.: Phishing attack detection based on random forest with wrapper feature selection method. Int. J. Inf. Process. Commun 7(2), 209–224 (2019)
Google Scholar
Ali, W.; Ahmed, A.A.: Hybrid intelligent phishing website prediction using deep neural networks with genetic algorithm-based feature selection and weighting. IET Inf. Secur. 13(6), 659–699 (2019)
Article Google Scholar
Ferreira, R.P.; et al.: Artificial neural network for websites classification with phishing characteristics. Soc. Netw. 7, 97–109 (2018)
Article Google Scholar
Wei, B.; et al.: A deep-learning-driven light-weight phishing detection sensor. Sensors 19(19), 4258 (2019)
Article Google Scholar
Soon, G.K.; Chiang, L.C.; On, C.K.; Rusli, N.M.; Fun, T.S.: Comparison of ensemble simple feedforward neural network and deep learning neural network on phishing detection. Lect. Notes Electr. Eng. 603, 595–604 (2020)
Article Google Scholar
Vrbančič, G.; Fister, I. Jr.; Podgorelec, V. (2018) Swarm intelligence approaches for parameter setting of deep learning neural network : case study on phishing websites classification. In: Int. Conf. Web Intell. Min. Semant.,
Gajera, K.; Jangid M.; Mehta, P.; Mittal, J. A novel approach to detect phishing attack using artificial neural networks combined with pharming detection. In: Proc. 3rd Int. Conf. Electron. Commun. Aerosp. Technol. ICECA 2019, pp. 196–200 (2019).
Zabihimayvan, M.; Doran, D. Fuzzy rough set feature selection to enhance phishing attack detection. In: IEEE Int. Conf. Fuzzy Syst., vol. 2019-June (2019)
Zhu, E.; Liu, D.; Ye, C.; Liu, F.; Li, X.; Sun, H. Effective phishing website detection based on improved BP neural network and dual feature evaluation. In: IEEE Intl Conf Parallel Distrib. Process. with Appl. Ubiquitous Comput. Commun. Big Data Cloud Comput. Soc. Comput. Networking, Sustain. Comput. Commun., pp. 759–765, (2018).
Singh, C.; Smt. Meenu, Phishing website detection based on machine learning: a survey. In: 6th International Conference on Advanced Computing & Communication Systems (ICACCS), 2020, pp. 398–404.
Mohammad, R.M.; Thabtah, F.; McCluskey, L.: Predicting phishing websites based on self-structuring neural network. Neural Comput. Appl. 25(2), 443–458 (2014)
Article Google Scholar
Subasi, A.; Molah, E.; Almkallawi, F.; Chaudhery, T. J.; Intelligent phishing website detection using random forest classifier. In: Int. Conf. Electr. Comput. Technol. Appl., vol. IEEE, pp. 1–5 (2017)
Zamir, A.; et al.: Phishing web site detection using diverse machine learning algorithms. Electron. Libr. 38(1), 65–80 (2020)
Article Google Scholar
Mazini, M.; Shirazi, B.; Mahdavi, I.: Anomaly network-based intrusion detection system using a reliable hybrid artificial bee colony and AdaBoost algorithms. J. King Saud Univ. Comput. Inf. Sci. 31(4), 541–553 (2019)
Article Google Scholar
Mohammad, R. M.; Thabtah, F.; McCluskey, L. An assessment of features related to phishing websites using an automated technique. 2012 IEEE, 2012. In: International Conference for Internet Technology and Secured Transactions., 2012, pp. 492–497.
Panigrahi, R.; Borah, S.: Dual-stage intrusion detection for class imbalance scenarios. Comput. Fraud Secur. 2019(12), 12–19 (2019)
Article Google Scholar
Zhou, Y.; Cheng, G.; Jiang, S.; Dai, M. An efficient intrusion detection system based on feature selection and ensemble classifier,” arXiv Prepr. arXiv1904.01352., 2019.
Sun, B.; Chen, S.; Wang, J.; Chen, H.: A robust multi-class AdaBoost algorithm for mislabeled noisy data. Knowledge Based Syst. 102, 87–102 (2016)
Article Google Scholar
Haixiang, G.; Yijing, L.; Yanan, L.; Xiao, L.; Jinling, L.: BPSO-Adaboost-KNN ensemble learning algorithm for multi-class imbalanced data classification. Eng. Appl. Artif. Intell. 49(October), 176–193 (2016)
Article Google Scholar
Collell, G.; Prelec, D.; Patil, K.R.: A simple plug-in bagging ensemble based on threshold-moving for classifying binary and multiclass imbalanced data. Neurocomputing 275, 330–340 (2018)
Article Google Scholar
Lee, S.J.; Xu, Z.; Li, T.; Yang, Y.: A novel bagging C4.5 algorithm based on wrapper feature selection for supporting wise clinical decision making. J. Biomed. Inform. 78, 144–155 (2018)
Article Google Scholar
David, J.; Thomas, C.: Efficient DDoS flood attack detection using dynamic thresholding on flow-based network traffic. Comput. Secur. 82, 284 (2019)
Article Google Scholar
Balogun, A.O.; Oladele, R.O.; Mojeed, H.A.; Amin-balogun, B.; Adeyemo, V.E.; Aro, T.O.: Performance analysis of selected clustering techniques for software defects prediction. African J. Comput. ICT 12(2), 30–42 (2019)
Google Scholar
Niyaz, Q.; Sun, W.; Javaid, A. Y.; Alam, M A deep learning approach for network intrusion detection system. In: Proceedings of the 9th EAI International Conference on Bio-inspired Information and Communications Technologies (formerly BIONETICS) (2016)

Download references

Author information

Authors and Affiliations

Department of Computer Science, Faculty of Science, Northern Border University, 73222, Arar, Kingdom of Saudi Arabia
Yazan Ahmad Alsariera
School of Built Environment, Engineering and Computing, Leeds Beckett University, Headingley Campus, Leeds, LS6 3QS, United Kingdom
Adeyemo Victor Elijah
Department of Computer and Information Sciences, Universiti Teknologi PETRONAS, Seri Iskandar, 32610, Perak, Malaysia
Abdullateef O. Balogun
Department of Computer Science, University of Ilorin, PMB 1515, Ilorin, Nigeria
Abdullateef O. Balogun

Authors

Yazan Ahmad Alsariera
View author publications
You can also search for this author in PubMed Google Scholar
Adeyemo Victor Elijah
View author publications
You can also search for this author in PubMed Google Scholar
Abdullateef O. Balogun
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yazan Ahmad Alsariera.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Alsariera, Y.A., Elijah, A.V. & Balogun, A.O. Phishing Website Detection: Forest by Penalizing Attributes Algorithm and Its Enhanced Variations. Arab J Sci Eng 45, 10459–10470 (2020). https://doi.org/10.1007/s13369-020-04802-1

Download citation

Received: 27 March 2020
Accepted: 14 July 2020
Published: 21 July 2020
Issue Date: December 2020
DOI: https://doi.org/10.1007/s13369-020-04802-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Phishing Website Detection: Forest by Penalizing Attributes Algorithm and Its Enhanced Variations

Abstract

Access this article

Similar content being viewed by others

Heart Disease Prediction using Machine Learning Techniques

A random forest guided tour

A Review on Random Forest: An Ensemble Classifier

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Phishing Website Detection: Forest by Penalizing Attributes Algorithm and Its Enhanced Variations

Abstract

Access this article

Similar content being viewed by others

Heart Disease Prediction using Machine Learning Techniques

A random forest guided tour

A Review on Random Forest: An Ensemble Classifier

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation