Abstract
The damaging effect of phishing is traumatizing as attackers or hackers execute theft of sensitive information from users subtly for inappropriate or unauthorized usage. In the light of curbing phishing, blacklisting of websites proved ineffective as the deployment of phishing websites are rampantly increasing and often short-lived. Hence, machine learning (ML) methods are seen as viable measures and used to develop deplorable models that can detect a phishing website. ML methods are fast gaining attention and acceptance in detecting phishing websites as they can cope with the dynamism of phishing websites and attackers. However, ML methods still suffer some shortcomings in terms of low detection accuracy, high false alarm rate (FAR) and induced bias of developed ML solutions. In addition, with the evolving nature of phishing attacks, there is a continuing imperative need for novel and effective ML-based methods for detecting phishing websites. This study proposed 3 meta-learner models based on Forest Penalizing Attributes (ForestPA) algorithm. ForestPA uses a weight assignment and weight increment strategy to build highly efficient decision trees by exploiting the prowess of all attributes (non-class inclusive) in a given dataset. From the experimental results, the proposed meta-learners (ForestPA-PWDM, Bagged-ForestPA-PWDM, and Adab-ForestPA-PWDM) are highly efficient with the least accuracy of 96.26%, 0.004 FAR, and 0.994 ROC value. Further, with the superiority of the proposed models over other existing methods, we recommend the development and adoption of meta-learners based on ForestPA for phishing website detection and other cybersecurity attacks.
Similar content being viewed by others
References
Yang, P.; Zhao, G.; Zeng, P.: Phishing website detection based on multidimensional features driven by deep learning. IEEE Access 7, 15196–15209 (2019)
Abdulrahaman, M.D.; Alhassan, J.K.; Adebayo, O.S.; Ojeniyi, J.A.; Olalere, M.: Phishing attack detection based on random forest with wrapper feature selection method. Int. J. Inf. Process. Commun 7(2), 209–224 (2019)
Ali, W.; Ahmed, A.A.: Hybrid intelligent phishing website prediction using deep neural networks with genetic algorithm-based feature selection and weighting. IET Inf. Secur. 13(6), 659–699 (2019)
Ferreira, R.P.; et al.: Artificial neural network for websites classification with phishing characteristics. Soc. Netw. 7, 97–109 (2018)
Wei, B.; et al.: A deep-learning-driven light-weight phishing detection sensor. Sensors 19(19), 4258 (2019)
Soon, G.K.; Chiang, L.C.; On, C.K.; Rusli, N.M.; Fun, T.S.: Comparison of ensemble simple feedforward neural network and deep learning neural network on phishing detection. Lect. Notes Electr. Eng. 603, 595–604 (2020)
Vrbančič, G.; Fister, I. Jr.; Podgorelec, V. (2018) Swarm intelligence approaches for parameter setting of deep learning neural network : case study on phishing websites classification. In: Int. Conf. Web Intell. Min. Semant.,
Gajera, K.; Jangid M.; Mehta, P.; Mittal, J. A novel approach to detect phishing attack using artificial neural networks combined with pharming detection. In: Proc. 3rd Int. Conf. Electron. Commun. Aerosp. Technol. ICECA 2019, pp. 196–200 (2019).
Zabihimayvan, M.; Doran, D. Fuzzy rough set feature selection to enhance phishing attack detection. In: IEEE Int. Conf. Fuzzy Syst., vol. 2019-June (2019)
Zhu, E.; Liu, D.; Ye, C.; Liu, F.; Li, X.; Sun, H. Effective phishing website detection based on improved BP neural network and dual feature evaluation. In: IEEE Intl Conf Parallel Distrib. Process. with Appl. Ubiquitous Comput. Commun. Big Data Cloud Comput. Soc. Comput. Networking, Sustain. Comput. Commun., pp. 759–765, (2018).
Singh, C.; Smt. Meenu, Phishing website detection based on machine learning: a survey. In: 6th International Conference on Advanced Computing & Communication Systems (ICACCS), 2020, pp. 398–404.
Mohammad, R.M.; Thabtah, F.; McCluskey, L.: Predicting phishing websites based on self-structuring neural network. Neural Comput. Appl. 25(2), 443–458 (2014)
Subasi, A.; Molah, E.; Almkallawi, F.; Chaudhery, T. J.; Intelligent phishing website detection using random forest classifier. In: Int. Conf. Electr. Comput. Technol. Appl., vol. IEEE, pp. 1–5 (2017)
Zamir, A.; et al.: Phishing web site detection using diverse machine learning algorithms. Electron. Libr. 38(1), 65–80 (2020)
Mazini, M.; Shirazi, B.; Mahdavi, I.: Anomaly network-based intrusion detection system using a reliable hybrid artificial bee colony and AdaBoost algorithms. J. King Saud Univ. Comput. Inf. Sci. 31(4), 541–553 (2019)
Mohammad, R. M.; Thabtah, F.; McCluskey, L. An assessment of features related to phishing websites using an automated technique. 2012 IEEE, 2012. In: International Conference for Internet Technology and Secured Transactions., 2012, pp. 492–497.
Panigrahi, R.; Borah, S.: Dual-stage intrusion detection for class imbalance scenarios. Comput. Fraud Secur. 2019(12), 12–19 (2019)
Zhou, Y.; Cheng, G.; Jiang, S.; Dai, M. An efficient intrusion detection system based on feature selection and ensemble classifier,” arXiv Prepr. arXiv1904.01352., 2019.
Sun, B.; Chen, S.; Wang, J.; Chen, H.: A robust multi-class AdaBoost algorithm for mislabeled noisy data. Knowledge Based Syst. 102, 87–102 (2016)
Haixiang, G.; Yijing, L.; Yanan, L.; Xiao, L.; Jinling, L.: BPSO-Adaboost-KNN ensemble learning algorithm for multi-class imbalanced data classification. Eng. Appl. Artif. Intell. 49(October), 176–193 (2016)
Collell, G.; Prelec, D.; Patil, K.R.: A simple plug-in bagging ensemble based on threshold-moving for classifying binary and multiclass imbalanced data. Neurocomputing 275, 330–340 (2018)
Lee, S.J.; Xu, Z.; Li, T.; Yang, Y.: A novel bagging C4.5 algorithm based on wrapper feature selection for supporting wise clinical decision making. J. Biomed. Inform. 78, 144–155 (2018)
David, J.; Thomas, C.: Efficient DDoS flood attack detection using dynamic thresholding on flow-based network traffic. Comput. Secur. 82, 284 (2019)
Balogun, A.O.; Oladele, R.O.; Mojeed, H.A.; Amin-balogun, B.; Adeyemo, V.E.; Aro, T.O.: Performance analysis of selected clustering techniques for software defects prediction. African J. Comput. ICT 12(2), 30–42 (2019)
Niyaz, Q.; Sun, W.; Javaid, A. Y.; Alam, M A deep learning approach for network intrusion detection system. In: Proceedings of the 9th EAI International Conference on Bio-inspired Information and Communications Technologies (formerly BIONETICS) (2016)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Alsariera, Y.A., Elijah, A.V. & Balogun, A.O. Phishing Website Detection: Forest by Penalizing Attributes Algorithm and Its Enhanced Variations. Arab J Sci Eng 45, 10459–10470 (2020). https://doi.org/10.1007/s13369-020-04802-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13369-020-04802-1