Skip to main content
Log in

Phishing Website Detection: Forest by Penalizing Attributes Algorithm and Its Enhanced Variations

  • Research Article-Computer Engineering and Computer Science
  • Published:
Arabian Journal for Science and Engineering Aims and scope Submit manuscript

Abstract

The damaging effect of phishing is traumatizing as attackers or hackers execute theft of sensitive information from users subtly for inappropriate or unauthorized usage. In the light of curbing phishing, blacklisting of websites proved ineffective as the deployment of phishing websites are rampantly increasing and often short-lived. Hence, machine learning (ML) methods are seen as viable measures and used to develop deplorable models that can detect a phishing website. ML methods are fast gaining attention and acceptance in detecting phishing websites as they can cope with the dynamism of phishing websites and attackers. However, ML methods still suffer some shortcomings in terms of low detection accuracy, high false alarm rate (FAR) and induced bias of developed ML solutions. In addition, with the evolving nature of phishing attacks, there is a continuing imperative need for novel and effective ML-based methods for detecting phishing websites. This study proposed 3 meta-learner models based on Forest Penalizing Attributes (ForestPA) algorithm. ForestPA uses a weight assignment and weight increment strategy to build highly efficient decision trees by exploiting the prowess of all attributes (non-class inclusive) in a given dataset. From the experimental results, the proposed meta-learners (ForestPA-PWDM, Bagged-ForestPA-PWDM, and Adab-ForestPA-PWDM) are highly efficient with the least accuracy of 96.26%, 0.004 FAR, and 0.994 ROC value. Further, with the superiority of the proposed models over other existing methods, we recommend the development and adoption of meta-learners based on ForestPA for phishing website detection and other cybersecurity attacks.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

References

  1. Yang, P.; Zhao, G.; Zeng, P.: Phishing website detection based on multidimensional features driven by deep learning. IEEE Access 7, 15196–15209 (2019)

    Article  Google Scholar 

  2. Abdulrahaman, M.D.; Alhassan, J.K.; Adebayo, O.S.; Ojeniyi, J.A.; Olalere, M.: Phishing attack detection based on random forest with wrapper feature selection method. Int. J. Inf. Process. Commun 7(2), 209–224 (2019)

    Google Scholar 

  3. Ali, W.; Ahmed, A.A.: Hybrid intelligent phishing website prediction using deep neural networks with genetic algorithm-based feature selection and weighting. IET Inf. Secur. 13(6), 659–699 (2019)

    Article  Google Scholar 

  4. Ferreira, R.P.; et al.: Artificial neural network for websites classification with phishing characteristics. Soc. Netw. 7, 97–109 (2018)

    Article  Google Scholar 

  5. Wei, B.; et al.: A deep-learning-driven light-weight phishing detection sensor. Sensors 19(19), 4258 (2019)

    Article  Google Scholar 

  6. Soon, G.K.; Chiang, L.C.; On, C.K.; Rusli, N.M.; Fun, T.S.: Comparison of ensemble simple feedforward neural network and deep learning neural network on phishing detection. Lect. Notes Electr. Eng. 603, 595–604 (2020)

    Article  Google Scholar 

  7. Vrbančič, G.; Fister, I. Jr.; Podgorelec, V. (2018) Swarm intelligence approaches for parameter setting of deep learning neural network : case study on phishing websites classification. In: Int. Conf. Web Intell. Min. Semant.,

  8. Gajera, K.; Jangid M.; Mehta, P.; Mittal, J. A novel approach to detect phishing attack using artificial neural networks combined with pharming detection. In: Proc. 3rd Int. Conf. Electron. Commun. Aerosp. Technol. ICECA 2019, pp. 196–200 (2019).

  9. Zabihimayvan, M.; Doran, D. Fuzzy rough set feature selection to enhance phishing attack detection. In: IEEE Int. Conf. Fuzzy Syst., vol. 2019-June (2019)

  10. Zhu, E.; Liu, D.; Ye, C.; Liu, F.; Li, X.; Sun, H. Effective phishing website detection based on improved BP neural network and dual feature evaluation. In: IEEE Intl Conf Parallel Distrib. Process. with Appl. Ubiquitous Comput. Commun. Big Data Cloud Comput. Soc. Comput. Networking, Sustain. Comput. Commun., pp. 759–765, (2018).

  11. Singh, C.; Smt. Meenu, Phishing website detection based on machine learning: a survey. In: 6th International Conference on Advanced Computing & Communication Systems (ICACCS), 2020, pp. 398–404.

  12. Mohammad, R.M.; Thabtah, F.; McCluskey, L.: Predicting phishing websites based on self-structuring neural network. Neural Comput. Appl. 25(2), 443–458 (2014)

    Article  Google Scholar 

  13. Subasi, A.; Molah, E.; Almkallawi, F.; Chaudhery, T. J.; Intelligent phishing website detection using random forest classifier. In: Int. Conf. Electr. Comput. Technol. Appl., vol. IEEE, pp. 1–5 (2017)

  14. Zamir, A.; et al.: Phishing web site detection using diverse machine learning algorithms. Electron. Libr. 38(1), 65–80 (2020)

    Article  Google Scholar 

  15. Mazini, M.; Shirazi, B.; Mahdavi, I.: Anomaly network-based intrusion detection system using a reliable hybrid artificial bee colony and AdaBoost algorithms. J. King Saud Univ. Comput. Inf. Sci. 31(4), 541–553 (2019)

    Article  Google Scholar 

  16. Mohammad, R. M.; Thabtah, F.; McCluskey, L. An assessment of features related to phishing websites using an automated technique. 2012 IEEE, 2012. In: International Conference for Internet Technology and Secured Transactions., 2012, pp. 492–497.

  17. Panigrahi, R.; Borah, S.: Dual-stage intrusion detection for class imbalance scenarios. Comput. Fraud Secur. 2019(12), 12–19 (2019)

    Article  Google Scholar 

  18. Zhou, Y.; Cheng, G.; Jiang, S.; Dai, M. An efficient intrusion detection system based on feature selection and ensemble classifier,” arXiv Prepr. arXiv1904.01352., 2019.

  19. Sun, B.; Chen, S.; Wang, J.; Chen, H.: A robust multi-class AdaBoost algorithm for mislabeled noisy data. Knowledge Based Syst. 102, 87–102 (2016)

    Article  Google Scholar 

  20. Haixiang, G.; Yijing, L.; Yanan, L.; Xiao, L.; Jinling, L.: BPSO-Adaboost-KNN ensemble learning algorithm for multi-class imbalanced data classification. Eng. Appl. Artif. Intell. 49(October), 176–193 (2016)

    Article  Google Scholar 

  21. Collell, G.; Prelec, D.; Patil, K.R.: A simple plug-in bagging ensemble based on threshold-moving for classifying binary and multiclass imbalanced data. Neurocomputing 275, 330–340 (2018)

    Article  Google Scholar 

  22. Lee, S.J.; Xu, Z.; Li, T.; Yang, Y.: A novel bagging C4.5 algorithm based on wrapper feature selection for supporting wise clinical decision making. J. Biomed. Inform. 78, 144–155 (2018)

    Article  Google Scholar 

  23. David, J.; Thomas, C.: Efficient DDoS flood attack detection using dynamic thresholding on flow-based network traffic. Comput. Secur. 82, 284 (2019)

    Article  Google Scholar 

  24. Balogun, A.O.; Oladele, R.O.; Mojeed, H.A.; Amin-balogun, B.; Adeyemo, V.E.; Aro, T.O.: Performance analysis of selected clustering techniques for software defects prediction. African J. Comput. ICT 12(2), 30–42 (2019)

    Google Scholar 

  25. Niyaz, Q.; Sun, W.; Javaid, A. Y.; Alam, M A deep learning approach for network intrusion detection system. In: Proceedings of the 9th EAI International Conference on Bio-inspired Information and Communications Technologies (formerly BIONETICS) (2016)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yazan Ahmad Alsariera.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Alsariera, Y.A., Elijah, A.V. & Balogun, A.O. Phishing Website Detection: Forest by Penalizing Attributes Algorithm and Its Enhanced Variations. Arab J Sci Eng 45, 10459–10470 (2020). https://doi.org/10.1007/s13369-020-04802-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13369-020-04802-1

Keywords

Navigation