Skip to main content
Log in

Improving the Feature Section Method Based on Genetic Algorithm to Increase the Efficiency of Detecting Phishing Websites

  • Published:
Automatic Control and Computer Sciences Aims and scope Submit manuscript

Abstract

Phishing is a term for unauthorized access to users' confidential information such as password, username, and credit card information. Today, the internet has become an essential need, and most of the requirements are provided through the internet. For this reason, phishers deceive users on the internet in various ways, such as social engineering, and steal their confidential information. Therefore, designing a method to deal with phishing is essential. However, if a phishing detection system only has high accuracy and ignores the time required to detect phishing, it will cause lengthy delays. In this research, phishing, anti-phishing, and feature selection methods are introduced. In the following, a data set containing 30 features of phishing and legal websites was prepared. Then we presented an effective method for feature selection based on Genetic Algorithm to deal with phishing, which can reduce the phishing detection time and maintain accuracy. Finally, the results show that the proposed method can achieve a detection time of about 0.46 s and an accuracy of 96.04%, which is better than existing research.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1.
Fig. 2.
Fig. 3.
Fig. 4.

REFERENCES

  1. Mohammad, R.M., Thabtah, F. and McCluskey, L., Tutorial and critical analysis of phishing websites methods, Comput. Sci. Rev., 2015, vol. 17, pp. 1–24. https://doi.org/10.1016/j.cosrev.2015.04.001

    Article  MathSciNet  Google Scholar 

  2. Baharloo, M. and Yari, A., An improved method for detecting phishing websites using data mining on web pages, J. Inf. Commun. Technol., 2020, vol. 43, no. 12, pp. 27–38. https://doi.org/10.52547/jict.12.43.27

    Article  Google Scholar 

  3. Singh, P., Jain, N. and Maini, A., Investigating the effect of feature selection and dimensionality reduction on phishing website classification problem, 2015 1st Int. Conf. on Next Generation Computing Technologies (NGCT), Dehradun, India, 2015, IEEE, 2015, pp. 388–393. https://doi.org/10.1109/NGCT.2015.7375147

  4. APWG. APWG phishing activity trends report, 2020. https://docs.apwg.org/reports/apwg_trends_report_ q3_2020.pdf.

  5. Suleman, M.T. and Awan, S.M., Optimization of URL-based phishing websites detection through genetic algorithms, Autom. Control Comput. Sci., 2019, vol. 53, no. 4, pp. 333–341. https://doi.org/10.3103/S0146411619040102

    Article  Google Scholar 

  6. Kawabata, M. and Mustafa, T., Performance comparison of classifiers on reduced phishing website dataset, 6th Int. Symp. on Digital Forensic and Security (ISDFS), Antalya, Turkey, 2018, IEEE, 2018, pp. 1–5. https://doi.org/10.1109/ISDFS.2018.8355357

  7. Saravanan, P. and Subramanian, S., A framework for detecting phishing websites using GA based feature selection and ARTMAP based website classification, Procedia Comput. Sci., 2020, vol. 171, pp. 1083–1092. https://doi.org/10.1016/j.procs.2020.04.116

    Article  Google Scholar 

  8. Abur-rous, M.R.M., Phishing website detection using intelligent data mining techniques: Design and development of an intelligent association classification mining fuzzy based scheme for phishing website detection with an emphasis on e-banking, PhD Dissertation, Bradford: Univ. of Bradford, 2010.

  9. PhishTank, 2021. phishtank.org.

  10. Aravindhan, R., Shanmugalakshmi, R., Ramya, K. and Selvan, C., Certain investigation on web application security: Phishing detection and phishing target discovery, 3rd Int. Conf. on Advanced Computing and Communication Systems (ICACCS), Coimbatore, India, 2016, IEEE, 2016, vol. 1, pp. 1–10. https://doi.org/10.1109/ICACCS.2016.7586405

  11. Chandrashekar, G. and Sahin, F., A survey on feature selection methods, Comput. Electr. Eng., 2014, vol. 40, no. 1, pp. 16–28. https://doi.org/10.1016/j.compeleceng.2013.11.024

    Article  Google Scholar 

  12. Sánchez-Maroño, N., Alonso-Betanzos, A. and Tombilla-Sanromán, M., Filter methods for feature selection—A comparative study, Intelligent Data Engineering and Automated Learning, Yin, H., Tino, P., Corchado, E., Byrne, W., and Yao, X., Eds., Lecture Notes in Computer Science, vol. 4881, Berlin: Springer, 2007, pp. 178–187. https://doi.org/10.1007/978-3-540-77226-2_19

  13. Jović, A., Brkić, K. and Bogunović, N., A review of feature selection methods with applications, 38th Int. Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO), Opatija, Croatia, 2015, IEEE, 2015, pp. 1200–1205. https://doi.org/10.1109/MIPRO.2015.7160458

  14. Maleki, N., Zeinali, Ya. and Niaki, S.T.A., A K-NN method for lung cancer prognosis with the use of a genetic algorithm for feature selection, Expert Syst. Appl., 2021, vol. 164, p. 113981. https://doi.org/10.1016/j.eswa.2020.113981

    Article  Google Scholar 

  15. Chilton, A., Alexa.js., 2021. https://gist.github.com/chilts/7229605#file-alexa-js.

  16. Mohammad, R.M., Thabtah, F., and McCluskey, L. Phishing website dataset, 2015. https://archive. ics.uci.edu/ml/datasets/Phishing+Websites.

  17. Mirjalili, S., Genetic algorithm, Evolutionary Algorithms and Neural Networks: Theory and Applications, Studies in Computational Intelligence, vol. 780, Cham: Springer, 2019, pp. 43–55. https://doi.org/10.1007/978-3-319-93025-1_4

  18. Wang, S.-Ch., Genetic algorithm, Interdisciplinary Computing in Java Programming, The Springer International Series in Engineering and Computer Science, vol. 743, Boston: Springer, 2003, pp. 101–116. https://doi.org/10.1007/978-1-4615-0377-4_6

  19. Ismaili, M., Concepts and techniques of data mining, Kashan, Iran: Sureh, 2013.

    Google Scholar 

  20. Breiman, L., Random forests, Mach. Learn., 2001, vol. 45, no. 1, pp. 5–32. https://doi.org/10.1023/A:1010933404324

    Article  MATH  Google Scholar 

  21. Raei, R. and Falahpour, S., Support vector machines application in financial distress prediction of companies using financial ratios, Iranian Accounting Auditing Rev., 2008, vol. 15, no. 53, pp. 17–34.

    Google Scholar 

  22. Support vector machine with python coding, 2008. https://blog.faradars.org.

  23. Hadi, W.e., Aburub, F. and Alhawari, S., A new fast associative classification algorithm for detecting phishing websites, Appl. Soft Comput., 2016, vol. 48, pp. 729–734. https://doi.org/10.1016/j.asoc.2016.08.005

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ali Reza Yari.

Ethics declarations

The authors declare that they have no conflicts of interest.

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Mohamad Reza Davoudi, Ali Reza Yari Improving the Feature Section Method Based on Genetic Algorithm to Increase the Efficiency of Detecting Phishing Websites. Aut. Control Comp. Sci. 57, 213–221 (2023). https://doi.org/10.3103/S0146411623030045

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.3103/S0146411623030045

Keywords:

Navigation