Abstract
Phishing is a term for unauthorized access to users' confidential information such as password, username, and credit card information. Today, the internet has become an essential need, and most of the requirements are provided through the internet. For this reason, phishers deceive users on the internet in various ways, such as social engineering, and steal their confidential information. Therefore, designing a method to deal with phishing is essential. However, if a phishing detection system only has high accuracy and ignores the time required to detect phishing, it will cause lengthy delays. In this research, phishing, anti-phishing, and feature selection methods are introduced. In the following, a data set containing 30 features of phishing and legal websites was prepared. Then we presented an effective method for feature selection based on Genetic Algorithm to deal with phishing, which can reduce the phishing detection time and maintain accuracy. Finally, the results show that the proposed method can achieve a detection time of about 0.46 s and an accuracy of 96.04%, which is better than existing research.
REFERENCES
Mohammad, R.M., Thabtah, F. and McCluskey, L., Tutorial and critical analysis of phishing websites methods, Comput. Sci. Rev., 2015, vol. 17, pp. 1–24. https://doi.org/10.1016/j.cosrev.2015.04.001
Baharloo, M. and Yari, A., An improved method for detecting phishing websites using data mining on web pages, J. Inf. Commun. Technol., 2020, vol. 43, no. 12, pp. 27–38. https://doi.org/10.52547/jict.12.43.27
Singh, P., Jain, N. and Maini, A., Investigating the effect of feature selection and dimensionality reduction on phishing website classification problem, 2015 1st Int. Conf. on Next Generation Computing Technologies (NGCT), Dehradun, India, 2015, IEEE, 2015, pp. 388–393. https://doi.org/10.1109/NGCT.2015.7375147
APWG. APWG phishing activity trends report, 2020. https://docs.apwg.org/reports/apwg_trends_report_ q3_2020.pdf.
Suleman, M.T. and Awan, S.M., Optimization of URL-based phishing websites detection through genetic algorithms, Autom. Control Comput. Sci., 2019, vol. 53, no. 4, pp. 333–341. https://doi.org/10.3103/S0146411619040102
Kawabata, M. and Mustafa, T., Performance comparison of classifiers on reduced phishing website dataset, 6th Int. Symp. on Digital Forensic and Security (ISDFS), Antalya, Turkey, 2018, IEEE, 2018, pp. 1–5. https://doi.org/10.1109/ISDFS.2018.8355357
Saravanan, P. and Subramanian, S., A framework for detecting phishing websites using GA based feature selection and ARTMAP based website classification, Procedia Comput. Sci., 2020, vol. 171, pp. 1083–1092. https://doi.org/10.1016/j.procs.2020.04.116
Abur-rous, M.R.M., Phishing website detection using intelligent data mining techniques: Design and development of an intelligent association classification mining fuzzy based scheme for phishing website detection with an emphasis on e-banking, PhD Dissertation, Bradford: Univ. of Bradford, 2010.
PhishTank, 2021. phishtank.org.
Aravindhan, R., Shanmugalakshmi, R., Ramya, K. and Selvan, C., Certain investigation on web application security: Phishing detection and phishing target discovery, 3rd Int. Conf. on Advanced Computing and Communication Systems (ICACCS), Coimbatore, India, 2016, IEEE, 2016, vol. 1, pp. 1–10. https://doi.org/10.1109/ICACCS.2016.7586405
Chandrashekar, G. and Sahin, F., A survey on feature selection methods, Comput. Electr. Eng., 2014, vol. 40, no. 1, pp. 16–28. https://doi.org/10.1016/j.compeleceng.2013.11.024
Sánchez-Maroño, N., Alonso-Betanzos, A. and Tombilla-Sanromán, M., Filter methods for feature selection—A comparative study, Intelligent Data Engineering and Automated Learning, Yin, H., Tino, P., Corchado, E., Byrne, W., and Yao, X., Eds., Lecture Notes in Computer Science, vol. 4881, Berlin: Springer, 2007, pp. 178–187. https://doi.org/10.1007/978-3-540-77226-2_19
Jović, A., Brkić, K. and Bogunović, N., A review of feature selection methods with applications, 38th Int. Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO), Opatija, Croatia, 2015, IEEE, 2015, pp. 1200–1205. https://doi.org/10.1109/MIPRO.2015.7160458
Maleki, N., Zeinali, Ya. and Niaki, S.T.A., A K-NN method for lung cancer prognosis with the use of a genetic algorithm for feature selection, Expert Syst. Appl., 2021, vol. 164, p. 113981. https://doi.org/10.1016/j.eswa.2020.113981
Chilton, A., Alexa.js., 2021. https://gist.github.com/chilts/7229605#file-alexa-js.
Mohammad, R.M., Thabtah, F., and McCluskey, L. Phishing website dataset, 2015. https://archive. ics.uci.edu/ml/datasets/Phishing+Websites.
Mirjalili, S., Genetic algorithm, Evolutionary Algorithms and Neural Networks: Theory and Applications, Studies in Computational Intelligence, vol. 780, Cham: Springer, 2019, pp. 43–55. https://doi.org/10.1007/978-3-319-93025-1_4
Wang, S.-Ch., Genetic algorithm, Interdisciplinary Computing in Java Programming, The Springer International Series in Engineering and Computer Science, vol. 743, Boston: Springer, 2003, pp. 101–116. https://doi.org/10.1007/978-1-4615-0377-4_6
Ismaili, M., Concepts and techniques of data mining, Kashan, Iran: Sureh, 2013.
Breiman, L., Random forests, Mach. Learn., 2001, vol. 45, no. 1, pp. 5–32. https://doi.org/10.1023/A:1010933404324
Raei, R. and Falahpour, S., Support vector machines application in financial distress prediction of companies using financial ratios, Iranian Accounting Auditing Rev., 2008, vol. 15, no. 53, pp. 17–34.
Support vector machine with python coding, 2008. https://blog.faradars.org.
Hadi, W.e., Aburub, F. and Alhawari, S., A new fast associative classification algorithm for detecting phishing websites, Appl. Soft Comput., 2016, vol. 48, pp. 729–734. https://doi.org/10.1016/j.asoc.2016.08.005
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
The authors declare that they have no conflicts of interest.
About this article
Cite this article
Mohamad Reza Davoudi, Ali Reza Yari Improving the Feature Section Method Based on Genetic Algorithm to Increase the Efficiency of Detecting Phishing Websites. Aut. Control Comp. Sci. 57, 213–221 (2023). https://doi.org/10.3103/S0146411623030045
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.3103/S0146411623030045