Abstract
Attackers perform malicious activities by sending URLs to victims via e-mail, SMS, social network messages, and other means. Recently, intruders have been generating malicious URLs algorithmically. They also use shortening or obfuscation services to bypass firewalls and other security barriers. Some machine learning methods have been presented in order to identify malicious URLs from normal ones, all of which are subject to classification errors. On the other hand, it is impractical to have a complete and up-to-date blacklist due to large number of daily generated malicious URLs. Therefore, calculating the URLs security risk would be more helpful than URLs classification. In this way a user can correctly decide whether to use an unfamiliar URL if they know its associated security risk. In this study, the problem of URLs security risk computation is introduced and two effective novel criteria for this problem are proposed. Based on these criteria, a security risk score can be estimated for each incoming URL. In the first criterion, based on previous malicious and non-malicious URL instances, the extracted features of a URL are divided into two categories, those increase the risk and those reduce the security risk. In the second criterion, security risk score of an unknown URL is estimated based on its distances to nearest known malicious and also safe URLs. For both criterion, corresponding formulations and algorithms are also designed and are described. Extensive empirical evaluations on various real datasets show the effectiveness of the proposed criteria in terms of malicious URL detection rate. Moreover, our experiments show that the proposed metrics significantly outperforms previously proposed risk score criteria.
Similar content being viewed by others
Availability of Data and Materials
This study uses available datasets which were referenced in the manuscript accordingly.
References
Aljabri M, Altamimi HS, Albelali SA, Maimunah AH, Alhuraib H T, Alotaibi NK, Salah K (2022) Detecting malicious URLs using machine learning techniques: review and research directions. IEEE Access
Bo W, Fang ZB, Wei LX, Cheng ZF, Hua ZX (2021) Malicious URLs detection based on a novel optimization algorithm. IEICE Trans Inf Syst 104(4):513–516
Chen Z, Liu Y, Chen C, Lu M, Zhang X (2021) Malicious url detection based on improved multilayer recurrent convolutional neural network model. Security and communication networks, 2021
Cherdantseva Y, Burnap P, Blyth A, Eden P, Jones K, Soulsby H, Stoddart K (2016) A review of cyber security risk assessment methods for SCADA systems. Comput Secur 56:1–27
Deypir M, Horri A (2018) Instance based security risk value estimation for Android applications. J Inf Security Appl 40:20–30
Ding C (2020). Automatic detection of malicious urls using fine-tuned classification model. In: 2020 5th International conference on information science, computer technology and transportation (ISCTT) (pp 302–320). IEEE
Deypir M (2019) Entropy-based security risk measurement for Android mobile applications. Soft Comput 23(16):7303–7319
Gates CS, Li N, Peng H, Sarma B, Qi Y, Potharaju R, Molloy I (2014) Generating summary risk scores for mobile applications. IEEE Trans Depend Secure Comput 11(3):238–251
Ghaleb FA, Alsaedi M, Saeed F, Ahmad J, Alasli M (2022) Cyber threat intelligence-based malicious URL detection model using ensemble learning. Sensors 22(9):3373
Google Web Risk, https://github.com/google/webrisk. Access date: 21 Augest 2023
Hajaj C, Hason N, Dvir A (2022) Less is more: Robust and novel features for malicious domain detection. Electronics 11(6):969
He S, Li B, Peng H, Xin J, Zhang E (2021) An effective cost-sensitive XGBoost method for malicious URLs detection in imbalanced dataset. IEEE Access 9:93089–93096
Hoffmann R, Kiedrowicz M, Stanik J (2016) Risk management system as the basic paradigm of the information security management system in an organization. In: MATEC web of conferences (vol 76, p 04010). EDP Sciences
Indrasiri PL, Halgamuge MN, Mohammad A (2021) Robust ensemble machine learning model for filtering phishing URLs: expandable random gradient stacked voting classifier (ERG-SVC). IEEE Access 9:150142–150161
Kim S, Kim J, Kang BB (2018) Malicious URL protection based on attackers’ habitual behavioral analysis. Comput Secur 77:790–806
Kumi S, Lim C, Lee SG (2021) Malicious url detection based on associative classification. Entropy 23(2):182
Kuyama M, Kakizaki Y, Sasaki R, (2016) Method for detecting a malicious domain by using whois and dns features. In: Proceedings of the third international conference on digital security and forensics (DigitalSec2016), Kuala Lumpur, Malaysia, 6–8 September 2016
Landoll D (2021) The security risk assessment handbook: A complete guide for performing security risk assessments. CRC Press.
Li T, Kou G, Peng Y (2020) Improving malicious URLs detection via feature engineering: linear and nonlinear space transformation methods. Inf Syst 91:101494
Liang Y, Wang Q, Xiong K, Zheng X, Yu Z, Zeng D (2021) Robust detection of malicious urls with self-paced wide and deep learning. IEEE Trans Depend Secure Comput 19(2):717–730
Lyu X, Ding Y, Yang SH (2019) Safety and security risk assessment in cyber-physical systems. IET Cyber Phys Syst Theory Appl 4(3):221–232
Ma J, Saul LK, Savage S, Voelker GM (2009) Identifying suspicious URLs: an application of large-scale online learning. In: Proceedings of the 26th annual international conference on machine learning (pp 681–688)
Madhubala R, Rajesh N, Shaheetha L, Arulkumar N (2022) Survey on malicious URL detection techniques. In: 2022 6th International conference on trends in electronics and informatics (ICOEI) (pp 778–781). IEEE
Malicious URL Detection using MLP. https://www.kaggle.com/code/ashisharya01/malicious-url-detection-using-mlp-99-6-accuracy/data?select=urldata.csv. Access Date: 23 July 2022
Mamun MSI, Rathore MA, Lashkari AH, Stakhanova N, Ghorbani AA (2016) Detecting malicious urls using lexical analysis. In: International conference on network and system security (pp 467–482). Springer, Cham
Messabi KA, Aldwairi M, Yousif AA, Thoban A, Belqasmi F (2018) Malware detection using dns records and domain name features. In: Proceedings of the 2nd international conference on future networks and distributed systems (pp 1–7)
Mondal DK, Singh BC, Hu H, Biswas S, Alom Z, Azim MA (2021) SeizeMaliciousURL: a novel learning approach to detect malicious URLs. J Inf Security Appl 62:102967
Nurse JR, Creese S, De Roure D (2017) Security risk assessment in internet of things systems. IT Professional 19(5):20–26
Palaniappan G, Sangeetha S, Rajendran B, Goyal S, Bindhumadhava BS (2020) Malicious domain detection using machine learning on domain name features, host-based features and web-based features. Proc Comput Sci 171:654–661
Patgiri R, Katari H, Kumar R, Sharma D (2019) Empirical study on malicious URL detection using machine learning. In: International conference on distributed computing and internet technology (pp 380–388). Springer, Cham
Patgiri R, Biswas A, Nayak S (2021) deepbf: Malicious url detection using learned bloom filter and evolutionary deep learning. arXiv preprint arXiv:2103.12544.
Patil DR, Patil JB (2018) Malicious URLs detection using decision tree classifiers and majority voting technique. Cybernet Inf Technol 18(1):11–29
Peltier TR (2016) Information security policies, procedures, and standards: guidelines for effective information security management. CRC Press
Prakash P, Kumar M, Kompella RR, Gupta M (2010) Phishnet: predictive blacklisting to detect phishing attacks. In: 2010 Proceedings IEEE INFOCOM (pp 1–5). IEEE
Raja AS, Vinodini R, Kavitha A (2021) Lexical features based malicious URL detection using machine learning techniques. Mater Today Proc 47:163–166
Raja AS, Pradeepa G, Arulkumar N (2022). Mudhr: Malicious URL detection using heuristic rules based approach. In: AIP conference proceedings (vol 2393, No 1, p 020176). AIP Publishing LLC
Rakesh R, Muthuraijkumar S, Sairamesh L, Vijayalakmi M, Kannan A (2015) Detection of URL based attacks using reduced feature set and modified C4. 5 algorithm. Adv Nat Appl Sci 9:304–311
van Rijswijk-Deij R, Jonker M, Sperotto A, Pras A (2016) A high-performance, scalable infrastructure for large-scale active DNS measurements. IEEE J Sel Areas Commun 34(6):1877–1888
Sahoo D, Liu C, Hoi SC (2017) Malicious URL detection using machine learning: A survey. arXiv preprint arXiv:1701.07179
Shameli-Sendi A, Aghababaei-Barzegar R, Cheriet M (2016) Taxonomy of information security risk assessment (ISRA). Comput Secur 57:14–30
URL Risk Levels, https://knowledge.broadcom.com/external/article/175589/url-risk-levels.html . Access date: 26 March 2023
Vinayakumar R, Soman KP, Poornachandran P (2018) Evaluating deep learning approaches to characterize and classify malicious URL’s. J Intell Fuzzy Syst 34(3):1333–1343
Vinayakumar R, Alazab M, Soman KP, Poornachandran P, Venkatraman S (2019) Robust intelligent malware detection using deep learning. IEEE Access 7:46717–46738
Yuan, J., Liu, Y., & Yu, L. (2021a). A novel approach for malicious url detection based on the joint model. Security and Communication Networks, 2021.
Yuan J, Chen G, Tian S, Pei X (2021b) Malicious URL detection based on a parallel neural joint model. IEEE Access 9:9464–9472
Acknowledgements
Not applicable.
Funding
This research has no funding.
Author information
Authors and Affiliations
Contributions
The authors confirm contribution to the paper as follows: The main idea and algorithm design: MD and TZ; implementation of the algorithm: MD; analysis and interpretation of results: TZ and MD; draft manuscript preparation: TZ; Both authors reviewed the results and approved the final version of the manuscript.
Corresponding author
Ethics declarations
Conflict of interest
The author declare that he has no known competing financial interests or personal relationship that could have appeared to influence the work reported in this paper.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Deypir, M., Zoughi, T. Novel Security Metrics for Identifying Risky Unified Resource Locators (URLs). Iran J Sci Technol Trans Electr Eng (2024). https://doi.org/10.1007/s40998-023-00690-x
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s40998-023-00690-x