A set of features to detect web security threats

Canfora, Gerardo; Visaggio, Corrado Aaron

doi:10.1007/s11416-016-0266-2

A set of features to detect web security threats

Original Paper
Published: 29 January 2016

Volume 12, pages 243–261, (2016)
Cite this article

Journal of Computer Virology and Hacking Techniques Aims and scope Submit manuscript

Gerardo Canfora¹ &
Corrado Aaron Visaggio¹

1027 Accesses
8 Citations
2 Altmetric
3 Mentions
Explore all metrics

Abstract

The increasing growth of malicious websites and systems for distributing malware through websites is making it urgent the adoption of effective techniques for timely detection of web security threats. Current mechanisms may exhibit some limitations, mainly concerning the amount of resources required, and a low true positives rate for zero-day attacks. With this paper, we propose and validate a set of features extracted from the content and the structure of webpages, which could be used as indicators of web security threats. The features are used for building a predictor, based on five machine learning algorithms, which is applied to classify unknown web applications. The experimentation demonstrated that the proposed set of features is able to correctly classify malicious web sites with a high level of precision, corresponding to 0.84 in the best case, and recall corresponding to 0.89 in the best case. The classifiers reveal to be successful also with zero day attacks.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Akiyama, M., Yagi, T., Itoh, M.: Searching Structural Neighborhood of Malicious URLs to Improve Blacklisting. In: proc. of Applications and the Internet (SAINT), 2011 IEEE/IPSJ 11th International Symposium, IEEE, 18–21 July, pp. 1–10 (2011)
Alme, C.: Web browsers: an emerging platform under attack. MCAfee (2008)
Almorsy, M., Grundy, J., Ibrahim, A.S.: Supporting automated vulnerability analysis using formalized vulnerability signatures. In: Proc. of automated software engineering 2012 (ASE2012), ACM
Balduzzi, M., Egele, M., Kirda, E., Balzarotti, D., Kruegel, C.: A solution for the automated detection of clickjacking attacks. In: ASI-ACCS’10 (2010)
Barth, A., Jackson, C., Mitchell, J.C.: Robust defenses for cross-site request forgery. In: Proc. of communication and computer security (CCS’08), pp. 75–88 (2008)
Barth, A., Jackson, C., Mitchell, J.: Securing frame communication in browsers. Commun. ACM 52, 83–91 (2009)
Article Google Scholar
Bin, L., Jianjun, H., Fang, L., Dawei, W., Daxiang, D., Zhaohui, L.: Malicious webpages detection based on abnormal visibility recognition. In: Proc. of international conference on e-business and information system security, 2009. EBISS ’09, pp. 1–5 (2009)
Canali, D., Cova, M., Kruegel, C., Vigna, G.: Prophiler: a fast filter for the large-scale detection of malicious webpages. In: Proc. of the 20th international conference on World wide web (WWW’11, ACM,), pp. 197–206 (2011)
Charles, R., John, D., Helen, J.W., Opher, D., Saher, E.: BrowserShield, Vulnerability-driven filtering of dynamic HTML. ACM Trans. Web 1, 11 (2007)
Article Google Scholar
Chia-Mei, C., Wan-Yi, T., Hsiao-Chung, L.: Anomaly behavior analysis for webpage inspection. In: Proc. of the first international conference on networks and communications, 2009. NETCOM ’09, pp. 358–363 (2009)
Cova, M., Kruegel, C., Vigna, G.: Detection and analysis of drive-by-download attacks and malicious javascript code. In: WWW2010. Raleigh (2010)
Curtsinger, C., Livshits, B., Zorn, B., Seifert, C.: Zozzle: Low-overhead mostly static javascript malware detection. In: Proc. of the USENIX security symposium (2011)
Ford, S., Cova, M., Kruegel, C., Vigna, G.: Analyzing and detecting malicious flash advertisements. In: Proc. of computer security applications conference, 2009. ACSAC ’09, pp. 363–372 (2009)
Gargoyle, Html Unit, Gargoyle Software Inc., http://htmlunit.sourceforge.net/. Accessed 02 May 2010
Gyongyi, Z., Garcia-Molina, H.: Web spam taxonomy. Stanford University, California (2004)
Google Scholar
Hansen, R.: Clickjacking. http://ha.ckers.org/blog/20080915/clickjacking/. Accessed 02 May 2010
Hou, Y.-T., Chang, Y., Chen, T., Laih, C.-S., Chen, C.-M.: Malicious web content detection by machine learning, Expert Syst. Appl. (2009, In Press, Corrected Proof)
Ikinci, A., Holz, T., Freiling, F.: Monkey-spider: detecting malicious websites with low-interaction honeyclients. Sicherheit, Saarbruecken (2008)
Google Scholar
Jianwei, Z., Yonglin, Z., Jinpeng, G., Minghua, W., Xulu, J., Weimin, S., Yuejin, D.: Malicious websites on the Chinese web: overview and case study. Peking University, Beijing (2007)
Google Scholar
John, J.P., Yu, F., Xie, Y., Krishnamurthy, A., Abadi, M.: deSEO: Combating search-result poisoning. In: 20th USENIX security syposium (2011)
Jose, M., Ralf, S., Helen, J.W., Yi-Min, W.: A systematic approach to uncover security flaws in GUI logic. In: Proceedings of the 2007 IEEE symposium on security and privacy, IEEE Computer Society (2007)
Kapravelos, A., Shoshitaishvili, Y., Cova, M., Kruegel, C., Vigna, G.: Revolver: an automated approach to the detection of evasive web-based malware. In: Proc. of the 22nd Usenix security symposium (2013)
Keats, S., Koshy, E.: The web’s most dangerous search term. McAfee (2009)
Lam Le, V., Welch, I., Gao, X., Komisarczuk, P.: Two-stage classification model to detect malicious webpages. In: Proc. of IEEE international conference on advanced information networking and applications (AINA), 2011, 22–25 March, IEEE, pp. 113–120 (2011)
Lawton, G.: Web 2.0 creates security challenges. Computer 40, 13–16 (2007)
Article Google Scholar
Liang, B., Huang, J., Liu, F., Wang, D., Dong, D., Liang, Z.: Malicious webpages detection based on abnormal visibility recognition. In: Proc. of International conference on e-business and information system security, 2009. EBISS ’09. 23–24 May, pp. 1–5 (2009)
Liu, P., Wang, X.: Identification of malicious webpages by inductive learning. In: Proc. of the international conference on web information systems and mining, Springer-Verlag, Shanghai (2009)
Ma, J., Saul, L.K., Savage, S., Voelker, G.M.: Beyond blacklists: learning to detect malicious web sites from suspicious URLs. In: Proc. of the 15th ACM SIGKDD international conference on knowledge discovery and data mining, ACM, Paris (2009)
Moshchuk, E., Bragin, T., Gribble, S.D., Levy, H.M.: A crawler-based study of spyware on the Web (2006)
Niels, P., Rajab M.A., Panayiotis, M.: Cybercrime 2.0: when the cloud turns dark. Queue 7, 46–47 (2009)
Paul Stone. Next generation clickjacking. https://media.blackhat.com/bh-eu-10/presentations/Stone/BlackHat-EU-2010-Stone-Next-Generation-Clickjacking-slides.pdf (2010)
Provos, N., Mavrommatis, P., Abu, M., Monrose, R.F.: All your iframes point to us. Google Inc, (2008)
Provos, N., McNamee, D., Mavrommatis, P., Wang, K., Modadugu, A.: The ghost in the browser: analysis of web-based malware. In: Proc. of the first USENIX workshop on hot topics in Botnets (2007)
Rajab, M.A., Ballard, L., Mavrommatis, P., Provos, N., Zhao, X.: The nocebo effect on the web: an analysis of fake anti-virus distribution. In: Proc. of the 3rd USENIX Conference on large-scale exploits and emergent threats: botnets, spyware, worms, and more, LEET (2010)
Ranadive, A., Demir, T., Rizvi, S., Daswani, N.: Malware distribution via widgetization of the web. https://media.blackhat.com/bh-dc-11/Daswani/BlackHat_DC_2011_Daswani_Malware%20Dist-wp.pdf. Accessed 02 May 2010
Rieck, K., Krueger, T., Dewald, A.: Cujo: Efficient detection and prevention of drive-by-download attacks. In: Proc. of the annual computer security applications conference (ACSAC) (2010)
Security Threat Report 2014, Sophos White Paper
Seifert, C., Welch, I., Komisarczuk, P.: Identification of malicious webpages with static heuristics. In: Proc. of telecommunication networks and applications conference, 2008. ATNAC 2008. Australasian, pp. 91–96 (2008)
Seifert, C.: Know your enemy: behind the scenes of malicious web servers. The Honeynet Project (2007)
Seifert, C., Welch, I., Komisarczuk, P.: HoneyC—the low-interaction client honeypot. NZCSRSC, Hamilton (2007)
Google Scholar
Shih-Fen, L., Yung-Tsung, H., Chia-Mei, C., Bingchiang, J., Chi-Sung, L.: Malicious webpage detection by semantics-aware reasoning. In: Proc. of the eighth international conference on intelligent systems design and applications, 2008. ISDA ’08, pp. 115–120 (2008)
Spam SEO trends & statistics. http://research.zscaler.com/2010/07/spam-seo-trends-statistics-part-ii.html. Accessed 02 May 2010
Tao, W., Shunzheng, Y., Bailin, X., Novel, A.: Framework for learning to detect malicious webpages. In: Proc. of information technology and applications (IFITA), 2010 International. Forum 16–18 July, pp. 353–357 (2010)
Wang, Y.-M., Beck, D., Jiang, X., Roussev, R., Verbowski, C., Chen, S., King, S.: Automated web patrol with strider honeymonkeys: findingweb sites that exploit browser vulnerabilities. In: Proc. of the symposium on network and distributed system security (NDSS) (2006)
Xiaoyan, S., Yang, W., Jie, R., Yuefei, Z., Shengli, L.: Collecting internet malware based on client-side honeypot. In: Proc. of the 9th international conference for young computer scientists, 2008. ICYCS 2008, pp. 1493–1498 (2008)
Zhong, J., Wei, G., Zhang, D., Yang, Y.: SAB2: A novel system of malicious webpages detection. In: Proc. of IEEE international conference broadband network and multimedia technology (IC-BNMT), pp. 733–737 (2010)

Download references

Author information

Authors and Affiliations

Department of Engineering, University of Sannio, Viale Traiano, 1, 82100, Benevento, Italy
Gerardo Canfora & Corrado Aaron Visaggio

Authors

Gerardo Canfora
View author publications
You can also search for this author in PubMed Google Scholar
Corrado Aaron Visaggio
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Corrado Aaron Visaggio.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Canfora, G., Visaggio, C.A. A set of features to detect web security threats. J Comput Virol Hack Tech 12, 243–261 (2016). https://doi.org/10.1007/s11416-016-0266-2

Download citation

Received: 18 March 2015
Accepted: 14 October 2015
Published: 29 January 2016
Issue Date: November 2016
DOI: https://doi.org/10.1007/s11416-016-0266-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A set of features to detect web security threats

Abstract

Access this article

Similar content being viewed by others

A Framework for Classifying Web Attacks While Respecting ML Requirements

Detect Malicious Web Pages Using Naive Bayesian Algorithm to Detect Cyber Threats

Web Guard

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A set of features to detect web security threats

Abstract

Access this article

Similar content being viewed by others

A Framework for Classifying Web Attacks While Respecting ML Requirements

Detect Malicious Web Pages Using Naive Bayesian Algorithm to Detect Cyber Threats

Web Guard

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation