Skip to main content

BINSPECT: Holistic Analysis and Detection of Malicious Web Pages

  • Conference paper
Security and Privacy in Communication Networks (SecureComm 2012)

Abstract

Malicious web pages are among the major security threats on the Web. Most of the existing techniques for detecting malicious web pages focus on specific attacks. Unfortunately, attacks are getting more complex whereby attackers use blended techniques to evade existing countermeasures. In this paper, we present a holistic and at the same time lightweight approach, called BINSPECT, that leverages a combination of static analysis and minimalistic emulation to apply supervised learning techniques in detecting malicious web pages pertinent to drive-by-download, phishing, injection, and malware distribution by introducing new features that can effectively discriminate malicious and benign web pages. Large scale experimental evaluation of BINSPECT achieved above 97% accuracy with low false signals. Moreover, the performance overhead of BINSPECT is in the range 3-5 seconds to analyze a single web page, suggesting the effectiveness of our approach for real-life deployment.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Symantec: Symantec report on attack kits and malicious websites (July 2011), http://symantec.com/content/en/us/enterprise/other_resources/b-symantec_report_on_attack_kits_and_malicious_websites_21169171_WP.en-us.pdf

  2. Symantec: Symantec web based attack prevalence report (July 2011), http://www.symantec.com/business/threatreport/topic.jsp?id=threat_activity_trends&aid=web_based_attack_prevalence

  3. WebSense: Websense 2010 threat report (July 2011), http://www.websense.com/content/threat-report-2010-highlights.aspx/

  4. Symantec: Internet security threat report 2011 trends (April 2012), http://www.symantec.com/content/en/us/enterprise/other_resources/b-istr_main_report_2011_21239364.en-us.pdf

  5. Canali, D., Cova, M., Vigna, G., Kruegel, C.: Prophiler:a fast filter for the large-scale detection of malicious web pages. In: Proceedings of WWW, pp. 197–206 (2011)

    Google Scholar 

  6. Stone-Gross, B., Cova, M., Cavallaro, L., Gilbert, B., Szydlowski, M., Kemmerer, R., Kruegel, C., Vigna, G.: Your botnet is my botnet: analysis of a botnet takeover. In: Proceedings of the 16th ACM CCS, pp. 635–647 (2009)

    Google Scholar 

  7. Eshete, B., Villafiorita, A., Weldemariam, K.: Malicious website detection: Effectiveness and efficiency issues. In: Proceedings of SysSec Workshop, pp. 123–126 (2011)

    Google Scholar 

  8. Ma, J.: Learning to Detect Malicious URLs. PhD thesis, University of California, San Diego (2010)

    Google Scholar 

  9. Ma, J., Saul, L.K., Savage, S., Voelker, G.M.: Identifying suspicious urls: an application of large-scale online learning. In: Proceedings of ICML, pp. 681–688 (2009)

    Google Scholar 

  10. Ma, J., Saul, L.K., Savage, S., Voelker, G.M.: Beyond blacklists: learning to detect malicious web sites from suspicious urls. In: Proceedings of KDDM (2009)

    Google Scholar 

  11. Thomas, K., Grier, C., Ma, J., Paxson, V., Song, D.: Design and Evaluation of a Real-Time URL Spam Filtering Service. In: Proceedings of the IEEE Symposium on Security and Privacy (2011)

    Google Scholar 

  12. Choi, H., Zhu, B.B., Lee, H.: Detecting malicious web links and identifying their attack types. In: Proceedings of the 2nd USENIX Conference on Web Application Development, pp. 11–11 (2011)

    Google Scholar 

  13. Seifert, C., Welch, I., Komisarczuk, P., Aval, C.U., Endicott-Popovsky, B.: Identification of malicious web pages through analysis of underlying dns and web server relationships. In: 33rd IEEE Conference on Local Computer Networks (2008)

    Google Scholar 

  14. Yung-Tsung, H., Yimeng, C., Tsuhan, C., Chi-Sung, L., Chia-Mei, C.: Malicious web content detection by machine learning. Expert Syst. Appl. 37(1), 55–60 (2010)

    Article  Google Scholar 

  15. Seifert, C., Welch, I., Komisarczuk, P.: Identification of malicious web pages with static heuristics. In: Proceedings of the Australasian Telecommunication Networks and Applications Conference (2008)

    Google Scholar 

  16. Likarish, P., Jung, E., Jo, I.: Obfuscated malicious javascript detection using classification techniques. In: Proceedings of International Conference on Malicious and Unwanted Software (MALWARE), pp. 47–54 (October 2009)

    Google Scholar 

  17. Qassrawi, M., Zhang, H.: Detecting malicious web servers with honeyclients. Journal of Networks 6(1) (2011)

    Google Scholar 

  18. Dewald, A., Holz, T., Freiling, F.C.: Adsandbox: sandboxing javascript to fight malicious websites. In: ACM Symposium on Applied Computing, pp. 1859–1864 (2010)

    Google Scholar 

  19. Marco, C., Christopher, K., Giovanni, V.: Detection and analysis of drive-by-download attacks and malicious javascript code. In: Proceedings of WWW, pp. 281–290 (2010)

    Google Scholar 

  20. Alexander, M., Tanya, B., Damien, D., Gribble, S.D., Levy, H.M.: Spyproxy: execution-based detection of malicious web content. In: Proceedings of 16th USENIX Security Symposium, pp. 3:1–3:16 (2007)

    Google Scholar 

  21. Ford, S., Cova, M., Kruegel, C., Vigna, G.: Analyzing and detecting malicious flash advertisements. In: Proceedings of ACSAC (2009)

    Google Scholar 

  22. Ikinci, A., Holz, T., Freiling, F.: Monkey-spider: Detecting malicious websites with low-interaction honeyclients. In: Proceedings of Sicherheit, Schutz und Zuverlssigkeit, pp. 407–421 (2008)

    Google Scholar 

  23. Byung-Ik, K., Chae-Tae, I., Hyun-Chul, J.: Suspicious malicious web site detection with strength analysis of a javascript obfuscation. International Journal of Advanced Science and Technology, 19–32 (2011)

    Google Scholar 

  24. Rieck, K., Krueger, T., Dewald, A.: Cujo: efficient detection and prevention of drive-by-download attacks. In: Proceedings ACSAC, pp. 31–39 (2010)

    Google Scholar 

  25. Kolbitsch, C., Livshits, B., Zorn, B., Seifer, C.: Rozzle: De-cloaking internet malware. Technical report, Microsoft (2011)

    Google Scholar 

  26. Google: Google safe browsing api (August 2011), http://code.google.com/apis/safebrowsing/

  27. McAfee: Mcafee site advisor (July 2011), http://www.siteadvisor.com

  28. Armorize.: mysql.com hacked:infecting visitors with malware (September 2011), http://blog.armorize.com/2011/09/mysqlcom-hacked-infecting-visitors-with.html

  29. Egele, M., Kirda, E., Kruegel, C.: Mitigating drive-by download attacks: Challenges and open problems (2009)

    Google Scholar 

  30. Seo, D.: Facebook and twitter’s influence on google’s search rankings (May 2012), http://www.seomoz.org/blog/facebook-twitters-influence-google-search-rankings

  31. HtmlUnit: Htmlunit (March 2012), http://htmlunit.sourceforge.net/

  32. Facebook: Facebook graph api (March 2012), https://developers.facebook.com/docs/reference/api/

  33. Twitter: Twitter url api (March 2012), http://urls.api.twitter.com/1/urls/

  34. PhishTank: Phishtank developer information (September 2011), http://www.phishtank.com/developer_info.php

  35. MalwareURL: Malware urls (September 2011), http://www.malwareurl.com/

  36. Alexa: Alexa top 500 global websites (July 2011), http://www.alexa.com/topsites

  37. Yahoo: Yahoo random url generator (October 2011), http://random.yahoo.com/bin/yrl/

  38. DMOZ: Open directory project (September 2011), http://www.dmoz.org/

  39. Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The weka data mining software: An update. SIGKDD Explorations 11 (2009)

    Google Scholar 

  40. UCSB: Wepawet (July 2011), http://wepawet.cs.ucsb.edu

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 ICST Institute for Computer Science, Social Informatics and Telecommunications Engineering

About this paper

Cite this paper

Eshete, B., Villafiorita, A., Weldemariam, K. (2013). BINSPECT: Holistic Analysis and Detection of Malicious Web Pages. In: Keromytis, A.D., Di Pietro, R. (eds) Security and Privacy in Communication Networks. SecureComm 2012. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 106. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-36883-7_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-36883-7_10

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-36882-0

  • Online ISBN: 978-3-642-36883-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics