WYSISNWIV: What You Scan Is Not What I Visit

Yang, Qilang; Damopoulos, Dimitrios; Portokalidis, Georgios

doi:10.1007/978-3-319-26362-5_15

Qilang Yang¹⁶,
Dimitrios Damopoulos¹⁶ &
Georgios Portokalidis¹⁶

Part of the book series: Lecture Notes in Computer Science ((LNSC,volume 9404))

Included in the following conference series:

International Symposium on Recent Advances in Intrusion Detection

2942 Accesses
2 Citations

Abstract

A variety of attacks, including remote-code execution exploits, malware, and phishing, are delivered to users over the web. Users are lured to malicious websites in various ways, including through spam delivered over email and instant messages, and by links injected in search engines and popular benign websites. In response to such attacks, many initiatives, such as Google’s Safe Browsing, are trying to make the web a safer place by scanning URLs to automatically detect and blacklist malicious pages. Such blacklists are then used to block dangerous content, take down domains hosting malware, and warn users that have clicked on suspicious links. However, they are only useful, when scanners and browsers address the web the same way. This paper presents a study that exposes differences on how browsers and scanners parse URLs. These differences leave users vulnerable to malicious web content, because the same URL leads the browser to one page, while the scanner follows the URL to scan another page. We experimentally test all major browsers and URL scanners, as well as various applications that parse URLs, and discover multiple discrepancies. In particular, we discover that pairing Firefox with the blacklist produced by Google’s Safe Browsing, leaves Firefox users exposed to malicious content hosted under URLs including the backslash character. The problem is a general one and affects various applications and URL scanners. Even though, the solution is technically straightforward, it requires that multiple parties follow the same standard when parsing URLs. Currently, the standard followed by an application, seems to be unconsciously dictated by the URL parser implementation it is using, while most browsers have strayed from the URL RFC.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Uniform resource identifier (URI): Generic syntax, January 2005. https://www.ietf.org/rfc/rfc3986.txt
Different behaviours of treating (backslash) in the url by FireFox and Chrome. stackoverflow, May 2012. http://stackoverflow.com/questions/10438008/different-behaviours-of-treating-backslash-in-the-url-by-firefox-and-chrome
gred, March 2015. http://check.gred.jp/
Online link scan - scan links for harmful threats! (2015). http://onlinelinkscan.com/
PhishTank — join the fight against phishing (2015). http://www.phishtank.com/
scumware.org - just another free alternative for security and malware researchers (2015). http://www.scumware.org/
Stopbadware — a nonprofit organization that makes the web safer through the prevention, mitigation, and remediation of badware websites, May 2015. https://www.stopbadware.org/
Sucuri sitecheck - free website malware scanner, March 2015. https://sitecheck.sucuri.net/
urlquery.net - free url scanner, March 2015. http://urlquery.net/
VirusTotal - free online virus, malware and URL scanner (2015). https://www.virustotal.com/en/
Web inspector - inspect, detect, protect (2015). http://app.webinspector.com/
Website/url/link scanner safety check for phishing, malware, viruses - scanurl.net, March 2015. http://scanurl.net/
Zscaler zulu url risk analyzer - zulu, March 2015. http://zulu.zscaler.com/
Akhawe, D., Felt, A.P.: Alice in warningland: a large-scale field study of browser security warning effectiveness. In: Proceedings of the 22th USENIX Security Symposium, pp. 257–272 (2013)
Google Scholar
Bau, J., Bursztein, E., Gupta, D., Mitchell, J.: State of the art: automated black-box web application vulnerability testing. In: 2010 IEEE Symposium on Security and Privacy (SP), pp. 332–345, May 2010
Google Scholar
Borgolte, K., Kruegel, C., Vigna, G.: Delta: automatic identification of unknown web-based infection campaigns. In: Proceedings of the ACM SIGSAC Conference on Computer and Communications Security (CCS), pp. 109–120 (2013)
Google Scholar
Burns, J.: Cross site request forgery: an introduction to a common web application weakness. White paper, Information Security Partners, LLC (2007)
Google Scholar
Canali, D., Cova, M., Vigna, G., Kruegel, C.: Prophiler: a fast filter for the large-scale detection of malicious web pages. In: Proceedings of the International Conference on World Wide Web (WWW), pp. 197–206 (2011)
Google Scholar
Cass, S.: The 2015 top ten programming languages. http://spectrum.ieee.org/computing/software/the-2015-top-ten-programming-languages
Cova, M., Kruegel, C., Vigna, G.: Detection and analysis of drive-by-download attacks and malicious javaScript code. In: Proceedings of the International Conference on World Wide Web (WWW), pp. 281–290 (2010)
Google Scholar
Doupé, A., Cova, M., Vigna, G.: Why Johnny can’t pentest: an analysis of black-box web vulnerability scanners. In: Kreibich, C., Jahnke, M. (eds.) DIMVA 2010. LNCS, vol. 6201, pp. 111–131. Springer, Heidelberg (2010)
Chapter Google Scholar
Egele, M., Wurzinger, P., Kruegel, C., Kirda, E.: Defending browsers against drive-by downloads: mitigating heap-spraying code injection attacks. In: Flegel, U., Bruschi, D. (eds.) DIMVA 2009. LNCS, vol. 5587, pp. 88–106. Springer, Heidelberg (2009)
Chapter Google Scholar
Egelman, S., Cranor, L.F., Hong, J.: You’ve been warned: an empirical study of the effectiveness of Web browser phishing warnings. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI), pp. 1065–1074 (2008)
Google Scholar
FireEye: email security - detect and block spear phishing and other email-based attacks, May 2015. https://www.fireeye.com/products/ex-email-security-products.html
Garera, S., Provos, N., Chew, M., Rubin, A.D.: A framework for detection and measurement of phishing attacks. In: Proceedings of the 2007 ACM Workshop on Recurring Malcode (WORM), pp. 1–8 (2007)
Google Scholar
Google: safe browsing API - google developers (2015). https://developers.google.com/safe-browsing/
Ikinci, A., Holz, T., Freiling, F.: Monkey-spider: detecting malicious websites with low-interaction honeyclients. In: Proceedings of Sicherheit, Schutz und Zuverlässigkeit (2008)
Google Scholar
Imperial-Legrand, A.: Vulnerability writeups. Google+, March 2014. https://plus.google.com/+AlexisImperialLegrandGoogle/posts/EQXTzsBVS7L
Invernizzi, L., Benvenuti, S., Cova, M., Comparetti, P.M., Kruegel, C., Vigna, G.: EvilSeed: a guided approach to finding malicious web pages. In: Proceedings of the 2012 IEEE Symposium on Security and Privacy, pp. 428–442 (2012)
Google Scholar
Kapravelos, A., Shoshitaishvili, Y., Cova, M., Kruegel, C., Vigna, G.: Revolver: an automated approach to the detection of evasive web-based malware. In: Proceedings of the USENIX Security Symposium, pp. 637–652 (2013)
Google Scholar
Khoury, N., Zavarsky, P., Lindskog, D., Ruhl, R.: An analysis of black-box web application security scanners against stored SQL injection. In: 2011 IEEE Third International Conference on Privacy, Security, Risk and Trust (PASSAT) and 2011 IEEE Third Inernational Conference on Social Computing (SocialCom), pp. 1095–1101, October 2011
Google Scholar
Kirda, E.: Cross site scripting attacks. In: van Tilborg, H., Jajodia, S. (eds.) Encyclopedia of Cryptography and Security, pp. 275–277. Springer, US (2011)
Google Scholar
Ma, J., Saul, L.K., Savage, S., Voelker, G.M.: Beyond blacklists: learning to detect malicious web sites from suspicious URLs. In: Proceedings of the International Conference on Knowledge Discovery and Data Mining (KDD), pp. 1245–1254 (2009)
Google Scholar
Ma, J., Saul, L.K., Savage, S., Voelker, G.M.: Identifying suspicious URLs: an application of large-scale online learning. In: Proceedings of the International Conference on Machine Learning (ICML), pp. 681–688 (2009)
Google Scholar
Microsoft: Microsoft security intelligence report, volume 13. Technical report, Microsoft Corporation (2012)
Google Scholar
Microsoft: smartscreen filter (2015). http://windows.microsoft.com/en-us/internet-explorer/products/ie-9/features/smartscreen-filter
Moshchuk, A., Bragin, T., Deville, D., Gribble, S.D., Levy, H.M.: Spyproxy: execution-based detection of malicious web content. In: Proceedings of 16th USENIX Security Symposium on USENIX Security Symposium, SS 2007, pp. 3:1–3:16, USENIX Association, Berkeley, CA, USA (2007). http://dl.acm.org/citation.cfm?id=1362903.1362906
proofpoint: targeted attack protection, May 2015. https://www.proofpoint.com/us/solutions/products/targeted-attack-protection
Protalinski, E.: These 8 characters crash Skype, and once they’re in your chat history, the app can’t start (update: fixed). VentureBeat, May 2012. http://venturebeat.com/2015/06/02/these-8-characters-crash-skype-and-once-theyre-in-your-chat-history-the-app-cant-start/
Provos, N., Mavrommatis, P., Rajab, M.A., Monrose, F.: All your iFRAMEs point to us. In: Proceedings of the USENIX Security Symposium, pp. 1–15 (2008)
Google Scholar
Provos, N., McNamee, D., Mavrommatis, P., Wang, K., Modadugu, N.: The ghost in the browser analysis of web-based malware. In: Proceedings of the Workshop on Hot Topics in Understanding Botnets (HOTBOTS) (2007)
Google Scholar
Symantec: Symantec Web Security.cloud (2015). http://www.symantec.com/web-security-cloud/
Wang, Q., Zhou, J., Chen, Y., Zhang, Y., Zhao, J.: Extracting URLs from JavaScript via program analysis. In: Proceedings of the Joint Meeting on Foundations of Software Engineering (FSE), pp. 627–630 (2013)
Google Scholar
Wang, Y.M., Beck, D., Jiang, X., Verbowski, C., Chen, S., King, S.: Automated web patrol with strider HoneyMonkeys: finding web sites that exploit browser vulnerabilities. In: Proceedings of NDSS, February 2006
Google Scholar
WHATWG: URL living standard, May 2015. https://url.spec.whatwg.org/
Whittaker, C., Ryner, B., Nazif, M.: Large-scale automatic classification of phishing pages. In: Proceedings of NDSS, February 2010
Google Scholar
Xu, W., Zhang, F., Zhu, S.: JStill: mostly static detection of obfuscated malicious javascript code. In: Proceedings of the ACM Conference on Data and Application Security and Privacy (CODASPY), pp. 117–128 (2013)
Google Scholar
Zhang, J., Seifert, C., Stokes, J.W., Lee, W.: ARROW: generating signatures to detect drive-by downloads. In: Proceedings of the International Conference on World Wide Web (WWW), pp. 187–196 (2011)
Google Scholar

Download references

Acknowledgements

We want to express our thanks to the anonymous reviewers for their valuable comments. We would also like to acknowledge Paul Spicer’s contribution, who initially investigated the problem.

Author information

Authors and Affiliations

Stevens Institute of Technology, Hoboken, NJ, USA
Qilang Yang, Dimitrios Damopoulos & Georgios Portokalidis

Authors

Qilang Yang
View author publications
You can also search for this author in PubMed Google Scholar
Dimitrios Damopoulos
View author publications
You can also search for this author in PubMed Google Scholar
Georgios Portokalidis
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Qilang Yang .

Editor information

Editors and Affiliations

Vrije Universiteit Amsterdam, Amsterdam, Noord-Holland, The Netherlands
Herbert Bos
University of North Carolina at Chapel H, Chapel-Hill, USA
Fabian Monrose
Université Paris-Saclay, Evry, France
Gregory Blanc

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Yang, Q., Damopoulos, D., Portokalidis, G. (2015). WYSISNWIV: What You Scan Is Not What I Visit. In: Bos, H., Monrose, F., Blanc, G. (eds) Research in Attacks, Intrusions, and Defenses. RAID 2015. Lecture Notes in Computer Science(), vol 9404. Springer, Cham. https://doi.org/10.1007/978-3-319-26362-5_15

Download citation

DOI: https://doi.org/10.1007/978-3-319-26362-5_15
Published: 12 December 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-26361-8
Online ISBN: 978-3-319-26362-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics