Eyes of a Human, Eyes of a Program: Leveraging Different Views of the Web for Analysis and Detection

Corbetta, Jacopo; Invernizzi, Luca; Kruegel, Christopher; Vigna, Giovanni

doi:10.1007/978-3-319-11379-1_7

Jacopo Corbetta¹⁸,
Luca Invernizzi¹⁸,
Christopher Kruegel¹⁸ &
…
Giovanni Vigna¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNSC,volume 8688))

Included in the following conference series:

International Workshop on Recent Advances in Intrusion Detection

3035 Accesses
4 Citations

Abstract

With JavaScript and images at their disposal, web authors can create content that is immediately understandable to a person, but is beyond the direct analysis capability of computer programs, including security tools. Conversely, information can be deceiving for humans even if unable to fool a program.

In this paper, we explore the discrepancies between user perception and program perception, using content obfuscation and counterfeit “seal” images as two simple but representative case studies. In a dataset of 149,700 pages we found that benign pages rarely engage in these practices, while uncovering hundreds of malicious pages that would be missed by traditional malware detectors.

We envision that this type of heuristics could be a valuable addition to existing detection systems. To show this, we have implemented a proof-of-concept detector that, based solely on a similarity score computed on our metrics, can already achieve a high precision (95%) and a good recall (73%).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Anderson, D.S., Fleizach, C., Savage, S., Voelker, G.M.: Spamscatter: Characterizing Internet Scam Hosting Infrastructure. In: Proceedings of the USENIX Security Symposium (2007)
Google Scholar
Barth, A., Caballero, J., Song, D.: Secure Content Sniffing for Web Browsers, or How to Stop Papers from Reviewing Themselves. In: Proceedings of the 30th IEEE Symposium on Security and Privacy. IEEE (2009)
Google Scholar
Bate, R., Jin, G., Mathur, A.: In Whom We Trust: The Role of Certification Agencies in Online Drug Markets. NBER working paper 17955 (2012)
Google Scholar
Bergholz, A., Paass, G., Reichartz, F., Strobel, S., Moens, M.F., Witten, B.: Detecting Known and New Salting Tricks in Unwanted Emails. In: Proceedings of the Conference on Email and Anti-Spam, CEAS (2008)
Google Scholar
Chou, N., Ledesma, R., Teraguchi, Y., Mitchell, J.C.: Client-side Defense Against Web-Based Identity Theft. In: Proceedings of the Network and Distributed System Security Symposium, NDSS (2004)
Google Scholar
Cova, M., Kruegel, C., Vigna, G.: Detection and Analysis of Drive-by-download Attacks and Malicious JavaScript code. In: Proceedings of the World Wide Web Conference, WWW (2010)
Google Scholar
Cova, M., Leita, C., Thonnard, O., Keromytis, A.D., Dacier, M.: An Analysis of Rogue AV Campaigns. In: Jha, S., Sommer, R., Kreibich, C. (eds.) RAID 2010. LNCS, vol. 6307, pp. 442–463. Springer, Heidelberg (2010)
Chapter Google Scholar
Cunningham, P., Nowlan, N., Delany, S.J., Haahr, M.: A Case-Based Approach to Spam Filtering that Can Track Concept Drift. Knowledge-Based Systems (2005)
Google Scholar
Cutts, M.: Pagerank sculpting (2009), http://www.mattcutts.com/blog/pagerank-sculpting/
Fumera, G., Pillai, I., Roli, F.: Spam Filtering Based on the Analysis of Text Information Embedded into Images. The Journal of Machine Learning Research 7, 2699–2720 (2006)
Google Scholar
Garera, S., Provos, N., Chew, M., Rubin, A.D.: A Framework for Detection and Measurement of Phishing Attacks. In: Proceedings of the ACM Workshop on Recurring Malcode, WORM (2007)
Google Scholar
Google Inc.: Image publishing guidelines (2012), http://support.google.com/webmasters/bin/answer.py?hl=en&answer=114016
Google Inc.: Making AJAX Applications Crawable (2014), https://developers.google.com/webmasters/ajax-crawling/
Google Inc.: Webmaster Guidelines (2014), http://support.google.com/webmasters/bin/answer.py?hl=en&answer=35769#2
Hara, M., Yamada, A., Miyake, Y.: Visual Similarity-Based Phishing Detection without Victim Site Information. In: Proceedings of the IEEE Symposium on Computational Intelligence in Cyber Security (CICS), pp. 30–36. IEEE (March 2009)
Google Scholar
Invernizzi, L., Benvenuti, S., Comparetti, P.M., Cova, M., Kruegel, C., Vigna, G.: EVILSEED: A Guided Approach to Finding Malicious Web Pages. In: Proceedings of the IEEE Symposium on Security and Privacy, S&P (2012)
Google Scholar
Invernizzi, L., Miskovic, S., Torres, R., Saha, S., Lee, S.J., Mellia, M., Kruegel, C., Vigna, G.: Nazca: Detecting Malware Distribution in Large-Scale Networks. In: Proceedings of the Network and Distributed System Security Symposium, NDSS (2014)
Google Scholar
Kapravelos, A., Shoshitaishvili, Y., Cova, M., Kruegel, C., Vigna, G.: Revolver: An Automated Approach to the Detection of Evasive Web-based Malware. In: Proceedings of the USENIX Security Symposium (2013)
Google Scholar
Kirda, E., Kruegel, C.: Protecting Users Against Phishing Attacks with AntiPhish. In: Proceedings of the International Conference on Computer Software and Applications (COMPSAC), vol. 1, pp. 517–524. IEEE (2005)
Google Scholar
Konte, M., Feamster, N., Jung, J.: Fast Flux Service Networks: Dynamics and Roles in Hosting Online Scams. Tech. rep., Georgia Institute of Technology and Intel Research (2008)
Google Scholar
Lee, S., Kim, J.: WarningBird: Detecting Suspicious URLs in Twitter Stream. In: Proceedings of the Network and Distributed System Security Symposium, NDSS (2010)
Google Scholar
Li, Z., Alrwais, S., Xie, Y., Yu, F., Wang, X.: Finding the Linchpins of the Dark Web: A Study on Topologically Dedicated Hosts on Malicious Web Infrastructures. In: Proceedings of the IEEE Symposium on Security and Privacy (S&P), pp. 112–126 (May 2013)
Google Scholar
Lin, E., Greenberg, S., Trotter, E., Ma, D., Aycock, J.: Does Domain Highlighting Help People Identify Phishing Sites? In: Proceedings of the Conference on Human Factors in Computing Systems (CHI), p. 2075. ACM Press, New York (2011)
Google Scholar
Lu, L., Perdisci, R., Lee, W.: SURF: Detecting and Measuring Search Poisoning Categories. In: Proceedings of the ACM Conference on Computer and Communications Security, CCS (2011)
Google Scholar
Jung, J., Milito, R.A., Paxson, V.: On the Effectiveness of Techniques to Detect Phishing Sites. In: Hämmerli, B.M., Sommer, R. (eds.) DIMVA 2007. LNCS, vol. 4579, pp. 20–39. Springer, Heidelberg (2007)
Chapter Google Scholar
Mcgrath, D.K., Gupta, M.: Behind Phishing: An Examination of Phisher Modi Operandi. In: Proceedings of the USENIX Workshop on Large-Scale Exploits and Emergent Threats, LEET (2008)
Google Scholar
Medvet, E., Kirda, E., Kruegel, C.: Visual-Similarity-Based Phishing Detection. In: Proceedings of the International Conference on Security and Privacy in Communication Networks (SecureComm), p. 1. ACM Press, New York (2008)
Chapter Google Scholar
Microsoft Corp.: Bing Webmaster Guidelines (2014), http://www.bing.com/webmaster/help/webmaster-guidelines-30fba23a
Neupane, A., Saxena, N., Kuruvilla, K., Georgescu, M., Kana, R.: Neural Signatures of User-Centered Security: An fMRI Study of Phishing, and Malware Warnings. In: Proceedings of the Network and Distributed System Security Symposium (NDSS). pp. 1–16 (2014)
Google Scholar
Ntoulas, A., Hall, B., Najork, M., Manasse, M., Fetterly, D.: Detecting Spam Web Pages through Content Analysis. In: Proceedings of the International World Wide Web Conference (WWW), pp. 83–92 (2006)
Google Scholar
Prakash, P., Kumar, M., Kompella, R.R., Gupta, M.: PhishNet: Predictive Blacklisting to Detect Phishing Attacks. In: Proceedings of the IEEE International Conference on Computer Communications (INFOCOM), pp. 1–5. IEEE (March 2010)
Google Scholar
Rajab, M.A., Ballard, L., Marvrommatis, P., Provos, N., Zhao, X.: The Nocebo Effect on the Web: An Analysis of Fake Anti-Virus Distribution. In: Large-Scale Exploits and Emergent Threats, LEET (2010)
Google Scholar
Rosiello, A.P.E., Kirda, E., Kruegel, C., Ferrandi, F.: A Layout-Similarity-Based Approach for Detecting Phishing Pages. In: Proceedings of the International Conference on Security and Privacy in Communication Networks, SecureComm (2007)
Google Scholar
Ruzzo, W., Tompa, M.: A Linear Time Algorithm for Finding All Maximal Scoring Subsequences. In: Proceedings of the Seventh International Conference on Intelligent Systems for Molecular Biology. AAAI (1999)
Google Scholar
Seifert, C., Welch, I., Komisarczuk, P.: Identification of Malicious Web Pages with Static Heuristics. In: Proceedings of the Australasian Telecommunication Networks and Applications Conference. IEEE (2008)
Google Scholar
Sheng, S., Holbrook, M., Kumaraguru, P., Cranor, L., Downs, J.: Who Falls for Phish? A Demographic Analysis of Phishing Susceptibility and Effectiveness of Interventions. In: Proceedings of the Conference on Human Factors in Computing Systems (CHI), pp. 373–382 (2010)
Google Scholar
Sheng, S., Magnien, B., Kumaraguru, P., Acquisti, A., Cranor, L.F., Hong, J., Nunge, E.: Anti-Phishing Phil: The Design and Evaluation of a Game That Teaches People Not to Fall for Phish. In: Proceedings of the Symposium on Usable Privacy and Security (SOUPS), pp. 88–99 (2007)
Google Scholar
Sheng, S., Wardman, B., Warner, G., Cranor, L.F., Hong, J.: An Empirical Analysis of Phishing Blacklists. In: Proceedings of the Conference on Email and Anti-Spam, CEAS (2009)
Google Scholar
Stone-Gross, B., Abman, R., Kemmerer, R., Kruegel, C., Steigerwald, D., Vigna, G.: The Underground Economy of Fake Antivirus Software. In: Proceedings of the Workshop on Economics of Information Security, WEIS (2011)
Google Scholar
Stringhini, G., Kruegel, C., Vigna, G.: Shady Paths: Leveraging Surfing Crowds to Detect Malicious Web Pages. In: Proceedings of the ACM Conference on Computer and Communications Security, CCS (2013)
Google Scholar
Symantec: Seal License Agreement (2014), https://www.symantec.com/content/en/us/about/media/repository/norton-secured-seal-license-agreement.pdf
Wang, D.Y., Savage, S., Voelker, G.M.: Cloak and Dagger: Dynamics of Web Search Cloaking. In: Proceedings of the ACM Conference on Computer and Communications Security (CCS), pp. 477–489 (2011)
Google Scholar
Wepawet, http://wepawet.cs.ucsb.edu
Whittaker, C., Ryner, B., Nazif, M.: Large-Scale Automatic Classification of Phishing Pages. In: Proceedings of the Network and Distributed System Security Symposium, NDSS (2010)
Google Scholar
Wu, M., Miller, R.C., Garfinkel, S.L.: Do Security Toolbars Actually Prevent Phishing Attacks? In: Proceedings of the Conference on Human Factors in Computing Systems (CHI), pp. 601–610 (2006)
Google Scholar
Xiang, G., Hong, J., Rose, C.P., Cranor, L.: CANTINA+: A Feature-Rich Machine Learning Framework for Detecting Phishing Web Sites. In: ACM Transactions on Information and System Security, pp. 1–28 (2011)
Google Scholar
Yandex, N.V.: Recommendations for webmasters - Common errors (2014), http://help.yandex.com/webmaster/recommendations/frequent-mistakes.xml
Yandex, N.V.: Recommendations for webmasters - Using graphic elements (2014), http://help.yandex.com/webmaster/recommendations/using-graphics.xml
Zauner, C.: Implementation and Benchmarking of Perceptual Image Hash Functions. Master’s thesis, Upper Austria University of Applied Sciences, Hagenberg Campus (2010)
Google Scholar
Zhang, Y., Hong, J., Cranor, L.: CANTINA: A Content-Based Approach to Detecting Phishing Web Sites. In: Proceedings of the ACM Conference on Computer and Communications Security, CCS (2007)
Google Scholar

Download references

Author information

Authors and Affiliations

University of California, Santa Barbara, USA
Jacopo Corbetta, Luca Invernizzi, Christopher Kruegel & Giovanni Vigna

Authors

Jacopo Corbetta
View author publications
You can also search for this author in PubMed Google Scholar
Luca Invernizzi
View author publications
You can also search for this author in PubMed Google Scholar
Christopher Kruegel
View author publications
You can also search for this author in PubMed Google Scholar
Giovanni Vigna
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

George Mason University, Fairfax, VA, USA
Angelos Stavrou
VU University Amsterdam, The Netherlands
Herbert Bos
Stevens Institute of Technology, NJ, USA
Georgios Portokalidis

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Corbetta, J., Invernizzi, L., Kruegel, C., Vigna, G. (2014). Eyes of a Human, Eyes of a Program: Leveraging Different Views of the Web for Analysis and Detection. In: Stavrou, A., Bos, H., Portokalidis, G. (eds) Research in Attacks, Intrusions and Defenses. RAID 2014. Lecture Notes in Computer Science, vol 8688. Springer, Cham. https://doi.org/10.1007/978-3-319-11379-1_7

Download citation

DOI: https://doi.org/10.1007/978-3-319-11379-1_7
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-11378-4
Online ISBN: 978-3-319-11379-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics