Advertisement

Eyes of a Human, Eyes of a Program: Leveraging Different Views of the Web for Analysis and Detection

  • Jacopo Corbetta
  • Luca Invernizzi
  • Christopher Kruegel
  • Giovanni Vigna
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8688)

Abstract

With JavaScript and images at their disposal, web authors can create content that is immediately understandable to a person, but is beyond the direct analysis capability of computer programs, including security tools. Conversely, information can be deceiving for humans even if unable to fool a program.

In this paper, we explore the discrepancies between user perception and program perception, using content obfuscation and counterfeit “seal” images as two simple but representative case studies. In a dataset of 149,700 pages we found that benign pages rarely engage in these practices, while uncovering hundreds of malicious pages that would be missed by traditional malware detectors.

We envision that this type of heuristics could be a valuable addition to existing detection systems. To show this, we have implemented a proof-of-concept detector that, based solely on a similarity score computed on our metrics, can already achieve a high precision (95%) and a good recall (73%).

Keywords

Website analysis content obfuscation fraud detection 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Anderson, D.S., Fleizach, C., Savage, S., Voelker, G.M.: Spamscatter: Characterizing Internet Scam Hosting Infrastructure. In: Proceedings of the USENIX Security Symposium (2007)Google Scholar
  2. 2.
    Barth, A., Caballero, J., Song, D.: Secure Content Sniffing for Web Browsers, or How to Stop Papers from Reviewing Themselves. In: Proceedings of the 30th IEEE Symposium on Security and Privacy. IEEE (2009)Google Scholar
  3. 3.
    Bate, R., Jin, G., Mathur, A.: In Whom We Trust: The Role of Certification Agencies in Online Drug Markets. NBER working paper 17955 (2012)Google Scholar
  4. 4.
    Bergholz, A., Paass, G., Reichartz, F., Strobel, S., Moens, M.F., Witten, B.: Detecting Known and New Salting Tricks in Unwanted Emails. In: Proceedings of the Conference on Email and Anti-Spam, CEAS (2008)Google Scholar
  5. 5.
    Chou, N., Ledesma, R., Teraguchi, Y., Mitchell, J.C.: Client-side Defense Against Web-Based Identity Theft. In: Proceedings of the Network and Distributed System Security Symposium, NDSS (2004)Google Scholar
  6. 6.
    Cova, M., Kruegel, C., Vigna, G.: Detection and Analysis of Drive-by-download Attacks and Malicious JavaScript code. In: Proceedings of the World Wide Web Conference, WWW (2010)Google Scholar
  7. 7.
    Cova, M., Leita, C., Thonnard, O., Keromytis, A.D., Dacier, M.: An Analysis of Rogue AV Campaigns. In: Jha, S., Sommer, R., Kreibich, C. (eds.) RAID 2010. LNCS, vol. 6307, pp. 442–463. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  8. 8.
    Cunningham, P., Nowlan, N., Delany, S.J., Haahr, M.: A Case-Based Approach to Spam Filtering that Can Track Concept Drift. Knowledge-Based Systems (2005)Google Scholar
  9. 9.
    Cutts, M.: Pagerank sculpting (2009), http://www.mattcutts.com/blog/pagerank-sculpting/
  10. 10.
    Fumera, G., Pillai, I., Roli, F.: Spam Filtering Based on the Analysis of Text Information Embedded into Images. The Journal of Machine Learning Research 7, 2699–2720 (2006)Google Scholar
  11. 11.
    Garera, S., Provos, N., Chew, M., Rubin, A.D.: A Framework for Detection and Measurement of Phishing Attacks. In: Proceedings of the ACM Workshop on Recurring Malcode, WORM (2007)Google Scholar
  12. 12.
    Google Inc.: Image publishing guidelines (2012), http://support.google.com/webmasters/bin/answer.py?hl=en&answer=114016
  13. 13.
    Google Inc.: Making AJAX Applications Crawable (2014), https://developers.google.com/webmasters/ajax-crawling/
  14. 14.
  15. 15.
    Hara, M., Yamada, A., Miyake, Y.: Visual Similarity-Based Phishing Detection without Victim Site Information. In: Proceedings of the IEEE Symposium on Computational Intelligence in Cyber Security (CICS), pp. 30–36. IEEE (March 2009)Google Scholar
  16. 16.
    Invernizzi, L., Benvenuti, S., Comparetti, P.M., Cova, M., Kruegel, C., Vigna, G.: EVILSEED: A Guided Approach to Finding Malicious Web Pages. In: Proceedings of the IEEE Symposium on Security and Privacy, S&P (2012)Google Scholar
  17. 17.
    Invernizzi, L., Miskovic, S., Torres, R., Saha, S., Lee, S.J., Mellia, M., Kruegel, C., Vigna, G.: Nazca: Detecting Malware Distribution in Large-Scale Networks. In: Proceedings of the Network and Distributed System Security Symposium, NDSS (2014)Google Scholar
  18. 18.
    Kapravelos, A., Shoshitaishvili, Y., Cova, M., Kruegel, C., Vigna, G.: Revolver: An Automated Approach to the Detection of Evasive Web-based Malware. In: Proceedings of the USENIX Security Symposium (2013)Google Scholar
  19. 19.
    Kirda, E., Kruegel, C.: Protecting Users Against Phishing Attacks with AntiPhish. In: Proceedings of the International Conference on Computer Software and Applications (COMPSAC), vol. 1, pp. 517–524. IEEE (2005)Google Scholar
  20. 20.
    Konte, M., Feamster, N., Jung, J.: Fast Flux Service Networks: Dynamics and Roles in Hosting Online Scams. Tech. rep., Georgia Institute of Technology and Intel Research (2008)Google Scholar
  21. 21.
    Lee, S., Kim, J.: WarningBird: Detecting Suspicious URLs in Twitter Stream. In: Proceedings of the Network and Distributed System Security Symposium, NDSS (2010)Google Scholar
  22. 22.
    Li, Z., Alrwais, S., Xie, Y., Yu, F., Wang, X.: Finding the Linchpins of the Dark Web: A Study on Topologically Dedicated Hosts on Malicious Web Infrastructures. In: Proceedings of the IEEE Symposium on Security and Privacy (S&P), pp. 112–126 (May 2013)Google Scholar
  23. 23.
    Lin, E., Greenberg, S., Trotter, E., Ma, D., Aycock, J.: Does Domain Highlighting Help People Identify Phishing Sites? In: Proceedings of the Conference on Human Factors in Computing Systems (CHI), p. 2075. ACM Press, New York (2011)Google Scholar
  24. 24.
    Lu, L., Perdisci, R., Lee, W.: SURF: Detecting and Measuring Search Poisoning Categories. In: Proceedings of the ACM Conference on Computer and Communications Security, CCS (2011)Google Scholar
  25. 25.
    Jung, J., Milito, R.A., Paxson, V.: On the Effectiveness of Techniques to Detect Phishing Sites. In: Hämmerli, B.M., Sommer, R. (eds.) DIMVA 2007. LNCS, vol. 4579, pp. 20–39. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  26. 26.
    Mcgrath, D.K., Gupta, M.: Behind Phishing: An Examination of Phisher Modi Operandi. In: Proceedings of the USENIX Workshop on Large-Scale Exploits and Emergent Threats, LEET (2008)Google Scholar
  27. 27.
    Medvet, E., Kirda, E., Kruegel, C.: Visual-Similarity-Based Phishing Detection. In: Proceedings of the International Conference on Security and Privacy in Communication Networks (SecureComm), p. 1. ACM Press, New York (2008)CrossRefGoogle Scholar
  28. 28.
    Microsoft Corp.: Bing Webmaster Guidelines (2014), http://www.bing.com/webmaster/help/webmaster-guidelines-30fba23a
  29. 29.
    Neupane, A., Saxena, N., Kuruvilla, K., Georgescu, M., Kana, R.: Neural Signatures of User-Centered Security: An fMRI Study of Phishing, and Malware Warnings. In: Proceedings of the Network and Distributed System Security Symposium (NDSS). pp. 1–16 (2014)Google Scholar
  30. 30.
    Ntoulas, A., Hall, B., Najork, M., Manasse, M., Fetterly, D.: Detecting Spam Web Pages through Content Analysis. In: Proceedings of the International World Wide Web Conference (WWW), pp. 83–92 (2006)Google Scholar
  31. 31.
    Prakash, P., Kumar, M., Kompella, R.R., Gupta, M.: PhishNet: Predictive Blacklisting to Detect Phishing Attacks. In: Proceedings of the IEEE International Conference on Computer Communications (INFOCOM), pp. 1–5. IEEE (March 2010)Google Scholar
  32. 32.
    Rajab, M.A., Ballard, L., Marvrommatis, P., Provos, N., Zhao, X.: The Nocebo Effect on the Web: An Analysis of Fake Anti-Virus Distribution. In: Large-Scale Exploits and Emergent Threats, LEET (2010)Google Scholar
  33. 33.
    Rosiello, A.P.E., Kirda, E., Kruegel, C., Ferrandi, F.: A Layout-Similarity-Based Approach for Detecting Phishing Pages. In: Proceedings of the International Conference on Security and Privacy in Communication Networks, SecureComm (2007)Google Scholar
  34. 34.
    Ruzzo, W., Tompa, M.: A Linear Time Algorithm for Finding All Maximal Scoring Subsequences. In: Proceedings of the Seventh International Conference on Intelligent Systems for Molecular Biology. AAAI (1999)Google Scholar
  35. 35.
    Seifert, C., Welch, I., Komisarczuk, P.: Identification of Malicious Web Pages with Static Heuristics. In: Proceedings of the Australasian Telecommunication Networks and Applications Conference. IEEE (2008)Google Scholar
  36. 36.
    Sheng, S., Holbrook, M., Kumaraguru, P., Cranor, L., Downs, J.: Who Falls for Phish? A Demographic Analysis of Phishing Susceptibility and Effectiveness of Interventions. In: Proceedings of the Conference on Human Factors in Computing Systems (CHI), pp. 373–382 (2010)Google Scholar
  37. 37.
    Sheng, S., Magnien, B., Kumaraguru, P., Acquisti, A., Cranor, L.F., Hong, J., Nunge, E.: Anti-Phishing Phil: The Design and Evaluation of a Game That Teaches People Not to Fall for Phish. In: Proceedings of the Symposium on Usable Privacy and Security (SOUPS), pp. 88–99 (2007)Google Scholar
  38. 38.
    Sheng, S., Wardman, B., Warner, G., Cranor, L.F., Hong, J.: An Empirical Analysis of Phishing Blacklists. In: Proceedings of the Conference on Email and Anti-Spam, CEAS (2009)Google Scholar
  39. 39.
    Stone-Gross, B., Abman, R., Kemmerer, R., Kruegel, C., Steigerwald, D., Vigna, G.: The Underground Economy of Fake Antivirus Software. In: Proceedings of the Workshop on Economics of Information Security, WEIS (2011)Google Scholar
  40. 40.
    Stringhini, G., Kruegel, C., Vigna, G.: Shady Paths: Leveraging Surfing Crowds to Detect Malicious Web Pages. In: Proceedings of the ACM Conference on Computer and Communications Security, CCS (2013)Google Scholar
  41. 41.
  42. 42.
    Wang, D.Y., Savage, S., Voelker, G.M.: Cloak and Dagger: Dynamics of Web Search Cloaking. In: Proceedings of the ACM Conference on Computer and Communications Security (CCS), pp. 477–489 (2011)Google Scholar
  43. 43.
  44. 44.
    Whittaker, C., Ryner, B., Nazif, M.: Large-Scale Automatic Classification of Phishing Pages. In: Proceedings of the Network and Distributed System Security Symposium, NDSS (2010)Google Scholar
  45. 45.
    Wu, M., Miller, R.C., Garfinkel, S.L.: Do Security Toolbars Actually Prevent Phishing Attacks? In: Proceedings of the Conference on Human Factors in Computing Systems (CHI), pp. 601–610 (2006)Google Scholar
  46. 46.
    Xiang, G., Hong, J., Rose, C.P., Cranor, L.: CANTINA+: A Feature-Rich Machine Learning Framework for Detecting Phishing Web Sites. In: ACM Transactions on Information and System Security, pp. 1–28 (2011)Google Scholar
  47. 47.
    Yandex, N.V.: Recommendations for webmasters - Common errors (2014), http://help.yandex.com/webmaster/recommendations/frequent-mistakes.xml
  48. 48.
    Yandex, N.V.: Recommendations for webmasters - Using graphic elements (2014), http://help.yandex.com/webmaster/recommendations/using-graphics.xml
  49. 49.
    Zauner, C.: Implementation and Benchmarking of Perceptual Image Hash Functions. Master’s thesis, Upper Austria University of Applied Sciences, Hagenberg Campus (2010)Google Scholar
  50. 50.
    Zhang, Y., Hong, J., Cranor, L.: CANTINA: A Content-Based Approach to Detecting Phishing Web Sites. In: Proceedings of the ACM Conference on Computer and Communications Security, CCS (2007)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Jacopo Corbetta
    • 1
  • Luca Invernizzi
    • 1
  • Christopher Kruegel
    • 1
  • Giovanni Vigna
    • 1
  1. 1.University of CaliforniaSanta BarbaraUSA

Personalised recommendations