ADAM: Automated Detection and Attribution of Malicious Webpages

  • Ahmed E. Kosba
  • Aziz Mohaisen
  • Andrew West
  • Trevor Tonn
  • Huy Kang Kim
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8909)

Abstract

Malicious webpages are a prevalent and severe threat in the Internet security landscape. This fact has motivated numerous static and dynamic techniques to alleviate such threat. Building on this existing literature, this work introduces the design and evaluation of ADAM, a system that uses machine-learning over network metadata derived from the sandboxed execution of webpage content. ADAM aims at detecting malicious webpages and identifying the type of vulnerability using simple set of features as well. Machine-trained models are not novel in this problem space. Instead, it is the dynamic network artifacts (and their subsequent feature representations) collected during rendering that are the greatest contribution of this work. Using a real-world operational dataset that includes different type of malice behavior, our results show that dynamic cheap network artifacts can be used effectively to detect most types of vulnerabilities achieving an accuracy reaching 96 %. The system was also able to identify the type of a detected vulnerability with high accuracy achieving an exact match in 91 % of the cases. We identify the main vulnerabilities that require improvement, and suggest directions to extend this work to practical contexts.

References

  1. 1.
    Antonakakis, M., Perdisci, R., Dagon, D., Lee, W., Feamster, N.: Building a dynamic reputation system for DNS. In: USENIX Security (2010)Google Scholar
  2. 2.
    Antonakakis, M., Perdisci, R., Lee, W., Vasiloglou II, N., Dagon, D.: Detecting malware domains at the upper DNS hierarchy. In: USENIX Security Symposium (2011)Google Scholar
  3. 3.
    Bailey, M., Oberheide, J., Andersen, J., Mao, Z.M., Jahanian, F., Nazario, J.: Automated classification and analysis of internet malware. In: Kruegel, C., Lippmann, R., Clark, A. (eds.) RAID 2007. LNCS, vol. 4637, pp. 178–197. Springer, Heidelberg (2007) CrossRefGoogle Scholar
  4. 4.
    Bayer, U., Comparetti, P.M., Hlauschek, C., Krügel, C., Kirda, E.: Scalable, behavior-based malware clustering. In: NDSS (2009)Google Scholar
  5. 5.
    Bilge, L., Kirda, E., Kruegel, C., Balduzzi, M.: EXPOSURE: finding malicious domains using passive DNS analysis. In: NDSS (2011)Google Scholar
  6. 6.
    Blum, A., Wardman, B., Solorio, T., Warner, G.: Lexical feature based phishing URL detection using online learning. In: AISec (2010)Google Scholar
  7. 7.
    Canali, D., Cova, M., Vigna, G., Kruegel, C.: Prophiler: a fast filter for the large-scale detection of malicious web pages. In: Proceedings of the World Wide Web (WWW) (2011)Google Scholar
  8. 8.
    Chang, J., Venkatasubramanian, K.K., West, A.G., Lee, I.: Analyzing and defending against web-based malware. ACM Comput. Surv. 45(4), 49 (2013)CrossRefGoogle Scholar
  9. 9.
    Egele, M., Scholte, T., Kirda, E., Kruegel, C.: A survey on automated dynamic malware-analysis techniques and tools. ACM Comput. Surv. 44(2), 296–296 (2008)Google Scholar
  10. 10.
    Felegyhazi, M., Kreibich, C., Paxson, V.: On the potential of proactive domain blacklisting. In: LEET (2010)Google Scholar
  11. 11.
    Gu, G., Perdisci, R., Zhang, J., Lee, W.: BotMiner: clustering analysis of network traffic for protocol and structure independent botnet detection. In: USENIX Security (2008)Google Scholar
  12. 12.
    Gu, G., Porris, P., Yegneswaran, V., Fong, M., Lee, W.: Bothunter: detecting malware infection through IDS-driven dialog correlation. In: USENIX Security (2007)Google Scholar
  13. 13.
    Gu, G., Zhang, J., Lee, W.: BotSniffer: detecting botnet command and control channels in network traffic. In: NDSS (2008)Google Scholar
  14. 14.
    Hao, S., Thomas, M., Paxson, V., Feamster, N., Kreibich, C., Grier, C., Hollenbeck, S.: Understanding the domain registration behavior of spammers. In: IMC (2013)Google Scholar
  15. 15.
    Kolbitsch, C., Comparetti, P.M., Kruegel, C., Kirda, E., Zhou, X., Wang, X.: Effective and efficient malware detection at the end host. In: USENIX Security Symposium (2009)Google Scholar
  16. 16.
    Kong. D., Yan, G.: Discriminant malware distance learning on structural information for automated malware classification. In: KDD (2013)Google Scholar
  17. 17.
    Ma, J., Saul, L.K., Savage, S., Voelker, G.M.: Beyond blacklists: learning to detect malicious web sites from suspicious URLs. In: KDD (2009)Google Scholar
  18. 18.
    Ma, J., Saul, J.L.K., Savage, S., Voelker, G.M.: Learning to detect malicious URLs. ACM Trans. Intell. Syst. Technol. 2(3), 30:1–30:24 (2011)Google Scholar
  19. 19.
    McGrath, D.K, Gupta, M.: Behind phishing: an examination of phisher modi operandi. In: LEET (2008)Google Scholar
  20. 20.
    Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., et al.: Scikit-learn: machine learning in python. J. Mach. Learn. Res. 12, 2825–2830 (2011)MATHMathSciNetGoogle Scholar
  21. 21.
    Provos, N., Mavrommatis, P., Rajab, M.A., Monrose, F.: All your iFRAMEs point to us. In: USENIX Security (2008)Google Scholar
  22. 22.
    Provos, N., McNamee, D., Mavrommatis, P., Wang, K., Modadugu, N., et al.: The ghost in the browser analysis of web-based malware. In: HotBots (2007)Google Scholar
  23. 23.
    Thomas, K., Grier, C., Ma, J., Paxson, V., Song, D.: Design and evaluation of a real-time url spam filtering service. In: IEEE Security and Privacy (2011)Google Scholar
  24. 24.
    Xu, L., Zhan, Z., Xu, S., Ye, K.: Cross-layer detection of malicious websites. In CODASPY (2013)Google Scholar
  25. 25.
    Yen, T.-F., Oprea, A., Onarlioglu, K., Leetham, T., Robertson, W., Juels, A., Kirda, E.: Beehive: large-scale log analysis for detecting suspicious activity in enterprise networks. In: ACSAC (2013)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Ahmed E. Kosba
    • 1
  • Aziz Mohaisen
    • 2
  • Andrew West
    • 2
  • Trevor Tonn
    • 3
  • Huy Kang Kim
    • 4
  1. 1.University of Maryland at College ParkCollege ParkUSA
  2. 2.Verisign LabsRestonUSA
  3. 3.Amazon.comWashington DCUSA
  4. 4.Korea UniversitySeoulSouth Korea

Personalised recommendations