Proactive Discovery of Phishing Related Domain Names

  • Samuel Marchal
  • Jérôme François
  • Radu State
  • Thomas Engel
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7462)


Phishing is an important security issue to the Internet, which has a significant economic impact. The main solution to counteract this threat is currently reactive blacklisting; however, as phishing attacks are mainly performed over short periods of time, reactive methods are too slow. As a result, new approaches to early identify malicious websites are needed. In this paper a new proactive discovery of phishing related domain names is introduced. We mainly focus on the automated detection of possible domain registrations for malicious activities. We leverage techniques coming from natural language modelling in order to build pro-active blacklists. The entries in this list are built using language models and vocabularies encountered in phishing related activities - “secure”, “banking”, brand names, etc. Once a pro-active blacklist is created, ongoing and daily monitoring of only these domains can lead to the efficient detection of phishing web sites.


phishing blacklisting DNS probing natural language 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Anti-Phishing Working Group and others: Phishing Activity Trends Report - 1H2011. Anti-Phishing Working Group (2011)Google Scholar
  2. 2.
    Antonakakis, M., Perdisci, R., Dagon, D., Lee, W., Feamster, N.: Building a dynamic reputation system for dns. In: Proceedings of the 19th USENIX Conference on Security, USENIX Security 2010, p. 18. USENIX Association, Berkeley (2010)Google Scholar
  3. 3.
    Antonakakis, M., Perdisci, R., Lee, W., Vasiloglou II, N., Dagon, D.: Detecting malware domains at the upper dns hierarchy. In: Proceedings of the 20th USENIX Conference on Security, SEC 2011, p. 27. USENIX Association, Berkeley (2011)Google Scholar
  4. 4.
    Bilge, L., Kirda, E., Kruegel, C., Balduzz, M.: EXPOSURE: Finding Malicious Domains Using Passive DNS Analysis. In: NDSS 2011. Internet Society (February 2011)Google Scholar
  5. 5.
    Blum, A., Wardman, B., Solorio, T., Warner, G.: Lexical feature based phishing url detection using online learning. In: Proceedings of the 3rd ACM Workshop on Artificial Intelligence and Security, pp. 54–60. ACM (2010)Google Scholar
  6. 6.
    Born, K., Gustafson, D.: Detecting dns tunnels using character frequency analysis. Arxiv preprint arXiv:1004.4358 (2010)Google Scholar
  7. 7.
    Felegyhazi, M., Kreibich, C., Paxson, V.: On the potential of proactive domain blacklisting. In: Proceedings of the 3rd USENIX Conference on Large-Scale Exploits and Emergent Threats: Botnets, Spyware, Worms, and More, p. 6. USENIX Association (2010)Google Scholar
  8. 8.
    Garera, S., Provos, N., Chew, M., Rubin, A.D.: A framework for detection and measurement of phishing attacks. In: Proceedings of the 2007 ACM Workshop on Recurring Malcode, pp. 1–8. ACM (2007)Google Scholar
  9. 9.
    Gyawali, B., Solorio, T., Wardman, B., Warner, G., et al.: Evaluating a semisupervised approach to phishing url identification in a realistic scenario. In: Proceedings of the 8th Annual Collaboration, Electronic Messaging, Anti-Abuse and Spam Conference, pp. 176–183. ACM (2011)Google Scholar
  10. 10.
    Hao, S., Feamster, N., Pandrangi, R.: Monitoring the initial DNS behavior of malicious domains. In: Proceedings of the ACM SIGCOMM Internet Measurement Conference, IMC 2011, pp. 269–278. ACM, New York (2011)CrossRefGoogle Scholar
  11. 11.
    Khonji, M., Iraqi, Y., Jones, A.: Lexical url analysis for discriminating phishing and legitimate websites. In: Proceedings of the 8th Annual Collaboration, Electronic messaging, Anti-Abuse and Spam Conference, pp. 109–115. ACM (2011)Google Scholar
  12. 12.
    Kolb, P.: DISCO: A Multilingual Database of Distributionally Similar Words. In: Storrer, A., Geyken, A., Siebert, A., Würzner, K.-M. (eds.) KONVENS 2008 – Ergänzungsband: Textressourcen und Lexikalisches Wissen, pp. 37–44 (2008)Google Scholar
  13. 13.
    Le, A., Markopoulou, A., Faloutsos, M.: Phishdef: Url names say it all. In: INFOCOM, 2011 Proceedings IEEE, pp. 191–195. IEEE (2011)Google Scholar
  14. 14.
    Ludl, C., Mcallister, S., Kirda, E., Kruegel, C.: On the Effectiveness of Techniques to Detect Phishing Sites. In: Hämmerli, B.M., Sommer, R. (eds.) DIMVA 2007. LNCS, vol. 4579, pp. 20–39. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  15. 15.
    Ma, J., Saul, L., Savage, S., Voelker, G.: Identifying suspicious urls: an application of large-scale online learning. In: Proceedings of the 26th Annual International Conference on Machine Learning, pp. 681–688. ACM (2009)Google Scholar
  16. 16.
    Marchal, S., François, J., Wagner, C., Engel, T.: Semantic Exploration of DNS. In: Bestak, R., Kencl, L., Li, L.E., Widmer, J., Yin, H. (eds.) NETWORKING 2012, Part I. LNCS, vol. 7289, pp. 370–384. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  17. 17.
    Mockapetris, P.: Rfc 1035: Domain names - implementation and specificationGoogle Scholar
  18. 18.
    Mockapetris, P.: Rfc 1034: Domain names - concepts and facilities (1987)Google Scholar
  19. 19.
    Mockapetris, P., Dunlap, K.: Development of the domain name system. In: Proceedings of the 1988 ACM SIGCOMM, pp. 123–133. IEEE Computer Society, Stanford (1988)Google Scholar
  20. 20.
    Prakash, P., Kumar, M., Kompella, R., Gupta, M.: Phishnet: predictive blacklisting to detect phishing attacks. In: INFOCOM, 2010 Proceedings IEEE, pp. 1–5. IEEE (2010)Google Scholar
  21. 21.
    Rasmussen, R., Aaron, G.: Global phishing survey: trends and domain name use in 1h2011. Anti-Phishing Working Group (2011)Google Scholar
  22. 22.
    Segaran, T., Hammerbacher, J.: Beautiful Data: The Stories Behind Elegant Data Solutions, ch. 14. O’Reilly Media (2009)Google Scholar
  23. 23.
    Soldo, F., Le, A., Markopoulou, A.: Predictive blacklisting as an implicit recommendation system. In: INFOCOM, 2010 Proceedings IEEE, pp. 1–9. IEEE (2010)Google Scholar
  24. 24.
    Wagner, C., François, J., State, R., Engel, T., Dulaunoy, A., Wagener, G.: SDBF: Smart DNS Brute-Forcer. In: Proceedings of IEEE/IFIP Network Operations and Management Symposium - NOMS. IEEE Computer Society (2012)Google Scholar
  25. 25.
    Xiang, G., Hong, J.: A hybrid phish detection approach by identity discovery and keywords retrieval. In: Proceedings of the 18th International Conference on World Wide Web, pp. 571–580. ACM (2009)Google Scholar
  26. 26.
    Xie, Y., Yu, F., Achan, K., Panigrahy, R., Hulten, G., Osipkov, I.: Spamming botnets: signatures and characteristics. In: ACM SIGCOMM Computer Communication Review, vol. 38, pp. 171–182. ACM (2008)Google Scholar
  27. 27.
    Yadav, S., Reddy, A.K.K., Reddy, AL, Ranjan, S.: Detecting algorithmically generated malicious domain names. In: Proceedings of the 10th Annual Conference on Internet Measurement, pp. 48–61. ACM (2010)Google Scholar
  28. 28.
    Zhang, J., Porras, P., Ullrich, J.: Highly predictive blacklisting. In: Proceedings of the 17th Conference on Security Symposium, pp. 107–122. USENIX Association (2008)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Samuel Marchal
    • 1
  • Jérôme François
    • 1
  • Radu State
    • 1
  • Thomas Engel
    • 1
  1. 1.SnT - University of LuxembourgLuxembourg

Personalised recommendations