Sophisticated Phishers Make More Spelling Mistakes: Using URL Similarity against Phishing

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7672)


Phishing attacks rise in quantity and quality. With short online lifetimes of those attacks, classical blacklist based approaches are not sufficient to protect online users. While attackers manage to achieve high similarity between original and fraudulent websites, this fact can also be used for attack detection. In many cases attackers try to make the Internet address (URL) from a website look similar to the original. In this work, we present a way of using the URL itself for automated detection of phishing websites by extracting and verifying different terms of a URL using search engine spelling recommendation.

We evaluate our concept against a large test set of 8730 real phishing URLs. In addition, we collected scores for the visual quality of a subset of those attacks to be able to compare the performance of our tests for different attack qualities. Results suggest that our heuristics are able to mark 54.3% of the malicious URLs as suspicious. With increasing visual quality of the phishing websites, the number of URL characteristics that allow a detection increases, as well.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    PhishTank: Statistics about phishing activity and PhishTank usage, (last accessed April 28, 2012)
  2. 2.
    Goodin, D.: Google bots detect 9,500 new malicious websites every day, (last visited July 12, 2012)
  3. 3.
    Google Inc.: Safe browsing API — google developers, (last accessed April 28, 2012)
  4. 4.
    Hong, J.: The state of phishing attacks. Communications of the ACM (2012)Google Scholar
  5. 5.
    Zhang, Y., Egelman, S., Cranor, L., Hong, J.: Phinding phish: Evaluating anti-phishing tools. In: NDSS (2007)Google Scholar
  6. 6.
    Moscaritolo, A.: Number of phishing URLs at alltime high, (last visited July 12, 2012)
  7. 7.
    Riden, J.: How fast-flux server networks work (2008), (last visited July 12, 2012)
  8. 8.
    Whitten, A., Tygar, J.D.: Why johnny can’t encrypt: A usability evaluation of PGP 5.0. In: 8th USENIX Security Symposium (1999)Google Scholar
  9. 9.
    Dhamija, R., Tygar, J.D., Hearst, M.: Why phishing works. In: CHI (2006)Google Scholar
  10. 10.
    Wu, M., Miller, R.C., Garfinkel, S.L.: Do security toolbars actually prevent phishing attacks? In: CHI (2006)Google Scholar
  11. 11.
    Chou, N., Ledesma, R., Teraguchi, Y., Boneh, D., Mitchell, J.C.: Client-side defense against web-based identity theft. In: NDSS (2004)Google Scholar
  12. 12.
    Zhang, Y., Hong, J.I., Cranor, L.F.: Cantina: a content-based approach to detecting phishing web sites. In: WWW (2007)Google Scholar
  13. 13.
    Phelps, T.A., Wilensky, R.: Robust hyperlinks cost just five words each. Technical Report (2000)Google Scholar
  14. 14.
    Xiang, G., Hong, J., Rose, C.P., Cranor, L.: CANTINA+: a feature-rich machine learning framework for detecting phishing web sites. ACM Transactions on Information and System Security (2011)Google Scholar
  15. 15.
    Krammer, V.: Phishing defense against IDN address spoofing attacks. In: PST (2006)Google Scholar
  16. 16.
    Gabrilovich, E., Gontmakher, A.: The homograph attack. Communications of the ACM (2002)Google Scholar
  17. 17.
    Gusfield, D.: Algorithms on strings, trees, and sequences: computer science and computational biology. Cambridge University Press (1997)Google Scholar
  18. 18.
    Lin, E., Greenberg, S., Trotter, E., Ma, D., Aycock, J.: Does domain highlighting help people identify phishing sites? In: CHI (2011)Google Scholar
  19. 19.
    Postel, J.: Domain Name System Structure and Delegation. RFC 1591, Informational (1994)Google Scholar
  20. 20.
    Mozilla Foundation: Public suffix list, (last accessed April 29, 2012)

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  1. 1.Media Informatics GroupUniversity of MunichMunichGermany

Personalised recommendations