Funny Accents: Exploring Genuine Interest in Internationalized Domain Names

  • Victor Le PochatEmail author
  • Tom Van Goethem
  • Wouter Joosen
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11419)


International Domain Names (IDNs) were introduced to support non-ASCII characters in domain names. In this paper, we explore IDNs that hold genuine interest, i.e. that owners of brands with diacritical marks may want to register and use. We generate 15 276 candidate IDNs from the page titles of popular domains, and see that 43% are readily available for registration, allowing for spoofing or phishing attacks. Meanwhile, 9% are not allowed by the respective registry to be registered, preventing brand owners from owning the IDN. Based on WHOIS records, DNS records and a web crawl, we estimate that at least 50% of the 3 189 registered IDNs have the same owner as the original domain, but that 35% are owned by a different entity, mainly domain squatters; malicious activity was not observed. Finally, we see that application behavior toward these IDNs remains inconsistent, hindering user experience and therefore widespread uptake of IDNs, and even uncover a phishing vulnerability in iOS Mail.


Internationalized Domain Names Phishing Domain squatting Homograph attack 



We would like to thank our shepherd Ignacio Castro for his valuable feedback, and Gertjan Franken and Katrien Janssens for their help in the user agent survey. This research is partially funded by the Research Fund KU Leuven. Victor Le Pochat holds a PhD Fellowship of the Research Foundation - Flanders (FWO).


  1. 1.
  2. 2.
    Measuring the information society report 2017, vol. 1. Technical report, International Telecommunication Union (2017).
  3. 3.
    Agten, P., Joosen, W., Piessens, F., Nikiforakis, N.: Seven months’ worth of mistakes: a longitudinal study of typosquatting abuse. In: 22nd Annual Network and Distributed System Security Symposium. Internet Society (2015).
  4. 4.
    Apple Inc.: About the security content of iOS 12.1.1, December 2018.
  5. 5.
    Braden, R.: Requirements for internet hosts - application and support. RFC 1123, October 1989Google Scholar
  6. 6.
    Canadian Internet Registration Authority: Domains with French accented characters, January 2018.
  7. 7.
    Carletti, S.: Ruby Whois.
  8. 8.
    Chronicle: VirusTotal.
  9. 9.
    Clayton, R., Mansfield, T.: A study of Whois privacy and proxy service abuse. In: 13th Annual Workshop on the Economics of Information Security (2014)Google Scholar
  10. 10.
    Costello, A.: Punycode: a bootstring encoding of Unicode for internationalized domain names in applications (IDNA). RFC 3492, March 2003Google Scholar
  11. 11.
    CZ.NIC: Czechs refused diacritics in domain names again, February 2017.
  12. 12.
    Davis, M., Suignard, M.: Unicode IDNA compatibility processing. Technical Standard 46, The Unicode Consortium, May 2018.
  13. 13.
    DENIC: DENIC putting extensive changes into force for .DE Whois Lookup Service by 25 May 2018, May 2018.
  14. 14.
    Dhamija, R., Tygar, J.D., Hearst, M.: Why phishing works. In: SIGCHI Conference on Human Factors in Computing Systems, pp. 581–590. ACM (2006).
  15. 15.
    Dinaburg, A.: Bitsquatting: DNS hijacking without exploitation. White Paper #2011-307, Raytheon Company (2011)Google Scholar
  16. 16.
    Edelman, B.: Large-scale registration of domains with typographical errors. Technical report, Berkman Center for Internet & Society - Harvard Law School, September 2003.
  17. 17.
    Eskandari, S., Leoutsarakos, A., Mursch, T., Clark, J.: A first look at browser-based cryptojacking. In: 3rd IEEE European Symposium on Security and Privacy Workshops - Security on Blockchains, pp. 58–66 (2018).
  18. 18.
    Ester, M., Kriegel, H.P., Sander, J., Xu, X.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: 2nd International Conference on Knowledge Discovery and Data Mining, pp. 226–231. AAAI Press (1996)Google Scholar
  19. 19.
    EURid, UNESCO: World report on internationalised domain names 2018, August 2018.
  20. 20.
    Faltstrom, P., Hoffman, P., Costello, A.: Internationalizing domain names in applications (IDNA). RFC 3490, March 2003Google Scholar
  21. 21.
    Gabrilovich, E., Gontmakher, A.: The homograph attack. Commun. ACM 45(2), 128 (2002). Scholar
  22. 22.
    GoDaddy: The GoDaddy API.
  23. 23.
    Google: Safe Browsing.
  24. 24.
    Hannay, P., Baatard, G.: The 2011 IDN homograph attack mitigation survey. In: International Conference on Security and Management, pp. 653–657 (2012)Google Scholar
  25. 25.
    Hannay, P., Bolan, C.: An assessment of internationalised domain name homograph attack mitigation implementations. In: 7th Australian Information Security Management Conference (2009).
  26. 26.
    Hannay, P., Bolan, C.: The 2010 IDN homograph attack mitigation survey. In: International Conference on Security and Management, pp. 611–614 (2010)Google Scholar
  27. 27.
    Harrenstien, K., Stahl, M., Feinler, E.: DoD internet host table specification. RFC 952, October 1985Google Scholar
  28. 28.
    Holgers, T., Watson, D.E., Gribble, S.D.: Cutting through the confusion: a measurement study of homograph attacks. In: USENIX Annual Technical Conference, pp. 261–266. USENIX Association (2006)Google Scholar
  29. 29.
    IDN Guidelines Working Group: Guidelines for the implementation of internationalized domain names, version 4.0, May 2018.
  30. 30.
    Internet Assigned Numbers Authority: Repository of IDN Practices.
  31. 31.
    Internet Corporation for Assigned Names and Numbers: Label Generation Rules Tool.
  32. 32.
    Internet Corporation for Assigned Names and Numbers: Data Protection/privacy Issues, July 2017.
  33. 33.
    Kharraz, A., Robertson, W., Kirda, E.: Surveylance: automatically detecting online survey scams. In: 39th IEEE Symposium on Security and Privacy, pp. 70–86 (2018).
  34. 34.
    Kintis, P., et al.: Hiding in plain sight: a longitudinal study of combosquatting abuse. In: 24th ACM SIGSAC Conference on Computer and Communications Security, pp. 569–586. ACM (2017).
  35. 35.
    Klensin, J.: Internationalized domain names for applications (IDNA): definitions and document framework. RFC 5890, August 2010Google Scholar
  36. 36.
    Korczyński, M., et al.: Cybercrime after the sunrise: a statistical analysis of DNS abuse in new gTLDs. In: 13th Asia Conference on Computer and Communications Security, pp. 609–623. ACM (2018).
  37. 37.
  38. 38.
    Larsen, C., van der Horst, T.: Bad guys using internationalized domain names (IDNs), May 2014.
  39. 39.
    Le Pochat, V., Van Goethem, T., Tajalizadehkhoob, S., Korczyński, M., Joosen, W.: Tranco: a research-oriented top sites ranking hardened against manipulation. In: 26th Annual Network and Distributed System Security Symposium, February 2019.
  40. 40.
    Levine, J., Hoffman, P.: Variants in second-level names registered in top-level domains. RFC 6927, May 2013Google Scholar
  41. 41.
    Liu, B., et al.: A reexamination of internationalized domain names: the good, the bad and the ugly. In: 48th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, pp. 654–665 (2018).
  42. 42.
    Liu, S., Foster, I., Savage, S., Voelker, G.M., Saul, L.K.: Who is .com?: learning to parse WHOIS records. In: Internet Measurement Conference, pp. 369–380. ACM (2015).
  43. 43.
    Lv, P., Ya, J., Liu, T., Shi, J., Fang, B., Gu, Z.: You have more abbreviations than you know: a study of AbbrevSquatting abuse. In: Shi, Y., et al. (eds.) ICCS 2018. LNCS, vol. 10860, pp. 221–233. Springer, Cham (2018). Scholar
  44. 44.
    Markham, G.: IDN display algorithm, April 2017.
  45. 45.
    McElroy, T., Hannay, P., Baatard, G.: The 2017 homograph browser attack mitigation survey. In: 15th Australian Information Security Management Conference, pp. 88–96 (2017).
  46. 46.
    Mockapetris, P.: Domain names - concepts and facilities. RFC 1034, November 1987Google Scholar
  47. 47.
    Moore, T., Edelman, B.: Measuring the perpetrators and funders of typosquatting. In: Sion, R. (ed.) FC 2010. LNCS, vol. 6052, pp. 175–191. Springer, Heidelberg (2010). Scholar
  48. 48.
    Nikiforakis, N., Balduzzi, M., Desmet, L., Piessens, F., Joosen, W.: Soundsquatting: uncovering the use of homophones in domain squatting. In: Chow, S.S.M., Camenisch, J., Hui, L.C.K., Yiu, S.M. (eds.) ISC 2014. LNCS, vol. 8783, pp. 291–308. Springer, Cham (2014). Scholar
  49. 49.
    Nikiforakis, N., et al.: Stranger danger: exploring the ecosystem of ad-based URL shortening services. In: 23rd International Conference on World Wide Web, pp. 51–62. ACM (2014).
  50. 50.
    Nikiforakis, N., Van Acker, S., Meert, W., Desmet, L., Piessens, F., Joosen, W.: Bitsquatting: exploiting bit-flips for fun, or profit? In: 22nd International Conference on World Wide Web, pp. 989–998. ACM (2013).
  51. 51.
    Nominet: .wales and .cymru domains - IDN policy, August 2015.
  52. 52.
    Núcleo de Informação e Coordenação do Ponto BR: Regras do domínio.
  53. 53.
    OpenDNS: PhishTank.
  54. 54.
    Rüth, J., Zimmermann, T., Wolsing, K., Hohlfeld, O.: Digging into browser-based crypto mining. In: Internet Measurement Conference, pp. 70–76. ACM (2018).
  55. 55.
    Scheitle, Q., et al.: A long way to the top: significance, structure, and stability of Internet top lists. In: Internet Measurement Conference, pp. 478–493. ACM (2018).
  56. 56.
    Schiffman, M.: Global internationalized domain name homograph report, Q2/2018. Technical report, Farsight Security, June 2018Google Scholar
  57. 57.
    Shin, J.: Establish a process to update “top domain” skeleton list for confusability check, May 2017.
  58. 58.
    Shin, J.: Mitigate spoofing attempt using Latin letters, April 2017.
  59. 59.
    Sommers, J.: On the characteristics of language tags on the web. In: Beverly, R., Smaragdakis, G., Feldmann, A. (eds.) PAM 2018. LNCS, vol. 10771, pp. 18–30. Springer, Cham (2018). Scholar
  60. 60.
    Spamhaus Project: The domain block list.
  61. 61.
    Spaulding, J., Upadhyaya, S., Mohaisen, A.: The landscape of domain name typosquatting: techniques and countermeasures. In: 11th International Conference on Availability, Reliability and Security, pp. 284–289 (2016).
  62. 62.
    SURBL: SURBL URI reputation data.
  63. 63.
    Szurdi, J., Kocso, B., Cseh, G., Spring, J., Felegyhazi, M., Kanich, C.: The long “taile;; of typosquatting domain names. In: 23rd USENIX Security Symposium, pp. 191–206. USENIX Association (2014)Google Scholar
  64. 64.
    The Unicode Consortium: Unicode transliteration guidelines.
  65. 65.
    The Unicode Consortium: The Unicode Standard, Version 11.0.0 (2018).
  66. 66.
    Tian, K., Jan, S.T.K., Hu, H., Yao, D., Wang, G.: Needle in a haystack: tracking down elite phishing domains in the wild. In: Internet Measurement Conference, pp. 429–442. ACM (2018).
  67. 67.
    Vissers, T., Barron, T., Van Goethem, T., Joosen, W., Nikiforakis, N.: The wolf of name street: hijacking domains through their nameservers. In: 24th ACM SIGSAC Conference on Computer and Communications Security, pp. 957–970. ACM (2017).
  68. 68.
    Vissers, T., Joosen, W., Nikiforakis, N.: Parking sensors: analyzing and detecting parked domains. In: 22nd Annual Network and Distributed System Security Symposium. Internet Society (2015)Google Scholar
  69. 69.
    Wang, Y.M., Beck, D., Wang, J., Verbowski, C., Daniels, B.: Strider typo-patrol: discovery and analysis of systematic typo-squatting. In: 2nd Workshop on Steps to Reducing Unwanted Traffic on the Internet, pp. 31–36. USENIX Association (2006)Google Scholar
  70. 70.
    Wood, P., Johnston, N.: Spammers taking advantage of IDN with URL shortening services, February 2011.
  71. 71.
    Zheng, X.: Phishing with Unicode domains, April 2017.

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.imec-DistriNet, KU LeuvenLeuvenBelgium

Personalised recommendations