Advertisement

Funny Accents: Exploring Genuine Interest in Internationalized Domain Names

  • Victor Le PochatEmail author
  • Tom Van Goethem
  • Wouter Joosen
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11419)

Abstract

International Domain Names (IDNs) were introduced to support non-ASCII characters in domain names. In this paper, we explore IDNs that hold genuine interest, i.e. that owners of brands with diacritical marks may want to register and use. We generate 15 276 candidate IDNs from the page titles of popular domains, and see that 43% are readily available for registration, allowing for spoofing or phishing attacks. Meanwhile, 9% are not allowed by the respective registry to be registered, preventing brand owners from owning the IDN. Based on WHOIS records, DNS records and a web crawl, we estimate that at least 50% of the 3 189 registered IDNs have the same owner as the original domain, but that 35% are owned by a different entity, mainly domain squatters; malicious activity was not observed. Finally, we see that application behavior toward these IDNs remains inconsistent, hindering user experience and therefore widespread uptake of IDNs, and even uncover a phishing vulnerability in iOS Mail.

Keywords

Internationalized Domain Names Phishing Domain squatting Homograph attack 

Notes

Acknowlegdments

We would like to thank our shepherd Ignacio Castro for his valuable feedback, and Gertjan Franken and Katrien Janssens for their help in the user agent survey. This research is partially funded by the Research Fund KU Leuven. Victor Le Pochat holds a PhD Fellowship of the Research Foundation - Flanders (FWO).

References

  1. 1.
  2. 2.
    Measuring the information society report 2017, vol. 1. Technical report, International Telecommunication Union (2017). https://www.itu.int/en/ITU-D/Statistics/Documents/publications/misr2017/MISR2017_Volume1.pdf
  3. 3.
    Agten, P., Joosen, W., Piessens, F., Nikiforakis, N.: Seven months’ worth of mistakes: a longitudinal study of typosquatting abuse. In: 22nd Annual Network and Distributed System Security Symposium. Internet Society (2015).  https://doi.org/10.14722/ndss.2015.23058
  4. 4.
    Apple Inc.: About the security content of iOS 12.1.1, December 2018. https://support.apple.com/en-us/HT209340
  5. 5.
    Braden, R.: Requirements for internet hosts - application and support. RFC 1123, October 1989Google Scholar
  6. 6.
    Canadian Internet Registration Authority: Domains with French accented characters, January 2018. https://cira.ca/register-your-ca/domains-french-accented-characters
  7. 7.
    Carletti, S.: Ruby Whois. https://whoisrb.org/
  8. 8.
    Chronicle: VirusTotal. https://www.virustotal.com
  9. 9.
    Clayton, R., Mansfield, T.: A study of Whois privacy and proxy service abuse. In: 13th Annual Workshop on the Economics of Information Security (2014)Google Scholar
  10. 10.
    Costello, A.: Punycode: a bootstring encoding of Unicode for internationalized domain names in applications (IDNA). RFC 3492, March 2003Google Scholar
  11. 11.
    CZ.NIC: Czechs refused diacritics in domain names again, February 2017. https://www.nic.cz/page/3499/czechs-refused-diacritics-in-domain-names-again/
  12. 12.
    Davis, M., Suignard, M.: Unicode IDNA compatibility processing. Technical Standard 46, The Unicode Consortium, May 2018. https://www.unicode.org/reports/tr46/
  13. 13.
    DENIC: DENIC putting extensive changes into force for .DE Whois Lookup Service by 25 May 2018, May 2018. https://www.denic.de/en/whats-new/press-releases/article/denic-putting-extensive-changes-into-force-for-de-whois-lookup-service-as-of-25-may-2018/
  14. 14.
    Dhamija, R., Tygar, J.D., Hearst, M.: Why phishing works. In: SIGCHI Conference on Human Factors in Computing Systems, pp. 581–590. ACM (2006).  https://doi.org/10.1145/1124772.1124861
  15. 15.
    Dinaburg, A.: Bitsquatting: DNS hijacking without exploitation. White Paper #2011-307, Raytheon Company (2011)Google Scholar
  16. 16.
    Edelman, B.: Large-scale registration of domains with typographical errors. Technical report, Berkman Center for Internet & Society - Harvard Law School, September 2003. http://cyber.law.harvard.edu/people/edelman/typo-domains
  17. 17.
    Eskandari, S., Leoutsarakos, A., Mursch, T., Clark, J.: A first look at browser-based cryptojacking. In: 3rd IEEE European Symposium on Security and Privacy Workshops - Security on Blockchains, pp. 58–66 (2018).  https://doi.org/10.1109/EuroSPW.2018.00014
  18. 18.
    Ester, M., Kriegel, H.P., Sander, J., Xu, X.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: 2nd International Conference on Knowledge Discovery and Data Mining, pp. 226–231. AAAI Press (1996)Google Scholar
  19. 19.
    EURid, UNESCO: World report on internationalised domain names 2018, August 2018. https://idnworldreport.eu/2018-2
  20. 20.
    Faltstrom, P., Hoffman, P., Costello, A.: Internationalizing domain names in applications (IDNA). RFC 3490, March 2003Google Scholar
  21. 21.
    Gabrilovich, E., Gontmakher, A.: The homograph attack. Commun. ACM 45(2), 128 (2002).  https://doi.org/10.1145/503124.503156CrossRefGoogle Scholar
  22. 22.
    GoDaddy: The GoDaddy API. https://developer.godaddy.com/
  23. 23.
    Google: Safe Browsing. https://safebrowsing.google.com/
  24. 24.
    Hannay, P., Baatard, G.: The 2011 IDN homograph attack mitigation survey. In: International Conference on Security and Management, pp. 653–657 (2012)Google Scholar
  25. 25.
    Hannay, P., Bolan, C.: An assessment of internationalised domain name homograph attack mitigation implementations. In: 7th Australian Information Security Management Conference (2009).  https://doi.org/10.4225/75/57b405aa30dee
  26. 26.
    Hannay, P., Bolan, C.: The 2010 IDN homograph attack mitigation survey. In: International Conference on Security and Management, pp. 611–614 (2010)Google Scholar
  27. 27.
    Harrenstien, K., Stahl, M., Feinler, E.: DoD internet host table specification. RFC 952, October 1985Google Scholar
  28. 28.
    Holgers, T., Watson, D.E., Gribble, S.D.: Cutting through the confusion: a measurement study of homograph attacks. In: USENIX Annual Technical Conference, pp. 261–266. USENIX Association (2006)Google Scholar
  29. 29.
    IDN Guidelines Working Group: Guidelines for the implementation of internationalized domain names, version 4.0, May 2018. https://www.icann.org/en/system/files/files/idn-guidelines-10may18-en.pdf
  30. 30.
    Internet Assigned Numbers Authority: Repository of IDN Practices. https://www.iana.org/domains/idn-tables
  31. 31.
    Internet Corporation for Assigned Names and Numbers: Label Generation Rules Tool. https://www.icann.org/resources/pages/lgr-toolset-2015-06-21-en
  32. 32.
    Internet Corporation for Assigned Names and Numbers: Data Protection/privacy Issues, July 2017. https://www.icann.org/dataprotectionprivacy
  33. 33.
    Kharraz, A., Robertson, W., Kirda, E.: Surveylance: automatically detecting online survey scams. In: 39th IEEE Symposium on Security and Privacy, pp. 70–86 (2018).  https://doi.org/10.1109/SP.2018.00044
  34. 34.
    Kintis, P., et al.: Hiding in plain sight: a longitudinal study of combosquatting abuse. In: 24th ACM SIGSAC Conference on Computer and Communications Security, pp. 569–586. ACM (2017).  https://doi.org/10.1145/3133956.3134002
  35. 35.
    Klensin, J.: Internationalized domain names for applications (IDNA): definitions and document framework. RFC 5890, August 2010Google Scholar
  36. 36.
    Korczyński, M., et al.: Cybercrime after the sunrise: a statistical analysis of DNS abuse in new gTLDs. In: 13th Asia Conference on Computer and Communications Security, pp. 609–623. ACM (2018).  https://doi.org/10.1145/3196494.3196548
  37. 37.
  38. 38.
    Larsen, C., van der Horst, T.: Bad guys using internationalized domain names (IDNs), May 2014. https://www.symantec.com/connect/blogs/bad-guys-using-internationalized-domain-names-idns
  39. 39.
    Le Pochat, V., Van Goethem, T., Tajalizadehkhoob, S., Korczyński, M., Joosen, W.: Tranco: a research-oriented top sites ranking hardened against manipulation. In: 26th Annual Network and Distributed System Security Symposium, February 2019.  https://doi.org/10.14722/ndss.2019.23386
  40. 40.
    Levine, J., Hoffman, P.: Variants in second-level names registered in top-level domains. RFC 6927, May 2013Google Scholar
  41. 41.
    Liu, B., et al.: A reexamination of internationalized domain names: the good, the bad and the ugly. In: 48th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, pp. 654–665 (2018).  https://doi.org/10.1109/DSN.2018.00072
  42. 42.
    Liu, S., Foster, I., Savage, S., Voelker, G.M., Saul, L.K.: Who is .com?: learning to parse WHOIS records. In: Internet Measurement Conference, pp. 369–380. ACM (2015).  https://doi.org/10.1145/2815675.2815693
  43. 43.
    Lv, P., Ya, J., Liu, T., Shi, J., Fang, B., Gu, Z.: You have more abbreviations than you know: a study of AbbrevSquatting abuse. In: Shi, Y., et al. (eds.) ICCS 2018. LNCS, vol. 10860, pp. 221–233. Springer, Cham (2018).  https://doi.org/10.1007/978-3-319-93698-7_17CrossRefGoogle Scholar
  44. 44.
    Markham, G.: IDN display algorithm, April 2017. https://wiki.mozilla.org/IDN_Display_Algorithm
  45. 45.
    McElroy, T., Hannay, P., Baatard, G.: The 2017 homograph browser attack mitigation survey. In: 15th Australian Information Security Management Conference, pp. 88–96 (2017).  https://doi.org/10.4225/75/5a84f5a495b4d
  46. 46.
    Mockapetris, P.: Domain names - concepts and facilities. RFC 1034, November 1987Google Scholar
  47. 47.
    Moore, T., Edelman, B.: Measuring the perpetrators and funders of typosquatting. In: Sion, R. (ed.) FC 2010. LNCS, vol. 6052, pp. 175–191. Springer, Heidelberg (2010).  https://doi.org/10.1007/978-3-642-14577-3_15CrossRefGoogle Scholar
  48. 48.
    Nikiforakis, N., Balduzzi, M., Desmet, L., Piessens, F., Joosen, W.: Soundsquatting: uncovering the use of homophones in domain squatting. In: Chow, S.S.M., Camenisch, J., Hui, L.C.K., Yiu, S.M. (eds.) ISC 2014. LNCS, vol. 8783, pp. 291–308. Springer, Cham (2014).  https://doi.org/10.1007/978-3-319-13257-0_17CrossRefGoogle Scholar
  49. 49.
    Nikiforakis, N., et al.: Stranger danger: exploring the ecosystem of ad-based URL shortening services. In: 23rd International Conference on World Wide Web, pp. 51–62. ACM (2014).  https://doi.org/10.1145/2566486.2567983
  50. 50.
    Nikiforakis, N., Van Acker, S., Meert, W., Desmet, L., Piessens, F., Joosen, W.: Bitsquatting: exploiting bit-flips for fun, or profit? In: 22nd International Conference on World Wide Web, pp. 989–998. ACM (2013).  https://doi.org/10.1145/2488388.2488474
  51. 51.
    Nominet: .wales and .cymru domains - IDN policy, August 2015. https://nominet-prod.s3.amazonaws.com/wp-content/uploads/2015/08/CymruWalesIDNPolicy_0.pdf
  52. 52.
    Núcleo de Informação e Coordenação do Ponto BR: Regras do domínio. https://registro.br/dominio/regras.html
  53. 53.
    OpenDNS: PhishTank. https://www.phishtank.com
  54. 54.
    Rüth, J., Zimmermann, T., Wolsing, K., Hohlfeld, O.: Digging into browser-based crypto mining. In: Internet Measurement Conference, pp. 70–76. ACM (2018).  https://doi.org/10.1145/3278532.3278539
  55. 55.
    Scheitle, Q., et al.: A long way to the top: significance, structure, and stability of Internet top lists. In: Internet Measurement Conference, pp. 478–493. ACM (2018).  https://doi.org/10.1145/3278532.3278574
  56. 56.
    Schiffman, M.: Global internationalized domain name homograph report, Q2/2018. Technical report, Farsight Security, June 2018Google Scholar
  57. 57.
    Shin, J.: Establish a process to update “top domain” skeleton list for confusability check, May 2017. https://bugs.chromium.org/p/chromium/issues/detail?id=722022
  58. 58.
    Shin, J.: Mitigate spoofing attempt using Latin letters, April 2017. https://codereview.chromium.org/2784933002
  59. 59.
    Sommers, J.: On the characteristics of language tags on the web. In: Beverly, R., Smaragdakis, G., Feldmann, A. (eds.) PAM 2018. LNCS, vol. 10771, pp. 18–30. Springer, Cham (2018).  https://doi.org/10.1007/978-3-319-76481-8_2CrossRefGoogle Scholar
  60. 60.
    Spamhaus Project: The domain block list. https://www.spamhaus.org/dbl/
  61. 61.
    Spaulding, J., Upadhyaya, S., Mohaisen, A.: The landscape of domain name typosquatting: techniques and countermeasures. In: 11th International Conference on Availability, Reliability and Security, pp. 284–289 (2016).  https://doi.org/10.1109/ARES.2016.84
  62. 62.
    SURBL: SURBL URI reputation data. http://www.surbl.org/
  63. 63.
    Szurdi, J., Kocso, B., Cseh, G., Spring, J., Felegyhazi, M., Kanich, C.: The long “taile;; of typosquatting domain names. In: 23rd USENIX Security Symposium, pp. 191–206. USENIX Association (2014)Google Scholar
  64. 64.
    The Unicode Consortium: Unicode transliteration guidelines. http://cldr.unicode.org/index/cldr-spec/transliteration-guidelines
  65. 65.
    The Unicode Consortium: The Unicode Standard, Version 11.0.0 (2018). http://www.unicode.org/versions/Unicode11.0.0/
  66. 66.
    Tian, K., Jan, S.T.K., Hu, H., Yao, D., Wang, G.: Needle in a haystack: tracking down elite phishing domains in the wild. In: Internet Measurement Conference, pp. 429–442. ACM (2018).  https://doi.org/10.1145/3278532.3278569
  67. 67.
    Vissers, T., Barron, T., Van Goethem, T., Joosen, W., Nikiforakis, N.: The wolf of name street: hijacking domains through their nameservers. In: 24th ACM SIGSAC Conference on Computer and Communications Security, pp. 957–970. ACM (2017).  https://doi.org/10.1145/3133956.3133988
  68. 68.
    Vissers, T., Joosen, W., Nikiforakis, N.: Parking sensors: analyzing and detecting parked domains. In: 22nd Annual Network and Distributed System Security Symposium. Internet Society (2015)Google Scholar
  69. 69.
    Wang, Y.M., Beck, D., Wang, J., Verbowski, C., Daniels, B.: Strider typo-patrol: discovery and analysis of systematic typo-squatting. In: 2nd Workshop on Steps to Reducing Unwanted Traffic on the Internet, pp. 31–36. USENIX Association (2006)Google Scholar
  70. 70.
    Wood, P., Johnston, N.: Spammers taking advantage of IDN with URL shortening services, February 2011. https://www.symantec.com/connect/blogs/spammers-taking-advantage-idn-url-shortening-services
  71. 71.
    Zheng, X.: Phishing with Unicode domains, April 2017. https://www.xudongz.com/blog/2017/idn-phishing/

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.imec-DistriNet, KU LeuvenLeuvenBelgium

Personalised recommendations