Abstract
International Domain Names (IDNs) were introduced to support non-ASCII characters in domain names. In this paper, we explore IDNs that hold genuine interest, i.e. that owners of brands with diacritical marks may want to register and use. We generate 15 276 candidate IDNs from the page titles of popular domains, and see that 43% are readily available for registration, allowing for spoofing or phishing attacks. Meanwhile, 9% are not allowed by the respective registry to be registered, preventing brand owners from owning the IDN. Based on WHOIS records, DNS records and a web crawl, we estimate that at least 50% of the 3 189 registered IDNs have the same owner as the original domain, but that 35% are owned by a different entity, mainly domain squatters; malicious activity was not observed. Finally, we see that application behavior toward these IDNs remains inconsistent, hindering user experience and therefore widespread uptake of IDNs, and even uncover a phishing vulnerability in iOS Mail.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
- 2.
The other deviations are the Greek , converted to in IDNA2003, and the zero width non-joiner and joiner, both deleted by the IDNA2003 Punycode algorithm.
References
IDN in Google Chrome. https://dev.chromium.org/developers/design-documents/idn-in-google-chrome
Measuring the information society report 2017, vol. 1. Technical report, International Telecommunication Union (2017). https://www.itu.int/en/ITU-D/Statistics/Documents/publications/misr2017/MISR2017_Volume1.pdf
Agten, P., Joosen, W., Piessens, F., Nikiforakis, N.: Seven months’ worth of mistakes: a longitudinal study of typosquatting abuse. In: 22nd Annual Network and Distributed System Security Symposium. Internet Society (2015). https://doi.org/10.14722/ndss.2015.23058
Apple Inc.: About the security content of iOS 12.1.1, December 2018. https://support.apple.com/en-us/HT209340
Braden, R.: Requirements for internet hosts - application and support. RFC 1123, October 1989
Canadian Internet Registration Authority: Domains with French accented characters, January 2018. https://cira.ca/register-your-ca/domains-french-accented-characters
Carletti, S.: Ruby Whois. https://whoisrb.org/
Chronicle: VirusTotal. https://www.virustotal.com
Clayton, R., Mansfield, T.: A study of Whois privacy and proxy service abuse. In: 13th Annual Workshop on the Economics of Information Security (2014)
Costello, A.: Punycode: a bootstring encoding of Unicode for internationalized domain names in applications (IDNA). RFC 3492, March 2003
CZ.NIC: Czechs refused diacritics in domain names again, February 2017. https://www.nic.cz/page/3499/czechs-refused-diacritics-in-domain-names-again/
Davis, M., Suignard, M.: Unicode IDNA compatibility processing. Technical Standard 46, The Unicode Consortium, May 2018. https://www.unicode.org/reports/tr46/
DENIC: DENIC putting extensive changes into force for .DE Whois Lookup Service by 25 May 2018, May 2018. https://www.denic.de/en/whats-new/press-releases/article/denic-putting-extensive-changes-into-force-for-de-whois-lookup-service-as-of-25-may-2018/
Dhamija, R., Tygar, J.D., Hearst, M.: Why phishing works. In: SIGCHI Conference on Human Factors in Computing Systems, pp. 581–590. ACM (2006). https://doi.org/10.1145/1124772.1124861
Dinaburg, A.: Bitsquatting: DNS hijacking without exploitation. White Paper #2011-307, Raytheon Company (2011)
Edelman, B.: Large-scale registration of domains with typographical errors. Technical report, Berkman Center for Internet & Society - Harvard Law School, September 2003. http://cyber.law.harvard.edu/people/edelman/typo-domains
Eskandari, S., Leoutsarakos, A., Mursch, T., Clark, J.: A first look at browser-based cryptojacking. In: 3rd IEEE European Symposium on Security and Privacy Workshops - Security on Blockchains, pp. 58–66 (2018). https://doi.org/10.1109/EuroSPW.2018.00014
Ester, M., Kriegel, H.P., Sander, J., Xu, X.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: 2nd International Conference on Knowledge Discovery and Data Mining, pp. 226–231. AAAI Press (1996)
EURid, UNESCO: World report on internationalised domain names 2018, August 2018. https://idnworldreport.eu/2018-2
Faltstrom, P., Hoffman, P., Costello, A.: Internationalizing domain names in applications (IDNA). RFC 3490, March 2003
Gabrilovich, E., Gontmakher, A.: The homograph attack. Commun. ACM 45(2), 128 (2002). https://doi.org/10.1145/503124.503156
GoDaddy: The GoDaddy API. https://developer.godaddy.com/
Google: Safe Browsing. https://safebrowsing.google.com/
Hannay, P., Baatard, G.: The 2011 IDN homograph attack mitigation survey. In: International Conference on Security and Management, pp. 653–657 (2012)
Hannay, P., Bolan, C.: An assessment of internationalised domain name homograph attack mitigation implementations. In: 7th Australian Information Security Management Conference (2009). https://doi.org/10.4225/75/57b405aa30dee
Hannay, P., Bolan, C.: The 2010 IDN homograph attack mitigation survey. In: International Conference on Security and Management, pp. 611–614 (2010)
Harrenstien, K., Stahl, M., Feinler, E.: DoD internet host table specification. RFC 952, October 1985
Holgers, T., Watson, D.E., Gribble, S.D.: Cutting through the confusion: a measurement study of homograph attacks. In: USENIX Annual Technical Conference, pp. 261–266. USENIX Association (2006)
IDN Guidelines Working Group: Guidelines for the implementation of internationalized domain names, version 4.0, May 2018. https://www.icann.org/en/system/files/files/idn-guidelines-10may18-en.pdf
Internet Assigned Numbers Authority: Repository of IDN Practices. https://www.iana.org/domains/idn-tables
Internet Corporation for Assigned Names and Numbers: Label Generation Rules Tool. https://www.icann.org/resources/pages/lgr-toolset-2015-06-21-en
Internet Corporation for Assigned Names and Numbers: Data Protection/privacy Issues, July 2017. https://www.icann.org/dataprotectionprivacy
Kharraz, A., Robertson, W., Kirda, E.: Surveylance: automatically detecting online survey scams. In: 39th IEEE Symposium on Security and Privacy, pp. 70–86 (2018). https://doi.org/10.1109/SP.2018.00044
Kintis, P., et al.: Hiding in plain sight: a longitudinal study of combosquatting abuse. In: 24th ACM SIGSAC Conference on Computer and Communications Security, pp. 569–586. ACM (2017). https://doi.org/10.1145/3133956.3134002
Klensin, J.: Internationalized domain names for applications (IDNA): definitions and document framework. RFC 5890, August 2010
Korczyński, M., et al.: Cybercrime after the sunrise: a statistical analysis of DNS abuse in new gTLDs. In: 13th Asia Conference on Computer and Communications Security, pp. 609–623. ACM (2018). https://doi.org/10.1145/3196494.3196548
Krawetz, N.: Looks like it, May 2011. https://www.hackerfactor.com/blog/index.php?/archives/432-Looks-Like-It.html
Larsen, C., van der Horst, T.: Bad guys using internationalized domain names (IDNs), May 2014. https://www.symantec.com/connect/blogs/bad-guys-using-internationalized-domain-names-idns
Le Pochat, V., Van Goethem, T., Tajalizadehkhoob, S., Korczyński, M., Joosen, W.: Tranco: a research-oriented top sites ranking hardened against manipulation. In: 26th Annual Network and Distributed System Security Symposium, February 2019. https://doi.org/10.14722/ndss.2019.23386
Levine, J., Hoffman, P.: Variants in second-level names registered in top-level domains. RFC 6927, May 2013
Liu, B., et al.: A reexamination of internationalized domain names: the good, the bad and the ugly. In: 48th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, pp. 654–665 (2018). https://doi.org/10.1109/DSN.2018.00072
Liu, S., Foster, I., Savage, S., Voelker, G.M., Saul, L.K.: Who is .com?: learning to parse WHOIS records. In: Internet Measurement Conference, pp. 369–380. ACM (2015). https://doi.org/10.1145/2815675.2815693
Lv, P., Ya, J., Liu, T., Shi, J., Fang, B., Gu, Z.: You have more abbreviations than you know: a study of AbbrevSquatting abuse. In: Shi, Y., et al. (eds.) ICCS 2018. LNCS, vol. 10860, pp. 221–233. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-93698-7_17
Markham, G.: IDN display algorithm, April 2017. https://wiki.mozilla.org/IDN_Display_Algorithm
McElroy, T., Hannay, P., Baatard, G.: The 2017 homograph browser attack mitigation survey. In: 15th Australian Information Security Management Conference, pp. 88–96 (2017). https://doi.org/10.4225/75/5a84f5a495b4d
Mockapetris, P.: Domain names - concepts and facilities. RFC 1034, November 1987
Moore, T., Edelman, B.: Measuring the perpetrators and funders of typosquatting. In: Sion, R. (ed.) FC 2010. LNCS, vol. 6052, pp. 175–191. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-14577-3_15
Nikiforakis, N., Balduzzi, M., Desmet, L., Piessens, F., Joosen, W.: Soundsquatting: uncovering the use of homophones in domain squatting. In: Chow, S.S.M., Camenisch, J., Hui, L.C.K., Yiu, S.M. (eds.) ISC 2014. LNCS, vol. 8783, pp. 291–308. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-13257-0_17
Nikiforakis, N., et al.: Stranger danger: exploring the ecosystem of ad-based URL shortening services. In: 23rd International Conference on World Wide Web, pp. 51–62. ACM (2014). https://doi.org/10.1145/2566486.2567983
Nikiforakis, N., Van Acker, S., Meert, W., Desmet, L., Piessens, F., Joosen, W.: Bitsquatting: exploiting bit-flips for fun, or profit? In: 22nd International Conference on World Wide Web, pp. 989–998. ACM (2013). https://doi.org/10.1145/2488388.2488474
Nominet: .wales and .cymru domains - IDN policy, August 2015. https://nominet-prod.s3.amazonaws.com/wp-content/uploads/2015/08/CymruWalesIDNPolicy_0.pdf
Núcleo de Informação e Coordenação do Ponto BR: Regras do domínio. https://registro.br/dominio/regras.html
OpenDNS: PhishTank. https://www.phishtank.com
Rüth, J., Zimmermann, T., Wolsing, K., Hohlfeld, O.: Digging into browser-based crypto mining. In: Internet Measurement Conference, pp. 70–76. ACM (2018). https://doi.org/10.1145/3278532.3278539
Scheitle, Q., et al.: A long way to the top: significance, structure, and stability of Internet top lists. In: Internet Measurement Conference, pp. 478–493. ACM (2018). https://doi.org/10.1145/3278532.3278574
Schiffman, M.: Global internationalized domain name homograph report, Q2/2018. Technical report, Farsight Security, June 2018
Shin, J.: Establish a process to update “top domain” skeleton list for confusability check, May 2017. https://bugs.chromium.org/p/chromium/issues/detail?id=722022
Shin, J.: Mitigate spoofing attempt using Latin letters, April 2017. https://codereview.chromium.org/2784933002
Sommers, J.: On the characteristics of language tags on the web. In: Beverly, R., Smaragdakis, G., Feldmann, A. (eds.) PAM 2018. LNCS, vol. 10771, pp. 18–30. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-76481-8_2
Spamhaus Project: The domain block list. https://www.spamhaus.org/dbl/
Spaulding, J., Upadhyaya, S., Mohaisen, A.: The landscape of domain name typosquatting: techniques and countermeasures. In: 11th International Conference on Availability, Reliability and Security, pp. 284–289 (2016). https://doi.org/10.1109/ARES.2016.84
SURBL: SURBL URI reputation data. http://www.surbl.org/
Szurdi, J., Kocso, B., Cseh, G., Spring, J., Felegyhazi, M., Kanich, C.: The long “taile;; of typosquatting domain names. In: 23rd USENIX Security Symposium, pp. 191–206. USENIX Association (2014)
The Unicode Consortium: Unicode transliteration guidelines. http://cldr.unicode.org/index/cldr-spec/transliteration-guidelines
The Unicode Consortium: The Unicode Standard, Version 11.0.0 (2018). http://www.unicode.org/versions/Unicode11.0.0/
Tian, K., Jan, S.T.K., Hu, H., Yao, D., Wang, G.: Needle in a haystack: tracking down elite phishing domains in the wild. In: Internet Measurement Conference, pp. 429–442. ACM (2018). https://doi.org/10.1145/3278532.3278569
Vissers, T., Barron, T., Van Goethem, T., Joosen, W., Nikiforakis, N.: The wolf of name street: hijacking domains through their nameservers. In: 24th ACM SIGSAC Conference on Computer and Communications Security, pp. 957–970. ACM (2017). https://doi.org/10.1145/3133956.3133988
Vissers, T., Joosen, W., Nikiforakis, N.: Parking sensors: analyzing and detecting parked domains. In: 22nd Annual Network and Distributed System Security Symposium. Internet Society (2015)
Wang, Y.M., Beck, D., Wang, J., Verbowski, C., Daniels, B.: Strider typo-patrol: discovery and analysis of systematic typo-squatting. In: 2nd Workshop on Steps to Reducing Unwanted Traffic on the Internet, pp. 31–36. USENIX Association (2006)
Wood, P., Johnston, N.: Spammers taking advantage of IDN with URL shortening services, February 2011. https://www.symantec.com/connect/blogs/spammers-taking-advantage-idn-url-shortening-services
Zheng, X.: Phishing with Unicode domains, April 2017. https://www.xudongz.com/blog/2017/idn-phishing/
Acknowlegdments
We would like to thank our shepherd Ignacio Castro for his valuable feedback, and Gertjan Franken and Katrien Janssens for their help in the user agent survey. This research is partially funded by the Research Fund KU Leuven. Victor Le Pochat holds a PhD Fellowship of the Research Foundation - Flanders (FWO).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Appendices
A Common Character Substitutions
Original | ä | ö | ü | ß | æ | ø | å | œ | þ |
Substitution | ae | oe | ue | ss | ae | oe | aa | oe | th |
B Tested User Agent Versions
Client | Version | Operating system | |
---|---|---|---|
Browser desktop | Google Chrome | 69.0.3497.100 | Ubuntu Linux 18.04.1 |
Firefox | 62.0 | Ubuntu Linux 18.04.1 | |
Safari | 12.0.1 (13606.2.100) | macOS 10.13.6 (17G65) | |
Opera | 55.0.2994.61 | Ubuntu Linux 18.04.1 | |
Internet Explorer | 11.0.9600.18894 | Windows 8.1 | |
Microsoft Edge | 42.17134.1.0 | Windows 10 17.17134 | |
Browser mobile | Google Chrome | 69.0.3497.100 | Android 7.0.0 |
Safari | – | iOS 12.0 (16A366) | |
Firefox | 62.0.2 | Android 7.0.0 | |
UC Browser | 12.9.3.1144 | Android 7.0.0 | |
Samsung Internet | 7.4.00.70 | Android 7.0.0 | |
Opera | 47.3.2249.130976 | Android 7.0.0 | |
Microsoft Edge | 42.0.0.2529 | Android 7.0.0 | |
Email desktop | Outlook 2016 | 16.0.4738.1000 | Windows 10 17.17134 |
macOS Mail | 11.5 (3445.9.1) | macOS 10.13.6 (17G65) | |
Thunderbird | 52.9.1 | Ubuntu Linux 18.04.1 | |
Email mobile | Gmail | 8.9.9.213351932 | Android 7.0.0 |
Outlook | 2.2.219 | Android 7.0.0 | |
iOS Mail | – | iOS 12.0 (16A366) | |
iOS 12.1.2 (16C104) | |||
Webmail | Gmail | – | – |
Yahoo | – | – | |
Yandex | – | – | |
Outlook | – | – | |
RoundCube | 1.2.9 | – |
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Le Pochat, V., Van Goethem, T., Joosen, W. (2019). Funny Accents: Exploring Genuine Interest in Internationalized Domain Names. In: Choffnes, D., Barcellos, M. (eds) Passive and Active Measurement. PAM 2019. Lecture Notes in Computer Science(), vol 11419. Springer, Cham. https://doi.org/10.1007/978-3-030-15986-3_12
Download citation
DOI: https://doi.org/10.1007/978-3-030-15986-3_12
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-15985-6
Online ISBN: 978-3-030-15986-3
eBook Packages: Computer ScienceComputer Science (R0)