Keywords

1 Introduction

As promising countermeasure technologies against phishing emails, sender authentication techniques such as Sender Policy Framework (SPF) [38], Domain-based Message Authentication, Reporting & Conformance (DMARC) [26], and DNS-Based Authentication of Named Entities (DANE) [23] have been standardized and have become widespread. In addition to these technologies, the standardization of Brand Indicators for Message Identification (BIMI) [16] is underway. The idea behind BIMI is to display the trademarked logo of a company or organization, along with information regarding its certification, in an email message. The recipient of the email can visually verify the legitimacy of the email sender by checking for the existence of a brand logo image with which they are familiar. BIMI technology has gained popularity since receiving official support from Google in July 2021.

For SPF, DMARC, and DANE, which are already widely used, many measurement studies have been conducted on the adoption, misuse, and misconfiguration of technologies. However, to the best of our knowledge, there have been no comprehensive measurement studies conducted on BIMI. Given this background, we set the following research questions to identify best practices and open research questions regarding the BIMI operation:

  • How widespread is BIMI currently?

  • How do DNS administrators configure the BIMI records for their domain names?

  • Is BIMI configured with other DNS-based email security mechanisms?

  • What are the typical misconfigurations of BIMI?

  • Are there any cyberattacks exploiting BIMI?

To address these research questions, we conducted the first large-scale measurement study of BIMI in the wild. We examined the presence and configuration of BIMI records for a list of one million popular domain names. We collected logo images and Verified Mark Certificates (VMC) for BIMI records and verified the validity of each setting. In addition, we examined the domain names extracted from 114,915 phishing emails collected by our spam trap and the open database of phishing websites and investigated whether there are any attack cases that exploit BIMI.

The contributions and findings of this study are as follows:

  • This is the first large-scale measurement study of the adoption and operation of BIMI in the wild.

  • Of the one million popular domain names, 3,538 have BIMI records.

  • Of the 3,538 domain names with a BIMI configuration, only 11% had a valid logo image and VMC.

  • In domain names that had set up a VMC for BIMI, DMARC was set up in 99.5% of the domain names.

  • We found 16 BIMI misconfigurations/violations in BIMI records, 1,224 in logos, 58 in VMCs, and 14 in the DMARC configuration.

  • We found 45 domain names having differences between the images contained in the VMC and the images provided on the server.

  • In this study, we found no cases of attacks exploiting BIMI.

2 Background

In this section, we first review the email security mechanisms. We then describe the specification of BIMI. For reference, we present the survey results of BIMI implementations for major mail user agents in Appendix.

2.1 DNS-Based Email Security Mechanisms

In the following, we present the overview of the major DNS-based email security mechanisms, except BIMI, which will be described in the next subsection.

The Sender Policy Framework (SPF) [38] is a mechanism used to verify the legitimacy of the sender of an email based on IP addresses. By registering SPF information in the DNS TXT record, mail server administrators can explicitly specify IP addresses that are allowed to send emails to the domain name in question.

DomainKeys Identified Mail (DKIM) [25] is a mechanism used to achieve authentication by adding a digital signature when sending email. To use DKIM, the domain name administrator must set up a public key for digital signatures on the DNS server. In addition, by setting a label called a selector, multiple public keys can be operated with a single domain name.

Domain-based Message Authentication, Reporting & Conformance (DMARC) [26] is a mechanism to verify the legitimacy of an email sender by referring to SPF and DKIM records. Like SPF, DMARC can be used by setting a TXT record on the authoritative DNS server of the domain name of the mail sender.

MTA-STS is a mechanism used to enforce STARTTLS on the sender of email, where STARTTLS [22] is a mechanism for encrypting the sending and receiving of email.

DNS-based Authentication of Named Entities (DANE) [15] is a mechanism used to guarantee the authenticity of mail destinations and the confidentiality of mail. DNSSEC [33,34,35] is used to determine the legitimacy, and STARTTLS is used to achieve confidentiality. To use DANE, a TLS public key must be set up on the email server.

TLS Reporting (TLSRPT) In MTA-STS and DANE, mail may not be delivered because of a failed authentication process. TLSRPT [29] is a function reporting such failures.

2.2 BIMI Specifications

BIMI presents an email to a user with an authenticated brand logo. This allows email recipients to visually distinguish the legitimacy of the email sender without having to look at the subject line or body of the email. The widespread use of BIMI is expected to reduce the success rate of phishing emails. However, for BIMI to be effective, the brand logo displayed by BIMI must be recognized by users [4, 5, 7]. As with DKIM, multiple logos can be set for a single domain name by setting the selector.

Table 5 in appendix summarizes the DNS records that must be set for each of the security mechanisms described above. “Configure” indicates who needs to configure the record.

BIMI Record: To enable BIMI for a domain name, the following data must be added to the TXT record of the domain name of the MX server:

v=BIMI1;l=<logo link>;a=<vmc link>,

where logo link describes the brand logo link and vmc link describes the link for the VMC. Among these links, only https is allowed as a schema.

Logo Image: The brand logo images used by BIMI must be provided in the SVG file format defined in RFC 6170 [36]. SVG Tiny P/S, currently proposed as an Internet Draft [17], sets the following restrictions:

  • A title tag must be included (64 characters or less is recommended).

  • The following attributes must be set in an svg tag:

    xmlns="http://www.w3.org/2000/svg",

    version="1.2",

    baseProfile="tiny-ps".

  • The inclusion of a desc tag is also recommended.

  • The size of the logo is recommended to be less than 32 KB.

VMC: VMC is a digital certificate used to certify the ownership of a logo. Currently, DigiCert and Entrust are two CAs that can issue a VMC [14].

DMARC: In DMARC, the domain name owner can set a policy regarding what action should be taken by the email recipient when the source authentication by SPF or DKIM fails. The three policies are as follows:

  • “none” indicates that no specific action will be taken.

  • “quarantine” indicates that the email recipient will treat as suspicious email that fails the DMARC mechanism check. The email recipient must take action, such as placing the email in the spam folder or conducting further investigations.

  • “reject” indicates that an email that fails the DMARC mechanism check is rejected.

DMARC allows one domain name and its subdomain names to be independently configured. A pct is a field that allows the domain name administrator to gradually implement the DMARC mechanism. By setting the pct, it is possible to apply a strong denial policy with a certain probability; otherwise, the next-strongest denial policy is applied. To use BIMI, domain name administrators must fully implement the DMARC mechanism. When using BIMI, “none” should not be applied.

Vetting Process. In order to use BIMI, it is necessary to obtain a valid VMC for the target logo as an email client will test both BIMI record and VMC. A user wishing to obtain a VMC for their logo submits the trademarked logo and information verifying the identity of the user to the VMC-issuing CA. The CA will review the submitted information and also conduct a video conference with the user. If no problems are found, a VMC associated with the logo will be issued. The two VMC-issuing CAs clearly describe in their Certification Practice Statement (CPS) that they meet the official security requirements for issuing the VMC [6, 8, 9]. They are subject to an external audit in order to conduct the business of issuing VMCs. This audit is similar to the external audit that CAs issuing server certificates in Web PKI undergo.

3 Measurement Method

In this section, we present the list of domain names we target for our analysis and the data collection methodology.

3.1 Target Domain Names

In this study, we adopt the domain names used for popular websites, those of phishing email senders, and those of phishing websites as our research target domain names.

Tranco: We adopted the one million domain names published by Tranco [13] on February 20, 2022. To ensure that these Tranco domain names contained enough legitimate targets for phishing, we conducted a preliminary study. Specifically, we determined how many of the 382 brands targeted by phishing sites listed on OpenPhish [10] between January 22, 2022 and February 20, 2022 were included in these Tranco domain names. As a result, 96% (= 365/382) of the brands were included in them. This indicates that Tranco domain names are a reasonable target for our BIMI study.

Phishing Email Sender: We analyzed the phishing emails received by our spam trap, and extracted domain names from the email address of the email sender. We collected the domain names of email addresses in the From and Received headers of emails received on April 1 – April 28, 2022. Random sampling resulted in 84,730 unique domain names.

Phishing Website: We employed domain names published by OpenPhish [10] as the domain names of the phishing sites. We obtained this list of domain names on May 2, 2022. A total of 30,221 domain names were examined.

3.2 Data Collection Methodology

This section describes how to determine whether a domain name employs BIMI and other DNS-based email security mechanisms described in Sect. 2.1.

We first send a query to each domain name to look up the BIMI, SPF, DKIM, DMARC, MTA-STS, TLSRPT, or DANE records. Queries were sent using dnspython [32]. We recorded the response to each query and determined that each mechanism is operational if the responses matched the signatures listed in Table 5. In the following, we describe specific notes on collecting data for each security mechanism.

Table 1. The levels of BIMI configuration.

BIMI: In a BIMI study, we adopted default as the selector. We downloaded data from the URLs of the logo image and the VMC listed in the BIMI record. In this study, we defined three levels of operation in BIMI, as listed in Table 1.

SPF: In our SPF study, we covered both TXT and SPF records.

DKIM: In the DKIM survey, we used default and key1 as the selectors.

DANE: In the DANE study, the domain names listed in the MX records were targeted. If at least one of the domain names listed in the MX record supports DANE, the domain name is determined to have adopted DANE.

Fig. 1.
figure 1

Fractions (%) of the domain names with valid BIMI records. \(10^n\) represents the logarithmic rank interval ranging from the \(10^{n-1}+1\) th domain to the \(10^n\) th domain.

4 Understanding BIMI in the Wild

In this section, we report on our measurement study of the adoption of BIMI in the wild and its correlation with other DNS-based email security mechanisms described in Sect. 2. We also investigate cases where BIMI has been used in attacks.

4.1 Adoption of BIMI

Among the Tranco one million domain names examined, 3,581 domain names with BIMI records existed (Level 1). We obtained logos from 3,034 domain names (Level 2). However, surprisingly, only 396 of these domain names had a valid VMC available for download (Level 3). We believe that the reason why so few domain names today have had their VMC correctly set up is due to the high cost of obtaining a VMC. To obtain a valid VMC, a brand logo must be registered as a trademark, and a certificate must be issued by a third-party organization based on an examination. We expect that the fact that the cost of operating BIMI is not low will serve as a barrier to attacks that exploit BIMI using fake logos.

Figure 1 presents the number of BIMI-compatible domain names (Level 1 and above) in each rank interval expressed in logarithms, where the rank indicates the popularity of the website corresponding to the domain name on the Tranco list. As expected, the higher the ranking of a domain name, the higher the rate of BIMI adoption; for the top-100 domains, more than 10% of domain names have configured a valid BIMI record. On the other hand, we can see that a certain number of domain names with low rankings have also adopted BIMI, suggesting that the use of BIMI is spreading. For reference, we analyzed the breakdown of the domain names that have configured BIMI. The results are shown in Appendix.

4.2 Correlations Between BIMI and Other DNS-Based Email Security Mechanisms

We analyzed the correlation between BIMI and other DNS-based email security mechanisms, i.e., whether they are simultaneously employed. Table 2 presents the results. “MX-enabled” indicates that the results are restricted to only domain names for which MX records existed. As described in Sect. 2.1, if an email recipient retrieves BIMI data for a domain name, the domain name must pass the DMARC authentication, and the configured policy must be “quarantine” or “reject.” Therefore, a high percentage of BIMI-enabled domain names have adopted SPF and DMARC.

We found that the number of domain names configuring BIMI is larger than those of MTA-STS and TLSRPT. This result suggests that BIMI is attracting the attention of more domain name administrators despite being a relatively new security mechanism. If a domain name operates BIMI with Level 3 and DANE, the domain name has an extremely high security level. We found that only two domain names meet these criteria. DANE requires DNSSEC [18, 28, 31, 33,34,35] settings, which are difficult to configure.

Table 2. Correlations of the email security mechanisms: BIMI vs. other mechanisms. The rows indicate other email security mechanisms and the columns indicate the BIMI setting level. The numerical values in the table indicate the number of domain names.

4.3 Attacks Exploiting BIMI

We applied BIMI record lookups on the domain names of phishing emails and websites, which we describe in Sect. 3.1. We found no BIMI records for 114,915 domain names in the two datasets combined; that is, as of today, we have not observed any phishing attempts that exploit BIMI records. We expect that this observation is due to the fact that the trademark registration process contributes to raising barriers to BIMI record operations. However, there is no assurance that BIMI-abusing domain names will not appear in the future, and it is therefore necessary to keep a close watch on this aspect.

5 Incorrect BIMI Configurations

In this section, we present a measurement study focused on the typical incorrect configurations of BIMI records, logo images, and VMC.

5.1 BIMI Record

We first study the inherent configuration errors we found with respect to the format of the BIMI records collected. It is meaningful to summarize such information and share explicit knowledge of the mistakes that administrators are prone to make.

Logo Setting: Two of the domain names did not have a field to set the logo. In one of these two cases, only a link to the certificate existed. In addition, although 11 domain names had a field for setting a logo, the content was empty, where the empty content in the logo setting field indicates that the domain name in question explicitly refuses to participate in BIMI.

Use of HTTP: There are five domain names whose logo URLs used http instead of https. None of the five domain names has a URL for the certificate. Similarly, one domain name was used http in the URL pointing to the certificate. The URL pointed to the Let’s Encrypt server and not the certificate.

Typos: Six domain names were incorrectly used I= instead of l= as the field for setting the logo. The certificate link did not exist for any of the six domain names.

Unnecessary Parentheses: One domain name existed in which the domain name was described as l=[<logo link>] when setting the logo. The domain name in question does not contain a certificate link set.

Invalid String: Two domain names existed, in which invalid character strings were set in records that should describe the URLs.

These misconfigurations were found in domain names that had set only a logo or had not set a logo at all.

5.2 Brand Logo Image

We analyzed logo images in SVG format retrieved from the URLs listed in the BIMI records. A total of 3,034 logo images were analyzed. In the following, we show the cases that violated the mandatory and recommended conditions described in the Internet Draft [17] of SVG shown in Sect. 2.2. Of the domain names with VMC configured, only five domain names failed to configure SVG in the correct format.

Title Tagmandatory: There were 1,008 (33%) logo images without title tags. Two images with empty title tags are found.

SVG Tagmandatory: There were 1,224 (40.3%) logo images that did not conform to the svg tag format.

Desc Tagrecommended: A total of 2,905 (95.7%) logo images did not contain a desc tag.

Image Sizerecommended: In total, 241 (7.9%) logo images exceeded the recommended 32 KB.

Aspect Ratiorecommended: Logos displayed on email clients are often circles or squares. It is therefore recommended that the aspect ratio of the logo be 1:1 [1], and 496 (16.3%) of the logo images do not have this aspect ratio.

5.3 VMC

We analyzed VMCs obtained from the URLs listed in the BIMI records. The analysis covered 396 certificates collected from domain names with Level 3 BIMI settings, as shown in Table 1.

Certificate Issuer: Table 3 shows a breakdown of the issuers of the collected certificates. Currently, certificates issued by parties other than Entrust and Digicert are invalid for BIMI, among which there are five such cases. These certificates did not contain logo images, whereas all certificates issued by Entrust and Digicert contained image data.

Table 3. Frequencies of issuers.
Table 4. BIMI configuration policies for the target domain (rows) vs. subdomains (columns).

Certificate Validity Period: We analyzed the validity period of the collected certificates. As a result, 13 certificates had expired. One of these is the domain name entrustdatacard.com, which was used by Entrust. The domain name redirects https://www.entrust.com/. However, BIMI records, logos, and certificate links are still accessible.

Legitimacy of Images Extracted from the VMC: We verified whether the 391 logo images extracted from the collected VMCs matched the logos collected from the URLs listed in the BIMI records. We found 45 domain names for which there was a difference between the two logo images. The differences included the use of completely different images, the presence of line breaks in the files, differences in the image size, and differences in the SVG titles.

5.4 Violation of DMARC Policy

We analyzed DMARC policies for 396 domain names using Level 3 BIMI settings. Of the 396 cases, four domain names did not have DMARC configurations. Table 4 presents the results of the analysis of the DMARC policy settings. The rows represent the configuration policies for the target domain names, and the columns represent configuration policies for the subdomain names. In the table, bold numbers indicate the number of policy violations, 12 of which were present.

6 Discussion

6.1 Current Status of BIMI

Here, we discuss three perspectives on the current status of BIMI as revealed by our results.

Prevalence of BIMI: Compared with other security mechanisms, BIMI has a relatively high adoption rate despite its novelty (see Sect. 4.2). This is because BIMI is relatively easy to set up, and includes setting up the BIMI records and registering the SVGs. However, our results show that only a small fraction of domain names are correctly configured up to VMC. This is because setting up a VMC increases the difficulty of setting up BIMI and incurs certain financial costs.

Misconfiguration of BIMI: Currently, many documents on the Web introduce BIMI settings, and we assume that domain name administrators refer to these documents to set up BIMI. However, it is highly likely that the SVG conversion tool [2] and the BIMI configuration check tool [4, 5, 7] are not correctly introduced in such documents since misconfiguration of BIMI exists. In the future, further dissemination of these tools is essential to reduce BIMI misconfiguration by domain name administrators and to enable them to self-check whether the correct settings have been made.

Abuse of BIMI: The results of our study show that there is still no evidence of BIMI abuse in phishing emails or in the domain names of phishing sites. BIMI is not yet fully deployed, even for well-known services, and end users are not yet familiar with BIMI. Thus, there is no advantage for attackers in configuring BIMI. However, there is no guarantee that attackers will not continue to implement BIMI abuse in the future. It is therefore necessary to continuously monitor the existence of BIMI abuses.

Challenges for BIMI to Scale: Our measurement study revealed that the adoption of BIMI is not high at the present time. In the following, we discuss approaches that may be effective in increasing the adoption of BIMI. We examined information about MUAs that have implemented BIMI and the categories of domain names that have registered BIMI (see Appendix for detials.) First, we found that there are MUAs that do not currently support BIMI. Although we surveyed major MUAs, there were cases where MUAs did not support BIMI. We hope that MUA vendors will understand the effectiveness of BIMI for protecting their users and implement it in the near future. It is also important for MUAs to provide a usable interface for displaying BIMI so that end-users can recognize and utilize BIMI correctly. In addition, in order to increase the number of BIMI compliant domain names, it would be effective to reduce the cost of setting up BIMI [42]. We expect that the availability of open tools and knowledge of the procedures required to register BIMI will increase its popularity.

6.2 Limitations

Our study has the following three limitations. First, in our study, we sent only a minimum number of queries (up to three) to avoid overloading the target. This means that if the target server was offline during our study, the data might not have been correctly retrieved. Second, our study only investigated the specific selectors for BIMI and DKIM. Therefore, if the target of our survey is to use individual selectors for each sending destination, it may be judged as unsupported in our study. Finally, our study did not clarify the current status of BIMI from the viewpoint of administrators and email recipients. To investigate the current issues in setting up BIMI and the effectiveness of BIMI from the viewpoint of the recipients, it is necessary to conduct an interview study.

6.3 Possibility of Registering Fake Logos

To register a brand logo with BIMI and obtain a legitimate certificate, it must be registered as a trademark. This is expected to make the registration of fake logos more difficult. By contrast, approximately 90% of the domain names that currently have BIMI records operate BIMI without valid certificates. It has also been pointed out that some email clients display BIMI brand logo images without certificate validation [3]. Based on this background, we investigated whether there were any cases of fake logos registered with BIMI. We employed a perceptual hash (pHash) [11], which calculates the similarity between two images. In addition, pHash is widely used to detect a copyright infringement. The analysis revealed several cases in which the same logo was used for multiple domain names. Most of these cases involve the use of several different TLDs for the same service, such as amazon.com and amazon.co.uk. By contrast, there was one domain name using the digicert logo for a completely different service, which we concluded was a misconfiguration. At this point, no obviously fake logos have been found, although we plan to monitor this situation closely.

6.4 Ethical Considerations

Our measurement study discovered several domain names with incorrect BIMI settings. As an ethical consideration, we decided to notify the administrators of those domain names to prevent their misuse. In particular, we are in the process of making a responsible disclosure to the administrators of domain names with VMC configured but with some misconfiguration. We also plan to notify the administrators of domain names that have only SVG configured.

7 Related Work

Several measurement studies have been conducted on DNS-based email security mechanisms. This section divides such studies into two broad categories: those that focus on SPF, DKIM, and DMARC, and those focusing on other areas.

SPF, DKIM, and DMARC: In 2011, Mori et al. conducted an early study on SPF implementation by investigating the existence of SPF and the errors found in SPF policies [30]. In 2015, Durumetric et al. measured email servers supporting SPF, DKIM, and DMARC by analyzing SMTP connections on Google’s email servers [20]. In 2015, Foster et al. investigated the prevalence of SPF and DMARC from the perspective of email providers [21]. Hu et al. studied the states of support for SPF, DKIM, and DMARC in 35 email providers in 2018, and conducted a phishing email measurement with end-users [24]. Deccio et al. measured the latest status of SPF, DKIM, and DMARC on several email servers in 2021 [19]. Tatang et al. continuously investigated the status of SPF, DKIM, and DMARC support for domain names listed in multiple top lists in 2021 for a period of 1.5 years [40]. Wang et al. conducted measurements of DKIM deployments using a 5-year Chinese Passive DNS dataset from 2015 to 2020 and server logs of an Chinese email provider in 2020 [41].

Others: In addition, measurement studies were conducted to elucidate other individual protocols (see Sect. 2.1). Scheitle et al. were the first to examine the number of CAAs deployed in 2018 [37]. In 2020, Lee et al. conducted an extensive study to determine how widely DANEs are spread and managed at both the server and client sides [27]. Tatang et al. conducted the first large-scale measurement study of MTA-STS adoption in 2021 [39]. Yajima et al. measured the adoption rates of DNSSEC, DNS Cookies, CAA, SPF, DMARC, MTA-STS, DANE, and TLSRPT, which are security mechanisms that can be implemented in 2021 [42].

None of the studies above mentioned any quantitative results for BIMI, which is just beginning to spread, and our study is the first BIMI measurement approach as of November 2022.

8 Conclusion

In this study, we conducted the first large-scale measurement of BIMI in the wild. We investigated the prevalence of BIMI in one million domain names and found that 3,538 already had BIMI records, despite the BIMI mechanism having only recently begun to be used. We also found that there are intrinsic misconfiguration patterns and specification violations in BIMI records, logos, VMCs, and DMARCs. In addition, no evidence of BIMI abuse was found during our investigation. For the coming widespread use of BIMI, future work includes development of a tool that enables domain name administrators to configure BIMI settings easily and properly, conducting interviews with both domain name administrators and email users on the incentives of adopting/leveraging BIMI, and continuously measure the adoption status of BIMI. We hope that the findings we derived through our measurement study of the BIMI will contribute to its further spread and help thwart the damages caused by phishing attacks.