Introduction

Although sustainability assurance is often voluntary,Footnote 1 the number of companies obtaining third-party assurance for their sustainability reports has increased (KPMG 2017). However, considerable differences exist in the way assurance services are conducted, for example, depending on the type of provider (accountant vs. consultant; O’Dwyer and Owen 2005), and the specific topics being assured. This means that sustainability assurance offers considerable flexibility for firms, which leads to questions of unethical interferences by management regarding the choice of a limited set of sustainability topics to be assured and how to clearly communicate this choice.

The most widely used sustainability assurance standards (International Standard on Assurance Engagements [ISAE] 3000 and the AccountAbility 1000 Assurance Standard [AA1000AS])Footnote 2 recommend that sustainability assurance follows the principle of materiality.Footnote 3 A sustainability topic is material if (misstated) information about this topic has the potential to influence the decisions of intended users, such as investors and other stakeholders (Canning et al. 2019, p. 6). However, firms are not obliged to request assurance for the most material topics. Furthermore, the principle of materiality does not offer strict implementation guidance (Reimsbach et al. 2020).

In practice, firm and assurer jointly determine the intensity and scope of the performed assurance process, which influences the likelihood of discovering problematic issues in a sustainability report (Hummel et al. 2019, p. 736). Differences in assurance depth can be manifest in the selected topics of a sustainability report, for example, whether a firm chooses more or fewer material topics to be assured. In our experiment, we focus on this particular aspect of assurance depth, while leaving other aspects of assurance (e.g., assurance methods, level, recommendations, and coverage) constant (Hummel et al. 2019, pp. 743–744). We thus define assurance depth by the choice of assuring more or fewer material sustainability topics.

In this context, Cooper and Owen (2014, p. 78) criticized that management can strategically influence the assurance process through their choice of assurance depth. This poses questions of unethical interferences by management regarding how to clearly communicate which limited set of sustainability topics was assured. We investigate this behavior as an example of managerial captureFootnote 4 in sustainability assurance engagements (Hummel et al. 2019; Owen et al. 2000; Smith et al. 2011).

Selecting assured topics is not per se an unethical interference. However, if its application is misused, for example, by an intentionally unclear communication or obfuscated scope of assurance, it may contradict ethical assurance practices and no longer foster the credibility of sustainability reporting. We use the term reference explicitness to capture whether firms choose to indicate the assured topics in a more or less explicit form, meaning whether the choice of the assured topics is indicated less clearly via verbal information cues in the assurance report or more clearly via visual information cues throughout the sustainability report. We thus define reference explicitness by the communication choice of assurance, using visual or verbal information cues.

Note that no uniform standard exists for clearly marking and referencing which topics actually have been subject to assurance. Therefore, managements’ choices of reference explicitness can make it harder or easier for the reader to correctly interpret the quality signal of sustainability assurance. Specifically, we address the following research question: “How do two strategic choices by management (reference explicitness and assurance depth) influence sustainability report readers’ credibility perceptions?”.

Adopting signaling theory (e.g., Jensen and Meckling 1976; Spence 1973), which is receiving increasing attention in the scholarly discourse (see Hahn and Reimsbach 2020), we examine the receiver’s perspective of sustainability assurance. We critically assess sustainability assurance and investigate whether it truly signals credible information or, on the contrary, provides room for the unethical behavior of false signaling (Connelly et al. 2011, p. 45). This is important because readers rely on sustainability assurance when evaluating the information in sustainability reports (Hodge et al. 2009). The main purpose of sustainability assurance is to contribute to completeness and transparency of sustainability information and thereby increase its credibility (O’Dwyer 2011). If sustainability assurance is only disseminating the information that enhances the corporate image, rather than a true and complete picture (Owen et al. 2000, p. 85), it might only serve symbolic purposes (Ball et al. 2000; Gray 2000). If this is the case, its primary goal is to include elements that falsely signal credibility to the reader of the sustainability report (Shen et al. 2017, p. 6), while keeping efforts and costs for the report low (Hummel et al. 2019, p. 733).

We investigate credibility perceptions for specific strategic choices of voluntary sustainability assurance, since credibility is a central aspect in the sustainability assurance literature. Gürtürk and Hahn (2016, p. 39) observed substantial differences in the choices of sustainability assurance concerning the assured content and the communication of assurance processes in the assurance report. However, it remains unclear how readers perceive differences in the choice of assured topics and the communication of this selection.

We connect to existing literature that focuses on the quality of the assurance process in terms of assurance process depth (Hummel et al. 2019) and analyze for which communication choices we observe a distortion of sustainability report readers’ perceptions (Neu 1991; Neu et al. 1998). Thereby we use an experimental 2 × 2 + 1 between-subjects design and focus on reference explicitness and assurance depth as two strategic choices by management when assigning sustainability assurance.

Our results show that for sustainability assurance with low reference explicitness, increasing the assurance depth leads to higher perceived credibility values. Interestingly, high reference explicitness leads to a reverse effect in which an increase in assurance depth causes a drop in perceived credibility. This indicates an underlying interaction effect that we interpret through the theoretical lens of signal interpretation. High reference explicitness is misinterpreted by the readers of sustainability reports as a false signal. This helps to explain the reason why only a low portion of companies choose to explicitly indicate the assured sustainability topics via visual cues (Gürtürk and Hahn 2016).

We contribute to the literature on sustainability assurance and ethical assurance practices in several ways. First, we extend prior research on sustainability assurance (e.g., Fuhrmann et al. 2017; Hodge et al. 2009; Manetti and Becatti 2009; Maroun 2020; Perego and Kolk 2012) by specifically investigating the practice of assuring only selected topics of a sustainability report. Second, extant literature comprises few studies that explicitly consider the communication of assurance (Gürtürk and Hahn 2016; Mock, Rao and Srivastava 2013; Mock, Stohm and Swartz 2007). To the best of our knowledge, this is the first study to examine different degrees of reference explicitness in the context of sustainability assurance. Sustainability assurance needs to be presented in a transparent and unambiguous form to reduce the risk of a misinterpretation of assurance by potential investors, audit providers, and companies. Third, our experimental design enables us to demonstrate that reference explicitness and assurance depth interact with each other. We reflect the results for these strategic management choices against the unethical practice of false signaling. We add to the literature on managerial capture (Hummel et al. 2019; Owen et al. 2000; Smith et al. 2011) by experimentally examining potential distortions of readers’ credibility perceptions for variations of reference explicitness and assurance depth. Fourth, we contribute to studies investigating signaling theory in the context of sustainability assurance (e.g., Cheng et al. 2015; Clarkson et al. 2019; Hummel et al. 2019; Zerbini 2017).

Related Literature, Theory, and Hypothesis Development

Related Literature

Prior research typically differentiates between firms with and without sustainability assurance (Cheng et al. 2015; Coram et al. 2009; Pflugrath et al. 2011; Reimsbach et al. 2018). Some studies have focused on general assurance characteristics (e.g., type of assurance provider, assurance standard applied) or firm characteristics (e.g., industry, country level, sustainability performance; Casey and Grenier 2015; Cho et al. 2014; Kolk and Perego 2010; Simnett et al. 2009). Other research focuses on capital market consequences of sustainability assurance (Fuhrmann et al. 2017). Previous qualitative research has investigated the content of assurance reports (Ball et al. 2000; Gürtürk and Hahn 2016; Manetti and Becatti 2009; Mock et al. 2007; Perego 2009; Perego and Kolk 2012) and the process of assurance engagements (Canning et al. 2019; O’Dwyer 2011).

However, only a few studies have focused on strategic aspects in assigning sustainability assurance (Callery and Perkins 2020, pp. 25–26). As sustainability assurance is often voluntary, the firm and the assurer jointly negotiate the terms of assurance. Previous studies have criticized the degree of independence of assurers and, therefore, have also questioned audit quality (Fonseca 2010, p. 359) and managerial and professional capture (Smith et al. 2011). For example, Cooper and Owen (2014, p. 78) highlighted that management can strategically influence the assurance process. Management can select the assurance provider and restrict the topics of the assurance process. Based on a deductive content analysis of 61 assurance statements, Gürtürk and Hahn (2016, p. 35) noted that the majority of the assurance statements (87%) omitted parts of the sustainability report and only assured selected topics or sections. In other words, management can restrict assurance processes to selected topics of a sustainability report that are most beneficial for the corporate image. Such an example of managerial capture reduces transparency and accountability at the expense of ethical assurance conduct (Hummel et al. 2019, p. 734).

Hasan et al. (2003) investigated the wording of assurance statements in the context of environmental reporting. They analyzed whether different limited assurance reporting forms (opinion on procedures, negative assurance, positive assurance, and positive assurance with a limitations paragraph) lead to significantly different perceptions by shareholders. Their findings show that assurance statement readers are able to recognize differences in assurance levels and risks. In a later study, Hasan et al. (2005) investigated the determinants and communication of different levels of assurance. Their research shows that companies use inconsistent approaches to communicate limited assurance, and therefore readers have difficulties in differentiating between a high and limited level of assurance (Hasan et al. 2005, p. 100).

Hodge et al. (2009) built on a similar idea. They conducted an experimental study with 126 master’s students in economics to compare different aspects of assurance (audited vs. unaudited, limited vs. reasonable assurance, accounting firm vs. specialist consultant) and analyzed whether the level of assurance is related to the perception of the credibility of a sustainability report. Contrary to Hasan et al.’s (2003) findings, report readers were not able to distinguish the difference between different levels of assurance. The authors were not able to find a significant effect on credibility for the level of assurance. However, the authors found a significant interaction effect for level of assurance and assurance provider.

To summarize, previous research on determinants and consequences of sustainability assurance have typically differentiated between firms with and without sustainability assurance. Some studies have also focused on additional assurance characteristics. In light of the fact that the vast majority of sustainability assurances are limited, it remains unclear how readers perceive differences in the choice of assured topics and the communication of this selection. By reflecting on signaling theory in an experimental approach, we aim to close this research gap. Given that prior research on the quality of the assurance process in terms of assurance process depth is inconclusive (Hummel et al. 2019), we extend this research field by investigating the communication of the conducted assurance processes (Gürtürk and Hahn 2016).

Signaling Theory

Previous studies have already highlighted the value of assurance as a signal in the context of financial reporting (e.g., Datar et al. 1991; Healy and Palepu 2001; Lys et al. 2015; Verrecchia 1983). We contribute to the limited number of studies discussing signaling theory with regard to the assurance of sustainability reports (e.g., Cheng et al. 2015; Clarkson et al. 2019; Hummel et al. 2019; Zerbini 2017) and thereby also follow a call to further build and expand signaling theory (Hahn and Reimsbach 2020).

It is a basic premise of signaling theory that the signaler wants to signal “quality” to the receiver. Signals of quality describe the “underlying, unobservable ability of the signaler to fulfill the needs or demands of an outsider observing the signal” (Connelly et al. 2011, p. 43). Transferred to the context of sustainability reporting, the management of the firm attempts to signal the credibility of their sustainability report (see Fig. 1).

Fig. 1
figure 1

adapted from Connelly et al. (2011, p. 44)

Signaling model of sustainability assurance,

Assurance can be regarded as a quality signal of the information provided by the firm, indicating information credibility and thereby counteracting an accusation of greenwashing (Cheng et al. 2015, pp. 136–137). Assurance can also reflect higher sustainability-related capital expenditures, therefore higher environmental performance (Zhou et al. 2016, p. 152) and higher future corporate value (Mock et al. 2007, p. 70).

Not only is information content signaled to the receiver but also information about the signaler’s intent which can influence the perception of the receiver (Stiglitz 2000). For example, the mere decision of a firm to obtain assurance already signals that management is willing to reduce information asymmetries (Kausar et al. 2016).

The receiver as the second actor in the signaling model (see Fig. 1) is usually outside the firm and therefore lacks insider information (Connelly et al. 2011, p. 40; Spence 2002). While the signaler decides what and how to communicate to the receiver through the signal, the latter needs the information provided through the signal to be able to make a better informed decision (Connelly et al. 2011, p. 39). In the context of sustainability reporting, varied readers (reflected in different stakeholder groups; Morris 1997) receive assurance signals. They have to interpret the assurance signal to decide whether it increases the credibility of the information in the sustainability report. Mercer (2004, p. 186) defined disclosure credibility as “[investors'] perceptions of the believability of a particular disclosure.” The perceived credibility of a sustainability report can influence subsequent stakeholder concerns and decisions, like risk and performance evaluations.

Signal costs are another important feature of signaling theory and are associated with firm costs to create the signal (Certo 2003, p. 434; Connelly et al. 2011, p. 52). This element of signaling theory is also important in the assurance context. Assurance statements are costly signals due to the payment of audit fees and the time and effort spent by management on preparing for and engaging in the assurance (Kausar et al. 2016). Organizations acquiring high-quality (i.e., costly) assurance services send a stronger signal to investors and other stakeholders (Cheng et al. 2015, pp. 136–137; Simnett et al. 2009). Following the notion of signaling theory, different levels of assurance reflect different signal costs. Therefore, comparisons of assurance costs and expected benefits also influence the requested depth of sustainability assurance (Casey and Grenier 2015, p. 98; Cohen and Simnett 2015, p. 62; Hummel et al. 2019, p. 737).

Hypothesis Development

Credibility Hypothesis

Although sustainability reports have received much scholarly attention, few studies have analyzed credibility perceptions of the reader beyond a conceptual level (Lock and Seele 2017, p. 585). However, the information in a sustainability report will only be relevant and useful for stakeholders if it is perceived as credible. Dando and Swift (2003) identified a “credibility gap,” in which low levels of public trust and confidence undermine the value of corporate reporting. Michelon et al. (2015, p. 59) explained the reasons for the credibility gap as a lack of completeness and quality of the disclosure. Boiral et al. (2019) referred to biased and overly positive reporting in sustainability reports as an additional reason for the low credibility of sustainability reports. Indeed, sustainability reports are often perceived as marketing tools used to present a favorable picture of the organization and to strengthen the firms’ legitimacy and reputation (Hahn and Lülfs 2014, p. 402). In this context, sustainability assurance can be used to lower the credibility gap through its function as a quality signal. Indeed, empirical studies confirm a positive relation between sustainability assurance and credibility (Kolk and Perego 2010; Simnett et al. 2009). To the contrary, Kuruppu and Milne (2010) reported mixed evidence of assurance on beliefs about the credibility of the presented information.

As sustainability reports are mainly communication instruments of the firm (Hooghiemstra 2000), we refer to the credibility construct following the Habermasian theory of communicative action (Habermas 1984). Lock and Seele (2017) provided a new measurement of the “multidimensional perception construct” of credibility (p. 586), referring to Habermas’s ideal speech situation and operationalizing it in four sub-dimensions (p. 585). According to Habermas (1984), the validity claims of the ideal speech situation are fulfilled when the information is true (truth), the speaker is sincere (sincerity), the communication is appropriate in its normative context (appropriateness), and understandable to the reader (understandability).

Sustainability assurance by an independent third party improves disclosure quality and increases the probability of finding misstatements and omissions (Hodge et al. 2009, p. 181). When readers, as the receivers of the signal, perceive sustainability assurance as a positive quality signal, it should result in a higher credibility evaluation compared to cases in which no sustainability assurance is provided. This leads to the following hypothesis:

Credibility Hypothesis (H1)

Report readers perceive an assured sustainability report as more credible than a sustainability report without any assurance.

Reference Explicitness Hypothesis

Following signaling theory, one of the most important characteristic of efficacious signals is signal observabilityFootnote 5 (Connelly et al. 2011, p. 45). Signal observability “refers to the extent to which outsiders are able to notice the signal” (Connelly et al. 2011, p. 45). Accordingly, signals can be characterized by how easily they can be detected by the receiver (Connelly et al. 2011, p. 53). In the context of sustainability assurance, signal observability can be captured by reference explicitness. Reference explicitness refers to the communication choice of assurance and differentiates whether a firm indicates an assured topic via visual or verbal information cues. We argue that a visual assurance signal is more observable than a verbal one. Additional visual information cues are a form of visual emphasis (Merkl-Davies and Brennan 2007), thus the information is more salient for readers of a sustainability report (Djamasbi et al. 2011; Jarvenpaa 1990). Furthermore, visual formats are more likely to direct the reader’s attention to the information presented (Hellmann et al. 2017; Lurie and Mason 2007), consequently readers will be able to notice an explicit assurance signal more easily.

In assessing the relationship between reference explicitness and the perception of credibility, we underline how important it is that the reader observes and consciously receives the assurance signal.

An unclear reference can create the impression of an overall assured sustainability report, even though only selected topics have been assured. This concern is also shared by the Institute of Public Auditors in Germany (IDW 2018, p. 4): “Audited and unaudited report elements smoothly merge and are hardly identifiable from each other to the external addressee during reading. […] Report recipients may easily lose track and consider unaudited information to be audited.”Footnote 6 If readers are unable to distinguish between assured and non-assured topics, they consequently cannot evaluate the assurance signal correctly. This can lead to an incorrect assessment of the credibility of the sustainability report as such.

On investigating this relationship, we refer to an experimental study by Hodge (2001). In this study, participants received a financial statement (assured information) and a letter to shareholders (non-assured information). It became clear that investors wrongly classified the non-assured information as assured information. This led to an overestimation of the credibility of the non-assured information and increased the credibility of the entire information set. Hodge (2001, p. 679) refers to this phenomenon as the credibility inflation effect. We assume that when readers of a sustainability report cannot easily distinguish between assured and non-assured information, this can create a comparable credibility inflation effect.

An accurate, observable, and unambiguous assurance signal should primarily reduce the risk of confusion between assured and non-assured information. When readers in principle are able to differentiate between assured and non-assured information, an explicit reference provides higher transparency of the assurance process and creates higher perceived credibility of the sustainability report. This leads to the following hypothesis:

Reference Explicitness Hypothesis (H2)

Readers of sustainability reports are more likely to perceive the credibility of sustainability reports as higher if the reference to the assurance is made explicitly.

Assurance Depth Hypothesis

Previous studies have found that in practice there is a substantial variation in the quality of sustainability assurance. Fonseca (2010, p. 359) attributed this to “ambiguity and diversity in criteria and scope,” whereas Bagnoli and Watts (2017, pp. 205–206) referred to inter alia differences in assurance depth.

Especially for limited assurance, management has some leeway to select the topics to be assured. Specifically, a firm could decide to obtain sustainability assurance for topics in which it performs exceptionally well (O’Dwyer 2011, pp. 1249–1250). If companies only assure parts of their sustainability reports, this limits transparency and accuracy (Gürtürk and Hahn 2016, p. 38) and does not contribute to closing the aforementioned credibility gap (Dando and Swift 2003).

In the words of signaling theory, a firm can send readers the signal of assurance, even though only selected topics have been part of the scope of assurance. Consequently, the quality of sustainability assurance as a signal can vary considerably. In terms of signaling theory, we address this issue within the boundaries of signal fit. Signal fit reflects the “extent to which the signal is correlated with unobservable quality” the signaler wants to convey (Connelly et al. 2011, p. 52). In other words, signal fit expresses how well the signaling firm fulfills the quality claim it communicates with the signal. For sustainability assurance, signal fit is highly dependent on the performed assurance engagement. We contribute to the debate on assurance depth because firms can particularly influence this element of sustainability assurance (Hummel et al. 2019, p. 734).

A reduction in assurance depth, without adapting the signal, reduces the underlying quality of the assurance signal. As the signal does not fulfill an equal quality claim, such a change in signal fit should influence credibility perceptions of readers of a sustainability report. We argue that a higher signal fit leads to higher perceived credibility of the sustainability report as reflected in the following hypothesis:

Assurance Depth Hypothesis (H3)

Readers of sustainability reports are more likely to perceive the credibility of sustainability reports as higher if the assurance depth is high.

Interaction Hypothesis

For a sustainability report to be of value, the provided information has to be perceived as plausible and trustworthy (Hahn and Kühnen 2013, p. 14). If a firm uses sustainability assurance as a quality signal for the sustainability report, the assurance itself must be credible (Watson et al. 2002, p. 201).

While true quality is reflected in a credible signal, signaling theory refers to the behavior of “false signaling” if the signaler does not possess the signaled quality but intentionally sends a misleading signal to the receiver (Connelly et al. 2011, p. 45). In such cases, the signaler takes advantage of existing information asymmetries and tries to influence the receiver’s decision to his or her own benefit. In the context of selected sustainability assurance, a firm would send the signal of an assured sustainability report, although not all topics have been subject to assurance. Organizations can take credit for a positive signaling effect, despite having only low assurance depth. The underlying assumption is that, regardless of the applied assurance depth, the mere provision of assurance is sufficient as a signal for receivers (Hummel et al. 2019, p. 738). Readers will perceive the sustainability report as unjustifiably more credible if they do not detect the false signal.

However, receivers can also detect false signaling and impose penalty costs, which are negative reactions of receivers (e.g., negative feedback, consumer boycott, lawsuits; Connelly et al. 2011, p. 61; Gammoh et al. 2006, p. 467). In the context of sustainability assurance, penalty costs are mainly expressed through lower perceived credibility. Once false signals are revealed, they are no longer effective (Alon and Vidovic 2015, p. 340) and subsequent signals will also not be seen as credible (Watson et al. 2002, p. 291).

The relevance of false signaling in the context of sustainability assurance is documented in several academic studies (e.g., Cho et al. 2015; Maroun 2020; Michelon et al. 2015). Findings by Michelon et al. (2015, p. 75) showed that firms do not use assurance as a substantive practice but rather as an instrument to demonstrate sustainability commitment. Maroun (2020) criticized sustainability assurance due to its lack of detail and precision, and Cho et al. (2015) noted that many firms approach sustainability issues on a symbolic level. While independent assurance should alleviate stakeholder concerns regarding corporate greenwashing (Lyon and Maxwell 2011), assurance itself can be misused as a symbolic practice (Michelon et al. 2015). Therefore, Maroun (2020) reasoned that sustainability assurance can also be interpreted as an impression management process.

We argue that the likelihood of the receiver to detect a false signal is influenced by the combination of reference explicitness and assurance depth of the sustainability assurance. To illustrate our point, we refer to assurance signals with low reference explicitness as an example. Potential penalty costs in such a case are lower, as false signaling is not as easily detectable as it is for highly observable signals. On the contrary, false signaling is successful if the firm intentionally combines low reference explicitness and low assurance depth and readers falsely perceive an increased credibility of the sustainability report. Such a cherry-picking behavior of firms, when setting the communication and topic choices of sustainability assurance, creates ethical tensions. It contradicts ethical assurance practices in terms of accountability and transparency. Nevertheless, in the short run, it could positively influence credibility perceptions of sustainability report readers. However, in the long run, applied practices of false signaling might damage the credibility of sustainability assurance as such. To find out how readers react to variations in communication and topic choices, we investigate all different combinations of assurance depth and reference explicitness. This leads to the following hypothesis:

Interaction Hypothesis (H4)

Credibility perceptions will differ based on the interaction of assurance depth and reference explicitness.

Research Method

Overview of Experimental Design

This study is based on a 2 × 2 + 1 between-subjects design in which participants were asked to evaluate the credibility of the sustainability report of a fictitious textile company. Between treatment groups (see Fig. 2), we randomly manipulated (1) reference explicitness (low vs. high) and (2) assurance depth (low vs. high) of an attached assurance report. In all conditions, the content of the sustainability report and the sustainability performance of the fictitious company remained constant. This experimental design resulted in four different treatment groups plus an additional control group that evaluated the sustainability report without any assurance. Except for the control group, all assurance reports were subject to a limited level of assurance.

Fig. 2
figure 2

Experimental between-subjects design

Participants and Experimental Materials

Participants in our experimental design are postgraduate students in business who proxy for reasonably informed non-professional investors, as one group of sustainability report readers. Non-professional investors have become a significant element in financial markets, and they are an important target stakeholder group for sustainability reporting (Cohen et al. 2011; Hellmann et al. 2017; Reimsbach and Hahn 2015). Using student participants is justified because our experimental task does not entail high integrative complexity (Elliott et al. 2007). For example, assessing the credibility of a sustainability report does not require participants to draw complex connections. Additionally, business graduate students often proxy for non-professional investors in experimental studies (e.g., Cheng et al. 2015, p. 141; Hellmann et al. 2017; Libby et al. 2002).

We asked participants to complete an online questionnaire via Qualtrics, and distributed the survey link to participants enrolled at universities in Germany and Australia. We relied on advertisements, which were communicated in the classes at the universities, to recruit participants. Following the advertisement, 220 participants commenced with the questionnaire in class. From those, 157 participants completed the questionnaire, which represents a dropout rate of 0.29.

The experiment started with some general instructions and background information for participants. The textile company ABC was presented, listing the material topics (see Appendix 1). The remaining part of the questionnaire consisted of three sections. The first section contained a short excerpt of a sustainability report, comprising two environmental and two social sustainability topics and (except for the control group) an assurance report (see Appendix 2). Depending on participants’ treatment groups, they were randomly shown a report with different communication and topic choices of sustainability assurance. The sustainability and the assurance report presented identical information; the sole difference between them was the depth of the assurance performed and the way assurance was referenced.

While disclosed negative information is per se more credible (Cho et al. 2013), prior studies have demonstrated the dominance of positive information in sustainability disclosure (e.g., Holder-Webb et al. 2009; Lougee and Wallace 2008). We presented a sustainability report with overly positive performance for all treatment conditions. In this case, credibility of the sustainability-related information is not very high and assurance is necessary to increase the level of credibility. The sustainability report was anonymized to control for potential bias for or against specific brands. However, the descriptions of the sustainability topics were based on GRI Standards (2018) and sustainability reports of real companies in the textile sector. As the value chain in the apparel and textile industry is oftentimes outsourced to manufacturers and suppliers in emerging markets, this sector is heavily intertwined with environmental, social, and governance issues (SASB 2015). This industry sector has also received wide scholarly attention, especially due to past sustainability scandals (Köksal et al. 2017; Oelze 2017).

The second section consisted of the credibility evaluation of the participants. To measure the perceived credibility, we used an adaptation of the perceived credibility scale developed by Lock and Seele (2017).

The third section (the post-experimental questionnaire) contained a number of questions to check attention and manipulation.Footnote 7 We measured a set of control variables (familiarity with sustainability reporting, English proficiency), including demographic questions (gender, nationality, topic area of degree). We used 7-point Likert-type scales to measure the control variables, which were anchored at 1 = very unfamiliar/low to 7 = very familiar/high.

Experimental Variables

Reference Explicitness

In our experiment, we operationalized the independent variable reference explicitness by using visual or verbal information cues when communicating sustainability assurance. The assured topics of the sustainability report were either marked with a more explicit visual graphical reference (high reference explicitness) or indicated less explicitly in a verbal text form within the assurance report (low reference explicitness).Footnote 8 A graphical reference in proximity to the topics assured entails a higher visual salience than a verbal format.Footnote 9

Such visual formats have been used in sustainability assurance practice, for example, by the German insurance company Allianz (2016). We aim to analyze how different references to indicate the assured parts of a sustainability report are perceived by the readers of these reports. This contributes to two empirical studies investigating the use of symbols in communicating sustainability assurance. Mock et al. (2007) and Mock et al. (2013) found that 13 out of 130 companies (10%) in the period from 2002 to 2004 and 9 out of 148 reports (6%) in the period 2006–2007 used symbols instead of words to indicate assured statements. We follow Mock et al. (2007, p. 71) who first describe this communication method and different designs of such symbols. Typically, the symbol is placed next to the statement or section in the sustainability report that had been assured. In a more recent study, Gürtürk and Hahn (2016, p. 35) noted that 13% of the 61 analyzed assurance statements used symbols to indicate which information in the report was assured. Such a reference can assist the reader in distinguishing between assured and non-assured topics and “is critical to ensure the user is not misled” (Mock et al. 2007, p. 71). Symbols are used to draw readers’ attention to key data and facts in a sustainability report (Mock et al. 2013, p. 287). Symbols can be used in both limited and reasonable assurance levels. However, to mark the assured topics, they are more useful in cases in which only selected topics are assured. Such symbols should serve, above all, as a clear and transparent communication of the sustainability assurance conducted (Mock et al. 2013, p. 290). It is thus surprising that only a small number of firms use symbols in their sustainability reports.

In the operationalization of reference explicitness, we have chosen a visual graphical reference in the form of a symbol (green tick) to extend the literature on sustainability assurance communication. Directly placed next to the assured topics, the symbol might serve as an additional information cue to direct the reader’s attention to the assured sections.

Assurance Depth

We manipulated the second independent variable, assurance depth, by varying the assured topics of a sustainability report. In their empirical analysis, Hummel et al. (2019, pp. 743–744) derived the variable assurance process depth from qualitative research, which refers to the content of assurance statements. They thereby measured assurance depth based on the elements of scope, methods, level, materiality, recommendations, and coverage. In our experiment, we left the above-mentioned elements constant and modeled assurance depth by the choice of material vs. non-material topics in the conducted sustainability assurance. We applied the definition of material topics provided by the GRI (2018, 101: 1.3.1), that “reflect the reporting organization’s significant economic, environmental, and social impacts [or] substantively influence the assessments and decisions of stakeholders.” In the high assurance depth treatments, the assured topics were equivalent to the aforementioned material topics, while in the low assurance depth treatments, only the non-material topics were assured. The selection of material topics for the fictitious textile company was based on a sector evaluation by the Governance & Accountability Institute (G&A 2014) and the GRI publication “Sustainability topics for sectors” (2013).

Perceived Credibility

Our dependent variable captures readers’ perceived credibility of the sustainability report. We adapted the aforementioned perceived credibility scale (PERCRED) developed by Lock and Seele (2017) to our setting (see Table 1).Footnote 10

Table 1 Adapted PERCRED scale by Lock and Seele (2017)

The adapted PERCRED scale comprises seven 7-point Likert scale statements. The Likert scales were anchored at 1 = strongly disagree to 7 = strongly agree. Trustworthiness was captured by a reverse-coded statement (S7). PERCRED was calculated as the average of the seven statement scores and has a Cronbach’s alpha value of 0.82.

Results

Descriptive Statistics

In total, we collected 157 responses. Two observations were excluded from the sample due to their low proficiency score in English.Footnote 11 After eliminating another five outliers from our data, we proceeded with a final sample size of 150.Footnote 12

Participants were randomly assigned to one of the five treatments. Therefore, participant characteristics such as gender (women = 51%), nationality (28% German, 27% Chinese, 22% Australian, and 23% other), and different general topic areas of their postgraduate degree (42% Business Administration, 15% Accounting, 15% Economics, 13% Commerce, 8% Finance, 4% International Business and Sustainability, and 3% other) were also randomly distributed across the treatments. Post-experiment analysis showed no significant differences in responses across the treatment groups, indicating that the randomization process was successful. Additionally, we included additional control variables (i.e., familiarity with sustainability reporting, English proficiency, gender, nationality, topic area of degree) in our analyses and found no considerable changes regarding our main results.

Table 2 depicts the cell means and standard deviations for the dependent variable, perceived credibility, and the number of participants in each treatment.

Table 2 Summary statistics

Hypothesis Testing

The credibility hypothesis (H1) was tested using a one-way ANOVA. We compared the responses for the dependent variable of perceived credibility between the participants who received an assured sustainability report (Treatments 1 through 4; n = 118) and those that did not receive any assurance (Treatment 5; n = 32). Table 3 presents the descriptive statistics, while Table 4 Panel A summarizes the results of the one-way ANOVA. Although in the predicted direction, the difference in means for perceived credibility is not statistically significant (F = 0.10, p = 0.08). Independent sample t tests showed a statistically significant difference in the mean perceived credibility score between the high assurance depth and low reference explicitness condition (Treatment 2) and the control group (t(58) = 1.711, p = 0.09). All other t tests remained insignificant (see Table 4 Panel B). Therefore, the credibility hypothesis is not supported, with the exception of Treatment 2.

Table 3 Credibility hypothesis—descriptive statistics
Table 4 Credibility hypothesis—results

We tested H2, H3, and H4 using a factorial ANOVA. We thereby analyzed responses for all five groups because a 2 × 2 + 1 between-subjects design was used with assurance depth and reference explicitness as the two factors. Table 5 summarizes the results of the ANOVA for the main and the interaction effects. It can be seen that neither assurance depth (F = 0.01, p = 0.92), nor reference explicitness (F = 1.93, p = 0.17) significantly impact perceived credibility. Hence, there is no support for the reference explicitness hypothesis (H2) and the assurance depth hypothesis (H3). However, Table 5 shows a significant interaction between assurance depth and reference explicitness (F = 5.51, p = 0.02) which provides support for the interaction hypothesis (H4). To further interpret the interaction effect, we performed a simple effects analysis (Table 6).

Table 5 Factorial ANOVA—results
Table 6 Simple effects analysis—results

The simple effect of perceived credibility was significant for high assurance depth (F = 6.97, p < 0.01) but not for low assurance depth (F = 0.46, p = 0.50). The simple effect for high reference explicitness was significant (F = 3.10, p = 0.08), while it is insignificant for low reference explicitness (F = 2.44, p = 0.12). Figure 3 graphically depicts the effects of this interaction. The results suggest that in situations in which firms obtain assurance for material topics (high assurance depth), report readers perceive the credibility of the sustainability report higher when such assurance is marked verbally (M = 4.86) rather than provided via a visual graphical reference (M = 4.24). No such difference between reference explicitness was found for low assurance depth (M = 4.65 and M = 4.49 for the visual and verbal treatments, respectively). As we did not measure higher credibility perceptions for the combination of low assurance depth and high reference explicitness, we could not verify that false signaling creates the intended effect of increased credibility perceptions by readers.

Fig. 3
figure 3

Graphical representation of the results

To test the robustness of the interaction effect, we compared the result for the dependent variable PERCRED to the item level. While the interaction effect is directionally equivalent for four items, we measured a significant interaction effect for three items.

Discussion

We examined the role of the two strategic choices, reference explicitness and assurance depth, of sustainability assurance in influencing how readers perceive the credibility of a sustainability report. The results indicate that neither reference explicitness nor assurance depth alone determine the perceived credibility of sustainability reports. However, as the significant interaction effect of both factors show, perceived credibility differs, depending on the combination of the independent variables assurance depth and reference explicitness.

To further discuss the interaction effect (H4), we have a look at all different factor combinations (see Fig. 3). Related to the non-confirmation of H1, sustainability report readers in the low assurance depth treatments perceived the information similarly credible as in the no assurance condition (control group). Therefore, this strategic topic choice (low assurance depth) does not produce the desired increase in perceived credibility. We observe that assurance can only add credibility on the sustainability report if it is based on high assurance depth (i.e., including the material topics). This observation leads to the recommendation that firms should refrain from obtaining sustainability assurance if they do not intend to assure their material topics.

When we add the factor dimension of the communication choice, we observe that in situations in which assurance is marked less explicitly in a verbal form, readers perceive the sustainability report as more credible for high assurance depth than for low assurance depth. Perceived credibility in the low reference explicitness condition was measured in the predicted direction of the assurance depth hypothesis (H3). However, if the firm indicates assured topics via a visual graphical reference, perceived credibility evaluations react inversely. Our results suggest that for high assurance depth, the presence of high reference explicitness is harmful to the perceived credibility of a sustainability report. Compared to our predictions in the reference explicitness hypothesis (H2), these results appear counter-intuitive. They therefore require further theoretical analysis against the background of signaling theory.

In the hypothesis development, we argued that credibility perceptions vary depending on receivers’ ability to detect false signals. While a good signal fit indicated by a less observable signal (high assurance depth and low reference explicitness) is perceived as a true and credible signal, a highly observable signal (high reference explicitness) might be (wrongly) perceived as a false and, therefore, not a credible signal. Although a visual graphical indication of assured material topics is not designed to be a false signal in our setting, readers seemed to interpret it as such.

A relevant concept in this context is “receiver interpretation” (Connelly et al. 2011, p. 52). Receiver interpretation describes the process of receivers translating signals and putting meaning to them (Connelly et al. 2011, p. 54). This process is not error-free, as distortions or individual weights can influence its outcome. Receivers might not interpret signals uniformly (Perkins and Hendry 2005; Srivastava 2001), depending on their individual background, changing instrumental and symbolic inferences (Highhouse et al. 2007), or personal values, external pressures, and necessities (Branzei et al. 2004, p. 1091). Previous research has demonstrated the importance of the receiver’s perspective in order to understand signaling processes (Suazo et al. 2009; Turban and Greening 1997). Based on the subjective information available to individuals (Ehrhart and Ziegert 2005, p. 903), receivers may interpret signals by departing from the original intent of the signaler.

In our setting, an explicit visual indication of assured sustainability topics (high reference explicitness) seems to be interpreted as a false signal. A visual graphical reference apparently does not increase the level of comprehensiveness and transparency of sustainability assurance. Our results indicate that participants could perceive a visual graphical reference as a misleading or too influential signal and therefore react with penalty costs (Connelly et al. 2011, p. 61) in the form of lower credibility perceptions. Our findings might explain the low prevalence of visuals (e.g., symbols) to indicate assured content within sustainability reports, as observed by Gürtürk and Hahn (2016, p. 35).

When interpreting the findings of our paper, we have to carefully consider factors which affect the reliability and generalizability of our experiment. With regard to experimental control, the amount of information provided to the participants was limited to avoid confounding factors and to increase the attention paid to the independent variables reference explicitness and assurance depth. We paid attention to create a setting for sustainability report readers that is structurally equivalent to a real-world setting, while isolating two strategic decisions around sustainability assurance. Although we believe that our research design adequately captures the signaling process of sustainability assurance, in practice readers consult various information channels when assessing the credibility of disclosure (Mercer 2004). Potential investors might consult additional sources of information, such as further reports, ratings, or social media, which might influence their credibility perceptions (Hahn et al. 2019). As our focus lies on non-professional investors, this reader group is likely more open when assessing sustainability assurance but also more prone to react to false signaling. In contrast, professional investors and auditors with experience in evaluating assurance are more likely to detect false signaling because they may have developed strategies against cognitive biases. For example, Maines and McDaniel (2000) argue that analysts are more focused on the content than the presentation format of financial statements.

Summarizing, assigning assurance to selected topics has to be done with caution. Companies have to be aware that a restriction of assurance depth might harm transparency of sustainability assurance. Furthermore, a well-intended high reference explicitness in the form of a graphical indication of assured topics might create unexpected and unwanted reader reactions to a sustainability report. If readers mistakenly consider it as a false signal, they will react with penalty costs and lower credibility perceptions.

Conclusion

Our experimental study provides empirical evidence supporting the relevance of different assurance choices for the perceived credibility of sustainability reports. It also provides support for the argument that the interaction between reference explicitness and assurance depth should be considered in sustainability assurance practice as it leads to different and sometimes unintended reader reactions.

We contribute to the literature in several ways. First, we extend prior studies on sustainability assurance by examining the practice of assuring only selected topics of a sustainability report. Our research differs from previous studies in this domain (e.g., Fuhrmann et al. 2017; Hodge et al. 2009; Manetti and Becatti 2009; Maroun 2020; Perego and Kolk 2012), because we use an experimental approach with an explicit focus on the reduction of assurance depth. Our findings indicate that sustainability assurance only increases credibility perceptions if it includes the material topics of a report. Firms should therefore consider our results in their cost–benefit considerations when assigning sustainability assurance. Second, we add to the few studies that explicitly consider reference explicitness of assurance (Gürtürk and Hahn 2016; Mock, Rao and Srivastava 2013; Mock, Stohm and Swartz 2007). Our experimental design allows us to analyze different degrees of reference explicitness in the communication of sustainability assurance. Our findings demonstrate that auditors have to be careful about how to mark the assured topics. A graphical reference can be wrongly interpreted by readers and, therefore, lead to lower credibility perceptions. To protect organizations from negative effects of such misinterpretation, sustainability assurance of selected topics should be communicated in an unambiguous form to readers of assurance and sustainability reports. Third, we contribute to literature on managerial capture (Hummel et al. 2019; Owen et al. 2000; Smith et al. 2011) by analyzing the interaction effect of reference explicitness and assurance depth. Selecting assured topics and not communicating that choice in a transparent form is a managerial decision that requires critical ethical analysis. Such false signaling can potentially distort readers’ credibility perceptions of the sustainability report. As readers of sustainability reports do not positively react to a lower signal fit, our results show that false signaling is not only an unethical but also an unsuccessful practice. Fourth, our paper contributes to the growing research on the signaling role of assurance in sustainability reporting (e.g., Cheng et al. 2015; Clarkson et al. 2019; Hummel et al. 2019; Zerbini 2017). In line with the call to build on and extend signaling theory (Hahn and Reimsbach 2020), our findings extend literature on signaling theory by showing that signal observability and signal fit (Connelly et al. 2011) are relevant aspects in the context of sustainability assurance. Although they do not affect perceived credibility when applied separately, they interact when appearing jointly. Different levels of signal observability influence the perceived credibility of signals that show high signal fit. Consequently, signal observability can influence signal interpretation by readers. Our results provide evidence for cases of misinterpreted signaling in which high signal observability and high signal fit lead to a decrease in perceived credibility. Therefore, our findings provide new insights against the background of signal interpretation. For high reference explicitness, we were able to demonstrate an incorrect signal interpretation by readers.

Beyond limitations typically associated with experimental research, the results of our study should be interpreted with caution in light of the following limitations. First, the study was based on master’s students in business as a proxy for non-professional investors. Typically, master’s students in business possess basic accounting knowledge and are familiar with investing (Libby et al. 2002, p. 803), which is sufficient for the goal of our experiment. Although this is a common practice in experimental accounting and auditing research (e.g., Cheng et al. 2015, p. 141; Libby et al. 2002), the sample used did not fully represent all characteristics of the general population of non-professional investors (see Cheng et al. 2015, p. 156). We should also keep in mind that sustainability reports are often designed to reach a broader set of stakeholders (e.g., employees, customers). We did not explicitly test for other groups of readers. Therefore, our results are generalizable to other reader groups in a limited way. However, it would be interesting to test our experiment with professional subjects with a higher level of exposure to sustainability disclosure and experience in sustainability assurance. Second, we presented an extract of the sustainability report to participants. The stimulus contained two environmental and two social sustainability topics and a shortened assurance report (see the Appendix 2). In practice, sustainability reports are considerably longer and cover a variety of different topics. While our stimulus is based on a fictitious company in the textile industry, literature shows industry differences in the exposure to sustainability risks and in the adoption of sustainability assurance (Casey and Grenier 2015; Kolk and Perego 2010; Pflugrath et al. 2011; Simnett et al. 2009). Typically, sustainability reports entail a higher degree of complexity than the extract presented in our experiment. Due to the mix of quantitative and qualitative information or the integration of text, tables, and other visual elements (e.g., Cho et al. 2012; Hellmann et al. 2017; Merkl-Davies and Brennan 2007), it is more difficult for readers to filter and interpret the relevant information. When generalizing our results, readers should keep in mind that participants may react differently to more extensive and more complex reports. Third, we concentrated on sustainability reports in the format of a separately published sustainability report. In practice, we also observe a growing number of integrated reports. Typically, financial information is externally assured, while oftentimes only parts of the non-financial information have been subject to assurance. This creates an even more complex scenario for sustainability report readers. Therefore, it is likely that the confusion between assured and non-assured information increases further (Hodge 2001; IDW 2018), which could influence sustainability report readers’ credibility perceptions. Although we expect that assurance depth and reference explicitness have a similar effect on credibility perceptions, our results are only conditionally transferable to the scenario of integrated reporting. Fourth, our study examines only two choices of assurance reports, reference explicitness and assurance depth. In our experimental design, we did not manipulate other managerial choices concerning the assurance statement.

The limitations presented offer opportunities for further research. Such research could investigate additional sustainability assurance choices, such as different levels of the assurance, changes in the applied methods, or the wording of the assurance statement (Hummel et al. 2019, p. 736). Additionally, the assessment of materiality could be varied. An experimental setting could analyze different degrees of coverage or a mix of material and non-material topics. Furthermore, there are promising avenues for further research in investigating the use of different graphical references to mark sustainability assurance. For example, future research can focus on different symbols, their size and color, or combinations of visual and verbal information. An experimental approach under the utilization of eye-tracking equipment could be adopted for such purposes. This technology enables a richer analysis of the reader’s judgment-making process. This approach also offers opportunities to further investigate signal interpretation processes. An empirical study examining recent assurance reports could analyze the use of graphical references for different levels of assurance. Such an approach would further complement the study by Gürtürk and Hahn (2016) that was based on a sample of published sustainability reports for 2012.

Further research could also investigate the motivations of firms when assigning assurance. Such research could investigate how companies strategically decide the different elements of assurance, for example, by comparing the motives of management to choose different assurance options with the credibility perceptions of readers. Studying the motives and goals of organizations could provide deeper insights into false signaling and potential impression management. In particular, this stage of the assurance engagement raises further questions of managerial capture and ethical conduct.