Oversized Area Indications on Bonus Packs Fail to Affect Consumers’ Transactional Decisions—More Experimental Evidence on the Mars Case

Findings from behavioural research are gaining increased interest in EU legislation, specifically in the area of unfair commercial practices. Prior research on the Mars case (Purnhagen and van Herpen 2017) has left open whether empirical evidence can provide an indication that this practice of using oversized indications of additional volume alters the transactional decision of consumers. This, however, is required to determine the “misleadingness” of such a practice in the legal sense as stipulated by the Unfair Commercial Practices Directive 2005/29/EC. The current paper closes this gap by illustrating how behavioural research can inform legal interpretation. In particular, it extends the previous research in two important ways: first, by examining the actual choice that people make; and second, by investigating whether the effects remain present in a context where a comparison product is available. Yet, while supporting and extending the findings of the study from Purnhagen and van Herpen (2017) on deceptiveness, the current study could not produce empirical evidence of a clear influence on the transactional decision of consumers, in the way “UCPD” requires.

Findings from behavioural research are gaining increased interest in unfair commercial practices (Duivenvoorde, 2015;Purnhagen, 2017;Purnhagen & Van Herpen, 2017;Schebesta & Purnhagen, 2017;Sibony, 2015;Trzaskowki, 2011;Weatherill, 2007). The European Commission, for example, funded a study where behavioural scientists tested whether behavioural science methods could be utilised to determine "vulnerability" when interpreting the legal "average consumer benchmark" within the specific group of vulnerable consumers (European Commission, 2016a). The European Commission took up these findings in its guidance document on the interpretation of the average consumer benchmark in EU Unfair Commercial Practices law (European Commission, 2016b, critical Purnhagen, 2018. Likewise, the Court of Justice of the European Union 1 presents an interpretation of legal norms that parallels the outcome from behavioural studies (Schebesta & Purnhagen, 2016). Seldom, however, has research of behavioural science directly targeted the interpretation of legal norms in European Union law. To illustrate this potential of behavioural research for the interpretation of unfair commercial practices, the normative assumptions of the influential Mars case (1995) 2 concerning misleading advertisement on a food package have been examined in an experiment simulating deceptiveness in terms of product perceptions (Purnhagen & Van Herpen, 2017). The Mars case (1995) concerned a business practice where the Mars company attached a "10% more" label on an ice-cream bar sold in Europe which in fact contains 10% more content for the same price. The yellow area of the label, however, covered 30% of the pack. The Mars case (1995) hence concerns a situation in which a product package contains extra volume ("bonus pack"), and where a coloured area on the pack communicates that the product is larger than usual. This coloured area is larger than the actual additional volume contained in the package, and the question is whether this is misleading in the legal sense as stipulated by the Unfair Commercial Practices Directive 2005/29/EC (hereafter: the UCPD). 3 In their study, Purnhagen and van Herpen (2017) provided first evidence, in an experimental setting, that such bonus packs containing an oversized indication of the extra volume on the package systematically lead consumers to overestimate the actual extra volume given and, as a result, could potentially be misleading. If one, as some scholars (Purnhagen, 2017;Sibony, 2015;Trzaskowki, 2011) and the European Commission (2016b) propose, were to use evidence from such studies as a basis to determine if consumers are deceived in the sense of the law, this study has illustrated that consumers are indeed likely to be deceived. These findings contradicted the Court's findings in the original Mars case (1995). It was left open, however, if such deceptiveness would also influence the consumer's buying behaviour. To use the terminology of the UCPD, whether this deceptiveness causes the average consumer or is likely to cause him "to take a transactional decision that he would not have taken otherwise" (Art. 6 UCPD).
The current study will fill this void in knowledge by providing empirical evidence on whether bonus packs with oversized volume indications will not only affect consumer perceptions but also the actual choices that they make. Our study also makes an important additional contribution beyond the prior empirical examination of Purnhagen and van Herpen (2017). Not only it will examine consumer choices but it will do so in a setting in which a competing alternative product is available that should make misleading on-pack information more obvious. After all, 1 We refer to the Court of Justice of the European Union as "CJEU" and to the CJEU and the European Court of Justice as "the Court. consumers regularly make transactional decisions in a setting where multiple products are offered simultaneously. They may notice that the extra volume indication on a bonus pack is misleading when a product of the same size but without the extra volume is presented next to it. Thus, one might expect that a side-by-side comparison of bonus and regular products will mitigate the effects of misleading information and prevent consumers from being deceived.
Our results will show that oversized area indications are deceptive, even when products are presented side-by-side and observation of the actual extra volume is made relatively easy. Consumers in this situation still believe that more extra volume is provided than is actually the case. Interestingly, however, the purchase intention for the bonus product did not depend on whether an oversized area was present or not. Our experimental findings thus suggest that, for stating their purchase intentions, it was irrelevant to consumers whether the pack contained an oversized or correct indication area. Instead, when both bonus and comparison product were evaluated together, consumers indicated a higher purchase intention for the bonus product than for the comparison product. They apparently preferred the bonus product irrespective of how much bonus volume they perceived it to obtain. It seems that in a competitive choice setting, the availability of a comparison product made the extra volume salient. Moreover, we find little evidence to suggest that actual choices are affected.
To sum up, while supporting and extending the findings of the study from Purnhagen and van Herpen (2017) on deceptiveness, the current study could not produce empirical evidence of a clear influence on the transactional decision of consumers, in the way UCPD requires.
The current study demonstrates how experimental evidence can inform legal cases involving potential misleading graphical indications on product packaging. Its main aim is to showcase how such an experiment can be set up, which insights can be gained, and what their relevance is for the legal issue. Our choice for a lab experiment is based on its unique ability to provide causal evidence under controlled circumstances. To inform the legal case, it is pertinent that effects can be exclusively attributed to the graphical information on pack and not to other circumstances or marketing tactics. Although a lab experiment is tailored to this, the results from such a lab experiment can be meaningfully supplemented with insights gained from qualitative work (to assess consumer responses more in-depth) and real-world transactional (sales) data (to obtain an indication of the size of the effect in real life). This will be discussed further in the general discussion.
Obviously, one experiment cannot provide a conclusive answer that would hold for all cases involving oversized area indications on bonus packs. Although we expect results to be generalizable, more research will be needed to determine if similar effects are found across demographic groups, different countries, and cultures, and for various product categories. Our objective is thus to provide first evidence for the effect that oversized area indications on bonus packs may have on consumers' transactional decisions.
Taking the demands for regulatory validity (Purnhagen, 2018) when applying experimental research to legal questions seriously, we will first present the legal framework of EU Unfair Commercial Practices law on which our hypothesis and the design of the study is based. Subsequently, drawing upon studies of cognitive psychology and behavioural economics on biases in consumer decision-making, we will develop hypotheses regarding the expected effects of oversized volume areas on consumer perceptions and choice. We will then present the experiment and its results. Finally, we will put the results into the wider context and discuss whether we can draw normative conclusions for legal interpretation from this experiment and, if so, which ones.

The Legal Framework of Unfair Commercial Practices in Foods
The protection of consumers against misleading practices is one of the quintessential functions of the European information model and constitutes an important part of the UCPD (Straetmans, 2016). In the area of food, Regulation (EU) No 1169/2011 covers such unfair commercial practices on the provision of food information to consumers (hereafter FIC). 4 The basic principles are that food information may not be misleading and that information must be accurate, clear, and easy to understand for the consumer (Art. 7 (1), (2) FIC). Although the FIC provisions echo the general spirit of the UCPD, they lack its more elaborated tests (Schebesta, 2019). Unlike Art. 6(1) UCPD, Article 7 of the FIC does not explicitly require a causal link between a deceiving practice and an alteration in consumer's purchase behaviour as a result of that deceptiveness to qualify a practice as misleading (Purnhagen & Van Herpen, 2017;Van der Zee & Fisher, 2018). Whether the differing test in Article 7 FIC can be considered lex specialis to the UCPD and hence preclude the applicability of the UCPD's additional requirement of a causal link, or whether Article 7 FIC can be seen as a declaratory repetition of the UCPD's requirements, is not clear from the text. The relationship between both provisions hence needs clarification.
Article 7 FIC qualifies as a rule "regulating specific aspects of unfair commercial practices" and shall hence "prevail and apply to those specific aspects" (Art. 3(4) UCPD). As a result, the law mandates that the rules of the FIC generally prevail over the ones of the UCPD. The FIC stipulates positive objective unilateral requirements such as standard information that needs to be included and references the average consumer in this respect (see for this difference and the different functions of the average consumer resulting from this Schebesta & Purnhagen, 2020, p. 299). The UCPD uses the average consumer concept in many ways, including as a benchmark to describe specific behaviour in concrete bilateral situations. Article 7 FIC can qualify as an additional, mandatory indication to determine "deceptiveness" in the sense of the UCPD, emphasizing that the speciality of foods when compared to other commodities (Purnhagen & Wesseler, 2020) needs to be considered. This adds an additional, food-related layer to the deceptiveness test of the UCPD, which justifies the application of both provisions in parallel, the UCPD as a basis and the FIC in addition. Hence, adopting the terminology used in the UCPD, the term "misleading" in Art. 7 FIC should rather read as "deceptive." This result is in line with findings in the literature, which stipulates that the provisions of the UCPD remain applicable also to the area of foods if they do not contradict the provisions of the FIC (Van der Zee & Fisher, 2018). This applies to the requirement to determine whether the deceptiveness was likely to cause the consumer "to take a transactional decision that he would not have taken otherwise," as this requirement is not mentioned in Art. 7 FIC (Van der Zee & Fisher, 2018). For the purposes of our study, we will hence rely solely on the provisions of the UCPD.
In order to contribute to the proper functioning of the internal market and to achieve a high level of consumer protection, the UCPD covers all business-to-consumer commercial practices in all sectors of economic activity (Van Boom, 2016) before, during, and after a commercial transaction in relation to a product (Art. 3 (1) UCPD). More specifically, the core of the UCPD lies in the prohibition of practices that are contrary to the requirements of professional diligence and that materially distort or are likely to materially distort the economic behaviour of the average consumer whom it reaches or to whom it is addressed (Art. 5 (2) UCPD). Then, the concept of "unfair practices" is further subcategorized into misleading and aggressive practices (Arts. 6-9 UCPD).
As to the first category, a practice shall be regarded as misleading if "it contains false information and is therefore untruthful or in any way, including overall presentation, deceives or is likely to deceive the average consumer, even if the information is factually correct, in relation to one or more of the listed elements, and, in either case, causes or is likely to cause him to take a transactional decision that he would not have taken otherwise" (Art. 6 (1) UCPD). Therefore, in order for the abovementioned oversized volume indication areas to be found a misleading commercial practice, they must impact on the transactional decision of consumers. This facet has so far been left untested (Purnhagen & Van Herpen, 2017). It is the aim of this study to investigate, within the confines of the methodology applied, whether the deceptiveness of the practice alters consumer's purchase decisions, which leads to the "misleading character" of such a practice.
While for some time it had been contended that the requirement to prove a causal link between deceptiveness and the transactional decision of the consumer did in practice not provide an additional threshold, recent cases have increasingly emphasized this requirement (Schebesta & Purnhagen, 2020, p. 303). The UCPD explicitly stipulates the notion of the "average consumer" as a benchmark for assessing the deceptiveness of a commercial practice. Consequentially, it is a settled case law that the average consumer concept provides a yardstick by which to characterise a practice as deceptive (Schebesta & Purnhagen, 2019). Specifically, in the words of the Court: "the constituent features of a misleading commercial practice, as set out in that provision, are in essence expressed with reference to the consumer as the person to whom unfair commercial practices are applied." 5 However, it was unclear if the same "average consumer" yardstick would also need to be applied to determine whether the deceptiveness of the practice would causally influence the transactional decision of the consumer. A transactional decision describes "any decision taken by a consumer concerning whether, how, and on what terms to purchase, make payment in whole or in part for, retain or dispose of a product or to exercise a contractual right in relation to the product, whether the consumer decides to act or to refrain from acting." 6 The notion of transactional decisions covers concrete, bilateral situations, and these are context depended on the individual case. This contrasts with the often general, unilateral situations, which determine deceptiveness, and which often concern generalizable situations, independent from the concrete case of the transaction. 7 It had been unclear if the same "average consumer" yardstick which applies to the determination of "deceptiveness" is applied also to determine the requirement of a causal link to the transactional decision. concerned Canal Digital's price advertising campaign for TV subscriptions. The CJEU held that "where the price of a product is split into several components, one of which is particularly emphasized in the marketing, while the other-which nevertheless constitutes an inevitable and foreseeable element of the price-is completely omitted or is presented less conspicuously, an assessment should be made whether that presentation is likely to lead to a mistaken perception of the overall offer" (CJEU, Canal Digital, para 43) and, furthermore, whether it affects the transactional decision of the consumer (CJEU, Canal Digital, para 45). By means of this decision, the CJEU used the average consumer benchmark as a basis for the interpretation of the notion of a "transactional decision that he would not have taken otherwise," to our knowledge, for the first time (Verbiest, 2017). In particular, the Court held that price "is, in principle, a determining factor in the mind of the average consumer, when he has to make a transactional decision" (Canal Digital, para 46). Therefore, the price enjoys an a priori presumption of influencing the average consumer's behaviour with respect to its transactional decision (Schebesta & Purnhagen, 2019).
To this aim, according to both, legal requirements and "good scientific practice" of consumer decision theory, a study focusing on the transactional decision is necessary. Prior research (Purnhagen & Van Herpen, 2017) has left open whether empirical evidence can provide an indication that the deceptive practice alters the transactional decision of consumers to determine the "misleading character" of such a practice. The current paper closes this gap by illustrating how behavioural research can be used as a tool to inform legal interpretation. In particular, it extends the previous research in two important ways: first, by examining the actual choice that people make; second, by investigating whether the effects remain present in a context where a comparison product is available.

Oversized Volume Indications
In this section, taking inspiration from the findings of cognitive psychology (Tversky & Kahneman, 1974) and law and behavioural science (Jolls et al., 1998;Sunstein, 2000), we provide a brief overview of existing literature on consumer information processing that is relevant for the Mars case (1995). Finally, we discuss how different evaluation contexts can affect the consumer choice and may mitigate the effects that we expect.

Consumer Information Processing
The leitbild of the "sovereign consumer" (Reisch & Zhao, 2017), based on the traditional assumptions of rationality and information (Incardona & Poncibò, 2007), has been increasingly questioned. It has been shown that consumers are unable to process all the available information because of their innate "bounded rationality" (Simon, 1955). That is why they often rely on a range of decision-making shortcuts, known as heuristics, and are strongly dependent on the context in which the decision is made (Tversky & Kahneman, 1974; for an overview, Korobkin & Ulen, 2000). The term "heuristic" was coined by George . Heuristics, or rules of thumb, are mental shortcuts in decision-making, which allow individuals to come to decisions quickly, but they can lead to biases.
Different models of decision-making have been developed, which altered, although without completely departing from, the traditional assumptions of the "homo economicus." In persuasive communication, the elaboration likelihood model (ELM) has been popular (Perloff, 1993). The ELM states that there are two processing "routes to persuasion," each characterized by a different likelihood of elaboration, namely the central and peripheral routes (Petty & Cacioppo, 1986). While the first requires an extensive cognitive elaboration, the latter occurs when the motivation or ability to elaborate is relatively low and individuals rely upon a series of peripheral cues in their response to the message (Gabbott & Clulow, 1999). Subsequently, dual-system theories were introduced, according to which there are two systems by which individuals process information: an intuitive mode, called "system 1," and a reasoning one, "system 2." In the former mode, individuals make fast, automatic, and effortless decisions based on limited information and using a variety of heuristics, while the latter mode seems very close to traditional assumptions of rational decision-making (Kahneman, 2003). Importantly, regardless of which model consumers are said to employ, they cannot avoid being subject to a host of cognitive biases in their decision-making process, as reported in behavioural literature (Tversky & Kahneman, 1974).

Numerical Versus Visual Information
Prior research has viewed the Mars case (1995) from an anchoring perspective (Purnhagen & Van Herpen, 2017). The anchoring bias is one of the first biases uncovered in psychology (Thaler & Sunstein, 2008), which has gained widespread recognition in judgement and decision-making literature by the influential work of Tversky and Kahneman (1974). According to them, the anchoring bias entails consumers making estimates which are biased towards an initially presented value ("the anchor") from which they adjusted. Therefore, different starting points can provide different estimates (Tversky & Kahneman, 1974). The final judgement is, then, a result of adjustments relative to this anchor, or reference point (Van Exel et al., 2006). Following Tversky and Kahneman's research, many studies illustrated the influence of anchoring in human decision-making processes across various domains, including the legal one (Van Exel 2006;Furnham & Boo, 2011).
Anchors may also occur across modalities and dimensions (Oppenheimer et al., 2008). They may, for instance, activate a sense of size, that is not attached to any rating scale, and may thus bias subsequent judgements (Le Boeuf & Shafir, 2006;Oppenheimer et al., 2008). In the Mars case (1995), for instance, when people are confronted with the bonus pack, they are biased in their estimates of values because a sense of size has been activated (Purnhagen & Van Herpen, 2017).
In the Mars case (1995), both a coloured section on the package and a numerical percentage of extra volume are provided on-pack. These could both be seen as potential anchors (Purnhagen & Van Herpen, 2017), or could be seen as pieces of information that consumers can rely upon when forming their product evaluations (cf. information integration theory ;Lynch Jr, 1985). Regardless of which perspective is taken, there is overall consensus that numerical information is likely to have less influence on product evaluations than visual information. While the percentage of extra volume is numerical and is likely to require more cognitive processing to be made meaningful for consumers, the coloured area can be processed more easily as it directly indicates a part of the packaging. In this context, consumers have also been shown to have difficulties in processing numeric information such as percentages (Chen & Rao, 2007), which leads them to neglect base values, and prefer bonus packs over economically equivalent price promotions (Chen et al., 2012). Moreover, the coloured area is more visually salient than the numerical information. Based on these arguments, we expect that consumers will be more susceptible to influences of a coloured area on the pack and, thus, will base their transactional decision on this: H1:The presence (vs. absence) of an oversized volume indication on a bonus pack will lead consumers to perceive that this product has a larger additional volume. H2:The presence (vs. absence) of an oversized volume indication on a bonus pack will increase consumers' purchase intention and choice of this product.

Simultaneous Versus Separate Evaluation
People may exhibit different preferences for the same option depending on whether or not other options are presented simultaneously or not (Bazerman et al., 1992;Hsee, 1996). In joint evaluation, two (or more) options are presented and evaluated simultaneously, whereas in separate evaluation, each option is presented and evaluated separately. This is especially relevant in the current case of oversized volume indications, as consumers may use the size of other options to infer the amount of extra volume that is provided. If this is the case, the effects of oversized volume indications may be mitigated when products are evaluated jointly. Such joint evaluation is common in real life, where consumers often choose products in the context of other options (e.g., a supermarket shelf containing multiple products from the same category).
Different explanations for inconsistencies across joint versus separate evaluation have been proposed in literature, and these include norm theory, the want/should proposition and the evaluability hypothesis. Norm theory (Kahneman & Miller, 1986) entails that in different contexts, individuals refer to different comparison sets. Specifically, when they are presented with a single item to evaluate, they find it hard to make sense of it and thus evoke a set of available, internal referents for comparison and evaluation. Instead, when they are presented with more items to evaluate, the alternatives themselves provide the comparison set for evaluation (Bazerman et al., 1999). The want/should proposition offered by Bazerman et al. (1998) relies upon a tension between what consumers want to do versus what they think that they should do. In a separate evaluation, lacking a counterbalancing alternative, consumers lean towards what they want to do, while in joint evaluation, consumers tend to select the most justifiable option (the one that they think they should choose). The evaluability hypothesis (Bazerman et al., 1992;Hsee, 1996;Hsee, 1998) suggests that preference reversals due to joint versus separate evaluation are driven by differences in the evaluability of attributes. A hard-toevaluate attribute will have less impact in separate evaluation than in joint evaluation, since in separate evaluation people have difficulty assessing the desirability of an option on the hard-toevaluate attribute (Bazerman et al., 1999).
Indications of bonus volume reflect a hard-to-evaluate attribute, as consumers have difficulties in correctly estimating the additional volume that they will obtain. Joint evaluation of products could be expected to help consumers in assessing the desirability of the option with additional volume, because it allows them to make a better estimate of the amount of additional volume that they will receive. In the case of an oversized volume indication, this would imply that consumers who are presented with multiple similar products at the same time are better able to assess that the additional volume is relatively small (i.e., smaller than would be expected based on the oversized volume indication) than consumers who are presented with only one option at a time. We expect that this will mitigate the effects of the oversized volume indication on both volume perceptions and choice: H3:The effect of the presence (vs. absence) of an oversized volume indication on volume perceptions will be smaller for joint than for separate evaluation. H4:The effect of the presence (vs. absence) of an oversized volume indication on purchase intention and product choice will be smaller for joint than for separate evaluation.

Methodology
Participants and design of the study Participants were 312 students of Wageningen University, the Netherlands, who were randomly assigned to a condition in a 3 (packaging design of the product with bonus volume) × 2 (evaluation context: joint vs. separate evaluation) between-subjects factorial design. Three participants were excluded from the dataset due to multiple missing values, leaving a sample of 309 participants (77.7% female, mean age 21 years). The number of participants required for the experiment was based on a power calculation, using the statistical programme GPower (2018). Based on the effect size (0.18) obtained in prior research (Purnhagen & Van Herpen, 2017), an error probability of (0.05) and a power of 1-B (0.80), 301 participants would be needed.
Participants received two products to evaluate (one product with bonus volume and one without), either simultaneously (in the joint evaluation condition) or first the bonus product and then the similarly shaped non-bonus product (in the separate evaluation condition). We opted for this fixed order of presentation in the separate evaluation condition to prevent participants from being affected by the non-bonus product in their evaluation of the bonus product.

Stimuli
Gingerbread was used as stimulus material. The bonus product, which was available on the market, contained an oversized extra volume indication area. The actual additional volume provided was 20%, whereas the coloured area on the package that appeared to indicate the extra volume took up nearly 30% of the package. In line with the prior study on oversized volume indications, we manipulated the packaging design of the product with bonus volume. This packaging contained (a) the correct area and percentage (control condition), (b) the oversized area only or (c) the oversized area plus correct percentage. As a comparison product, a gingerbread of the same brand without additional volume was taken. This product was identical in all conditions. Figure 1 provides the different packages that were used.

Procedure
Participants were recruited around campus by means of posters, Facebook posts, and word of mouth communication. After giving informed consent, in the separate evaluation condition, participants randomly received one out of the three package conditions and later the comparison product. In the joint evaluation condition, participants received both gingerbreads simultaneously. Products were taken away, and participants subsequently answered questions about the products.
In the separate evaluation condition, participants first answered questions on the bonus product (general questions to mask the purpose of the study, followed by questions on volume perception, inferences of manipulative intent, and purchase intention for the bonus product). Next, they saw the comparison product and subsequently answered questions on this comparison product (general questions and purchase intention). In the joint evaluation condition, participants saw both products together. They answered general questions and purchase intention for bonus product and for comparison product (in that order). Next, they answered questions on volume perception and inferences of manipulative intent. The questions on inferences of manipulative intent were included to assess the extent to which participants were aware of being influenced by the packaging and potential effects on attitudes.
At the end of the survey, all participants reported background questions on demographics (gender, age, study), consumption and purchase of gingerbread, liking of gingerbread, and attitudes towards the brand of gingerbread used in this study (6 items, α = .73; based on Lee & Mason, 1999). Finally, participants could choose to take one of the gingerbreads as reward for the study, or, alternatively, they could opt for a monetary reward of 1.50 Euro.

Measures
General Questions About the Packaging For both packages, participants rated general questions on brand familiarity, ease of identifying the product type, comprehension of opening instructions, ease of comprehending ingredient list, readability of text, liking of packaging design, and attractiveness of packaging, on 5-point scales ranging from "completely disagree" to "completely agree." These questions were included to mask the purpose of the study. Paired-sample t-tests revealed that the two chosen packages were very comparable. No significant differences in the general evaluations were found, except for two instances: opening instructions were considered less easy to comprehend for the bonus pack (M = 4.09) than for the comparison product (M = 4.30; t(308) = −5.39, p < .001), and overall attractiveness of the packaging was higher for the bonus pack (M = 3.98) than for the comparison product (M = Fig. 1 Bonus products (3 packaging design conditions) and comparison product 3.86; t(308) = 2.34, p = .020). The latter is in line with product attitude, as reported in the "Results" section.
Volume Perception Participants answered three types of measurements designed to tap into their perception of the amount of extra volume contained in the bonus packaging (cf. Purnhagen & Van Herpen, 2017). The first measure concerned a visual estimate of the amount of extra volume. Participants were asked to indicate the amount of extra volume on a piece of paper showing the exact contours of the bonus package gingerbread. They could indicate the amount of extra volume by drawing a horizontal line on the paper. Second, participants rated their volume perception for the bonus pack, on three items ("this bonus pack of gingerbread offered a lot of bonus volume," "this bonus pack of gingerbread is a large pack" and "this bonus pack of gingerbread was a lot larger than most packs of gingerbread;" α = .64), on a 7point scale ranging from "completely disagree" to "completely agree." Third, participants were asked to give a numerical indication (percentage) of the amount of extra volume they thought they were given.
Purchase Intention Participants indicated their intention to purchase the products on a threeitem scale (cf. Baker & Churchill Jr, 1977;Bruner II, 2009; α = .78 for bonus pack and α = .80 for comparison product), ranging from "definitely not" to "certainly" (7 points). Items were "Would you like to try this product," "Would you buy this product if you happened to see it in a store" and "Would you actively seek out this product in a store." Inferences of Manipulative Intent Inferences of manipulative intent measured using a sixitem scale (cf. Campbell, 1995), on a 7-point rating scale ranging from "completely disagree" to "completely agree." Items were "The way in which the packaging tries to persuade people to buy seems acceptable to me," "This company tries to manipulate consumers in ways that I don't like" (reverse coded), "This packaging was fair in what was said and shown," "I think that this packaging is fair," "I was annoyed by this packaging because the company seemed to be trying to inappropriately manage or control the purchase of consumers" (reverse coded) and "I didn't mind the packaging, the company tried to persuade consumers without being excessively manipulative" (α = .84). In addition to examining the full scale, we also separated perceived honesty of the packaging (third and fourth item from the inferences of manipulative intent scale) from perceived manipulative intent of the company (remaining items).
Choice At the end of the experiment, participants were given a choice between the gingerbread with bonus volume, the alternative gingerbread (i.e., the comparison product) or a small monetary reward. They received their choice as a token of appreciation for participating. By letting participants pay for the gingerbread by foregoing reward for their participation, we ensured an incentive-compatible choice. This measures the transactional decision that the participant makes.

Analyses
One-sample t-tests were used to examine if participants in each of the conditions significantly overestimated the bonus volume, compared to the actual bonus volume (5 cm). Next, a factorial ANOVA (analysis of variance) was used to further analyse data on volume perception, with the different packaging conditions and evaluation context as independent variables. When a significant effect for packaging condition was found, post hoc tests (LSD) were used to compare means. The histogram of volume estimates in percentages showed various peaks, and we decided to split this variable into three categories (underestimation, correct estimation, and overestimation) and test the association with packaging condition using a chi-square test.
Purchase intention was asked for both bonus product and comparison product and analysed using a repeated measures ANOVA with type of product (bonus vs. comparison) as within subjects factor, and packaging condition and evaluation context as between-subjects factors. Inferences of manipulative intent were analysed using factorial ANOVA, using both the full scale and the two subscales as dependent variables. Furthermore, multinomial regressions were run with choice as dependent variable, to predict choice from package condition and evaluation context.

Background Information
Most participants (80%) liked gingerbread. Consumption frequency was low, though, with 27% of participants eating gingerbread "reasonably often" to "very often." Most participants (69%) bought gingerbread less than once per month. Attitude towards the brand used in this study was positive (M = 4.60 on a 7-point scale).

Volume Perception
To assess whether participants overestimated the bonus volume, we tested the volume perceptions in centimetres against the correct area size (5 cm). As the packaging is sealed at the end, and the coloured area extends beyond the actual gingerbread to accommodate for this, we tested the robustness of our results by also checking effects against 4 and 6 cm. In all cases and conditions, we found evidence of overestimation. Hence, in all conditions, participants significantly overestimated the amount of bonus volume (all ts (52) > 8.46, ps < .001). Table 1 provides means and standard deviations. Next, we examined whether the extent of overestimation depended on packaging condition or evaluation context. An ANOVA revealed that packaging condition significantly influenced the volume perceptions for the bonus volume that participants indicated in the drawing (in centimetres) (F(2, 303) = 21.89, p < .001, η p 2 = .13). Participants who saw a bonus pack with the correct area and percentage gave a significantly lower estimate of the bonus volume (M = 7.30), then participants who saw a bonus pack with oversized area only (M = 8.88, p < .001) and then participants who saw a bonus pack with oversized area and correct percentage (M = 8.70, p < .001). Volume perceptions did not differ significantly between the latter two conditions (p = .488). This implies that the size of the coloured area affected the volume estimates, in support of our first hypothesis, and that adding a correct percentage did not change these perceptions.
The ANOVA furthermore showed that evaluation context (F(1, 303) = 0.38, p = .537) and the interaction between packaging condition and evaluation context (F(2, 303) = 0.78, p = .459) did not significantly affect volume perceptions. This implies that we do not find support for hypothesis 3. In a joint evaluation, with a comparison product present, participants provided similar volume estimates as in separate evaluation.
Participants also indicated the perceived additional volume in percentage. Estimates ranged between 0 and 50%, and almost half of the participants gave the correct percentage of 20%. Because the distribution of the estimated volume percentages showed a high peak at the correct percentage, we split the estimates into "underestimation" (< 20%), "correct estimation" (20%) and "overestimation" (> 20%). These estimations differed across packaging conditions (χ 2 (4) = 15.75, p = .003). In the condition with the correct area and percentage where indicated, 52% of participants gave a correct estimation, and only 16% overestimated and 32% underestimated. Yet, when the oversized area was used with the correct percentage, more participants (25%) gave an overestimation. In this condition, 59% gave a correct estimate and 16% gave an underestimation. Both overestimation and underestimation were high among participants in the condition with only the oversized area. In that condition, 38% gave the correct estimate, whereas 29% overestimated and 33% underestimated. These results show that even in conditions where the correct percentage was provided on the packaging, many participants were unable to report it. Further analyses showed that evaluation context did not have a significant effect on the extent to which participants correctly estimated the percentage.
Additionally, participants reported volume perceptions on a rating scale. Table 1 provides means and standard deviations. An ANOVA shows a significant main effect of packaging (F(2, 303) = 3.50, p = .031, η p 2 = .02). Post hoc tests revealed that volume estimates were higher for participants who saw the oversized area plus percentage (M = 5.64) than for participants who saw the correct area plus percentage (M = 5.35, p = .009), in partial support of hypothesis 1. Other differences between packaging conditions were not significant. There was also a marginal effect of evaluation context (F(2, 303) = 3.12, p = .078, η p 2 = .01), with people tending to give higher volume estimates in separate evaluation (M = 5.57) than in joint evaluation (M = 5.41). The interaction effect between packaging and evaluation context was not significant (F(2, 303) = 0.12, p = .884). We thus found no support for hypothesis 3.

Purchase Intention
The repeated measures ANOVA for purchase intention showed a significant main effect for evaluation context (F(1, 303) = 7.38, p = .007, η p 2 = .02), which is qualified by a significant interaction between type of product and evaluation context (F(1, 303) = 4.97, p = .027, η p 2 = .02). None of the other main effects and interactions was significant. Means and standard deviations are provided in Table 1. Examining these results further in separate ANOVAS for bonus pack and comparison product, we found that the evaluation context had a significant effect on the evaluation of the bonus pack (F(1, 303) = 12.06, p = .001, η p 2 = .04), but not on the evaluation of the comparison product (F(1, 303) = 2.34, p = .127). It thus appears that, in line with our reasoning, the presence of a comparable product affects participants' evaluation of a bonus pack, while the evaluation of the comparison product remains stable across conditions. Surprisingly, purchase intentions for the bonus pack were higher in joint evaluation (M = 4.24) than in separate evaluation (M = 3.78). Apparently, the evaluation context made the presence of bonus volume more salient.
To check whether volume perceptions mediated (i.e., were the underlying process for) effects of packaging condition on purchase intentions, we estimated a moderated mediation model using the PROCESS macro in SPSS (model 7; Hayes, 2017). In this analysis, packaging condition was entered as a categorical independent variable using effect coding, perceived volume perception (in centimetre) was the mediator, evaluation context was the moderator, and the difference score between purchase intentions for bonus product and comparison product was the dependent variable. Results showed that none of the indirect effects was significant (all confidence intervals included 0), indicating the volume perceptions were not the mediating process.
The overall higher purchase intention for bonus packs in joint than in separation evaluation (regardless of volume indications) can occur because participants realized that the indicated bonus area was (in some conditions) oversized but preferred the product nonetheless as it contained extra volume, or because participants did not realize that the bonus area was oversized. To investigate this further, we analysed effects on inferences of manipulative intent.

Inferences of Manipulative Intent
An ANOVA with inferences of manipulative intent (full scale) as dependent variable showed that none of the effects was significant (packaging condition: F(2, 303) = 0.35, p = .703; evaluation context: F(1, 303) = 0.003, p = .955; interaction between these: F(2, 303) = 0.12, p = .889). Yet, examining the subscale on perceived honesty of the packaging, a significant main effect of packaging was found (F(2, 303) = 3.01, p = .051, η p 2 = .02). Participants who saw the packaging with oversized area found this to be less honest (M = 4.58) than participants who saw the packaging with correct area and percentage (M = 4.96). The effects of evaluation context and the interaction were not significant. For the subscale on perceived manipulative intent of the company, none of the effects was significant. Thus, participants appeared aware that the packaging with oversized volume indication was less honest, but they did not seem to consider this manipulative.

Choice
Participants were given a choice between the bonus pack, the comparison product and cash (€1.50) at the end of the study. Figure 2 shows the percentage of participants choosing each of these options in the various conditions. Overall, none of the options is preferred significantly more than the others (χ 2 (2) = .718, p = .698). To test whether choice for the bonus pack is higher depending on packaging condition or evaluation context, we applied a multinomial regression.
According to the multinomial regression model, the choice for bonus pack (with "choice for comparison product" as reference category) was not significantly affected by evaluation context, packaging condition or their interaction (ps > .05), although the interaction effect between an oversized area (with "correct area and correct percentage" as reference category) and evaluation context was close to significance (β = −1.30, χ 2 (1) = 3.62, p = .057). The McFadden pseudo R-square was low (.026), indicating small effect size, and effects of packaging condition were not significant when we examined choices separately for the two evaluation contexts. This means that we find very little statistical evidence to support that choices are affected by packaging condition and evaluation context, and that the effect size of such a possible effect appears to be very small. The pattern of results is opposite to what we found for purchase intention, with participants being more inclined to choose a bonus pack with oversized area under separate evaluation than under joint evaluation. Mediation analyses (again using PROCESS, model 7) furthermore showed no significant indirect effects of volume estimates in the relation between packaging condition and choice of bonus pack (entered as a dichotomous dependent variable).

Discussion
In this study, we have tested the effects of oversized volume indications on the transactional decision of consumers in two types of evaluation contexts, namely separate and simultaneous. Our results contribute to previous findings about consumer deception. Specifically, we have shown that the oversized bonus area leads consumers to overestimate the bonus volume, under all conditions of the study. Hence, such bonus packs generally deceive consumers. These results parallel findings of previous research on bonus packs (Purnhagen & Van Herpen, 2017). People thus tend to overestimate the size of the bonus volume in general, regardless of context. Even adding the correct percentage clearly on the front-of-pack does not mitigate this general effect, according to the current results. Many participants could not recollect the percentage that was indicated, underlining the high relative salience of the visual area in comparison to the low salience of the percentage itself.
When it comes to purchase intentions, people rated the bonus pack higher when in joint evaluation compared to separate evaluation. Thus, participants had a higher intention to buy the bonus product in a competitive setting with other products than when evaluating it alone. For this effect to occur, it was irrelevant whether it contained an oversized area or the correct area. The availability of a comparison product appears to have made the extra volume salient. Yet, when it came to the actual transactional decision (choice), participants tended to be relatively less likely to choose the bonus pack than the comparison product in joint evaluation, and relatively more likely to choose the bonus package than the comparison product in separate evaluation (although this effect did not reach statistical significance). This result is opposite to what we observed when measuring purchase intention. Result patterns for purchase intention and choice are not in line, which is an interesting result. It illustrates that to assess actual transactional value, looking only at purchase intention may not suffice, as results for purchase intention and choice may not align. Future studies might rely on more empirical evidence by using a different measurement to test purchase behaviour. For example, studies could ask participants to pay out of pocket (a realistic price) for the gingerbread, rather than foregoing a monetary reward.
Testing whether consumers know that they are persuaded by the bonus pack, the results indicate that they indeed realize that the packaging with oversized area is dishonest, but do not find this serious enough to affect their evaluation of the company as manipulative. This suggests that consumers may not find bonus packs with oversized areas especially troublesome. Yet, our dataset cannot draw definite conclusions as to this effect, and this could be something to explore in more depth in future research.
Article 6 (1) UCPD requires to test whether the deceptiveness "causes or is likely to cause [the consumer] to take a transactional decision that he would not have taken otherwise." In a competitive choice setting, a significant number of consumers intended to purchase the product indicating a bonus, regardless of whether it was correctly sized or not. Hence, it seems that it is not the deceptiveness of the bonus packs as such, which leads consumers to intend purchasing the product. Rather, the fact that there is a bonus volume seems to have an effect on purchase intention. This lies in line with the result of the data on inferences of manipulative intent, which indicate that consumers simply may not care if the package is deceiving or not. From these results alone, one could not draw the conclusion that the bonus pack's deceptiveness has an influence on consumers' transactional decision in the way Art. 6 (1) UCPD requires. If one adds the results from the outcome of the study on choice, consumers were less likely to choose for the bonus volume in a competitive choice setting. Hence, in competitive choice settings (which should present the majority of consumer's shopping experiences), we could not produce empirical evidence that the deceptiveness "causes or is likely to cause (… the consumer) to take a transactional decision that he would not have taken otherwise." A possible explanation could be that in competitive choice settings, the bonus volume may be perceived as a distinct benefit compared to the other product, and a lost opportunity if not chosen. Previous research has shown that consumers are especially attracted to sales promotions that are limited in time, and that they will accelerate purchases for these types of promotions (Aggarwal & Vaidyanathan, 2003). In our case, consumers may feel that if they do not take the bonus pack, they may not be able to reap the benefits the bonus pack volume may provide to them over the other product. For such intention, it matters less how large the benefit is exactly, it suffices that there is a bonus option over the other product. This potential process is not able to explain, however, why choice of the bonus pack tended to be lower in joint evaluation compared to separate evaluation.
In separate evaluation, consumers seem to have less intention to purchase a product with an oversized bonus area compared to a competitive choice setting. However, their actual choice of the product is relatively high, albeit not significantly different from the other packaging conditions in our study. In such settings, the deceptiveness does not seem to carry through to choice. In sum, according to our study, the deceiving effect of bonus packs does not have an effect on the transactional decision of consumers.
To sum up, these results support and extend the findings of the study from Purnhagen and van Herpen (2017) by showing that oversize extra volume indication area influences consumers' volume perceptions. Yet, although consumers seemed to be deceived, this seemed not to have a clear influence on the transactional decision of consumers.

Limitations and Future Research
A limitation of our study is that it took place in a lab environment rather than a real-life environment, such as a supermarket or other retail environment. In a real-life environment, a large assortment of alternative products will be offered (rather than one competing product), and more distraction is present. Moreover, these competing products could be differently packaged, making direct comparisons of bonus volume more cumbersome for consumers. What the current study indicates is that such contextual factors could influence the extent to which the transactional decision of consumers is affected.
Our experimental setup also implies that the choices of participants are made in a somewhat artificial context. We took precautions to ensure that choices are incentive-compatible and resemble actual purchases as closely as possible. Participants had to give up a monetary reward for participation to receive the product. Yet, this situation is not identical to purchase decisions made in a supermarket, and future research could examine more natural contexts.
Future research could also focus on using alternative measures for choice, such as letting participants pay out of pocket or using actual sales data, to test if our pattern of results for choice is robust. Our current findings suggest that future research may want to focus on the effects of an oversized volume indication for products or situations in which no comparison product is readily available. This could for instance be products with a package shape that is different from competing products, so that the size of additional volume is difficult to ascertain, or online shopping situations in which the pictorial representation of the products complicates direct comparison of product size. Our results indicate that a potential increase in choice due to deception is most likely to occur under these conditions, if at all.
Finally, the use of alternative methods to examine the case of oversized volume indications could supplement the current findings and provide a deeper understanding. For instance, qualitative in-depth interviews or focus groups could be used to gain more insight into consumer responses. This could answer questions such as to what extend consumers are aware of the oversized area indications, what their main concerns are, and what type of reactions occur once consumers become aware of the deception. Moreover, quantitative analyses of sales data from products that temporarily contain oversized versus correct-sized volume indications on pack could provide an indication of the size of potential effects on sales volumes. Such analyses would need to control for many factors that could simultaneously affect sales volumes, such as advertisements for the bonus product, competitive reactions, and seasonal patterns.

Implications
Our study has several important theoretical and practical implications. The differences that we found between stated purchase intention and actual product choice imply that studies based on purchase intention may not provide the needed insights to determine whether transactional decisions of consumers will be affected. Therefore, research that sets out to examine such transactional decisions should include measures of actual choice.
Another implication from our research relates to the measurement of extra volume perceptions. Our results indicate that some measures are more sensitive than others, in picking up differences in perceptions. We recommend the use of the volume perception measure in centimetres and the item scale since these appear to be the most sensitive in picking up differences in consumer perceptions among the different package conditions.
As previously mentioned, our research emphasizes the necessity and potential of incorporating behavioural studies in EU legislation and adjudication, specifically in the area of unfair commercial practices. It is thus of relevance to consumers, current European legislation, judges and marketers for several reasons. It is relevant to consumers because of the potential misleading practice they can fall victim to. When they overestimate the actual extra volume of a product due to the oversized volume indication section on the package and thereby end up buying this product, they get less value than expected. This, according to the Court, can make a commercial practice misleading (Art. 7 UCPD), from which they should be protected. In this respect, we found very little statistical evidence to support that choices are affected by packaging condition and evaluation context, and the effect size of such a possible effect appears to be very small. These insights have implications for the Court of Justice of the European Union since it could alter its future interpretation concerning volume indications on packages.

Conclusions
Research in the field of behavioural sciences has provided valuable insights into how consumers' behaviour can deviate from the expected rational standard, revealing their systemically limited processing capacities. These findings can be used to improve the interpretation of the average consumer benchmark as a basis for unfair commercial practice law. In this regard, we find that consumer perceptions are affected by visually salient volume indications. The question, however, is whether these altered perceptions lead to changes in consumers' transactional decision. This could then, consequently, be used as input to determine an unfair commercial practice (Art. 7 UCPD).
Our results indicate that transactional decisions appear relatively unaffected, especially when evaluating such a bonus pack jointly with a comparison product.
Although consumer protection levels are still to be determined at Member State level, following the claim established by Schebesta and Purnhagen (2017), systematic deviations could be addressed at Union level, taking into account the insights of behavioural sciences to determine a more accurate and realistic average consumer benchmark. Whether such insight will be used to inform regulation, law-making and/ or legal interpretation depends to a large extent on the weight that actors will give to them in order to protect consumers against other factors, such as the provision of the principle of free movement of goods, when exercising their discretion or interpreting the law. In this respect, the Mars case concerned free movement of goods, and at the time it was decided, neither the FIC nor the UCPD had entered into force. General remarks regarding whether this case should have been decided differently are difficult to make. Making such normative legal recommendations based on empirical evidence shall generally be done with care, as empirical evidence shall be considered only one of the aspects in judicial decision-making, and only as far as findings are generalizable and applicable to the concrete case and the concrete law at hand (Purnhagen, 2018;Zeiler, 2010). In conclusion, behavioural studies can provide relevant insights for Courts and regulators about the effects of commercial practices on consumer perceptions. While it is far from clear whether and at what level (Member State or EU) the current consumer benchmark requires the incorporation of behavioural science under EU law, it does not automatically preclude it.