1 Introduction

Product packaging represents an essential component of communicating brand meaning to consumers. Especially at the point of purchase, packaging has been identified as the most important vehicle of communication (Underwood et al. 2001; van Rompay et al. 2013). However, although several aspects of package design such as shape (Folkes and Matta 2004; Raghubir and Greenleaf 2006; Schoormans and Robben 1997) and texture (Veryzer and Hutchinson 1998) have been investigated, only limited research attention has been placed on the role that package color plays in shaping consumer preferences (Labrecque and Milne 2013). Package color attracts consumers’ attention, creates aesthetic experiences, and renders symbolic value to the brand (Garber et al. 2000; Kauppinen-Räisänen and Luomala 2010; Labrecque and Milne 2012). In general, marketing literature identifies color as a crucial element for brand positioning and product identification (van Rompay et al. 2013). Against this background, it is surprising that only a limited number of studies have explored the influence of different package color options on consumers’ responses (see Labrecque and Milne 2012, 2013 for a few exceptions).

Meeting consumers’ color expectations has been found to benefit brand equity by increasing processing fluency and facilitating both category and brand identification (Labrecque and Milne 2013). These specific expectations about the color options that brand packages within the given category typically employ are represented in category color norms (Bottomley and Doyle 2006; Labrecque and Milne 2012). While not all product categories possess category color norms, those product categories where category color norms exist offer two different strategic alternatives to meet consumers’ color expectations. First, colors of new product packages can conform to the color of the market leader in a particular category. Color norms can develop over time in a “benchmarking” mode by following the example of existing dominant player in a certain product category (e.g., red for colas).

Second, package colors can conform to the intuitive meaning a color has for a certain product category. For instance, brands of milk typically come in white packages, whereas orange juice comes in orange. These products evoke color associations that correspond to the natural color of a product category. In addition, the intuitive meaning of a color for category develops due to learned associations consumers have to a specific color. Gold is associated with luxury, resulting in many luxury products to be packaged in gold. Similarly, the color red is associated with love and hence, red is often used for the package of romantic products. This latter kind of category color norms are culture specific, since different cultures have different color associations. From a practical perspective, marketers need to be aware of these differences in color meanings in order to adapt package colors to varying cultural contexts if necessary.

Although brands tend to generally conform to this “market standard” (i.e., package colors correspond to the dominant player or to the intuitive meaning a product has), a closer look in the marketplace reveals that brands sometimes choose unexpected or atypical package colors in relation to their corresponding product category in an attempt to attract attention and differentiate themselves from competition (Garber et al. 2000; Kauppinen-Räisänen and Luomala 2010; Schoormans and Robben 1997; Stoll et al. 2008). For example, Pepsi’s prominent use of blue color is considered less conventional with regards to cola-type soft drinks which are typically associated with variations of red (e.g., Coca-Cola, Dr. Pepper).

Schema incongruity theory suggests that deviations from consumers’ expectation can attract attention, lead to more positive responses (Mandler 1982) and enjoy the advantage of differentiation manifested in superior market performance (Talke et al. 2017). At the same time, however, incongruity research indicates that when deviations from established perceptions cannot be successfully accommodated within existing cognitive structures they generate confusion and lead to unfavorable effects (Garaus, Wagner and Kummer 2015; Halkias and Kokkinaki 2014, 2017). Overall, despite the importance of product package design in marketing theory and practice, extant research has not yet systematically investigated the consequences of adopting an atypical package color (Schoormans and Robben 1997). Consequently, there is a lack of evidence with regards to whether and to what extent relevant package color decisions might harm or benefit a brand.

The present research tries to empirically address this issue by exploring the relative influence of typical and atypical package colors on consumer emotional responses and on preferences in product categories where category color norms exist (as developed by the intuitive meaning a color has for a certain product category). The findings contribute to research on product package design by emphasizing the importance of color as a package element that may contribute to a brand’s potential success or failure. In addition, by suggesting that not only shape (Folkes and Matta 2004; Raghubir and Greenleaf 2006) but also color is an essential component of category expectations, our findings provide new insights for future research on product (a)typicality. From a practical perspective our research alerts managers to the potential detrimental consequences of choosing atypical package colors and implies that brands benefit more from generally conforming to rather than violating product category color norms.

2 Conceptual framework

2.1 Product category color norms

According to categorization theory, individuals tend to organize the external environment on the basis of previous experiences (Fiske and Taylor 1991). That is, based on the knowledge accumulated within a given conceptual domain, people form cognitive categories over time, or sets of expectations, which subsequently are used to determine how future experiences will be apprehended (Loken and Ward 1990; Sujan 1985; Sujan and Bettman 1989). Along these lines, research on categorization focuses on how individuals organize and represent knowledge while it seeks to explain how the integration of newly encountered stimulus objects into existing cognitive structures can be used to make inferences about the nature of these objects (Lajos et al. 2009; Uekermann et al. 2010). In general, psychological literature identifies the process of categorization as a fundamental cognitive activity and suggests that people tend to naturally rely on cognitive categories as bases for inductive inference and prediction (Fiske and Taylor 1991).

Marketing research emphasizes the importance of category structures in the identification and differentiation of products and brands (Lajos et al. 2009). Individuals create and maintain categories based on the taxonomic representation of the stimulus objects they encounter (Estes et al. 2012). From a consumer’s perspective, the objective is to develop categorical structures that maximize homogeneity within each category while, at the same time, maximize heterogeneity across different categories (Rosch et al. 1976). These categorical structures guide the integration of new and old information, enable inference-making about the nature of new stimulus objects, and often determine evaluative judgments of the latter (Lajos et al. 2009; Sujan and Dekleva 1987). The organization of the market in the form of product categories can be seen as a representation of such categorical structure (Estes et al. 2012; Halkias 2015). Through their interactions with the market environment, consumers learn to organize, store, and eventually utilize market information on the basis of existing product categories (Lajos et al. 2009). This leads to normative inferences and, subsequently, expectations about the features, the functions, and the performance of new products that eventually influence consumers’ thoughts, feelings, and overall attitudes toward these products (Sujan 1985; Sujan and Bettman 1989).

The product package design, defined as “the various elements chosen and blended into a holistic design to achieve a particular sensory effect” (Orth and Malkewitz 2008, p. 64), is one of the most important cues of product category activation. Different packages contribute to the identification of different design prototypes and norms, providing important perceptual input for successful product categorization (Orth and Malkewitz 2008, 2012; Schoormans and Robben 1997). In the context of product package design, research has primarily focused on how (un)expected, novel, or hybrid shapes influence product categorization and how they affect consumers’ evaluation and adoption of new product introductions (Orth and Malkewitz 2012). Nonetheless, only limited attention has been placed on the role that different color options might play in shaping consumer preferences (cf. Labrecque and Milne 2013).

Veryzer and Hutchinson (1998) argue that color is an essential feature of package design and constitutes a prominent component of the product’s visual identity (see also Garber et al. 2000; Labrecque and Milne 2012). In a similar sense, literature indicates that consumers’ category perceptions incorporate specific expectations about the color options that brand packages within the given category typically employ (Bottomley and Doyle 2006; Labrecque and Milne 2012). On the one hand, such product category color norms are intuitively developed by mere association with the nature of the product category (e.g., white for dairy products, green and brown for gardening tools) or learned color associations that correspond to a given product category (e.g., gold for luxury products). On the other hand, as recent research suggests (Labrecque and Milne 2013), color norms may be developed over time as brands within a given product category tend adopt a similar color to that of the dominant player in the market (e.g., red for colas). A visit to the supermarket illustrates that, across several categories of household products, competing brands tend to have the same or similar package color. However, even though market practice testifies to the existence of product category color norms (Gorn et al. 1997), research to date has not provided conclusive evidence with regard to the consequences of adopting a package color that is typical or atypical in relation to its corresponding category color norm (Labrecque and Milne 2013).

2.2 Consumer responses to (a)typical package colors

Categorization research suggests that the more a stimulus object possess features that are central to its corresponding category the more prototypical it is considered to be (Loken and Ward 1990). Thus, product typicality refers to the extent to which the product’s features overlap with those commonly encountered in the category or, alternatively put, the extent to which the product is perceived to be representative of its product category (Loken and Ward 1990; Ozanne et al. 1992). Schema incongruity theory (Mandler 1982) suggests that information, which does not conform to predefined cognitive categories or schemata may captivate the receiver’s attention and elicit positive affective responses (Meyers-Levy and Tybout 1989). The attention-getting effect of atypical information receives considerable support from various studies (e.g., Halkias and Kokkinaki 2017; Schützwohl 1998; Törn and Dahlén 2007). Nevertheless, with reference to the Attention-Interest-Desire-Action framework, research that goes one step further and explores the influence of deviations from category color norms on emotional responses, such as interest, is required in order to fully understand the effects of package color (a)typicality on consumers’ preferences.

Advertising research reveals that essential creative dimensions such as novelty and originality entail divergence from preexisting schemata and expectations (Ang et al. 2007; Sheinin et al. 2011), which has been found to provide an effective tool for engaging consumers in the process of communication (Halkias and Kokkinaki 2014). Similarly, research acknowledges the interest- and curiosity-evoking effect of new package designs (Alexander 2008; Celhay and Trinquecoste 2015; Snelders and Hekkert 1999). In contrast to familiar packages, new package designs are atypical (Hekkert et al. 2003; Snelders and Hekkert 1999) and changes in visual design elements, such as colors, are particularly appropriate to manipulate perceived product typicality (Celhay and Trinquecoste 2015). In line with that, inconsistencies, ambiguities, and deviations from conventional norms may produce a pleasurable sense of arousal in product design while consumers get to become bored by prototypical products (Veryzer and Hutchinson 1998).

Atypical package design increases cognitive effort required for processing packages. In support of this notion, Ulrich (1983) emphasizes that the affective response interest positively correlates with information processing effort and that disordered natural scenes likely evoke interest. Consistent with this, research on the psychology of aesthetics identifies the emotional response interest as likely to be evoked from new stimuli (Berlyne 1960) and associates interest with curiosity, exploration and information seeking (Silvia 2005). Based on this research stream, studies emphasize the relevance of interest as emotional response to art (Silvia 2005) and to the consumption experience (Westbrook and Oliver 1991). Drawing on this evidence, we propose that:


Atypical package colors increase consumers’ interest.

Stimuli including information that raise curiosity and interest are, in principle, more intellectually-challenging and often associated with higher aesthetic value as well as increased perceptions of creativity (Sheinin et al. 2011). For instance, research findings demonstrate that product and brand messages which challenge established perceptions increase consumers’ motivation to further investigate the content of the message (Ang et al. 2007) eventually leading to higher ad and brand liking (Halkias and Kokkinaki 2014). In line with this reasoning, the interest generated by a product package that bears an atypical color in relation to its category is expected to positively influence attitudinal responses toward the product itself. Hence, we predict that:


Interest positively influences consumers’ attitude toward the brand.

Other literature indicates that individuals generally tend to prefer typical or category-congruent objects, as such choices provide a sense of structure that simplifies the understanding of the complex social environment (Fiske and Taylor 1991; Ozanne et al. 1992). In line with this, marketing research suggests that product preference is a positive function of consumers’ subjective perceptions of product typicality or category representativeness (Loken and Ward 1990). The underlying idea is that when the product information can be successfully categorized into existing cognitive structures, comprehension and efficient decision-making is enabled (Halkias and Kokkinaki 2014, 2017; Schoormans et al. 1997). More specifically, empirical studies reveal that high attribute typicality increases product identification (Orth and Malkewitz 2012) and is substantially correlated with meaningfulness and familiarity (Loken and Ward 1990). This, in turn, increases processing fluency and stimulates positive feelings that lead to more favorable judgments (Mayer and Tormala 2010).

Perceptually novel and atypical attributes do not conform to existing knowledge structures and are more difficult to process, often resulting in divergent interpretations and ambiguity (Bottomley and Doyle 2006; Orth and Malkewitz 2012). Nevertheless, atypical information attracts attention (e.g., Halkias and Kokkinaki 2017), which might reflect a beneficial strategy for new products in a highly-cluttered market place. As regards to emotional responses to such a strategy, advertising research indicates that attention-getting tactics create perceptions of manipulative intent, which in turn increases counterarguing (Campbell 1995). In addition, atypical as compared to typical and expected information, is associated with lower credibility and persuasiveness (Dahlén et al. 2005; Goodstein 1993), both of which are negatively correlated with skepticism (Obermiller et al. 2005). Supporting this line of argumentation, Obermiller and Spangenberg (1998) conclude that attention-getting tactics likely increase feeling of skepticism. In the context of package design, skepticism reflects “a consumer’s tendency to question any aspect of a new product offering, in any form it may appear (e.g., facts, inferences, or claims)” (Morel and Pruyn 2003, p. 352). Alexander (2008) argues that the psychological newness of new products increases skepticism. This negative emotional response to new and unfamiliar product attributes can be explained by its associated difficulty in making sense of this particular package attribute (Babin et al. 1995; Halkias and Kokkinaki 2017).

Given that color is an important feature of the product’s overall appearance, a package color that is perceived to be atypical with regard to the corresponding product category color norm is expected to be perceived as new, to generate a sense of ambiguity and increase feelings of skepticism toward the product. Thus, it is predicted:


Atypical package colors increase consumers’ skepticism.

The significant role that emotions play in shaping behavior is a long-standing proposition in psychological research (Bagozzi et al. 1999). Individuals’ affective reactions to a stimulus are found to strongly influence subsequent attitudinal responses toward the stimulus in valence-consistent manner (Holbrook and Batra 1987). In this sense, appraisal theory suggests that people’s cognitive assessment of a given situation results in the formation of emotions that are then manifested in behavioral intentions. As Bagozzi et al. (1999, p. 184) argue, such affective reactions represent “a mental state of readiness that arises from cognitive appraisals of events or thoughts” and results in specific actions, depending on whether this psychological state benefits or harms the person having it. Therefore, the feeling of skepticism triggered by the atypical package color should in turn lead to unfavorable brand responses. To this end, previous research shows that attention-getting tactics decrease attitudes toward the advertiser (Campbell 1995). If consumers feel skeptical about new products, the information value about the benefits of new offerings is diminished (Mohr et al. 1998). Consequently, the skepticism induced by the choice of an atypical package color is expected to result in lower attitude toward the brand. Drawing on this evidence, we suggest that:


Skepticism negatively influences consumers’ attitude toward the brand.

As described above, perceptions of (a)typicality, and subsequent (un)successful categorization elicit affective responses that correspondingly contribute to the overall assessment of the stimulus object. According to Chaudhuri and Holbrook (2001), positive affect contributes to more favorable attitudinal outcomes which in turn encourages favorable behavioral tendencies, whereas the opposite chain of effects holds true for negative affect. Therefore, consumers’ overall attitude toward the brand will be positively related to subsequent brand purchase intention. It should be noted that the present research does not a priori hypothesize the relative magnitude of the negative effect of skepticism and the positive effect of interest. The experienced intensity of affect is a highly idiosyncratic issue (Rocklage and Fazio 2015) and not enough evidence exists to allow confidence in drawing specific predictions. As such, to the extent that one is more dominant over the other in consumers’ attitude formation, an effect of similar directionality is expected to be manifested in their purchase intentions. Thus, it is predicted that:


Consumers’ attitude toward the brand is positively related to brand purchase intentions.

The hypothesized relationships described above lead to the conceptual model that was developed and tested in the present research (Fig. 1). Overall, the choice of an atypical package color appears to have antagonistic effects with regard to the emotional responses it stimulates. In particular, despite the fact that package color atypicality should stimulate positively valenced arousal in the form of increased curiosity and interest (Babin et al. 1995; Berlyne, 1960), it is also expected to simultaneously inhibit processing fluency and increase consumers’ skepticism.

Fig. 1
figure 1

Conceptual model and results of experiment 1

3 Experiment 1

A sample of 177 consumers was recruited (73% female, Mage = 31 years, SD = 9.69, 31% university degree, 45% high school as highest completed level of education) in an online experiment, employing a 2 (category color norm fit: atypical vs. typical package color) × 2 (product category: orange juice vs. sparkling wine) mixed factorial design. A link to the online survey was published on social media platforms targeted to respondents in a central European country. The fit of the product package color with the corresponding category norm was manipulated as a between-subject factor and by varying the package color (atypical vs. typical) of unknown stimulus brands. Two different product categories (orange juice and sparkling wine) operationalized as a within-subjects factor served as stimuli to avoid category specificity and to increase the external validity of the findings.

The fast moving consumer goods (FMCG) sector is particularly suitable for the current investigation for three reasons. First, many FMCGs possess typical package colors, which is prerequisite for the current research. Second, FMCGs are typically presented in a highly cluttered environment and directly next to alternative products of the same category. Hence, for similar purchases, package colors should be highly relevant for product identification and categorization (Underwood 2003; Veryzer and Hutchinson 1998). Third, grocery shopping is characterized by its utilitarian shopping motivation (Yim et al. 2014), and hence, efficient shopping task completion (Babin et al. 1994). Driven by the efficient task completion, consumers often rely on mental shortcuts to draw conclusions about a product so that they do not need to think very deeply about their choice (Hoyer and MacInnis 2010; Mai et al. 2016). Package color typicality likely represents such a mental shortcut in utilitarian shopping situations. Structural equation modeling was applied to estimate the proposed model and test the hypothesized effects. A preliminary study was initially conducted to identify product category color norms and to enable subsequent development of the stimulus materials.

3.1 Stimulus material development

Since color associations vary among cultures (e.g., Grossman and Wisenblit 1999), an exploratory pre-study identified common color associations for the country under investigation. The employed procedure aimed to identify product categories that are intuitively associated with certain colors without any pre-conditioning (e.g., no market leader products determine color associations). Data were collected by an open-ended online survey employing a convenience sample (N = 41, 67.5% female, Mage = 26 years, SD = 10.681). Respondents were instructed to indicate their feelings, thoughts and general associations for a list of 14 different colors.

This procedure generated a total of 1006 responses (including multiple associations) that could be classified into 482 color associations (see Table 1). Three researchers content-analyzed participants’ responses to establish product–color associations patterns and to select products that correspond to specific color schemes, thus serving the purpose of the present investigation. More than half of the participants associated the color orange with the fruit orange, therefore, the orange color was judged as typical for an orange juice and likely to evoke natural color associations. Likewise, more than half of the respondents associated the color gold with luxury. In the FMCG context, a sparkling wine was considered to reflect this color association. Colors with no single match in color associations with the colors orange and gold qualified as potential atypical colors. In order to create product packages that are atypical, yet not unrealistic, the selected products and potential typical and atypical package colors were validated by a Google picture search. A visual inspection of the resulting pictures of orange juices and sparkling wines indicated that blue and purple reflect atypical but realistic, package colors, while orange and gold represent typical colors for orange juice and sparkling wine, respectively.

Table 1 Results of pre-study: color associationsa

An additional pre-test (N = 30, 52.2% female, Mage = 30 years, SD = 11.95) was conducted in order to establish that the package colors selected above do not vary in the emotional responses they evoke. In particular, the scale of Allen and Janiszewski (1989) and Labroo and Rucker (2010) was adopted to assess general mood states when being exposed to the colors orange versus blue (orange juice) and gold versus purple (sparkling wine). A MANOVA (Pillai’s trace V = .31, F(4, 18) = 2.00, p > .10) with the two colors orange and blue as independent variables and four mood states as dependent variables revealed that there is no significant difference among good/bad (2.33 vs. 2.50), pleasant/unpleasant (2.80 vs. 2.13), happy/sad (2.47 vs. 2.88) and positive/negative (2.73 vs. 3.00) evaluations among orange and blue. Similarly, another MANOVA (Pillai’s trace V = .16, F(4, 25) = 1.16, p > .10) confirmed that respondents’ general mood state was not significantly different between gold versus purple (good/bad: 2.27 vs. 2.80, pleasant/unpleasant: 2.73 vs. 2.73, happy/sad: 3.07 vs. 2.87, and positive/negative: 2.47 vs. 2.47).

Pictures of the two products orange juice and sparkling wine served as the stimulus material. In order to create variation in package color congruity, the original pictures of product packaging of the selected products (i.e., orange juice and sparkling wine) as yielded by the Google picture search were manipulated in Adobe Photoshop. In a first step, the brand name of the original product picture was removed in order to prevent any established brand associations. This prevented prior brand knowledge from potentially influencing typicality perceptions (Peracchio and Tybout 1996). Second, two different color versions of the product packaging were created. The first version depicted a product in a typical color (i.e., orange juice with orange packaging and sparkling wine with gold packaging), while the second version illustrated the product in atypical colors (i.e., orange juice with blue packaging and sparkling wine with purple packaging) (see Fig. 2).

Fig. 2
figure 2

Stimulus material (experiments 1 and 2)

3.2 Procedure

Participants were randomly allocated to one of the two experimental groups. In the typical condition respondents were exposed to the orange juice and the gold sparkling wine. The blue orange juice and the purple sparkling wine were used in the atypical package color condition. Each product was displayed on a separate web page, which were available in two different orders for each condition to avoid any sequence effect. A short introduction informed respondents on the purpose of the study, namely the evaluation of grocery products. Following, respondents were exposed to the stimulus product and then responded to the items assessing typicality, skepticism, interest, product attitude and purchase intention. Respondents were not allowed to return to previous sections.

3.3 Dependent measures

All items were based on established scales drawn from extant research. More specifically, respondents were asked to evaluate the stimulus product packages’ typicality using three items according to the studies of Babin et al. (1995) and Sujan and Dekleva (1987). The emotions skepticism and interest were assessed with items, respectively, adopted from Babin et al. (1995). Holbrook and Batra’s (1987) four-item brand attitude scale assessed product attitude. Finally, two items were used to measure purchase intention. Scale items were operationalized with five-point response alternatives. All details about the measurement scales and their psychometric properties are summarized in Table 2.

Table 2 Factor loadings and reliability statistics for the conceptual research model

3.4 Analysis and results

3.4.1 Measurement validation

Following the procedure of Anderson and Gerbing (1988) data were first subjected to a CFA to assess the reliability and validity of the used scales. The final measurement model yielded a good model fit. The χ2 statistic was 194.94 with 55 degrees of freedom. The root mean square error of approximation (RMSEA) was 0.08, the comparative fit index (CFI) is 0.97, the non-normed fit index (NNFI) is 0.96, the goodness of fit index (GFI) is 0.92, and the standardized rood mean square residual (SRMR) is 0.03. All factor loadings were substantial (≥ 0.68) and significant, indicating evidence for convergent validity. Reliabilities were excellent among all scales, with average variance extracted (AVE) values not smaller than 0.67, and construct reliabilities larger than 0.80. The Fornell and Larcker (1981) criterion was used to test discriminant validity. As Table 3 shows, the AVE of each construct was higher than its squared multiple correlations with the other latent variables in the model. Table 3 exhibits descriptive and reliability statistics for the latent constructs.

Table 3 Correlation matrix of latent constructs, variance extracted, construct reliability, and descriptive statistics (measurement model)

3.4.2 Manipulation check and preliminary analysis

For the analysis, data were arranged according to product categories, producing a dependent sample of 342 observations (10 observations had to be eliminated due to missing values) with each respondent providing two data points for each construct. This procedure enabled the calculation of a congruity × product interaction. An ANOVA tested the general premise that product package colors that meet common color associations were perceived as more typical than colors that are not associated with the product. The analysis confirmed the result (F(1, 309) = 108.14, p < 0.01). Among the two product categories, the product packaging that does not meet product category color norms was perceived as less typical (MeanAtpyicalOrangeJuice = 1.73, MeanAtypicalSparklingWine = 1.76) than the packaging that met the product category color norm (MeanTypicalOrangeJuice = 3.11, MeanTypicalSparklingWine = 3.02).

An analysis of the pages’ time stamps presenting the stimulus (either the orange juice or the sparkling wine) excluded the alternative explanation that respondents devoted less attention to the incongruent package. In confirmation with extant studies reporting the attention-getting effect of atypical stimuli, respondents remained longer on the pages showing the atypical orange juice package as compared to the typical one (MeanAtpyicalOrangeJuice = 176 s vs. MeanTpyicalOrangeJuice = 144 s; F(1,1569 = 5.41, p < 0.05)). Similarly, the atypical sparkling wine attracted more attention than the typical one (MeanAtypcialSparklingWine = 148 s vs. MeanTypcialSparklingWine = 115 s; F (1, 140 = 7.74, p < 0.01). Moreover, there was no significant difference among congruity perceptions between the two products categories (F(1, 309) = 0.21, p = 0.65), hence the two product categories were collapsed for further analysis.

3.4.3 Structural model and hypotheses testing

In order to test the conceptual model, a structural equation model was estimated in LISREL. Constraints (i.e., paths) were placed on the overall covariance matrix as depicted in Fig. 1. The theoretical model yielded an excellent fit: χ2 (df = 57) = 198.23, RMSEA = 0.08, CFI = 0.97, NNFI = 0.96, GFI = 0.92, SRMR = 0.03. Surprisingly, and contrary to the proposed direction of H1, high levels of atypicality decreases interest (γ = − 0.54, p < 0.01). This result contradicts extant literature on schema theory and thus requires further investigation in experiment 2 and elaboration in the discussion section. H2 suggests that the emotional response interest has a positive impact on product attitude. The data confirmed this hypothesis (β = 0.44, p < 0.01). As predicted, high levels of atypicality increases skepticism (γ = 0.60, p < 0.01), in support of H3. Moreover, skepticism negatively impacts product attitude, supporting H4 (β = − 0.24, p < 0.01). Consistent with the theoretical reasoning and H5, product attitude is linked to purchase intention (β = 0.54, p < 0.01) (see Table 4).Footnote 1 Having tested the direct effects and the fit of the overall structural model, we examined the indirect effects specified in our model. As expected, there was a significant indirect effect on product attitude through skepticism (− 0.12, p < 0.01) as well as through interest (− 0.21, p < 0.01), making a total indirect effect from atypicality to product attitude of − 0.75 (p < 0.01). Overall, the results confirm the proposed theoretical framework, with the exception of the negative influence of package color atypicality on interest. This result is particularly surprising, since it contradicts previous literature suggesting a positive effect of atypical package designs. Experiment 2 explores further the relationship between atypical package colors, emotional responses and product choice, while controlling for awareness effects.

Table 4 Standardized structural path coefficients for the structural model

4 Experiment 2

While the analysis of the first experiment is in line with the attention-getting effect of atypical package colors, our findings do not support the positive influence of atypicality on interest. If atypical package colors do not really increase interest but evoke feelings of skepticism which lowers product attitude and purchase intention (as found in experiment 1), then generalizing the superiority of atypical package colors should be questioned. That being said, experiment 1 has to be interpreted with caution as its single-exposure design limits the ecological validity of the findings. In real-world shopping environments, shelves are cluttered with a variety of products and, as such, it is likely a product does not raise awareness even if its atypical package raises interest. Research confirms that attention and awareness are related but distinct phenomena, which not necessarily need to occur together (Hsieh et al. 2011). Attention refers to “a state in which cognitive resources are focused on certain aspects of the environment rather than on others” while awareness is the “perception or knowledge of something” (APA dictionary 2018). Given that awareness is strongly tied with consumers’ interest, whether or not a product raises awareness should materially influence the outcome of the first experiment. To this end, the second experiment aims to tests the robustness of the findings obtained in the first experiment by investigating whether atypical package colors do indeed negatively influence product choice, while controlling for consumers’ awareness in a cluttered exposure setting.

A online panel-sample of 112 consumers was recruited according to a pre-defined quota in terms of age, gender and education to represent the population of interest in the country of investigation (52% female, Mage = 41 years, SD = 13.41, 19% university degree, 19% high school, 20% middle-level school, 18% apprenticeship training and 5% compulsory school as highest completed level of education). The experiment employed a one factor between subject design, with category color norm fit (atypical vs. typical package color) as manipulated factor.

4.1 Stimulus material development

Having already extensively pre-tested two products in experiment 1, we chose orange juice as product category. The same packages as in experiment 1 constituted the atypical orange juice (blue package) and the typical one (orange package), both named hereafter target orange juice. In contrast to experiment 1, the target orange juice was presented next to nine other orange juices, with all juices having typical orange juice colors (orange, including white or yellow parts). This filler juices were selected based on a Google picture search. Effort was devoted to choose products which are not familiar to the target group. This was achieved by selecting products that are not available in the country of investigation and by removing brand logos as well as any quality signals (e.g., fair trade, bio) in order to avoid any habituation effects. Orange juices were presented in two rows, each row consisting of five orange juices. In each condition (atypical vs. typical) two different variations of orange juice locations (first row, outer right hand side vs. second row, second product from the left) existed in order to rule out any location effects.

4.2 Procedure

Participants were randomly allocated to one of the two experimental groups. The instructions asked participants to put themselves into a usual grocery shopping situation. Respondents should think about conducting the grocery shopping as usual, while ultimately arriving at the orange juice shelf. On the subsequent page, participants were informed that the following page displays orange juices and that they should choose three of these orange juices for purchase consideration based on their first impression.

The following page asked respondents to indicate whether they noticed the target orange juice (i.e., the blue one in the atypical condition or the orange one in the typical condition) and whether they considered this target orange juice for purchase. Immediately after, respondents were exposed to questions assessing typicality, skepticism, interest and demographic variables.

4.3 Dependent measures

The same items as in experiment 1 measured perceived typicality, skepticism and interest. All details about the measurement scales and their psychometric properties are summarized in Table 2. The question assessing awareness towards the product reads “Did you notice this orange juice”, with the target orange juice being displayed above the question. Moreover, another item assessed choice: “Did you consider this orange juice for purchase”? Both questions offered yes or no as answer categories.

4.4 Analysis and results

An ANOVA confirmed that the manipulation worked as intended (F(1, 110) = 35.38, p < 0.01). The blue orange juice was perceived as less typical than the orange one (2.8 vs. 3.95). The influence of package color (a)tpyicality on emotions was tested by a MANOVA (Pillai’s trace V = 0.05, F(2, 109) = 2.66, p < 0.05, one sided). Consistent with experiment 1, the atypical (blue) orange juice evoked higher feelings of skepticism than the orange one (2.71 vs. 2.28, p < 0.5). Package (a)typicality had only a marginally significant influence on feelings of interest, with the atypical orange juice evoking lower interest than the typical one (2.83 vs. 3.11, p = 0.7, one-sided). Overall, these results confirm the findings of the first experiment.

The analysis proceeded with testing the influence of the target orange juice’s location (first row, right vs. second row, left) on choice and awareness. No significant association was revealed between target location and choice, neither in the atypical (blue) condition (χ2 = 0.6, p = 0.44), nor in the typical (orange) condition (χ2 = 0.48, p = 0.49). Similarly, target location did not affect awareness in both conditions (χ2atypical  = 0.1, p = 0.92; χ2typical  = 1.43, p = 0.23). Consequently, the two location groups were collapsed to form two experimental groups (atypical vs. typical package color).

Moreover, no significant difference was observed in terms of awareness between the two experimental groups (χ2 = 0.94, p = 0.33) showing that in both groups the same amount of respondents noticed the target orange juice. As expected, there was a significant association between color (a)typicality and choice χ2 = 5.73, p < 0.05. In the blue orange juice condition only 27% respondents chose the blue orange juice, while 73% did not choose it. In contrast, in the orange juice condition, almost half of the respondents chose the target orange juice (49%). The odds of not choosing the target were 2.59 higher if respondents were exposed to the atypically colored orange juice (blue) than if exposed to the typical one (orange). These findings demonstrate that category color norm mismatch (i.e., atypical package colors) negatively impacts product choice. At the same time, experiment 2 offers evidence that rules out the possibility that respondents are likely to fail noticing an atypical package when presented next to products of the same category with typical packages.

5 Discussion

5.1 Theoretical and managerial implications

Although the importance of product packages and product package design has been widely documented in the marketing literature, color-related package decisions represent an under-researched phenomenon. Understanding how different package colors influence product preference is of relevance not only from a theoretical, but also from a practical point of view as it may add or retract from a brand’s equity. In this context, marketers often make package color decisions consistent with the category norm, usually set by the market leader. However, such an approach might not always be beneficial as it inhibits product differentiation and enables direct comparison of the new product with existing—and perhaps more established—products in the market. Alternatively, the use of category novel and unexpected colors can help products in standing out from the clutter and differentiating from competitors but, at the same time, might also generate ambiguity and confusion about what the product is and/or stands for. Against this background, the present research contributes to a better understanding of how package color decisions may impact consumer responses and therefore influence brand equity. More specifically, drawing from categorization theory the current research investigates how conforming versus not conforming to commonly-held product category color associations influence consumer emotional responses, and in turn product attitude and purchase intention.

The findings of two experiments offer compelling empirical evidence that package colors which match with product category color norms generate more positive consumer emotional responses than atypical package colors. The results show that an increase in package color typicality decreases consumer skepticism and enhances interest, leading to more positive attitudinal and behavioral reactions. In contrast, package color atypicality was found to negatively influence consumer preferences. As predicted, deviations from the norm make consumers being more skeptical toward the product and, contrary to expectations, decrease interest. These findings are of relevance for the present research under the theoretical lenses of consumer goal activation and pursuit (Bagozzi and Dholakia 1999). Consistent with extant literature, our research implies that in the context of grocery shopping, consumers’ goal orientation seems to depend heavily on maximizing the efficiency of the shopping trip (Babin et al. 1994). Product package color represents a powerful visual element in communicating the nature of the product (Underwood 2003) that can function as an effective heuristic toward attaining this goal (Reber et al. 2004). Typical package colors enable product identification and maintain order in consumers’ mental representation of how the market is structured (Veryzer and Hutchinson 1998). This facilitates decision-making across multiple different product categories and renders the overall shopping experience more efficient. Along these lines, consumers’ interest appears to be stimulated by the instrumental nature of color as a heuristic to facilitate cognitive navigation across multiple categories.

In contrast, deviations from product category color norms might negatively impact efficient goal attainment. Atypical package colors trigger feelings of skepticism, hinder product identification, and cause disruptions to category-based product knowledge (Desai and Ratneshwar 2003). At the same time, severe discrepancies with category perceptions seem to negatively influence consumers’ interest. Consistent with recent findings in schema incongruity research (Garaus et al. 2015; Halkias and Kokkinaki 2014, 2017), when stimulus deviations from schematic knowledge are extreme, consumers recognize only minimum diagnostic value in the stimulus’ content and are more likely to disregard it. In the current setting, for example, an orange juice with a blue package might generate confusion as to what the product really stands for and requires a significant investment of cognitive resources to be accommodated in pre-existing product category schemata. This damages overall shopping efficiency and negatively influences consumer evaluations. The superiority of typical package colors to evoke favorable consumer responses might be even more pronounced in a cluttered market place, where consumers are exposed to numerous other cues that compete for cognitive resources required for making purchase decisions.

From a practical point of view, the findings of this research offer valuable insights into the choice of package colors for new products. Package colors that conform to the category norm or some sort of a “market standard” seem to be more important, especially for new FMCG products where category color norms exist, as they allow product identification and classification under the right product category. Therefore, benchmarking with regard to package color decisions is strongly recommended. Moreover, our findings offer interesting insights for brand extension strategies. In particular, when extending to new product categories, brands that are strongly associated with a certain color should not by default use the same color palette to enhance brand recognition. Instead, marketers should first examine whether package color options are considered (a)typical to the respective extension category and make necessary adjustments. Taking a more strategic perspective, package design, in general, and package color, in particular, can be an effective tool to implement brand positioning and differentiation tactics. However, findings from the present research suggest that package color should be used within the bandwidth of meaningful color associations in product categories that possess color norms in the FMCG sector. Research on brand logo colors has revealed that variation in the value and saturation of the brand’s logo color differentially impacts perceptions about the personality of the brand (Labrecque and Milne 2012). Therefore, it is reasonable for marketers to differentiate by changing the intensity and/or brightness of colors in order to stand out from competition, but be cautious in not to switch into colors altogether atypical with regard to the corresponding product category.

5.2 Limitations and suggestions for future research

The current research is qualified by a number of limitations that offer interesting avenues for further research. First, the present research used unkown products in order to eliminate any potential confound due to prior experience with the brand. Although this decision allowed us to ensure internal validity, it has compromised real-life correspondence and reduced the external validity of the findings. In addition, our research employed only two products within the FMCG category rendering the results mainly applicable to categories of relatively similar characteristics. Besides, the research settings of the present investigation approximate decision-making processes typically found in grocery shopping. Overall, to establish the generalizability of the present effects future researchers should employ known brands and explore whether our results are replicated across diverse product categories.

Moreover, our findings with regard to the negative impact of package colors that deviate from category perceptions are based on a dichotomous operationalization of perceived (a)typicality. However, as Halkias and Kokkinaki (2017) argue this approach does not allow discrimination between different levels of discrepancy and makes comparisons between empirical studies problematic as the extent of perceived incongruity of a given “atypical” package might considerably vary across different studies. Besides, explicitly distinguishing between different levels of atypicality (i.e., moderate vs. extreme) would enable the direct application of Mandler’s (1982) schema incongruity theory, suggesting that moderate incongruity from established schemata can result in more favorable evaluations. Along these lines, future studies are strongly encouraged to operationalize different levels of (a)typicality and see whether there are atypical color options that can generate a pleasurable level of arousal that will engage consumers beyond the mere accomplishment of an efficient shopping trip.

Finally, our research focused on the under-researched issue of package color typicality. However, literature shows that other elements of a product’s package (e.g., shape) may exert a strong influence on product and category perceptions as well as on subsequent consumer preferences. That said, the present research looked into package color in isolation, without considering any potential interplay with other elements of product package design. It might be that if some degree of atypicality exists in both the color and the shape of the package, the product will be so conspicuous that will be readily classified as idiosyncratic or exclusive eventually benefiting brand equity. Research addressing the issues discussed above will contribute significantly in the relevant literature.