Consumer effects of front-of-package nutrition labeling: an interdisciplinary meta-analysis

As consumers continue to struggle with issues related to unhealthy consumption, the goal of front-of-package (FOP) nutrition labels is to provide nutrition information in more understandable formats. The marketplace is filled with different FOP labels, but their true effects remain unclear, as does which label works best to change perceptions and behaviors. We address these issues through an interdisciplinary meta-analysis, generalizing the findings of 114 articles on the impact of FOP labels on outcomes such as consumers’ ability to identify healthier options, product perceptions, purchase behavior, and consumption. The results show that, although FOP labels help consumers to identify healthier products, their ability to nudge consumers toward healthier choices is more limited. Importantly, FOP labels may lead to halo effects, positively influencing not only virtue but also vice products, e.g., interpretive nutrient-specific labels improve health perceptions of both vice and virtue products, yet they influence only the purchase intention of virtues.

With many consumers struggling with health problems related to food consumption, including obesity, diabetes, and heart and coronary problems (World Health Organization 2018), tackling nutrition-and diet-related health issues has become a major concern for both food marketers and policymakers around the world. One commonly suggested approach to nudging consumers toward healthier food consumption is providing clearer information about the nutritional content of food products. For example, the Nutrition Labeling and Education Act of 1990 provided the U.S. Food and Drug Administration (FDA) with the authority to require U.S. food manufacturers to convey nutrition information on food packaging via a so-called Nutrition Facts Panel (NFP) (Institute of Medicine 2010). The aim was to improve consumers' ability to access and process all the nutrition information they needed to make health-conscious food choices (Balasubramanian and Cole 2002). However, at the point of sale, consumers face time pressure and struggle to understand the information presented on the NFP (Block and Peracchio 2006;Graham et al. 2012), thus fueling the search for simpler, more attention-grabbing forms of communicating the nutritional content and relative healthfulness of food products (World Health Organization 2018). As a result, a variety of front-of-package (FOP) nutrition labels have emerged and been implemented globally. Many large food manufacturers, including Nestlé and PepsiCo, are looking to help consumers make Bbalanced and mindful choices^with FOP labels, but they are struggling to find the best label type (Michail 2017(Michail , 2018. In fact, these companies are calling for one internationally agreed-upon format (Askew 2018). This exemplifies the lack of consensus on the format and effectiveness of different FOP labels but strong motivation among the industry for finding a good solution, calling for a systematic review of what works and what doesn't.
FOP nutrition labels include symbols and rating systems that summarize Bkey nutritional aspects and characteristics of food products^(Institute of Medicine 2010, p. 1) in simplified formats. In this way, FOP labels complement the traditional NFP (Kees et al. 2014). According to the FDA (2010), FOP labels aim to Bincrease the proportion of consumers who readily notice, understand, and use the available information to make more nutritious Mark Houston and John Hulland served as Special Issue Editors for this article.
choices for themselves and their families, and thereby prevent or reduce obesity and other diet-related chronic disease.^The formats range from reductive labels, which simply summarize the complex information from the NFP and present it on the package's front (e.g., BFacts Up Front^), to interpretative labels, which can be further divided into two categories: (1) nutrient-specific indicators evaluating the level of individual nutrients (e.g., Multiple Traffic Light systems or nutrient content claims such as Blow in salt^) (Ducrot et al. 2015;Hodgkins et al. 2012) and (2) summary indicators of the overall healthfulness of the product (e.g., Health Star Ratings) (Savoie et al. 2013;Emrich et al. 2014).
The increased uptake of FOP labeling is concomitant with the increasing amount of research being conducted across different fields with regard to a range of different consumer-relevant outcomes, including the attention consumers pay to the Nutrition Facts Panel, healthfulness and tastiness perceptions, product attitude, identification of healthier options, making healthy choices, purchase intentions of FOP labeled products, and, finally, actual consumption (in the upcoming sections, we use the term Beffectiveness^to describe this set of outcome variables). Yet many knowledge gaps remain. First, debate is ongoing about the overall effectiveness of this labeling in helping consumers make healthier choices. For example, FOP labels can sometimes mislead consumers and induce an inaccurate assessment of the product's healthfulness, which could result in higher consumption of unhealthy food (Orquin and Scholderer 2015;Roberto et al. 2012). While important meta-analyses summarize the impact of various nudges (Cadario and Chandon 2018) and menu labeling (Sinclair et al. 2014) on consumers' food choices, a systematic research review able to reconcile contrasting findings in the context of FOP labels is lacking.
Second, given the considerable variation in FOP labeling systems, the question arises whether there are any generalizable differences in the effectiveness of these systems. While some findings suggest that interpretive nutrient-specific labels, such as the Multiple Traffic Light, work best (Hawley et al. 2013), other findings show that the effectiveness of different label types is context dependent (Newman et al. 2016). Thus, it is important to know not only how FOP labels influence purchase decisions and whether they have possible negative consequences or health halo effects, but also how these effects differ between different types of labels. Health halos refer to consumers' use of limited, specific information about, for example, individual nutrients to infer a general perception about the product's overall healthfulness. Such inferences occur when incomplete information is presented or available to consumers Andrews et al. 1998). As most research has examined only a limited number of FOP labels or a type of effect at a time, consensus on the Bbest^FOP label is yet to be reached (Gearhardt et al. 2012;Hieke and Taylor 2012;Hodgkins et al. 2012). Therefore, a generalizable research effort is required to understand which FOP labeling type is most beneficial to consumers in terms of providing information, influencing perceptions, and driving healthier choices (Hawley et al. 2013;Kanter et al. 2018). Such an effort is even more important because previous research has found that FOP labeling decreases the attention consumers pay to the NFP , suggesting that consumers strongly rely on the information appearing on the front of the package. To find more effective ways of using FOP labeling to promote healthier consumption, it is critical to answer the questions of how consumers are influenced by FOP nutrition labels, and by which FOP labels consumers are most influenced, without being misled.
Finally, it is important to emphasize that while various FOP labeling systems have been developed through government policies, the implementation of most remains at the discretion of manufacturers and marketers (Kanter et al. 2018). This includes, for example, the Facts Up Front program in the United States and logos to identify healthier options in Scandinavia, Poland, Singapore, the Netherlands, and Thailand. Although some countries have recently implemented types of mandatory labels (e.g., Chile, Mexico), most manufacturers still have the ability to decide whether to implement FOP labels and, if so, which ones. Not only policymakers but also food marketers would thus benefit from a more thorough understanding of the various effects of different FOP labels.
The current study aims to address these gaps by means of a meta-analytic review of all articles examining the effects of health-or nutrition-related FOP labels at the consumer level. To assess the impact of these labels, we harvested 1594 effects from 114 studies between 1996 and 2018, across different fields (marketing and consumer studies, nutrition science, public health, sensory science, medicine, and others). Drawing from extant literature (Cadario and Chandon 2018;Ducrot et al. 2015;Hodgkins et al. 2012;Kanter et al. 2018;Newman et al. 2018), we classify FOP labels into reductive and interpretative and further distinguish between interpretative nutrient-specific labels and summary indicator labels. This typology allows us to identify the key aspects influencing the effectiveness of FOP labeling and the type of information most beneficial for consumers.
To the best of our knowledge, this study is the first to take into account the various types of labels and assess different FOP effects, including leading consumers' attention away from the NFP; affect the labeled products' healthfulness perception, tastiness evaluation, and product attitude; and improve consumers' ability to identify healthier options and increase purchase intention, choice, and, ultimately, consumption of healthier products. All of these effects are relevant to both public policymakers and marketers. We found only two studies on the effects of FOP labels that take a quantitative, meta-analytical approach. However, these meta-analyses are more limited in terms of the FOP labels examined (only health claims in Kaur et al. 2017) or outcomes (healthy choice and consumption in Cecchini and Warin 2016). Our work differs from these in four important ways (see Web Appendix 1). First, we investigate a wider range of FOP labeling systems, categorizing a variety of labels into a typology of reductive nutrient-specific, interpretive nutrient-specific, and interpretive summary indicator labels. Second, we consider more outcomes, without restricting the analysis to healthy choice or consumption. For example, the potential influence of FOP labels on tastiness perceptions can have important consequences on product choice, as taste remains a key driver of food choices (e.g., Lassen et al. 2016). Third, we distinguish between different types of foods (healthy or Bvirtue^foods vs. unhealthy or Bvice^foods). This is important because some labels (e.g., health logos such as the BSmarter Choice^label) have been criticized for possibly promoting the consumption of unhealthy products. Third, we control for multiple other factors that may explain the discrepancy of the results in extant literature by considering other product characteristics (e.g., known brand vs. fictitious brand), consumer characteristics (e.g., percentage of female respondents, North American sample), and study characteristics (e.g., withinsubject design, year of publication, field).
The results offer insights to public policymakers trying to understand the impact of the implementation of FOP labeling or struggling to decide which specific label type to promote. Marketers can also use the results to forecast consumers' reactions to the addition of FOP labeling, whether it be voluntary or mandatory labels, and to make the best possible decisions for both the consumers and the company. Theoretically, we offer a basis for further research on FOP nutrition and health labels by bringing together a broad range of research from multiple disciplines to help understand what is known about the impact of these labels on consumers' perceptions, choices, and behavior.
We begin by briefly discussing the various FOP label types and reflecting on their effects on relevant metrics. Then, we describe the data collection and analytical procedures and present the results from our meta-analyses. We conclude by discussing the implications for both public policymakers and marketers and identifying paths for future research.
A typology of FOP nutrition labels FOP labels provide consumers with truncated nutrition information and serve to complement the more complex NFP typically found on the back or side of the packaging Newman et al. 2018). A wide variety of such labels are available in the marketplace and examined in academic research. A report by the Institute of Medicine (2010) reviewed a selection of 20 different systems, covering only the most represented alternatives. These labels are common: a study of labeling practices in Europe found that nearly 50% of all food products carried FOP labels, ranging from 24% in Turkey to more than 80% in the United Kingdom (Storcksdieck genannt Bonsmann et al. 2010).
Prior research shows that FOP labels vary in content and structure (Kanter et al. 2018;Newman et al. 2018;Talati et al. 2017), which is likely to lead to differences in their effectiveness in helping consumers determine a product's healthfulness. More specifically, FOP labels can be broadly classified as either (1) reductive labels (e.g., Facts Up Front, Guideline Daily Amounts), which reduce the amount of nutrition information provided in the NFP without offering any interpretation of this information, or (2) interpretive labels (e.g., traffic-light symbols, warning labels, star-based systems, health logos), which provide greater evaluation of information contained in the NFP (Newman et al. 2018;Talati et al. 2017). Interpretive labels can be further categorized into two types depending on the degree of information aggregation (Talati et al. 2017). The first type, interpretive nutrient-specific labels, adds an evaluative component, an interpretation of the healthfulness of one or more individual nutrient. For example, the Multiple Traffic Light system uses colors to emphasize whether the level of a particular nutrient is low, medium, or high. The second type, interpretative summary indicator labels, is more aggregated, in that they provide a summary of the overall nutritional profile of a product, such as in the Choices Program health logo and Health Star Rating system. Figure 1 depicts our typology of FOP labels; this typology builds on previous categorizations (Kanter et al. 2018;Newman et al. 2018;Talati et al. 2017) and typologies (e.g., consumer-based typology; Hodgkins et al. 2012) in the literature. To include the full range of FOP nutrition labels in the typology, we expanded on these previous typologies by including labels missing from the categorizations, such as warning labels, nutrient content claims, and health claims. To further explain our typology, we briefly discuss the three categories of reductive nutrient-specific labels, interpretive nutrientspecific labels, and interpretive summary indicator labels.
Reductive nutrient-specific labels provide nutrient-level information with little interpretation (Newman et al. , 2018. Examples include calorie labels, Facts Up Front, and Guideline Daily Amounts. These labels offer objective information about the nutritional content of a product, in a manner that is less complex and more condensed than the NFP, and present this in a more accessible location (front rather than back of pack). Despite these advantages, reductive labels are still regarded as time-consuming and difficult to interpret for consumers (Hawley et al. 2013;Hersey et al. 2013;Talati et al. 2017).
Interpretive nutrient-specific labels not only present information about specific nutrients but also add a layer of interpretation, evaluating whether a product scores Bgood^or Bbad^on this aspect (e.g., Andrews et al. 2011). Examples include traffic-light labels, warning labels, nutrient content claims, and health claims. This interpretative representation (e.g., giving red, yellow, and green colors to different levels of nutrients) facilitates consumer understanding of the message ) but still requires consumers to integrate multiple points of information to determine the product's overall healthfulness (Talati et al. 2016). Nutrient content claims highlight the positive level of a specific nutrient on a product (Dixon et al. 2014), while health claims link the nutrient to a specific health or risk reduction benefit (Aschemann-Witzel and Hamm 2010; Kozup et al. 2003). These claims may help consumers process and interpret information about individual nutrients. However, consumers may overgeneralize these claims to be indicative of the overall healthfulness of the product and may expect, for example, a low-fat product to contain more beneficial levels of other nutrients as well (Andrews et al. 2000).
Interpretative summary indicator labels provide an interpretive aggregation of nutrition information summarizing the overall nutritional value of a food. These labels include graded summary systems, such as the Health Star Rating, NutriScore, and health logos (e.g., international Choices Program). These labels offer nutrition information at a glance, by summarizing and interpreting the overall healthfulness of a product into one indicator (Hamlin and McNeill 2016;Hersey et al. 2013;Talati et al. 2016). This is especially helpful for consumers who want to compare different alternatives at the point of purchase to choose the healthiest product (Liem et al. 2012;Newman et al. 2018).
Despite offering consumers a quick identifier for healthier food options, some of these labels are subject to strong criticism. In particular, these labels are criticized for oversimplifying nutritional information by trying to put information about multiple dimensions of nutritional quality into one unidimensional indicator, thus possibly misleading consumers. On the other hand, some research suggests that interpretive labels are more effective than reductive labels in moving consumers toward healthier choices (Volkova and Ni Mhurchu 2015). Grunert and Wills's (2007) review suggests that, in general, consumers prefer the ease of summary labels but support formats that still provide them with enough details on the nutritional content of the product (e.g., nutritionspecific labels). Some research suggests that consumers identify healthier options more easily using nutrient-specific rather than summary indicator labels (Hersey et al. 2013), while other studies have found that consumers are better at understanding and using summary indicator labels (Bialkova and van Trijp 2010;Ducrot et al. 2015;Talati et al. 2017). In summary, although FOP labels have the overarching goal of helping consumers evaluate and choose healthier products (FDA 2010;Newman et al. 2018;Talati et al. 2017), labeling systems approach this goal in different ways, possibly leading to different outcomes.

Effectiveness of FOP labels
As discussed, existing research has studied a broad range of outcomes related to FOP labeling, including the attention consumers pay to the Nutrition Facts Panel, healthfulness and tastiness perceptions, product attitude, identification of healthier options, making healthy choices, purchase intentions of FOP labeled products, and, finally, actual consumption. Below, we shortly discuss this research and the unanswered questions we aim to address with this meta-analysis.
Attention to NFP and product perceptions FOP labels are meant to complement the more complex, complete nutritional information presented on the NFP, though not at the same level of detail. As such, researchers have examined the possible impact of the presence of FOP labels on how much attention consumers pay to the NFP (e.g., Becker et al. 2015;Bix et al. 2015). As purchase decisions in the supermarket are typically made quickly, consumers will not have the time to study the two sources of nutrition information and, as a result, often ignore the lengthier NFP when a FOP label is present (Watson et al. 2014). Thus, we will study whether consumers will rely consumers will rely more on the information available on the front of the package, paying less attention to the NFP, when making purchase decisions and whether this effect differs between label types.  One aspect of FOP labels especially important to marketers is their impact on product perceptions. Do these labels indeed lead consumers to perceive a product as healthier, and if so, how do they affect the perceived tastiness? Though significant efforts have been made to understand these effects, the answers remain unclear especially in terms of the label types that have the largest impact. FOP labels are explicit cues about the product's healthfulness. Summary indicator labels make it easier for a consumer to evaluate the product's healthfulness and should therefore positively influence healthfulness perceptions. Conversely, reductive labels provide specific information about the nutritional content of the product without any interpretation of its healthfulness. However, consumers often assume that information about specific nutrients is related to other attributes; this Bhealth-halo^effect may result in a more favorable overall attitude toward the product (Burke et al. 1997;Kozup et al. 2003;Burton et al. 2014). Moreover, consumers often rely on an implicit Bunhealthy = tasty^intuition (Mai and Hoffmann 2015;Raghunathan et al. 2006) when evaluating food products, suggesting that the perceptions of a product's healthfulness and tastiness are negatively correlated in consumers' minds. Thus, does the addition of FOP labeling also influence consumers' expectations of a product's tastiness (cf., Bialkova et al. 2016)? Research on the impact of FOP labeling on the perceptions of these different attributes has found contradicting results and, therefore, we aim to answer how FOP labeling influences the perceptions of a product's healthfulness and tastiness and general product attitude, as well as to identify moderators that cab help explain the differing results of previous work.

Identifying, choosing, and consuming healthier options
By providing simpler information in a more salient format, FOP labels aim to help consumers identify the options that are better for their health and to avoid the alternatives that may cause problems when consumed in excess (Institute of Medicine 2010). Existing research shows that, in general, FOP labels are indeed able to help consumers with this identification, but whether certain labels are more helpful than others remains unclear as individual studies have only compared a limited number of labels at once. Overall, reductive labels still require consumers to understand the meaning of the different nutrients to identify healthier options. Therefore, we expect that interpretive labels, which offer an evaluation with regard to the level of healthfulness of the product, are more effective in helping consumers differentiate between more and less healthy alternatives (Cecchini and Warin 2016;Newman et al. 2018).
It is important to recognize, however, that being informed and aware of healthfulness does not always lead to better choices by consumers. The Bunhealthy = tasty^intuition may lead consumers to avoid the healthy option and choose the unhealthy tasty option instead (Raghunathan et al. 2006). So far, there is no specific understanding of whether, and to which extent, the ability to identify healthier options translates to behaviors. Moreover, even if labels successfully nudge consumers toward healthier choices, a key driver of the increasing obesity rates around the world is the overconsumption of calories (World Health Organization 2015). Thus, a crucial factor when evaluating FOP labels is their impact not only on choice but also on actual consumption. Choosing healthier products is counterproductive if this gives consumers a way to justify larger portions and increased consumption (Suher et al. 2016). A recent meta-analysis on health claims (one specific type of FOP label) indicates that such claims lead consumers to make more healthy choices without reducing the number of calories consumed (Cecchini and Warin 2016). We further examine whether and how this finding generalizes across the full range of FOP labels.

Vice versus virtue categories
As mentioned previously, research has criticized some FOP labels for potentially misleading consumers, due to the unclear criteria behind the labeling (van Herpen and van Trijp 2011). A key example is health logos, which are used to mark options that either are good for health in general or are healthier than the average within a category. The latter case may result in a situation in which FOP labels are promoting products that are, in absolute terms, still unhealthy (e.g., candy with 20% less sugar than the average candy). Therefore, differentiating between the effects of the labels in more (virtue) or less (vice) healthy categories is crucial. Following Huyghe et al. (2017, p. 66), we define a virtue as Bsomething that is not very tempting now but may be more beneficial in the long-run … something that you feel less guilty choosing,^while a vice is Bsomething tempting that has few long-term benefits. Something that you want but at the same time feel guilty choosing.F or FOP labels to meet their goal of helping consumers distinguish between products differing in healthfulness, they should have different effects for virtue and vice products, ideally leading to more positive healthfulness evaluations and purchase intentions for virtue products and the opposite for less healthy alternatives (Newman et al. 2018;Talati et al. 2016). By contrast, positive effects in vice categories may be indicative of potential misleading and halo effects (Orquin and Scholderer 2015).

Brand familiarity
Evidence suggests that brand familiarity has an impact on the effectiveness of FOP labeling. Although research in this setting has mostly focused on health claims, findings indicate a limited impact on consumers who are familiar with the product (Aschemann-Witzel et al. 2013b; Moon et al. 2011). This suggests that FOP labels do not change perceptions of consumers who are already familiar with the product but are helpful for consumers wanting to understand and evaluate a new product. Therefore, we include a variable in our analysis, comparing the effects between studies using known brands as stimuli and those relying on fictitious or unbranded stimuli.

Gender
In general, women are more health conscious than men (Lassen et al. 2016), a trait that research has also linked to the greater use of nutritional information (Hieke and Taylor 2012;Williams and Mummery 2013). Women also tend to make healthier decisions than men when provided with nutritional labels (Heiman and Lowengart 2014;Hieke and Newman 2015). Indeed, more health-conscious consumers tend to rely more on detailed information, such as the NFP, than simplified information, such as health claims (Cavaliere et al. 2016;Naylor et al. 2009). As research on FOP labeling largely does not differentiate between people's health consciousness or uses different ways to measure concepts such as nutrition consciousness, health consciousness, or nutrition knowledge, it is difficult to include the moderating effect of these factors in a meta-analytic setting. However, as gender has often been linked to health consciousness and the proportion of female participants in research is a measure typically reported, we conduct a moderation analysis with gender.

Identification of the primary studies
To identify empirical studies on the effects of FOP labels, we undertook a thorough literature search using several criteria. First, we considered all studies on the effects of FOP labeling on consumers for the meta-analysis. Given our specific focus, we excluded studies on allergy labeling (e.g., Bdairy free^) and production claims (e.g., Borganic^). Second, we included only quantitative studies comparing the label with a clear control condition. Third, the dependent variables investigated had to be related to attention to NFP, healthfulness perception, tastiness, attitude toward the product, identification of the healthy option, healthy choice, purchase intention, and consumption.
To identify and include unpublished studies, we invited researchers to share their unpublished work on the topic in a post on the ELMAR platform. Fourth, we implemented a snowballing procedure to identify additional articles from the reference lists of the studies already selected. Furthermore, we carried out additional manual searches for articles within all marketing and nutrition science journals as well as several other journals relevant to the topic (e.g., Appetite and Public Health Nutrition). Finally, we included only studies in which we could retrieve correlation coefficients directly or through calculations (e.g., Hunter and Schmidt 2004).
The search, completed in October 2018, produced 114 articles (see Appendix 1), including four unpublished papers, for 130 independent studies in total, dating between 1996 and 2018. The details of article identification process are available in Web Appendix 2.
The final set included articles from the fields of food and sensory science (34.2%), marketing and consumer studies (22.8%), nutrition science (18.4%), public health (10.5%), medicine (7.0%), and individual publications in other fields, such as economics. The article set reflects the increasing attention paid to FOP nutrition labeling. The first article identified comes from 1996, though more than 75% of all articles in this meta-analysis were published between 2012 and 2018 ( Fig. 2 shows the number of articles by year).

Identification of effect sizes
Nearly all articles reported multiple effects (e.g., due to multiple effects/dependent variables included and multiple labels and/or products tested), for 1594 effect sizes in total. On average, we gathered 14 effect sizes per article, with a minimum of one and a maximum of 96 effects. When we focus on the different effects/dependent variables of FOP labels, this average varies between 4.5 (for tastiness perceptions) and 15.3 (for the identification of healthier options).
FOP labels The extant body of research has not focused equally on the different types of FOP labels: 80.7% of the articles (N = 92) included in our meta-analysis examined interpretive nutrient-specific labels, whereas interpretive summary indicator (36.0%, N = 41) and reductive nutrient-specific (29.8%, N = 34) labels received less attention. A more detailed examination of the different labels reveals that research has paid attention mostly to nutrient content claims (included in 30.7% of the articles), the Multiple Traffic Light label (29.8%), and monochrome labels, such as the Guideline Daily Amount label and Facts Up Front (29.8%). So far, research has examined warning labels the least (13.2% of articles); however, given the mandatory implementation of warning labels in Chile and plans for them in other countries (e.g., Peru), understanding their impact is becoming increasingly important. Nearly all research on warning labels was published in 2018 or is still forthcoming (12 of the 15 articles), indicating that academics have realized the importance of these labels.
FOP effectiveness According to our dataset, research on FOP labeling has focused mostly on the effects on healthfulness perceptions (N = 42, number of effect sizes [k] = 547), followed by studies on healthier choice (N = 25, k = 247), the ability to identify healthier options within a set of products (N = 15, k = 230), and purchase intentions (N = 45, k = 209). Other dependent variables included in our analyses and considered relevant for public policymakers and marketers have received less attention, especially product and brand attitude (N = 15, k = 112), actual consumption (N = 11, k = 69), and attention to the NFP (N = 7, k = 55). Web Appendix 3 provides more descriptive statistics on these dependent variables.

Primary data collection
To determine the moderating role of product category, we coded all individual products used in the collected articles. We categorized the products into broader categories (e.g., bread, yogurt, cookies) and asked 205 respondents (42% female, M age = 34.86) on Amazon Mechanical Turk (using the TurkPrime interface) to rate the categories in terms of perceived vice or virtue. We provided the respondents with the definition of vice and virtue products and asked them to rate the category on a nine-point Likert-type scale, with the endpoints Bvice^and Bvirtue^(adapted from Huyghe et al. 2017). Each respondent evaluated product categories in a randomized order, leading to a minimum of 52 ratings for each. In total, 78 different product categories were rated. We averaged the responses and used them as a dummy variable divided at the center of the scale for the analysis of the mean effect sizes, to separate the effects for the products considered more vice or more virtue, and as a continuous predictor variable in the hierarchical regression model. The full list of product categories, together with their average vice score, is available in Web Appendix 4.

Analytical approach
As a measure of effect size, we collected or computed correlation coefficients r from each article (Gardner and Altman 1986;Geyskens et al. 2009;Hunter and Schmidt 2004). To normalize the distribution of the effect sizes, we applied a Fisher's Z r transformation (Hedges and Olkin 1985). We coded the articles for information on the type of FOP label investigated, its effect on the dependent variable(s), and potential moderators (e.g., product type, research design). Following Geyskens et al. (2009), we assessed the reliability of the coding process by having a second independent coder categorize a subset of 111 randomly selected effect sizes, representing 12% of the articles. Intercoder reliability for the two coders was 97.6%, with the few disagreements resolved through discussion. We analyzed the mean effect sizes using the Hedges-Olkin meta-analysis (HOMA) approach, in which each Fisher's Z r effect size is weighted by the inverse of its variance (N-3), to account for differences in the reliability of individual studies and to give more weight to more precise effects (Lipsey and Wilson 2001). We opted for the random-effect HOMA (Raudenbush and Bryk 2002), which produces more conservative estimates than the fixed-effects model when effect size distributions are heterogeneous (while estimates are similar to the fixed-effects model if the distributions are homogeneous) (Lipsey and Wilson 2001). HOMA assumes that studies estimate different effect sizes, which are corrected for sampling error, plus other sources of variability that are assumed to be randomly distributed. We obtained the results through Wilson's STATA macro and backtransformed them into correlations (Lipsey and Wilson 2001). We quantified the moderating effects through a weighted hierarchical linear model (HiLMA), accounting for statistical dependencies in effect sizes from the same publication (Lipsey and Wilson 2001). The hierarchical structure allows us to account for within-study error correlation, as we have multiple effect sizes originating from the same article. The HiLMA also weights each observation by the inverse of its variance (Hedges and Olkin 1985), which allows us to explain the variation in the effectiveness of FOP labels as a function of the label type,  97 98 99 00 01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 Number of articles Year of publication Fig. 2 Number of articles included in the meta-analysis per year of publication product characteristics, consumer characteristics, and other study artefacts. Label type is captured by the two dummy variables interpretive summary indicator and interpretive nutrientspecific labels, with reductive nutrient-specific labels functioning as the reference category. Product characteristics are identified by the perceived virtue, as a mean-centered continuous variable based on our survey, and known brand, a dummy variable that takes the value of 1 when a known brand was used and 0 otherwise. To capture consumer characteristics, we use the proportion of female participants, as a mean-centered continuous variable, and a dummy variable for North American participants (1 = North American, 0 = other). We account for other research design factors with three variables: withinsubject design (dummy variable that takes the value of 1 for a within-subject design and 0 otherwise), field of research (dummy variable based on the field of the journal, with 1 for marketing or consumer studies and 0 for other fields), and year of publication (mean-centered continuous variable). Table 1 reports results of the random-effects HOMA, with the Fisher's z back-transformed weighted mean effect sizes (mean ES) across the three types of FOP labels and dependent variables. The table also displays Cochran's Q test to highlight the large heterogeneity among the effect sizes (Lipsey and Wilson 2001). Figure 3 shows the mean effect size per individual label, but they also further split these into the individual labels within each broader type. The exact mean effect sizes of the individual labels are available in Appendix 2. Overall, we find that FOP labels indeed influence consumers' perceptions and intentions, though a great deal of variation exists among the different label types and the dependent variables.

HOMA: Overall effectiveness of FOP labels
Attention to the NFP Both types of interpretive labels significantly decrease consumers' attention to the NFP (r-nutrient_specific = −.104, p < .05; r summary_indicator = −.259, p < .001), though only one study tests this effect for summary indicators (for rating labels; Watson et al. 2014). This implies that consumers may rely strongly on the information presented on these FOP labels, though additional research is warranted. The studies also find negative and significant results, especially in combination with formats with limited information presented, such as nutrient content claims (r = −.068, p < .001), health claims (r = −.044, p < .001), and rating labels (r = −.259, p < .001), but not for labels listing information for multiple nutrients simultaneously and repeating the NFP (rmonochrome = .025, ns; r Multiple_Traffic_Light = −.111, ns).
Product perceptions Interpretive nutrient-specific labels have the strongest influence on consumers' overall healthfulness perceptions (r = .051, p < .001). This effect holds across category types, meaning that these labels also bring with them a (worrisome) positive impact on the perceived healthfulness of vice products (r = .030, p < .01), though virtue products seem to benefit more (r = .078, p < .001). Among these labels, both health claims (r vice = .087, p < .001; r virtue = .233, p < .01) and nutrient content claims (r vice = .043, ns; r virtue = .053, ns) show a positive impact on perceived healthfulness regardless of the category, though the latter effects do not reach significance. Importantly, however, both warning labels (r = −.120, p < .001) and Multiple Traffic Light labels (r = −.102, p < .01) negatively influence healthfulness perceptions of virtue products. Similarly, interpretive summary indicator labels influence healthfulness perceptions of vices positively (r = .069, p < .01) but have no impact on virtue products (r = .001, ns).
The tastiness evaluations of labeled products are influenced negatively by interpretive labels (r nutrient-specific = −.160, p < .001; r summary_indicator = −.166, p < .01), especially virtues (r nutrient-specific = −.246, p < .001; r summary_indicator = −.282, p < .01). For nutrient-specific labels, the effect is completely driven by the impact of nutrition (r = −.286, p < .001) and health claims (r = −.101, p < .01) in virtue categories. Among interpretive summary indicators, health logos show a similar effect (r = −.317, p < .05). The effects on overall attitude measures are varied but have been limitedly examined for all but nutrient-specific label types. These results indicate that the use of nutrition-specific interpretive labels may hurt consumer attitudes toward both vice and virtue products (rvice = −.117, p < .001; r virtue = −.198, p < .001). This effect seems to be especially true for warning labels, which hurt consumer attitudes across category types (r vice = −.133, p < .001; r virtue = −.118, p < .001).
Identifying, choosing, and consuming healthier options Considering the goal of FOP labels to help consumers identify healthier options, the results of all three label types show strong positive effects (p < .01), with the interpretive summary indicator labels leading the way ( r r e d _ s p e c i f i c = . 2 6 6 , r e v a l _ s p e c i f i c = . 1 7 7 , r -eval_summary = .374). These effects do not directly translate into consumers making healthier decisions. Although all labels have a significant (p < .001) and positive impact on consumers choosing healthier produ c t s ( r r e d _ s p e c i f i c = . 0 93 , r e v a l _ s p e c i f i c = . 0 7 9, r -eval_summary = .023), these correlations are much smaller than those found for identification of healthier products. Among the individual labels, warning labels show the strongest impact on consumers' choice of healthier food products (r = .140, p < .001).
Regarding purchase intention (toward labeled products), we observe that, overall, only interpretive nutrient-specific labels are able to increase it (r = .026, p < .05). Importantly, both interpretive label types increase purchase intentions tow a r d v i r t u e s ( r n u t r i e n t -s p e c i f i c = . 0 8 7 , p < .01; r -summary_indicator = .072, p < .01) but do not influence vices. Crucially, an examination of the individual labels reveals some unwanted effects as well: health claims significantly increase consumers' intentions toward purchasing vice products carrying those claims (r = .139, p < .001), while warning labels hurt purchase intentions regardless of the healthfulness of the category (r vice = −.082, p < .001; r virtue = −.104, p < .001). For actual consumption, we find that FOP labels have little impact. The overall effects for the three label types are all nonsignificant, and we observe only one significant effect at the level of the individual labels, in which health This finding, however, comes from just one study (Belei et al. 2012). Thus, further research on actual consumption effects is required.
Overall, the HOMA results show that though most FOP labels help consumers identify healthier options within product sets, this does not directly translate into other measures of effectiveness, which show smaller effects and greater variability between different label types and product categories. Moreover, while some FOP labels generated the desired effects, others, such as the interpretive nutrient-specific labels, may be misleading. In addition, we verified how many studies are necessary to nullify our significant results using the failsafe N (Rosenthal 1979). The number of studies obtained from these tests is 7446 on average, varying between 220 (where the current number of effect sizes is 4) and 24,618. These results suggest that even if we did not identify all unpublished studies for inclusion in our data set, the significant results in our analyses are unlikely to suffer from strong publication bias. However, caution is still warranted in the interpretation of results based on fewer than five effect sizes, such as the aforementioned effect on attitude and the negative effect of interpretive summary indicator labels on attention to the NFP, yielding a failsafe N of 258. All other significant results lead to a failsafe N of a minimum of 1628, suggesting that at least 1628 studies with null results would be required to nullify the results of the HOMA. Finally, the many significant Cochran's Q tests for homogeneity indicate strong heterogeneity within the effect sizes, calling for additional moderator analyses to control for the possible product, consumer, and research design characteristics at play.

HiLMA: The impact of study characteristics and methodological choices on FOP
We assess the impact of various moderating factors that may affect the effectiveness of FOP labels through a weighted HiLMA (Bijmolt and Pieters 2001). The models are not affected by severe multicollinearity, with the highest variance inflation factor being lower than 5. Table 2 presents the results of the analysis. Additional robustness checks are available in Web Appendix 5. Overall, the results show that multiple contextual factors have a significant impact on the effects of FOP labels on the different dependent variables.
We find that interpretive nutrient-specific labels on the front of packages influence consumers' attention to the NFP more negatively than reductive labels (β = −.190, p < .001). Furthermore, nutrient-specific labels with an interpretative element show a weaker impact on consumers' healthfulness perceptions than reductive ones (β = −.053, p <.01), when controlling for other factors. However, as consumers are evaluating the products, interpretive summary indicators lead to marginally more positive tastiness evaluations (β = .153, p <.10) and overall attitudes toward the product (β = .101, p <.01). By contrast, interpretive nutrient-specific labels have a more negative effect on attitudes (β = −.035, p < .001).
Overall, all FOP label types are beneficial for consumers trying to identify healthier options from product sets, with interpretive nutrient-specific labels having the strongest impact (β = .067, p <.05). This does not translate, however, into more healthy choices being made, as the different label types show similar effects for healthy choice (β eval_specific = .003, ns; β eval_summary = .016, ns). Although the impact of FOP labels on purchase intentions toward the labeled product is small, reductive labels show a more positive effect than interpretive nutrient-specific labels, but they do not differ from interpretive summary indicator labels (β eval_specific = −.075, p <.05; β eval_summary = −.025, ns). Finally, the label types do not differ in their ability to influence actual consumption behavior (β eval_specific = .064, ns; β eval_summary = .026, ns).

Product characteristics
The perceived virtue of a product category can influence the effectiveness of FOP labels. More specifically, FOP labels on products scoring low on Bvirtue^can lead consumers to perceive these (vices) as healthier (β = −.043, p <.01). Furthermore, using FOP labels may result in more negative product attitudes (β = −.029, p < .001), tastiness perceptions (β = −.022, p <.01), and purchase intentions (β = −.090, p <.10) for virtue products. By contrast, the effectiveness of FOP labels in terms of consumption (β = .003, ns) does not significantly differ depending on the product category. Use of a known brand as a stimulus also does not influence most of the dependent measures, with the exception of healthfulness perceptions, in that consumers are less influenced by FOP labels when evaluating the healthfulness of known rather than fictitious or unknown brands (β = −.048, p < .001).
Consumer characteristics Regarding the consumer-related moderating factors, a higher proportion of female participants leads to more positive effects on attitude (β = .739, p <.05) but lower tastiness perceptions (β = −4.179, p < .001). Women also reduce consumption more than men as a result of FOP labels (β = −.072, p < .001).
North American participants tend to benefit less from FOP labels than consumers in other parts of the world when identifying healthier options (β = −.010, p < .001) and are also less influenced by the labels with regard to healthfulness perceptions (β = −.072, p < .001). By contrast, FOP labels have a more positive impact on actual food consumption among North Americans (β = .162, p <.05).
Research design Not surprisingly, research designs influence the strength of the results found. When consumers are evaluating products both with and without FOP labels in a within- Virtue score not included in the model as included in the DV subject design, the impact of the label is stronger for purchase intention (β = .182, p < .001) but weaker for healthfulness perceptions (β = −.345, p < .001). The field in which the article is published also seems to affect the strength of the effects, with marketing and consumer studies showing more positive effects for product attitude (β = .495, p <.01) and actual consumption (β = .121, p <.05). We find that the effects for FOP labels have changed slightly over time. More recent studies on healthier choice behavior (β = −.016, p <.10), attitudes (β = .024, p <.01), and consumption (β = .010, p <.05) show weaker effects than earlier articles on FOP labeling. Furthermore, we find some differences in the effect sizes between published and unpublished studies, with unpublished studies showing more negative effects on healthy choices (β = −.236, p < .001) and consumption (β = −.238, p < .001), when controlling for other factors.

Discussion
A wealth of research has tried to understand the antecedents of health label usage among consumers and consumers' attitudes toward food labeling. This stream of literature often focuses on a subjective understanding of FOP labels (Borgmeier and Westenhoefer 2009;Grunert and Wills 2007)-that is, whether consumers believe they understand the information provided on food labels-which does not necessarily reflect their actual attitudes or behaviors (Levy et al. 1992). The studies included in our meta-analysis, however, examine this issue in more objective terms, showing that FOP labels indeed help consumers identify and compare healthier products. Our work provides the first quantitative interdisciplinary generalization of the effects of the different FOP labels. Below and in Table 3 we summarize the key findings based on our results.

Key findings
FOP labels steal attention from the NFP The results show that the presence of FOP labels may reduce the attention consumers pay to the NFP, suggesting a strong reliance on the information presented on the front. This especially seems to be the case with labels that do not offer the same information as the NFP: either claims focusing on individual nutrients or rating labels offering a summary of the overall healthfulness of the product. This underscores the importance of carefully reviewing the information presented on the front and implementing clear regulations against misleading or simple claims, which may lead consumers to overgeneralize the information to the overall healthfulness of the product (Roe et al. 1999;Burton et al. 2014). Given the limited number of studies on this issue, further research is necessary to confirm these findings and to better understand the underlying process.
Identifying healthier options easier with FOP labels Even though previous work has found strong support for interpretive nutrient-specific labels helping consumers identify healthy products (Hersey et al. 2013;Volkova and Ni Mhurchu 2015), in line with Ducrot et al. (2015) we find that interpretive summary indicators are most easily understood by consumers, allowing them to identify healthy products more accurately. However, the effect of all FOP labels on consumers' ability to identify healthier options is positive, suggesting that FOP labels are able to cover the first part of the goal set by the FDA (2010): Bincrease the proportion of consumers who readily notice, understand ... the available information.^Though all labels help consumers, based on our results, an overall summary of the healthfulness is more beneficial than a focus on individual nutrients. However, whether this knowledge of a product's relative healthfulness translates into healthier choice and purchase behavior, the second part of the FDA goal for FOP labels, is less clear from existing research.
Knowing what is healthier does not directly translate into healthier behaviors Although we find that all label types have positive effects on consumers' choice of healthier options, these effects are much smaller than for the identification of the healthier options. Purchase intentions are only influenced by interpretive label types, which increase consumers' intentions to buy virtue products. However, the literature provides limited evidence for effects of FOP labels on consumers' intentions to purchase unhealthy vice products. Taken together with the results indicating the consumers are not adapting their consumption behaviors (i.e., the actual intake of food) based on FOP labels, our results question whether FOP labels are able to reach the second part of the FDA goal, to Buse the available information to make more nutritious choices for themselves and their families, and thereby prevent or reduce obesity and other diet-related chronic disease.Ô ne explanation for the improved ability to identify healthier options not leading to similarly strong effects on purchase intentions and consumption behavior stems from the many other factors driving food purchases. Despite the increase in consumer health consciousness (e.g., Michaelidou and Hassan 2008), tastiness remains a key driver of food choice (Raghunathan et al. 2006). We find that highlighting a product's healthfulness through interpretive FOP labels hurts the tastiness evaluations of food products, offering support to the notion that the Bunhealthy = tasty^intuition (Mai and Hoffmann 2015;Raghunathan et al. 2006) may be a limiting factor in nutritional information leading to healthier choices. This effect is especially strong for nutrient content claims, health claims, and health logos. However, we find no positive effect for warning labels, as expected from the negative effect on healthfulness perceptions. This suggests a potential way to battle the Bunhealthy = tasty^intuition, though further research is needed.

Best front-of-package label
Key take-aways

Reductive Nutrient-Specific
While reductive nutrient-specific do not alter the attention to nutritional information, Interpretive labels drive away consumers' attention from the NFP

No clear winner
All three types of front-of-package labels increase the perceived healthfulness of vice products. Interpretive nutrient-specific labels are the only ones (positively) influencing virtues as well, but there is strong variation between the specific labels within the category, and the positive effect is mainly driven by health claims.

No clear winner
Both types of interpretive labels have a strong, negative impact on the tastiness perception of virtue products, showing support for the unhealthy = tasty intuition

Interpretive Summary
Both reductive and interpretive nutrient-specific labels hurt consumer attitudes toward virtue products, the latter also for vices. Interpretive summary indicators do not hurt, but show no positive impact, either.

Interpretive Summary
All three types of labels clearly help consumers identify healthier options, reaching one of the main goals of front-of-package labeling by the FDA. Interpretive summary indicators show the strongest effect.

Purchase intention
Interpretive Nutrient-Specific The effect sizes in relation to purchase intentions are overall limited. Both interpretive label types show a positive impact on the purchase intentions of virtues, but no label type lowers consumer interest toward vices.

Healthy choice No clear winner
All label types show some positive effect on consumers' making healthier choices.

No clear winner
Front-of-package labeling, regardless of the type, is not influencing consumers' consumption, calling into question their ability to influence eating behavior.
Another explanation may be that simply providing more nutrition information may not lead to direct changes in behavior (Balasubramanian and Cole 2002), and FOP labels may simplify the search process mainly for those already interested in buying healthier products (Aschemann-Witzel et al. 2013a;Lobstein and Davies 2009). Indeed, we show that the impact of FOP labeling on perceived healthfulness is more pronounced in studies with a higher proportion of female participants, who tend to be more health conscious (Lassen et al. 2016). We thus find that the labels allowing consumers to more easily identify healthier options are not necessarily the ones that lead to healthier choices. This is in line with the notion that the provision of information alone does not directly lead to changes in behavior.
Misleading interpretation of healthfulness information To avoid misleading consumers or promoting unhealthy products, FOP labels should ideally have a negative impact on the perceived healthfulness of vice products and a positive impact on that of healthy products (Talati et al. 2016). However, we find that all three label types have some positive influences on the perceived healthfulness of vice products (Table 1). Thus, FOP labels may lead consumers to believe that the product is healthier on all aspects, suggesting a potential halo effect of these labels. The effects for health claims and health logos, despite their dissimilar content and focus, seem highly similar across multiple different outcome variables. This implies that consumers may perceive an interpretive nutrient-specific health claim as an indicator of overall healthfulness, in support of prior work on health halos, which has found that consumers use information about specific attributes a basis of inference in the absence of information about a product's overall healthfulness (Roe et al. 1999). Thus, a positive claim about a specific nutrient can lead consumers to assign similar positive values on other attributes, and overall healthfulness. Interestingly, consumers seem less influenced by nutrient-content claims, such as 30% less sugar, than health claims. Possibly, consumers are aware of the stricter regulation surrounding health claims, and therefore more suspicious of the more relative nutrient-content claims. This could indicate an even stronger reliance on the health halos created by the health claims consumers consider trustworthy.
Though in line with previous research (Talati et al. 2016), it is less clear why labels offering (negative) information regarding multiple nutrients, such as the reductive nutrient-specific labels and the Multiple Traffic Light Labels, have a similar positive effect on the perceived healthfulness of vice products. It is possible that consumers' perception of the healthfulness of vice products is already very low based on category knowledge and expectations, and the provision of nutrition information leads to an adaptation to a more moderate perception. This possible explanation however, requires further research as it has not been studied so far.
Research has also found that Multiple Traffic Light labels especially increase attention to negative nutrients (Jones and Richardson 2007). This helps explain the negative effect of Multiple Traffic Light labels in virtue categories: if a product in a virtue category scores red on one or more nutrients, it might have a harmful impact on the overall perception of the product's nutritional value. Overall, none of the individual labels seems capable of influencing the healthfulness perceptions of virtues positively and vices negatively, warranting additional research on the topic.
Less impactful front-of-packaging label for familiar brands Finally, our moderator analysis suggests that FOP labels influence the perceived healthfulness of known brands. This finding is in line with existing research suggesting that labeling is less influential when consumers have already formed opinions about products and therefore pay less attention to labels . Limited research has thus far been conducted in real shopping situations, in which consumers are mainly faced with known and familiar brands, but this finding indicates that simply adding FOP labels may not be enough to change consumer perceptions.

Implications
Our results allow a better understanding of the aspects affecting the effectiveness of FOP labels of use for both public policymakers and marketers.
For public policymakers We find that to meet a key goal of FOP labels to help consumers identify healthier options from product sets, interpretive summary indicator labels are most helpful. Importantly, these labels may be most beneficial for consumers lacking the ability to interpret more detailed nutrition information, who may be more at risk for health issues related to unhealthy consumption (Ducrot et al. 2015). By contrast, given previous findings on consumer preferences (see Méjean et al. 2013), public policymakers and manufacturers should consider label types combining detailed nutrition information with an interpretive aspect, such as the Multiple Traffic Light. However, they should also note that these may have a negative influence on the perceptions of healthy food products. A potential solution may be to combine the two label types, adding a traffic light for the overall healthfulness of the product (Temple and Fraser 2014).
The recently popularized warning labels offer promising results in driving healthier choices among consumers. These labels negatively influence consumers' perceptions of both the healthfulness and tastiness of unhealthy food products. However, this effect remains negative in virtue categories as well, in which, for example, an otherwise healthy product high in sodium may also be hurt by the sodium-warning label. Therefore, it is critical to understand consumers' reactions to warning labels in more detail: if consumers switch from labeled products, what do they choose as alternatives?
However, the results also call for caution in relying on the provision of nutrition information as a way to improve public health. Although we find that, in general, consumers are better able to identify healthier options from a set of products when FOP labels are present, the effects on healthier choice are weaker, though still positive. The impact on consumption has received insufficient attention in the literature, but it seems that FOP labels are not influencing the amount consumers eat, indicating their limited effectiveness in influencing behaviors. Although providing consumers information about the foods they consume in a simpler, faster-toprocess format is certainly beneficial, it remains crucial to find more ways to increase their motivation to consume healthy foods.
For marketing practice Overall, our results suggest that the impact of FOP labels on consumers' purchasing behavior is rather limited. However, consumers do react positively to health and nutrient content claims. A responsible marketer should be cautious though: these labels may increase sales of unhealthy items as well. Little is known about consumers' reactions to retailers or manufacturers that promote unhealthy items through the creation of health halos. By contrast, warning labels show a negative influence on purchase intentions of the products carrying the warnings. As such labeling may become increasingly common across various countries, manufacturers may want to offer a healthier alternative so that consumers switching from the warning-labeled products can do so within the same brand offering. Alternatively, it seems beneficial to adjust recipes and avoid the implementation of the warning labels on products altogether.
Our results also indicate that FOP labels may influence consumers' overall product attitudes. That is, consumers seem to react negatively to health and nutrient content claims but positively to interpretive summary indicators. This suggests that brands can use FOP labeling to build health associations (see Bollinger et al. 2018). Conversely, health and nutrient content claims, which consumers do not always trust (Kozup et al. 2003), may also hurt the brand. Marketers should be careful in deciding which labels to implement not only because of a direct impact on sales but also because of the broader impact on the brand.

Limitations and future research directions
While meta-analyses provide generalizable results, they come with their own limitations. First, although we carefully scanned various databases and journals for all relevant literature, it is possible that we failed to identify some publications. We also excluded many studies identified for this metaanalysis because of unavailable data. Higher reporting standards should be implemented across all disciplines and journal tiers in order to allow future empirical generalizations. Second, in our analyses we filter out the most common methodological choices of the primary studies included in our sample, missing out other study characteristics that may bias results. As our results highlight how these methodological choices and study characteristics affect results (e.g., the use of familiar brands, female respondents, within-subject designs), future studies should more carefully justify their research design, and report additional robustness checks. Third, all the studies included assume consumer awareness of the label. However, many factors demand consumers' attention at the point of purchase and also influence their motivation and ability to read FOP labels. These factors will certainly influence our findings, which are only suggestive of effects from consumers actually paying attention to the labels (van Kleef and Dagevos 2015). Table 4 highlights some of the key questions for future research to investigate with regard to front-of package labels.
Policymaker and consumer perspective One of the crucial factors requiring better understanding is the interaction between different label types and consumer characteristics. Research indicates that factors pertaining to, for example, consumers' health motivation and knowledge have an impact on the effects related to consumer behavior and nutrition labeling (Hieke and Taylor 2012). Future work should aim to understand the moderators of FOP labeling effects in more detail, especially by considering personality factors and ways to draw consumers' attention to the label. Given the criticism that FOP labels may only help consumers who are already desiring healthier food options, future research should investigate which types of labels help encourage healthier consumption among the less health conscious (those more at risk of the various negative health consequences of unhealthy consumption).
Furthermore, much of the research on FOP labels assesses the effects across a range of consumers, assuming that their perceptions of products and their healthfulness are fairly homogeneous. However, healthfulness can mean different things to different consumers: while some consumers might be focusing on avoiding the intake of excess calories, others may be limiting their sugar consumption or seeking gluten-free products because of intolerance or allergies. It is important to understand how individual differences in perceived healthfulness influence the impact of different types of FOP labels. For example, our results suggest that consumers react differently to labels on healthy and unhealthy products depending on their subjective perceptions.
Relatedly, our results show that while most FOP labels can help consumers identify healthier products, this does not directly translate into the choice of these products. In other words, consumers' lack of knowledge about the healthfulness of products is not the only issue driving their food choices, and knowing whether a product is healthy does not help if the consumer is not interested in finding healthy options. As such, future work should aim to understand the factors that make consumers prefer less healthy products and to find ways to counteract these. Do any FOP labels increase consumers' health motivation at the point of purchase? If so, can this explain some of the differences between the effectiveness of the different labels? Indeed, little is known about the psychological mechanisms underlying the (un)effectiveness of FOP labeling.
Manufacturer and marketer perspective Most of the research on FOP labeling is based on marketplace examples of different label types and designs, leading to a focus on comparing the various labels and their impact on a range of outcomes but not on the design factors contributing to these differences. Although some work has examined the impact of different shapes and colors on the effectiveness of warning labels (Cabrera et al. 2017), a more detailed understanding of the impact of these design factors on the effectiveness of FOP labeling is necessary. Knowledge on consumer reactions to design details, framing, or even the abstractness of the information presented may offer new insights into the Bbest^FOP label.
Moreover, only a small number of studies in our analysis examine the simultaneous impact of various FOP labels. Future work should aim to understand how nutrition and health claims interact with reductive label types or interpretive summary indicators, such as rating labels. Similarly, assessing the interplay of FOP labels and other information provided on the package can help explain how consumers react to the complexity of the information provided at the point of sale. Would consumers react to negative information in a nutrition label differently if the package also carried claims of naturalness or superior tastiness?
Finally, we focused only on the consumer perspective. It seems relevant to study FOP labels from a more systemic point of view. For example, the implementation of different FOP labels can motivate manufacturers to refine their recipes, leading to healthier product assortments (Lobstein and Davies 2009). Such a change would be a fruitfulalbeit long term -avenue toward healthier consumption patterns. Similar points can be made with regard to retailers, who might be inclined to include products with FOP labels in their assortments, in order to improve their overall image as a responsible retailer, and to attract the growing segment of health-conscious consumers. 2. Effectiveness Do front-of-package nutrition labels help consumers who are most at risk for health issues related to unhealthy consumption or simplify choices of consumers already interested in healthy consumption? How do consumers' subjective perceptions of product healthfulness influence the effectiveness and outcomes related to front-of-package labeling? Can front-of-package nutrition labels help increase consumer motivation to consumer healthier foods? Do the possible effects of front-of-package labels hold over time, or is their impact mainly driven by the newness of recently implemented new labeling? Manufacturer perspective 1. Product development Does the implementation of front-of-package nutrition labeling motivate manufacturers to develop healthier products? 2. Label design How do the design factors of nutrition and health labels, such as their shape, coloring, and placement on the package, influence their effectiveness in attracting attention and influencing consumers' choices? How are consumers influenced by packaging carrying multiple (different forms of) front-of-package nutrition labels simultaneously? What about in combination with other labels not directly related to health (e.g., organic-production claims or allergen information)?
3. Brand image How does the voluntary implementation of front-of-package nutrition labels influence the brand's image or consumer perceptions of the manufacturer? Can brands gain positive value by simplifying the search of consumers for healthier products and signaling concern for public health? Can front-of-package labels attract attention and change behaviors in the cluttered retail environment?
Which label formats are best in these situations? Do consumers' past experiences with brands influence front-of-package labels' ability to influence consumer perceptions of products from known brands?