Participatory plant breeding reveals that geosmin concentration is not the central determinant of hedonic liking in table beet

Participatory plant breeding and rapid sensory evaluation are effective techniques for organic cultivar development. Table beet is an important crop for organic growers, and geosmin, a volatile compound which confers earthy aroma, has been suggested as the attribute around which hedonic liking of beet is organized. Open pollinated table beet populations with diverse pigmentation and low (LGC) or high (HGC) geosmin concentration served as starting materials for the first PPB effort in table beet. This project sought to develop consumer-accepted specialty beet cultivars for organic systems and to gauge consumer perception of and preference for geosmin concentration in non-laboratory conditions. LGC and HGC initial populations were significantly different in mean geosmin concentration but not mean TDS. LGC populations diverged significantly in geosmin concentration over two cycles of selection for hedonic liking, due to drift rather than selection. PPB yielded cultivars ‘Evansville Ember’, ‘Snowglobe’, ‘Blushing Not Bashful’, ‘Evansville Orbit’, and ‘Moving Target’. Cultivar novelty and market development were strengthened by chef input and association with a publicly funded seed system development group. Geosmin concentration was not the central determinant of hedonic liking or perceived earthy flavor in table beet. Earthiness was inconsistently associated with geosmin concentration and hedonic liking. Sweetness and bitterness were positively and negatively correlated with liking, respectively, although sweetness was not associated with variation in TDS. Cultivars with a broad range of geosmin concentration were well accepted by consumers, and manipulating expectation—via appearance—may be as powerful as manipulating flavor compounds in influencing liking of table beet.


Introduction
Table beet (Beta vulgaris ssp vulgaris) is distinctive for its betalain pigmentation and earthy flavor (Goldman and Navazio 2003) and is widely produced by Wisconsin farmers (USDA NASS 2020), including organic farmers marketing directly to consumers (Lyon et al. 2015). Flavor is of utmost importance for consumers in local and organic marketplaces (Dawson and Healy 2018); such consumers also accept-or even seek-specialty varieties with novel appearance or culinary quality (Goldman and Navazio 2003). Indeed, specialty vegetable varieties can be important in market niche establishment (Lyon et al. 2015), and multiple preferred flavor profiles can coexist within a market space (Dawson and Healy 2018). Research into table beet flavor-in particular, the earthy aroma conferred by the aromatic molecule geosmin (trans-1,10-dimethyl-trans-9-decalol)yielded open pollinated (OP) table beet populations with diverse pigmentation and uniquely low geosmin concentration (LGC) or high geosmin concentration (HGC) character (Maher and Goldman 2017). Due to their distinctive flavor characteristics and varied colors, these populations held potential for development into specialty cultivars aimed towards organic farms and direct-to-consumer markets. Plant breeding priorities of organic farmers diverge from those of conventional farmers (Hubbard and Zystro 2016) and reflect their customers' emphasis on flavor (Dawson et al. 2017;Brouwer and Colley 2016), and participatory plant breeding (PPB) has been used to facilitate effective and culturally-appropriate cultivar selection for organic systems (Shelton and Tracy 2016). Rapid sensory evaluation, which involves untrained tasters or professional experts in lieu of trained sensory panels, can yield accurate, relevant information regarding consumer flavor perception and preference when used in plant breeding programs (Dawson and Healy 2018). The present work represents, to our knowledge, the first PPB effort in table beet; it sought to develop consumer-accepted specialty beet cultivars with strong performance on Wisconsin organic farms, and concurrently, it sought to gauge consumer perception of and preference for geosmin concentration in non-laboratory conditions. Participatory plant breeding is a crop development strategy that facilitates farmer-involved selection in environments that diverge from those targeted by conventional plant breeding (Atlin et al. 2001). PPB was first developed to serve farmers cultivating marginal land in developing countries (Ceccarelli 1994), but more recently, PPB has been adapted to serve organic and low-input farms in developed countries (Dawson et al. 2008). Because organic agroecosystem characteristics, cultural practices, and consumer preferences differ from those of conventional agriculture, organic growers' plant breeding priorities often diverge from those of conventional growers (Hubbard and Zystro 2016). While breeding priorities vary by crop, organic farmers tend to prioritize yield, plant disease resistance, weed competitiveness, crop resilience during abiotic stress, and flavor (Brouwer and Colley 2016;Hultengren et al. 2016;Hoagland et al. 2015). By facilitating direct selection in the target environment-rather than indirect selection in a conventional environment-PPB produces greater response to selection for quantitative traits like flavor, yield, and climactic adaptation (Ceccarelli 1996;Murphy et al. 2007;Renaud et al. 2014). Importantly, farmer involvement in PPB can lead to increased varietal adoption, which increases the program's effectiveness and cost efficiency (Ceccarelli 2015).
Farmer-engaged research is common within the organic seed community, due to both its fruitful results and its resonance with the organic community's values of self-reliance and community knowledge exchange (Shelton and Tracy 2016). PPB models can be breederor farmer-initiated, carry out selection on farms or research stations, engage farmers in part or all of the breeding process (Shelton and Tracy 2016), and involve breeder collaboration with a single farm (e.g. Shelton and Tracy 2016;Mazourek et al. 2009) or multiple farms (e.g. Hoagland et al. 2015;Myers et al. 2011). Because flavor has emerged as such a critical breeding priority for the organic and consumer-direct marketplace, some projects include culinary professionals in development and evaluation of cultivars (Beans 2017). Indeed, in the case of developing vegetable cultivars for direct markets, farmer varietal adoption hinges directly on consumer acceptance, so expanding PPB to include chefs and household consumers in participatory plant breeding models is both advantageous for gauging marketability (Dawson and Healy 2018) and consistent with historic organic community values (Shelton and Tracy 2016). Finally, while cultivar development for the conventional 123 vegetable market is primarily accomplished by large private seed companies, these companies have little economic incentive to develop cultivars for organic agriculture (Atlin et al. 2001). Shelton and Tracy argue that given the environmental and social good associated with organic agriculture, organic cultivar development is well suited to public breeding programs, especially when farmer involvement can be achieved (2016).
In sensory science, flavor is defined as the ''biological response to chemical [stimuli] by the senses [that is] interpreted by the brain in the context of human experience'' (Heymann et al. 1993). Chemical stimuli are comprised of molecules that bind to either taste receptors on the tongue to produce sweet, sour, salty, bitter, and umami flavors; to oral trigeminal nerve endings that detect astringency or pungency (Roper and Chaudhari 2017); or to aroma receptors in the back of the throat and nasal cavity (Soudry et al. 2011). Aroma, or odor, is the sensation caused by the action of volatile compounds on the olfactory system (Soudry et al. 2011); importantly, these compounds interact rather than act additively to produce perceived aroma (Wang and Seymour 2017). Flavor perception varies by individual due to variation in number, assortment, and sensitivity of sensory receptors (Klee and Tieman 2018), which can depend on both genetics and habitual diet (Tesileanu 2019). Signals from all activated flavor receptors are processed by the brain; aroma signals are first associated with memories and emotions (Soudry et al. 2011), and then sensory signals are integrated with environmental, historical, and socio-cultural information to create a unified experience of sensation and preference (Lahne 2016).
Flavor perception and preference, then, are inseparable from an individual's neurocognitive landscape, from which a quality expectation is generated for each food item encountered (Deliza and MacFie 1995). Initial expectations regarding sensory quality (as reviewed by Piqueras-Fiszman and Spence 2015) and hedonic liking (as reviewed by Fernqvist and Ekelund 2014) are generated based on previous experience with the food, intrinsic quality cues, and extrinsic quality cues. Intrinsic quality cues are inherent to a food product; these include aroma, taste, texture, appearance, and freshness (Piqueras-Fiszman and Spence 2015). Of all product-intrinsic cues perceptible before food consumption, color is of foremost importance in establishing expectations about foods (Spence and Piqueras-Fiszman 2016). Extrinsic quality cues are associated with a food item but not inherently part of it; these include price, brand, nutrition information, and credence cues, which are pieces of information that establish the credibility of a seller to a buyer (Fernqvist and Ekelund 2014). Upon tasting, these sensory and hedonic expectations are either confirmed or disconfirmed. According to Deliza and MacFie (1995), confirmed expectations generally result in consumer satisfaction and repeated product use. A similar result occurs when positive disconfirmation takes place, or when the food is perceived as better than expected. Negative disconfirmation -when the food is perceived as worse than expected-is postulated to result in lowered expectations and/or future rejection of the food.
While flavor represents a complex, variable, and reflexive interaction between food and human, plant breeders seek to determine and select for the physicochemical compounds most strongly associated with preferred flavor (Klee and Tieman 2018). Historically, characterization of table beet flavor has revolved around its most salient flavor attributes-sweet flavor and earthy aroma-which are derived from sucrose (Bach et al. 2014) and the volatile terpenoid molecule geosmin (Murray et al. 1975), respectively. In addition, compounds with known bitter flavor-saponins (Mikołajczyk-Bator et al. 2016), flavonoids (Kujala et al. 2002), and phenols (Bavec et al. 2010)-are present in beet, as is oxalate (Freidig and Goldman 2011), which produces abrasive sensory qualities in other crops. A trained sensory panel detected 23 and 17 sensory attributes in raw and boiled beet, respectively; in boiled beet, sweet, earthy, and bitter flavors were salient, while prominent aromas included earthy, beetroot, boiled potato, and sweet (Bach et al. 2014). The dominance of earthy flavor and aroma to the sensory character of table beet has been well documented anecdotally (e.g. Goldman and Navazio 2003) and in lay literature (e.g. Hartke 2020), as has the range of consumer attitude toward earthy flavor in beet: from adoration to aversion. Because earthy flavor deters some consumers from beets, and because geosmin is known to produce earthy aroma, research has been undertaken to determine the degree to which geosmin concentration in table beet is genetically controlled (Freidig and Goldman 2014;Maher andGoldman 2017, 2018;Hanson and Goldman 2019;Hanson et al. 2021). In addition, OP cultivar 'Badger Flame' was developed by Goldman and Breitbach at UW-Madison for low levels of earthy flavor-albeit without laboratory measurement of geosmin concentration-and consumer response to the cultivar has been positive (Rao 2018) but not measured empirically. The present work seeks to extend technical research on geosmin heritability into the realm of practical applicability and to complement anecdotal preference evaluation of extreme-geosmin table beets with empirical evidence.
Geosmin imbues moist soil with its distinctive aroma but is considered a contaminant in water, wine, beer, fish, and other foods (reviewed in Liato and Aïder 2017). Geosmin is present in all five B. vulgaris ssp. vulgaris crop types, including table beet, sugar beet, and chard (Freidig and Goldman 2014), and closely related dehydrogeosmin is present in other members of the Caryophyllales order (Schlumpberger et al. 2004). This 12-carbon molecule with earthy and musty odor characteristics (Gerber 1967) is known to be synthesized by a diverse group of organisms including actinobacteria, cyanobacteria, and proteobacteria (Churro et al. 2020), fungi (Liato and Aïder 2017), and liverworts (Spiteller et. al 2002). The biosynthesis of geosmin in Streptomyces coelicolor bacteria hinges on a single bifunctional geosmin synthase enzyme (Harris et al. 2015), which carries out a cyclization-fragmentation reaction in the presence of Mg 2? ions and 15-carbon precursor molecule farnesyl diphosphate (FPP) (Jiang et al. 2007). A geosmin synthase gene has been discovered in both Streptomyces coelicolor (Cane and Watt 2003) and cyanobacteria (Giglio et al. 2008, Churro et al 2020. Investigation of the B. vulgaris ssp. vulgaris genome supports the plausibility of a geosmin synthase gene in table beet. A large region of B. vulgaris ssp. vulgaris Chromosome 8 showed association with geosmin concentration in a genetic mapping study, representing the first connection of a physical genomic location in B. vulgaris ssp. vulgaris with geosmin concentration (Hanson et al. 2021). In addition, a preliminary search of the RefBeet sugar beet reference genome (Dohm et al. 2014) for the S. coelicolor geosmin synthase sequence returned two hypothetical proteins with predicted terpenoid synthase function (Huang et al. 2017) also located on B.vulgaris Chromosome 8 (Maher 2017).
While the genomic and biosynthetic origin of geosmin in table beet remains under investigation, a substantial body of evidence shows that geosmin concentration in table beet is both endogenously produced and under primarily genetic control. Because geosmin is known to be synthesized by diverse microbes, it was historically assumed that geosmin concentration in B. vulgaris ssp. vulgaris owed to association with either soil-dwelling or endophytic microbes. However, beet accessions grown in autoclaved and non-autoclaved soil showed no significant difference in geosmin concentration, and cultivar-specific geosmin concentrations could be discerned (Freidig and Goldman 2014). Geosmin was also present in beet seedlings grown in sterile tissue culture-sometimes in significantly higher concentration than in greenhouse-grown comparisons (Maher and Goldman 2018)-further refuting the hypothesis that geosmin in table beet is of microbial origin. Evidence of the heritability of geosmin in table beet is consistent with that supporting its endogenous origin. Geosmin concentration responded to bidirectional recurrent selection in OP table beet populations, showing realized heritability of 0.70 and 0.23 for downward and upward selection, respectively (Maher and Goldman 2017). Genotype was also found to be responsible for 90% of variance in geosmin concentration in an experiment comparing four table beet genotypes across environments and fertilizer treatments (Hanson and Goldman 2019). Given the consistent evidence supporting the heritability of geosmin concentration in table beet, significant potential exists to develop cultivars with novel geosmin concentration. However, to define appropriate future breeding goals for geosmin in table beet-and more broadly, for table beet flavor-it is necessary to probe the connections between geosmin concentration, geosmin perception, perception of earthy flavor, perception of signature beet flavor, and hedonic liking.
A perception threshold for geosmin in edible table beet has not been determined, but it is likely to depend on both individual olfactory sensitivity and table beet preparation. Geosmin odor detection thresholds vary by individual but are exceedingly low at 5 to 50 ng/L, depending on the chemical matrix within which geosmin is situated (Liato and Aïder 2017). A study of geosmin odor detection in cooked beet juice found a geosmin detection threshold of 5.8 lg/Lsubstantially higher than the aforementioned thresholds-above which several participants deemed the juice ''too earthy to be beet like.'' One panelist also 123 found beet juice with very low geosmin concentration (0.36 lg/L) to be ''bland and without character,'' suggesting that a fairly narrow range of geosmin concentration was perceived by tasters to be characteristic of table beet (Tyler et al. 1979). Discrimination tests using raw beet samples of varied geosmin concentration suggested the existence of a threshold for geosmin perception but were not designed to quantify such (Maher and Goldman 2017). Thus, differences in geosmin concentration may be perceptible in table beet, depending on concentration, preparation, and individual detection threshold, and perceived differences in geosmin concentration may be associated with characteristic beet flavor, but this neither implies a preference for a particular geosmin concentration nor provides useful descriptors for the perceived differences in beet flavor. In contrast, Bach et al. (2014) reported perceived flavor and aroma attributes of table beet cultivars, along with appropriateness-which implies positive evaluation-for boiled, raw, and pan-fried preparations. However, geosmin concentration was not measured, so its association with perceived earthy flavor and culinary appropriateness could not be measured. Finally, these sensory studies of beet Bach et al. 2014;Tyler et al. 1979) report neither demographic data for their sensory panelists nor panelists' prior preference for beet flavor, but because flavor perception is genetically and culturally mediated, it cannot be assumed that these studies' results apply to all consumers.
While earthy aroma is the sensory attribute that most distinguishes table beet from other root vegetables, sweet flavor is also salient in table beet (Bach et al. 2014). Sweet flavor is associated with sucrose, the disaccharide produced via photosynthesis and translocated to beet roots. Sucrose, commonly measured as total dissolved solids (TDS) in table beet, is directly related to total dry root mass (McGrath and Panella 2018) and is slightly more concentrated in in the outer root zone of table beets than the center (Gaertner and Goldman 2019). Percent sugar content in sugar beet shows moderate narrow sense heritability (h 2 ¼ 0:60) (Würschum et al. 2011), and repeatability for TDS was found to be 0.43 in a small set of table beet cultivars (Hanson and Goldman 2019). Indeed, soluble solids content was found to vary with planting date (Feller and Fink 2004), reflecting the complex genotype x environment interactions with respect to TDS observed by Hanson and Goldman. Increased soluble solids content has been historically associated with table beet quality (Feller and Fink 2004), although perceived sweet flavor and sucrose content have not always shown positive correlation (Bach et al. 2014). Sucrose is known to interact with bitter compounds in other vegetables, at times with the effect of masking perception of either sweet or bitter flavors (Kreuzmann et al. 2008;Beck et al. 2014).
This PPB project pursued both farmer and consumer engagement to develop novel, consumer-accepted beet cultivars that grow well on Wisconsin organic farms. The novel color and flavor characteristics of the LGC and HGC table beet populations presented opportunities to manipulate expected and perceived flavors, respectively. Two PPB models were employed in development of novel LGC and HGC table beet cultivars, each with three key selection groups: farmers, core hedonic selection participants, and consumer hedonic selection participants. A Farm model used the established PPB method of collaboration between a single farm and single plant breeder; in this model, one farmer, the farm's staff, and the farm's CSA members served as farmer, core, and consumer participants, respectively. In contrast, an Outreach model sourced input from many Wisconsin direct market vegetable farmers, a group of farm-totable chefs, and consumer event attendees as farmer, core, and consumer participants, respectively. The Seed to Kitchen Collaborative (SKC), a program of the Urban and Regional Food Systems Group in the University of Wisconsin -Madison Dept. of Horticulture, facilitated farmer contacts and convened chef and consumer tasting events. Notably, core participants were professional experts with respect to flavor perception in the Outreach model but not the Farm model. Three cycles of recurrent selection were carried out in both Farm and Outreach models for both LGC and HGC populations. 'Cycle-Model population' refers to the LGC or HGC families planted in a single year (Cycle) and site (Model).
Because the goal of this PPB project was to develop regionally-adapted and consumer-accepted cultivars from germplasm with unique and extreme geosmin concentration, direct selection was for farmer and consumer acceptance, as measured by horticultural performance and hedonic liking, respectively. Geosmin concentration and TDS, then, were the objects of indirect selection. Tracing the behavior of these flavor compounds over two cycles of selection illuminates both the degree to which they were associated with hedonic liking and the degree to which they were altered via either indirect selection or drift. Finally, chef and consumer flavor evaluation of collaboratively bred LGC and HGC cultivars offered insight into the relationships among physicochemical flavor compounds, perceived flavor attributes, and hedonic liking.

Germplasm
Cycle 0 LGC and HGC table beet populations were each composed of 30 LGC and HGC half sib families, respectively, derived by bidirectional breeding for geosmin concentration (Maher and Goldman 2017). Seed parent roots of Cycle 0 LGC and HGC families ranged in geosmin concentration from 1.26 to 2.03 lg Á kg À1 , and from 27.49 to 44.92 lg Á kg À1 , respectively. Cycle 0 LGC and HGC populations were composed of identical families in both Farm and Outreach models, but because LGC and HGC Cycle 1 and 2 populations resulted from independent selection events, they were distinct between models. OP cultivars 'Touchstone Gold' and 'Bull's Blood' were selected to serve as check cultivars because of their characteristic low and high geosmin concentrations, respectively (Freidig and Goldman 2014).

Field techniques
Each plot consisted of a single 3.7 m row planted using a continuous drill seeder (Planet Junior, Cole Planter Co, Albany, GA) modified with a cone attachment. Rows were 46 cm apart, and buffer rows of cv. 'Early Wonder Tall Top' or 'Red Ace' were established to extend 3.7 m and 0.9 m, respectively, beyond ends and sides of each main planting. Each LGC and HGC Cycle-Model planting comprised two randomized repetitions of 30 half sib family plots plus 6 plots of LGC check 'Touchstone Gold' or HGC check 'Bull's Blood', respectively. All LGC and HGC Cycle-Model plantings began with 30 half sib families except HGC Farm Cycle 2, as only four roots selected from the previous cycle yielded seed.
Planting took place during the first week of June in 2016 2017, and 2018 for Cycles 0, 1, and 2, respectively (Online Resource 1a). Farm and Outreach fields were sited on Dresden silt loam at Tipi Produce in Evansville, WI and on Plano silt loam at the West Madison Agricultural Research Station (WMARS) in Verona, WI, respectively (Soilweb 2020). Both soils were well drained with high yield potential (Laboski and Peters 2012), and a soil test taken at planting per Peters and Laboski (2013) showed no notable deficiencies (Online Resource 1b). Field preparation was carried out in accordance with USDA Organic standards by Tipi Produce and WMARS staff; cultivation was entirely mechanical and mostly by hand. Plots were thinned 4 to 6 weeks after planting.

Phenotyping and laboratory analysis
Stand counts were taken 4 to 6 weeks after planting to measure germination, and relative leaf height was evaluated 6 to 8 weeks after planting to gauge ability to out-compete weeds. Approximately 11 weeks after planting, five representative plants were harvested from every plot for phenotyping (Online Resource 1a). Roots were brushed to remove soil, cut in crosssection, and photographed. Plots were evaluated on a 1 (low) to 5 (high) scale using a rubric (Online Resource 2) for root and leaf disease, crown and taproot size, smoothness, uniformity, and visual impact of both leaves and roots; interior root color and petiole color were also noted. Finally, qualitative notes and a Holistic Quality (HQ) score were added to reflect overall breeder evaluation of the plot.
After breeder evaluation, two cylindrical cores were extracted from each lower root half using a core borer with 1-cm internal diameter. Each plot was thus represented by a bulked sample of ten cores, each with uniform epidermal area but varying length. Tissue from 6 cores was ground into slurry for GC-MS analysis of geosmin concentration, and remaining tissue was analyzed by refractometry for TDS per Hanson and Goldman (2019). Relative recovery rates for GC-MS analysis were 37. 16%, 40.37%, and 44.76% in 2016 2017, and 2018, respectively. Harvest and seed production Horticultural and hedonic selection proceeded as detailed below, and selected roots were harvested for vernalization 12 to 13 weeks after planting. Tops were trimmed to 2.5-cm conical stubs and stored at 5°C for approximately 12 weeks. For greenhouse seed propagation, approximately 12 roots from each selected family were planted into pots 15.2 cm tall 9 13.8 cm in diameter, filled with a blend of one-third Pro-Mix High Porosity (Premier Tech, Quakertown, PA) and two-thirds field soil. Before flowering, plants comprising each Farm and Outreach LGC and HGC population were moved into separate isolation rooms; plants were shaken periodically to promote pollen dispersal and intermating.

Participatory recurrent selection
Three cycles of recurrent selection were carried out in both Farm and Outreach models for both LGC and HGC populations. Selection was based on horticultural characteristics and hedonic liking in Cycles 0 and 1 (Fig. 1), while selection in Cycle 2 was based solely on horticultural characteristics. In all cycles, approximately 5 of 30 families were selected for intermating by mass pollination. In Cycles 1 and 2, half-sib family starting populations were created by harvesting seed separately from each plant, while the Cycle 3 starting population was composed of bulked seed from all plants.

Horticultural performance
Farmer breeding priorities and breeder observations guided selection for horticultural performance. Wisconsin direct market vegetable farmers rated the importance of table beet horticultural characteristics on a 5-point scale, and a Horticultural Selection Index (HSI) was created by weighting breeder ratings by mean farmer importance scores. Farmer responses were gathered via interview (n = 1 in 2016) and online survey (n = 5 in 2016 plus 24 in 2017, Online Resource 3), for Farm and Outreach models, respectively.
Horticultural selection was performed within each LGC and HGC Cycle-Model population, considering both replications of each family. Families were eliminated by sequential culling, first for insufficient root availability and then for low HSI and/or HQ for one or both replications. Breeder qualitative notes aided in selecting among families with similar quantitative scores. In Cycles 0 and 1, approximately 10 of 30 families were eliminated from each LGC or HGC Cycle-Model population based on horticultural performance, and remaining families were subject to core and consumer hedonic selection (Fig. 1). All selection in Cycle 2 was based on horticultural performance, Fig. 1 A cycle of participatory table beet selection featuring three stages of selection followed by mass pollination. a Selection based on horticultural traits reduced the population from 30 to 20 half sib families. Families were grouped by appearance and evaluated for hedonic liking by b core and c consumer groups, reducing the population to 10 and 5 families, respectively. d The 5 selected families were intermated via mass pollination to create seed for the subsequent cycle and the role of core and consumer participants shifted from hedonic selection to flavor characterization.

Hedonic liking
In Cycles 0 and 1, families remaining after horticultural selection in each LGC and HGC Cycle-Model population were sorted into groups of approximately five families each, based on root color, petiole color, and degree of zoning. Core and consumer participants ranked beet groups based on overall hedonic liking; the highest-ranked two groups from core evaluation advanced to consumer evaluation, and the highest ranked group in consumer evaluation advanced to seed propagation ( Fig. 1). Sixteen total instances of hedonic selection occurred, comprising Cycle 0 and Cycle 1 selection from both LGC and HGC populations, in both Farm and Outreach models, by both core and consumer participants.
To prepare beet groups for sampling, an equal number of roots were selected from each family comprising the group. Whole roots were washed, steamed in pots fitted with perforated baskets for 20-22 min-until a paring knife could be inserted easily to the center of the root-and cooled to room temperature. Quarter-disk samples of 0.7-cm thickness were created by slicing roots perpendicular to the root-shoot axis and then quartering the slices. Epidermis was not removed, and samples were refrigerated until serving. Beet group samples were presented on white plates and labeled with alphanumeric codes.
Hedonic selection events for Cycle 0 and Cycle 1 took place between August and October of 2016 and 2017, respectively (Online Resource 1a). All Farm model hedonic selection was performed at Tipi Produce, while Outreach model core and consumer selection took place in Madison, WI at SKC chef variety evaluations and consumer-facing events, respectively. Core participants in hedonic selection comprised 14 and 5 farm staff in Farm Cycles 0 and 1, respectively, and 3 and 6 chefs in Outreach Cycles 0 and 1, respectively. For consumer hedonic selection in Cycles 0 and 1 combined, 120 CSA members and 119 Farm to Flavor attendees served as Farm and Outreach consumer participants, respectively (Online Resource 4). Both core and consumer participants rated beet groups on appearance, willingness to eat again, and presence of signature beet flavor. Willingness to eat again was evaluated both with a binary (Yes / No) response and by ranking groups from low (most willing) to high (least willing). Cumulative rank was used to select groups for advancement, while the binary response suggested strength of preference for the selected group. Participants were also surveyed regarding demographic information, beet consumption frequency, and attitude toward beet consumption (Online Resource 4).

Flavor and hedonic liking evaluation
In August through October of both 2018 and 2019, Outreach LGC cultivar 'Snowglobe' and Outreach HGC cultivars 'Blushing Not Bashful' and 'Moving Target' were evaluated for perceived flavor attributes and hedonic liking against a different set of OP checks. A ballot was constructed to include hedonic liking of texture and flavor, plus five salient flavor attributes generated by a beet flavor lexicon development process: sweet, earthy, bitter, intense, complex. Briefly, a lexicon of terms describing beet flavor were generated by SKC chefs during summer 2018 per guidelines of Lawless and Civille (2013). Projective mapping was carried out per Dawson and Healy (2018), and data were subjected to PCA. Flavor attributes in sweet, earthy, and bitter categories were most frequently mentioned; first and second PCA dimensions seemed to represent intensity and complexity (Online Resource 5). All flavor attributes and hedonic liking parameters were measured on 1 (low) to 5 (high) Likert scales. Ballots were offered both in paper form and via an online survey platform; cultivar order was randomized among participants (Online Resource 5). All samples were prepared as previously described and identified with random threeletter codes. Chef flavor evaluation was performed by SKC chefs at Madison, WI restaurants, while consumer evaluation took place in 2018 at the Culinary Breeding Network Showcase NYC and in 2019 at the UW-Madison Dept. of Horticulture campus building and SKC Farm to Flavor event. Each of five tasting instances was termed an event.
LGC and HGC populations LGC and HGC check plots (n = 71 and n = 72, respectively, Online Resource 6) were analyzed separately for variation in geosmin concentration and TDS to inform our choice of response variables. Check plot served as the experimental unit for two-way ANOVA with respect to Cycle, Model, and Cycle x Model interaction; Tukey-corrected pairwise comparisons were performed. Both untransformed TDS and natural-log transformed geosmin concentration data sets met ANOVA assumptions. Geosmin concentration varied significantly with Cycle and Cycle x Model interaction for both LGC and HGC checks, and TDS varied significantly with Model for LGC but not HGC checks (Online Resource 1c). Due to significant environmental effects and interactions, 'proportion of check' response variables were used per Wolyn and Gabelman (1990) to reduce the influence of environmental variation when comparing LGC and HGC populations across Cycles and Models. 'Proportion of check' response variables were calculated by dividing each family mean by the LGC or HGC check mean from the same Cycle-Model combination. Measured log geosmin concentration and TDS, in turn, offer information about the flavor compounds subject to perception by participants.
For LGC and HGC population analysis (Online Resource 6), half sib family was treated as the experimental unit and as a random source of variation. Cycle 0 LGC and HGC populations were compared with one another and with Cycle 0 check cultivar populations using Welch T-tests. Because identical LGC and HGC half sib families were planted in Farm and Outreach models in Cycle 0, family means include four replications, two from each model (N = 60, n = 30). Six LGC and six HGC Cycle-Model populations-derived from Cycles 0, 1, and 2 in both Farm and Outreach models-were compared within LGC or HGC category. Half sib family data points represented the mean of two replications within the same LGC or HGC Cycle-Model population. Analysis was performed for four response variables: log geosmin concentration (ln lg Á kg À1 ), proportion of check log geosmin concentration, TDS (%), and proportion of check TDS.
For comparison of geosmin concentration and TDS among LGC and HGC Cycle-Model populations, twoway ANOVA was performed with Cycle, Model, and Cycle x Model as fixed effects. Due to the nature of recurrent selection, the Cycle effect represented both annual growing conditions and the effect of selection, and the Model effect represented both field location and participant horticultural and hedonic preferences.
To detect family-to-family variation in geosmin concentration and TDS within individual LGC or HGC Cycle-Model populations, mixed model ANOVA was performed on each of 12 Cycle-Model populations. Half sib family, here represented by one replication, remained the experimental unit. Replications were treated as fixed blocks, while family was treated as random and tested with a log likelihood test of random effects with significance threshold a ¼ 0:10.
To detect significant differences in flavor compounds among beet groups that might affect hedonic selection, one-way ANOVA was performed within each LGC or HGC Cycle-Model population for measured log geosmin and TDS. Family, as the mean of two replications, remained the experimental unit, and beet group was the sole fixed effect. Finally, Welch T-tests were used to test the differences in mean log geosmin and TDS between selected and non-selected families within each LGC or HGC Cycle-Model population.

Hedonic selection
Statistical tests were applied to cumulative ranks within each hedonic selection event to test the null hypothesis that all beet groups were equally well liked. The Z-test of equal proportions was used to compare cumulative ranks of the two selected and two nonselected groups in core selection events. The binomial test was used to compare cumulative ranks of firstranked groups in both core and consumer selection to those of non-selected groups. To compare frequencies of binary-positive and negative-responses to questions regarding appearance, willingness to eat again, and signature flavor, Fisher exact tests and Z tests of equal proportions were used for core and consumer hedonic selection data, respectively. The null hypothesis for both statistical tests was that positive evaluation (e.g. of appearance) was equally frequent for all beet groups. Finally, participant demographic, attitude, and history data were compared between consumer hedonic selection participants in Farm and Outreach models using Z tests of equal proportions.

Flavor and hedonic liking
Mixed model ANOVA was performed on flavor and hedonic liking data with cultivar, event, and cultivar x event interaction as fixed effects and participant as a random effect. Analysis was performed within year, and Tukey-corrected pairwise comparisons (a ¼ 0:05Þ were performed within event. For each cultivar, geosmin concentration and TDS were measured via GC-MS analysis and TDS per Hanson and Goldman (2019). 6 and 4 samples of each cultivar were analyzed in 2018 and 2019, respectively (Online Resources 7 and 8); each ten-core bulked samples represented five roots. Relative recovery rates for GC-MS analysis were 44.76% and 46.76% in 2018 and 2019, respectively. One-way ANOVA was performed for log geosmin concentration and untransformed TDS, with cultivar as the sole fixed effect.

Initial populations
Plant breeding requires selection from a variable population, so LGC and HGC initial populations were characterized with respect to geosmin concentration and TDS.
LGC and HGC Cycle 0 families, averaged across Farm and Outreach models, were significantly different in mean geosmin concentration (P 0:001Þ but not in mean TDS. Similarly, LGC and HGC Cycle 0 populations were significantly different in geosmin concentration (P 0:001Þ but not in TDS from their respective Cycle 0 checks (Fig. 2). Standard deviation of geosmin concentration for Cycle 0 LGC and HGC populations were 2.53 and 9.30 lg Á kg À1 geosmin, 123 respectively, reflecting the documented phenomenon that variation in geosmin concentration increases with concentration itself (Hanson and Goldman 2019;Maher and Goldman 2017). In contrast to their markedly different variances in geosmin concentration, Cycle 0 LGC and HGC populations showed almost exactly equal standard deviations for TDS, at 1.04% and 1.05%, respectively.
Individual families in the LGC Cycle 0 population showed much less variation in geosmin concentration, on average, than HGC Cycle 0 families (Fig. 3). However, variation in geosmin concentration among families was moderately significant for Cycle 0 Farm and Outreach LGC populations (P 0:03 and P 0:07, respectively) but not significant for their HGC counterparts (Table 1). Thus, large within-family variance in Cycle 0 HGC families likely rendered among-family variation statistically insignificant, but lower within-family variation in Cycle 0 LGC families allowed for statistically significant variation in geosmin concentration among families. While within-family variance in TDS was similar for Cycle 0 Farm and Outreach populations (Fig. 3), three of four Farm or Outreach Cycle 0 populations showed significant variation among families in TDS (Table 1).
Geosmin concentration varied significantly among families in LGC but not HGC Cycle 0 populations-a phenomenon consistent with higher heritability in LGC than HGC populations (Maher and Goldman 2017)suggesting that larger potential exists in LGC than HGC populations for shifting mean geosmin concentration. Variation among families notwithstanding, geosmin concentration of LGC and HGC populations is distinct from that of 'Touchstone Gold' and 'Bull's Blood' checks, so selection could result in cultivars with geosmin concentration distinct from these check cultivars, even in the absence of a shift in mean geosmin concentration.
TDS varied significantly among families in three of four Cycle 0 populations, and moderate heritability has been shown for TDS in B. vulgaris (Würschum et al. 2011;Hanson and Goldman 2019). However, Cycle 0

123
LGC and HGC populations present a relatively small range of TDS values from which to select, and TDS values are similar to those of 'Touchstone Gold' and 'Bull's Blood' checks. Thus, while it is plausible that mean TDS could shift via recurrent selection, it is unlikely that mean TDS in resulting cultivars will be substantially different from that in check cultivars.
Finally, both LGC and HGC Cycle 0 populations showed segregation at qualitative root color loci R and Y (Goldman and Austin 2000), yielding color classes of white, yellow, and red. The red class included pink, purple, and coral hues, with varying degrees of zoning. Yellow roots were much more frequent in the LGC than HGC Cycle 0 population. Thus, LGC and HGC populations offered the opportunity to select for both flavor compounds and root pigmentation.

Horticultural selection
Farmer selection priorities were similar but not identical between Farm and Outreach models (Table 2). Farmers in both models indicated that high germination, ability to out-compete weeds, and disease resistance were top priorities for selection; less important were aesthetic qualities like uniformity, taproot and crown size, and root shape. This result is consistent with other surveys of organic farmer breeding priorities, which show that while priorities vary by crop and region, top priorities often include disease resistance, weed competitiveness, and resilience against abiotic stress (Brouwer and Colley 2016;Hultengren et al. 2016;Dawson et al. 2017;Hubbard and Zystro 2016).  HSI values ranged from 42 to 103 over all Cycle-Model populations, with grand mean 76.4, and HQ ratings ranged from 1 to 5 over all Cycle-Model populations with grand mean 3.7 (Online Resource 1d). In Cycles 0 and 1, 10 of 30 families in each LGC and HGC Cycle-Model population were eliminated based on these horticultural performance measures, and remaining families were sorted into hedonic selection groups based on root color, petiole color, and degree of zoning. In Cycle 2, however, variation in visual phenotype had been sufficiently reduced in each LGC and HGC population that families could not be grouped meaningfully into hedonic selection groups by appearance. Thus, all selection in Cycle 2 was based on horticultural performance, and hedonic selection was replaced by flavor evaluation. Five of 30 families were selected from each Cycle 2 LGC population; 6 of 30 families were selected from the Cycle 2 Outreach HGC population; and the only four Cycle 2 Farm HGC families that survived greenhouse seed production were retained to compose that population.

Hedonic selection
In most instances of core and consumer hedonic selection, no strong preference for the selected group was expressed. Of 16 instances of selection based on willingness to eat again, only 5 showed significant differences in cumulative rank (Table 3). Similarly, only 3 of 16 selection instances showed significant differences in binary willingness to eat again (p \ 0.10, Table 3). Two of those cases-core selection in LGC Farm Cycle 0 and consumer selection in LGC Outreach Cycle 0-were associated with significant rank differences, indicating consistent preference for the selected group. However, the four other instances of either significant rank difference or significant difference in binary willingness to eat again were not associated with significance at the other indicator of hedonic liking, showing either moderate or inconsistent preference for the selected group. Ten of 16 selection instances showed no significant differences in either cumulative rank or binary willingness to eat again, indicating either inconsistent preference or a consensus of equal preference among participants.
While few selection instances detected differences in preference among beet groups, even fewer detected differences in perceived signature beet flavor, which has been associated with the aroma of geosmin (Tyler et al. 1979). Only two of 16 selection instances-core selection in LGC Farm Cycle 0, and consumer selection in HGC Outreach Cycle 1-showed such differences (Table 3). Both groups with significantly more perceived signature beet flavor were selected, the former by a significant difference in cumulative rank, and the latter by only a numerical difference. However, the remaining 14 selection instances -including three instances of selection by significant difference in cumulative rank -were not associated with significant difference in perceived signature beet flavor. Thus, perceived signature beet flavor was not consistently associated with preference.
Appearance liking also was not consistently associated with preference. Of 16 selection instances, 5 showed significant differences in appearance liking (Table 3). In two cases of core selection-in LGC Farm Cycle 0 and HGC Farm Cycle 1-the significant difference in appearance liking was associated with greater willingness to eat again, both by ranked and binary indicators, by significant margins in the former case and numerical but non-significant margins in the latter (Online Resource 1e).
However, in the three instances of consumer selection with significantly different appearance liking between groups, one or both metrics of preference were in favor of the group with less-liked appearance. Specifically, in consumer tasting for both HGC Farm Cycle 0 and LGC Outreach Cycle 1, the group with less-liked appearance was selected for advancement, albeit by non-significant differences in cumulative rank (Online Resource 1f); in the former case, the group with significantly less-liked appearance also showed significantly higher participant willingness to eat again. This trend occurred again during consumer selection for HGC Outreach Cycle 1; the group with more-liked appearance was selected, but by a nonsignificant difference in cumulative rank, and willingness to eat again was roughly equal between the lessand more-liked groups. The beet groups with significantly more-liked appearance were of diverse color motifs and zoning levels, but all contained at least some pigmented roots.
The two largest disparities in appearance liking were found when groups composed entirely of white roots were compared with pigmented groups. In consumer tasting for LGC Outreach Cycle 1, white beet group C1_WL2 received only 24% appearance 123 3.5 E-03** , *, **, ***Significant at P 0.10, 0.05, 0.01, or 0.001, respectively Significant difference in geosmin concentration between beet groups offered for hedonic selection à Significant difference in cumulative rank based on overall hedonic liking between selected and non-selected beet groups liking, while its red, yellow, and white comparison group received 82% appearance liking (p \ 0.001). For HGC Outreach Cycle, 1, white beet group C1_WH4 received 37% appearance liking, compared with 80% liking of its pink-purple-coral alternative (p \ 0.001). In both cases, though, willingness to eat again was roughly equal between groups, and in the former case, the white beet group was selected for advancement. Thus, in these cases, perceived flavor was positive enough to offset a significantly less-liked appearance. In terms of expectation theory, it appears that a negative expectation-based on the intrinsic cue of non-traditional appearance-was followed by positive disconfirmation of expectation based on flavor. Indeed, food with low color has been associated with expectation of low flavor (Spence 2015). According to expectations theory, when expectations are disconfirmed-either positively or negativelyconsumer evaluation either contrasts with or assimilates to expectations. Contrast from expectation occurs when perceived quality is anchored by the current sensory experience-causing the consumer to adjust expectations for future experiences with the foodwhile assimilation occurs when perceived quality is anchored by expectations. The size of the discrepancy between expected and perceived quality, along with the strength of the expectation, determines whether contrast or assimilation is experienced (Piqueras-Fiszman and Spence 2015). Because this experiment was not structured to control for the intrinsic cue of color, we cannot know how consumers would have evaluated pigmented versions of these particular white beet groups. Thus, we cannot know if white root color amplified hedonic liking scores or only failed to decrease them substantially. However, because hedonic liking scores are so much higher than appearance liking scores, it appears that the phenomenon of contrast took place, plausibly driven by white root color. Notably, both white beet groups were selected for consumer tasting by chef participants in Outreach Cycle 1. That is, chefs-culinary professionals who create both cuisine and expectations thereof-played a central role in selecting beet groups that evoked positive disconfirmation of expectations.
Two cycles of hedonic selection were characterized by relatively few instances of preference for a specific beet group, and similarly few instances in which signature beet flavor was perceived to be different among groups. Notably, the strongest differences in any measurement of hedonic liking were with respect to appearance, not flavor. To discern whether this paucity of flavor preference owed to a lack of variation in flavor compounds, we analyzed geosmin concentration and TDS between and within LGC and HGC Cycle-Model populations.

Indirect selection for geosmin concentration
LGC Farm and Outreach Cycle-Model populations did not differ significantly in measured geosmin concentration or proportion of check geosmin concentration in Cycle 0, but they diverged in both geosmin metrics after two cycles of selection (Table 1, Fig. 4). The Outreach LGC population increased in mean geosmin concentration from 7.9 lg Á kg À1 in Cycle 0 to 9.9 lg Á kg À1 in Cycle 2, but this change was not statistically significant. However, this shift was reflected in a statistically significant increase in proportion of check geosmin between Cycles 0 and 2. In contrast, the Farm LGC population decreased significantly in mean geosmin concentration-from 7.7 lg Á kg À1 to 4.9 lg Á kg À1 -between Cycles 0 and 2, but because LGC check geosmin concentration shifted similarly over all cycles (Online Resource 1g), this change in measured geosmin concentration was not reflected in a significant difference in proportion of check geosmin. While significant among-family variation in geosmin concentration was found in all four LGC Cycle 0 and Cycle 1 populations at p 0.10, beet groups for core hedonic selection were not significantly different in geosmin concentration in these Cycle-Model populations. However, in two cases-LGC Farm Cycle 0 and LGC Outreach Cycle 1-geosmin concentration did differ significantly between the two beet groups offered for consumer selection (Table 3). In each of those instances, mean geosmin concentration of the selected group was also significantly different than that of all non-selected families (Table 1). In the Cycle 0 Farm population, mean geosmin concentration of selected families was lower than that of the nonselected families, at 5.53 lg Á kg À1 and 8.09 lg Á kg À1 , respectively. In contrast, mean geosmin concentration in Cycle 1 Outreach selected families was higher than non-selected families, at 13.52 lg Á kg À1 and 9.24 lg Á kg À1 , respectively. The valence of these significant differences is in accord with the ultimate divergence in both geosmin concentration and 123 proportion of check geosmin concentration between the Farm and Outreach LGC populations. Thus, two cycles of selection within LGC populations for horticultural performance and hedonic liking resulted in significant shifts in mean geosmin concentration, but a causal link cannot be assumed between these selection parameters and geosmin concentration.
While Farm and Outreach LGC populations diverged in geosmin concentration over two cycles of selection, HGC Farm and Outreach populations fluctuated in geosmin concentration but did not diverge (Table 1, Fig. 4). Even though the same HGC families composed HGC Cycle 0 in both models, mean geosmin concentration was significantly lower in the Outreach HGC population than the Farm HGC population in Cycle 0. Highlighting the increased variability of geosmin concentration in HGC populations, crossover Cycle x Model interaction occurred with respect to geosmin concentration between Cycles 0 and 1 in the HGC populations. HGC check geosmin concentration in Farm and Outreach models varied similarly to that of the relevant HGC populations, so proportion of check geosmin concentration showed no fluctuation within model and only diverged between models in Cycle 1 (Online Resource 1g). The HGC populations ended two cycles of selection by converging at a geosmin concentration significantly lower than that in the HGC Farm Cycle 0 population but not significantly different than that in the HGC Outreach Cycle 0 population. While significant between-cycle variation occurred in the HGC populations with respect to measured geosmin concentration, lack of change in proportion of check geosmin concentration shows that this variation represented fluctuation rather than directional change.
While every LGC Cycle 0 and 1 population showed significant variation in geosmin concentration among families, this was true of only one HGC Cycle 0 or 1 population: HGC Outreach Cycle 1 (Table 1). However, no difference in geosmin concentration was observed between beet groups offered to core or consumer participants in any HGC Cycle 0 or 1  (Table 3), and geosmin concentration did not differ significantly between selected and nonselected families for any HGC population ( Table 1). The lack of significant variability in geosmin concentration in HGC Cycle-Model populations-both among all families, and between selected and non-selected families-is consistent with the lack of significant change in geosmin concentration for HGC populations over two cycles of selection.
To sum up, LGC populations diverged significantly in geosmin concentration over two cycles of selection for hedonic liking, while HGC populations did not.
LGC populations showed significant family to family variation in geosmin concentration for all LGC Cycle 0 and 1 populations, while this was true for only one of four HGC Cycle 0 or 1 populations. For LGC populations, two selection events in occurred in which selected families differed significantly in geosmin concentration from non-selected families, but no such differences were present for HGC populations. Thus, as portended by analysis of Cycle 0 LGC and HGC populations (Fig. 2, 3), the increased variance of geosmin concentration in HGC families was associated with lack of significant change in HGC population mean geosmin concentration over the course of selection. In contrast, relatively lower variance of geosmin concentration in LGC variance was associated with significant shifts in LGC population mean geosmin concentration.
The fact that the variance of geosmin concentration covaries with concentration itself has practical breeding implications. Heritability for this trait is higher in LGC than HGC populations, so the population mean can be shifted more readily in LGC populations, whether via direct selection for laboratory-measured geosmin concentration (Maher and Goldman 2017) or indirect selection for associated sensory traits.
Population mean geosmin concentration is also vulnerable to shifting via drift if population size is suddenly reduced. Thus, maintenance of geosmin concentration-and its contribution to holistic flavorwithin OP table beet cultivar populations necessitates use of large population sizes for seed production. Critically, though, selection for geosmin concentration is an apt breeding goal only if it correlates with hedonic liking and is feasible within the resource limitations of a breeding program. Laboratory measurement of geosmin concentration is expensive, so indirect selection via another sensory attribute-like perceived signature beet flavor or earthy aroma-is an appealing strategy. Thus, it is necessary to examine the connections between geosmin concentration, hedonic liking, and perceived flavor.
While divergence in LGC population mean geosmin resulted from consumer selection for hedonic liking, the observed shifts in geosmin concentration cannot be directly linked to hedonic liking of geosmin. That is, two critical instances of consumer selection were identified-Farm Cycle 0 and Outreach Cycle 1at which the LGC Farm and Outreach populations diverged with respect to mean geosmin concentration. In these instances, the selected groups in the LGC Farm and Outreach populations were significantly lower and higher in geosmin concentration, respectively, than the comparison groups presented for consumer hedonic selection. However, no significant difference in consumer willingness to eat again-either as cumulative rank or binary choice-was observed in either instance. (Table 3).
Aggregated hedonic liking data support the notion that geosmin concentration is not associated with hedonic liking. That is, no significant difference in willingness to eat again was found between LGC and HGC groups, aggregated over Cycles 0 and 1, for , ***Significant at P 0.10 and 0.001, respectively 123 either core or consumer participants (Table 4), despite the fivefold difference in geosmin concentration between LGC and HGC grand means (Table 1). Hedonic liking of table beet groups, then, did not center around geosmin concentration, and the shifts observed in geosmin concentration within the LGC populations appear to be due to drift rather than selection. Perceived signature beet flavor was inconsistently linked to measured geosmin concentration. In the two instances in which consumers selected between beet groups significantly different in geosmin concentration-LGC Farm Cycle 0 and LGC Outreach Cycle 1neither was associated with a significant difference in perceived signature beet flavor. During core selection for Farm Cycle 0, the first-ranked group was perceived to be significantly higher in signature beet flavor than the comparison groups, despite the fact that at 5.53 lg Á kg À1 , its geosmin concentration was among the lowest of all LGC groups. In contrast, consumers in HGC Outreach Cycle 1 perceived the beet group with geosmin concentration 51.00 lg Á kg À1 to have significantly more beet flavor than one with 51.99 lg Á kg À1 (Online Resource 1f), despite the fact that these concentrations were almost identical. Thus, perceived signature beet flavor was associated with neither higher nor divergent geosmin concentration. Tyler et al. (1979) found that participants perceived signature beet aroma in cooked beet juice when geosmin concentrations were between 0.36 lg Á kg À1 and 5.8 lg Á kg À1 . While the aforementioned LGC group had geosmin concentration consistent with this range, the HGC group had tenfold greater geosmin concentration, well into the range deemed ''too earthy to be beetlike'' by Tyler's participants. However, due to the difference in physicochemical matrix between beet juice and solid beet samples, and the difference between olfactory and oral sensory perception, it should not be expected that detection thresholds would align.
Core hedonic selection participants noted signature beet flavor significantly more often for HGC groups than LGC groups, aggregated across Cycles 0 and 1 (p \ 0.001, Table 4). However, participants were informed that this project was undertaken to gauge preference for beets with varied geosmin concentration, and because they were charged with choosing a group from within both LGC and HGC categories, the categories were so identified. Chefs, as culinary professionals, may have associated signature beet flavor with increased geosmin concentration, and this may have increased their tendency to note signature beet flavor in HGC groups. Because prior knowledge might have influenced chefs' perception of signature beet flavor, and because this attribute was not significantly associated with geosmin concentration among consumers, this phenotype should not be relied upon for indirect selection for geosmin concentration.

Indirect selection for TDS
LGC populations diverged in measured TDS but not proportion of check TDS, in contrast to their divergence in both measures of geosmin concentration (Table 1, Fig. 5). Initial Farm and Outreach LGC populations were not significantly different in TDS or proportion of check TDS; TDS fluctuated in the LGC Farm population, but proportion of check TDS did not change. Significant among-family variation for TDS was found in both Farm and Outreach Cycle 0 LGC populations, but hedonic selection groups were not significantly different in TDS for any LGC Cycle 0 or 1 population (Table 3). Likewise, selected families were not significantly different in TDS than nonselected families for these populations (Table 1). Thus, while LGC Farm and Outreach populations diverged in measured TDS, this shift is not associated with divergence in proportion of check TDS or differences in TDS among hedonic selection groups  or between selected and non-selected families in any Cycle-Model population. Both HGC populations followed the trend of the Outreach LGC population with respect to TDS. Among HGC Cycle-Model populations, TDS varied significantly among families in both Farm Cycle 1 and Outreach Cycle 0, but hedonic selection groups did not vary significantly in in TDS for any HGC Cycle 0 or 1 population (Table 3). In two cases-Cycle 0 in both HGC models-selected families were significantly higher in TDS than non-selected families (p \ 0.10). While the valence of these differences is consistent with the upward shifts in mean proportion of check TDS between HGC Farm Cycles 0 and 1, and between HGC Outreach Cycle 0 and 2, these shifts are of small magnitude and were not reflected in significantly different measured TDS. Thus, only weak evidence connects hedonic selection events to meaningfully shifts in mean TDS in HGC populations.
The lack of significant shift in measured TDS for any LGC or HGC population over the course of selection is unsurprising, given the known moderate heritability of TDS in table beet (Hanson and Goldman 2019) and the relatively small range of TDS values in initial LGC and HGC populations. Because no selection instances occurred in which beet groups diverged significantly in TDS, it was not possible to examine the relationship between TDS and hedonic liking. To develop table beet cultivars with novel TDS levelsand to attempt association of TDS levels with hedonic liking-it is likely that initial breeding populations would need to show greater variation in TDS.

Novel flavor identified cultivars
Farm and Outreach LGC populations have been disclosed for commercialization as OP cultivars 'Evansville Ember' and 'Snowglobe', respectively (Fig. 6). Both novel LGC cultivars showed consistently low geosmin concentration in 2018 and 2019; 'Evansville Ember' was numerically but not significantly lower in geosmin concentration than 'Snowglobe' in both years (Table 5). However, 'Evansville Ember' showed significantly lower geosmin concentration than 'Touchstone Gold' in both years, while 'Snowglobe' did not. 'Evansville Ember' was lower in geosmin concentration than all commercial cultivars evaluated here, but several commercial cultivars-like 'Blankoma' and 'Red Ace' per Freidig and Goldman (2014) and Maher and Goldman (2017), respectivelyare documented to have quite low geosmin concentration. In addition, while small numerical differences were found in TDS among novel LGC cultivars, HGC cultivars, and check varieties, none were statistically significant (Table 5). Thus, while neither new LGC cultivar necessarily represents novelty among commercial cultivars with respect to geosmin concentration or TDS, 'Evansville Ember' and 'Snowglobe' are unique for their characterized flavor components and their development via participatory recurrent selection in organic conditions. In addition, 'Snowglobe' is one of few table beet cultivars with white roots and green petioles, potentially providing farmers with an alternative white-rooted specialty cultivar.
Farm and Outreach HGC populations have been disclosed for commercialization as 'Evansville Orbit' and 'Moving Target', respectively (Fig. 6). An additional cultivar, 'Blushing Not Bashful', was selected from the Outreach HGC population due to chef enthusiasm for the cultivar's flavor and novel appearance (Online Resource 1e), even though it narrowly missed consumer selection (Online Resource 1f). These three HGC cultivars did not have significantly different geosmin concentration in either 2018 or 2019 (Table 5), and their change in numerical geosmin concentration between years reflects the high variability of geosmin concentration in high geosmin genotypes. All three HGC cultivars were numerically higher in geosmin concentration than 'Bull's Blood' in both years, and in each year, one HGC cultivar was significantly higher than this HGC check. In addition, in 2019 both 'Blushing Not Bashful' and 'Evansville Orbit' were significantly higher in geosmin concentration than 'Chioggia' (Table 6), another OP cultivar known for relatively high geosmin concentration (Freidig and Goldman 2014). Thus, these HGC cultivars represent novelty within the commercial table beet market with respect to geosmin concentration, and they are also notable for having been selected in and for organic farms and marketplaces. Moreover, 'Evansville Ember' and 'Moving Target' represent novel additions to the zoned beet market class, and 'Blushing Not Bashful' is the only known commercial cultivar with both white roots and pink petioles. This multidimensional novelty-in flavor, appearance, and breeding approach-offers potential for both greater on-farm genetic diversity (Hubbard and Zystro 2016) and market niche creation (Lyon et al. 2015).
Selection in organic conditions allowed for marketrelevant phenotypic expression but does not ensure broad adaptation to organic growing conditions. Cultivars were selected under organic growing conditions in Wisconsin-constituting selection in the target environment for this specialty cultivar-and according to breeding priorities of Wisconsin direct market farmers. Parallel selection was not carried out in conventional conditions, so it cannot be known if PPB allowed for increased response to selection for the suite of horticultural and sensory traits this project prioritized. However, performing selection in the target environment allowed for market-relevant expression of phenotypes and phenotypic interactions. Despite the farmer-engaged nature of this project, each cultivar was selected on only one farm. In some cases, selection in organic growing conditions has yielded cultivars that show increased performance on organic farms, but in other cases, selection on a single organic farm has resulted in cultivars with limited adaptation (Lyon et al. 2018). In addition, correlation between variety performance on individual farms and that on research stations was lower in organic than conventional conditions (Singh et al. 2011). Because organic farm environments are heterogeneous, decentralized trialing will be critical to gauge the appropriateness of these PPB-developed varieties in regions or microclimates outside those in which they were developed.  (Table 6).

Perceived sweetness
Perceived sweetness varied strongly with both cultivar and cultivar x event interaction in both 2018 and 2019 (Table 6). Cultivar x event interactions indicate that participant groups perceived different levels of sweetness in the same cultivar, and these occurred with respect to different cultivars each year (Table 7). However, some cultivars consistently were perceived as more or less sweet than comparison cultivars: 'Blushing Not Bashful' was in the group with highest perceived sweetness at all five events, and 'Snowglobe' was in the group perceived most sweet in both 2018 events and in two 2019 events (Fig. 7). Consistencies appeared for low perceived sweetness as well: 'Badger Flame' was in the group perceived least sweet at both 2018 tasting events, and 'Chioggia' was in the group perceived as least sweet in all 2019 tasting events.
Notably, no significant differences in TDS were found among representative samples of five 2018 cultivars (excluding 'Badger Flame', due to lack of data) or any 2019 varieties (Table 5). Although significant differences in perceived sweetness occurred at all tasting events, and although certain cultivars were perceived as consistently more or less sweet across several events, these differences were not associated with differing TDS levels. This finding is consistent with Bach et al. (2014), who found no correlation between perceived sweetness and individual saccharide levels despite finding significant differences in both perceived sweetness and saccharide levels among table beet cultivars. In both cases, this lack of correlation could owe to the presence of bitter compounds, volatile compounds, or both. Bitter compounds have been shown to mask perceived sweetness in Brassica vegetables (Beck et al. 2014), and the volatiles correlated with liking have been shown to enhance perceived sweetness in tomato (Wang and Seymour 2017) and blueberry (Klee and Tieman 2018). Thus, it is possible that variable concentrations of geosmin-the volatile molecule around which this project revolves-or bitter compounds like phenols, flavonoids, and saponins could have driven observed differences in perceived sweetness, even in the absence of significant difference in TDS among cultivars.
It is also notable that 'Blushing Not Bashful' and 'Snowglobe', the two cultivars most often perceived as significantly sweeter than comparisons, respectively, have white color. It is possible that, as posited previously, the white color of these cultivars created an expectation of low flavor, such that positive disconfirmation of expectations occurred upon tasting roots with unexceptional TDS levels. While this remains possible, white check cultivar 'Avalanche' was perceived as significantly more sweet than other cultivars in only one of three 2019 events. Thus, it cannot be concluded that white color alone was responsible for the perception of high sweetness in 'Blushing Not Bashful' and 'Snowglobe.' To associate perceived sweetness with TDS in table beet more precisely, sensory evaluation and TDS measurement would need to be performed on individual roots, and cultivars would ideally represent a wider range of TDS than was present here.

Perceived earthiness
Like perceived sweetness, perceived earthiness varied significantly with cultivar in both 2018 and 2019, and with event in 2019 (Table 6). In contrast with perceived sweetness, no cultivars were perceived as consistently more or less earthy than others across all flavor evaluation events. Instead, differences in perceived earthiness varied by year and taster group. In both 2018 tasting events, 'Snowglobe' was perceived to be significantly less earthy than at least one variety: 'Bull's Blood' among consumers, and both 'Badger Flame' and 'Touchstone Gold' among chefs (Table 7). In 2019, chef and horticulture department participants detected no differences in perceived earthiness among cultivars, but consumers perceived 'Avalanche' to be significantly less earthy than 'Moving Target', 'Snowglobe', and 'Chioggia'.
While no significant differences in TDS were present among varieties within each year's tasting set, strongly significant differences in geosmin concentration were present. In both 2018 and 2019, samples of Outreach HGC cultivars 'Moving Target' and 'Blushing Not Bashful' were significantly higher in geosmin concentration than Outreach LGC cultivar 'Snowglobe'; in addition, both HGC cultivars were significantly higher in geosmin concentration than checks 'Touchstone Gold' in 2018 and 'Avalanche' in 2019 (Table 5). If geosmin concentration were associated with perceived earthiness in steamed beet samples, both HGC cultivars would be perceived as significantly more earthy than 'Snowglobe' among at least some participants. Instead, no significant differences in perceived earthiness were detected between HGC cultivars and LGC cultivar 'Snowglobe' in any tasting event. Moreover, chefs in 2018 and consumers in 2019 perceived significant differences in earthy flavor between 'Snowglobe' and check cultivars 'Touchstone Gold' and 'Avalanche', respectively, despite the lack of significant differences in geosmin concentration. Thus, the large and consistent differences in geosmin concentration between HGC and LGC cultivars were not detected as differences in perceived earthiness by either consumer or chef participants, but differences in earthiness were perceived between cultivars with no measurable difference in geosmin concentration.
Significant differences in perceived earthiness were not associated with measured geosmin concentration, but they were consistently associated with differences in perceived sweetness and perceived bitterness. Perceived earthiness was negatively correlated with perceived sweetness in both 2018 and 2019; in 2018 this correlation was statistically significant (Table 8).
While it is clear that lower perceived sweetness is associated with higher perceived earthiness, this phenomenon cannot be tied to variation in TDS or geosmin concentration. That is, TDS did not vary significantly in either set of cultivars (Table 5), so variation in sucrose cannot be related to variation in perceived earthy flavor, and significant variation in geosmin concentration does not associate systematically with perceived earthiness or perceived sweetness. Thus, although the presence of volatile molecules can affect perceived sweetness (Klee and Tieman 2018), geosmin concentration does not appear to have varied systematically with perceived sweetness in this study. Perceived earthiness and perceived bitterness, in contrast, show significant positive correlation in both 2018 and 2019. Bitter compounds were not measured in this study, so while it is possible that saponin, flavonoid, or phenol concentration covaried with geosmin concentration, it is also possible that bitter and earthy flavor attributes were conflated by flavor evaluators. Thus, neither measured geosmin concentration nor TDS explain variation in perceived earthiness, and it is not clear whether perceived earthiness represents a distinct flavor quality than either high perceived bitterness or low perceived sweetness.
Differences in geosmin concentration, as measured using slurry from raw beet, were not reflected in differences in perceived earthy flavor among either chef or consumer participants. Admittedly, because geosmin is a volatile compound, it was likely substantially reduced in all roots through the process of steaming (Tyler et al. 1979). Geosmin concentration was not measured in cooked roots for this study, nor was it measured in the individual roots sampled by participants, so it is not possible to relate a sample's actual geosmin concentration with the earthy flavor perceived in it. Instead, this breeding project was undertaken to maximize external validity: table beets are most frequently eaten cooked, so consumer acceptance of novel LGC and HGC varieties hinges on their perceived flavor in cooked form. Moreover,  123 since beets have traditionally been consumed in cooked form, historical consumer complaints about earthy flavor in table beet have been levied mostly against canned, boiled, steamed, or pickled beets. Thus, it is relevant to measure consumer hedonic liking of beets with uniquely high or low geosmin in cooked form. Since beets are widely characterized as 'earthy' in both popular media (Hartke 2020) and academic literature (e.g. Liato and Aïder 2017), it is appropriate to suppose that the term 'earthiness' would be reflected in a commonly understood flavor experience. However, the present research shows that to untrained tasters not provided with a sensory standard for earthy aroma, 'earthy' may be an imprecise descriptor for table beet flavor. Indeed, untrained tasters tend to use more diverse vocabulary to describe flavor than do sensory experts (Dawson and Healy 2018), although flavor attributes can interact even for trained sensory evaluators (Beck et al. 2014).

Hedonic liking
Like perceived sweetness and perceived earthiness, flavor liking varied significantly in both 2018 and 2019. Cultivar was a significant source of variation in flavor liking in both years; event and cultivar x event interaction were also significant sources of variation in 2019 (Table 6). In 2018, flavor liking of 'Blushing Not Bashful' was significantly higher than that of 'Badger Flame' in both flavor evaluation events; among chefs, 'Snowglobe' was also significantly more liked than 'Badger Flame' (Table 7). In 2019, both consumer and horticulture department participants liked 'Blushing Not Bashful' flavor significantly more than that of 'Chioggia', and consumers also liked 'Blushing Not Bashful' significantly more than 'Snowglobe'. As with perceived sweetness, a consistent theme emerged in which 'Blushing Not Bashful' was in the most liked pairwise-comparison group in all four events in which significant differences in liking were detected.
Flavor liking showed significant positive correlation with perceived sweetness in both 2018 and 2019, and liking showed negative correlation-albeit significant only in 2018-with perceived bitterness in both years. The correlation of perceived sweetness and liking is unsurprising given the essential role of sugar in metabolism and nutrition (Roper and Chaudhari 2017), and indeed it has been observed in tomato, strawberry, and blueberry (Klee and Tieman 2018), among other foods. Likewise, bitterness can signal the presence of antinutritive compounds and is negatively correlated with liking in many foods, including Brassicas (Beck et al. 2014). The valence of the correlation between perceived earthiness and flavor liking, however, shifted from significantly negative in 2018 to weakly positive in 2019. Thus, perceived earthiness is associated neither with measured geosmin concentration nor an experience of hedonic liking.
Consistent relationships between perceived sweetness, earthiness, and bitterness were demonstrated over both years (Table 8), but perceived intensity and complexity showed conflicting association with the three main flavor attributes between years. For instance, while perceived earthiness showed significant positive correlation with both intensity and complexity in 2019, non-significant correlations of negative valence were shown in 2018. While perceived complexity showed significant positive correlation with flavor liking in both 2018 and 2019, the lack of consistent association between complexity and perceived sweetness, earthiness, or bitterness makes the term of limited use. Perceived intensity, likewise, showed positive correlation, albeit non-significant, with flavor liking but inconsistent association with other perceived flavor attributes. Moreover, perceived intensity and complexity varied among cultivars in only one of two years (Table 6), adding to the case that these terms are not sufficiently precise to differentiate among table beet cultivars or describe their flavor.

Conclusion
Participatory plant breeding and rapid sensory evaluation revealed that geosmin concentration is not the central determinant of hedonic liking, perceived signature beet flavor, or perceived earthy flavor in steamed table beet. Over two cycles of selection, participants expressed no significant preference for LGC versus HGC beet groups, despite extremely significant differences in geosmin concentration (Table 4). In the two instances of hedonic selection between beet groups significantly different in geosmin concentration, no preference was shown for either group (Table 3). Moreover, flavor evaluation showed that perceived earthiness was neither associated with geosmin concentration (Fig. 6; Table 5) nor associated consistently with hedonic liking (Table 8).
While neither geosmin concentration nor perceived earthiness were consistently associated with hedonic liking of table beet, several other characteristics were. High perceived sweetness and low perceived bitterness were consistently correlated with hedonic liking, although perceived sweetness was not associated with variation in TDS. In addition, evidence accrued for the role of root pigmentation in hedonic liking. Pigmented roots were more liked than white roots, although no specific color or zoning motif was expressly preferred among pigmented roots. However, in an apparent demonstration of positive disconfirmation of expectations, white roots were well liked despite-or perhaps because of-their unusual appearance. Finally, one novel cultivar with white roots, 'Blushing Not Bashful', was in both the most liked comparison group and the group perceived as most sweet in all flavor evaluation events. Because two other white beet cultivars were present, this positive evaluation cannot be attributed solely to the expectations generated by its white color. Instead, 'Blushing Not Bashful' seems to possess a unique flavor profile, the nuance of which was not captured by the simple flavor descriptors offered.
Chef and consumer participation in hedonic selection and flavor evaluation lent high external validity to our investigation of consumer acceptance of novel, flavor identified table beet germplasm. That is, beets were sampled in cooked rather than raw form; no efforts were made to obscure the intrinsic cue of color; and tasting took place in restaurant settings for chefs and at farmers-market-style sampling tables for consumers. Participants were not supplied flavor standards, in an effort to discern the intrinsic meaning of attributes like sweetness, earthiness, and 'signature beet flavor.' Thus, we learned that when beets are presented in steamed form, cultivars with a broad range of geosmin concentration are well accepted by consumers and that manipulating expectation-via novel appearance-may be as powerful as manipulating flavor compounds when it comes to influencing hedonic liking. However, external validity often comes at the cost of internal validity (Campbell and Stanley 1963), as was the case for this breeding project. Sensory studies with increased precision are needed to associate flavor compounds with perceived flavor attributes. Such studies could further explore cultivar x preparation interaction after Bach et al. (2014), evaluate aroma and taste separately, and/or use flavor standards to anchor participant understanding of flavor descriptors.
This PPB project compared two models, a Farm model based on a single farm, and an Outreach model drawing input from a broader group of farmers, chefs, and consumers. While both yielded OP cultivars that appear to be effective and may be marketable, the Outreach model showed distinct advantages due to its inclusion of chefs and association with the SKC. First, chef involvement was key in selecting 'Snowglobe' and 'Blushing Not Bashful', two white rooted cultivars that apparently evoked positive disconfirmation of expectations among consumer participants. Creating expectation around food is central to chefs' work, and this project adds to the body of evidence showing the benefits of involving culinary experts in crop variety selection (Dawson and Healy 2018). Second, the SKC facilitated not only interaction with chefs, but also with groups of farmers and consumers interested in decentralized plant breeding and food system development. While consumer tasting events in both Farm and Outreach models were enjoyable, productive, and hopefully engaging, the multi-stakeholder connections facilitated by the SKC allowed enthusiasm to build -and economic demand to develop -for Outreach table beet varieties during their development. One of the noted advantages to PPB is that marketing and cultivar adoption can start during cultivar development (Ceccarelli 2015), and the Outreach model showed particular strength in facilitating this phenomenon. Indeed, the SKC provided infrastructure that substantially amplified the capacity of public plant breeders UW-Madison Goldman Lab to carry out cultivar development for organic agriculture, per the vision of Shelton and Tracy (2016).
It is clear that geosmin concentration can be manipulated in table beet, which opens intriguing potential for development of novel flavor profiles and concomitant market niches. This project, the first to apply PPB methods to table beet, succeeded in developing five novel open pollinated cultivars with input from Wisconsin farmers, chefs, and consumers. However, hedonic selection and rapid sensory evaluation showed that cultivars with a wide range of geosmin concentration could be accepted by consumers and, indeed, that geosmin concentration is neither the trait around which hedonic liking of beet is organized nor an indicator of perceived earthy flavor. Thus, table beet breeders' attention can turn to a more 123 holistic paradigm of flavor and hedonic liking that encompasses multiple interacting flavor compounds, appearance, consumption environment, and extrinsic credence cues. Such a paradigm is undoubtedly more complex than one organized around a single flavor compound, but it is ultimately more reflective of the multifaceted and dynamic nature of sensory perception and satisfaction.