1 Introduction

Eating and drinking habits are powerful markers of identity, status, and solidarity, as well as triggers of contention about health risks and responsibility (DeSoucey & Waggoner, 2022). Most studies focus on what people like and buy, do and praise. A thriving literature in cultural stratification is dedicated to how tastes, preferences, and practices are clustered together and differentially distributed across the population—both within and beyond food and drinks (e.g. Alderson et al., 2007; Bennett et al., 2008; Chan & Goldthorpe, 2007; Fishman & Lizardo, 2013; Jæger & Møllegaard, 2022; Oncini & Triventi, 2021). Sociological literature has also occasionally stressed the importance of dislikes to understand how cultural hostility is patterned—“determination is negation”, as Bourdieu (1984, 56) asserted— with a few studies concentrating on categorical intolerance (Lizardo & Skiles, 2016), some with reference to matters of food and drink (Wilk, 1997; Lindblom & Mustonen, 2019; Warde, 2011).

Little systematic consideration is given to what people avoid consuming, despite eating and drinking practices often being defined by exclusion of items: some religions have stringent rules on prohibited and forbidden items (e.g. pork, beef); vegetarian practice differ depending on what types of meat and animal derivatives are ruled out; teetotalers refrain from all alcoholic drinks; and gender and class-based boundaries are marked by exclusion of particular foodstuffs, drinks, brands, or dishes (Rosansky & Rosenberg, 2020; Oncini, 2019, 2020).

Making use of rich data from the Multipurpose Survey of Daily Life by ISTAT between 2003 and 2016 on twenty-three items commonly consumed by Italian adults, this paper investigates how avoidances—i.e. what people claim to never eat or drink—are clustered and socially patterned and have evolved over time. Beyond the advantages of this data source, Italy represents a strategic case study because of the centrality of food in Italian cultural life, the recent reinvention of national and regional gastronomic traditions (Ceccarelli et al., 2010; DeSoucey, 2010; Leitch, 2003), the rise of new dietary trends such as vegetarian and vegan diets, and the diminishing appeal of the Mediterranean diet (Eurispes, 2019; Dernini & Berry, 2015; Oncini & Triventi, 2021).

Methodologically, motivated by the huge size and complexity of the data set, we propose the novel use and integration of two machine learning techniques—Self-Organizing Maps (SOM) and Boosted Regression Trees (BRT)—to better describe empirical patterns of food and drink avoidance. SOM is an unsupervised algorithm to reduce the complexity of large, multidimensional datasets. It allows us to identify and depict the clustering of individual avoidances. BRT is a flexible, supervised, machine learning technique that requires fewer assumptions than standard regression models (e.g. linearity and additivity) and has unusually high out-of-sample predictive power. In particular, BRT can incorporate complex functional forms and interactions between predictors while still providing intelligible findings. We employ BRT to identify the power of several variables in predicting the probability of individuals’ belonging to specific clusters. Overall, the article illustrates the sociological value of considering consumption avoidances and their cultural variation and offers a methodological framework that could be employed with consumption surveys in other contexts.

2 Avoidance, in Practice

A vast range of possible sources of nourishment means that humans temper their capacity for omnivorousness by different sorts of selectivity (Rozin, 1976). Biologically, bitter and sour taste receptors warn us that potentially poisonous or pathogenic compounds are being ingested (Lindemann, 2001). However, purely physiological reactions cannot account for the wide variation in everyday avoidances. Principles of selection are informed by concerns ranging from pathogen disgust to allergies and intolerances, to following social norms and conventions which affect reputation and respectability, to scrupulous compliance with religious doctrines. Health, hedonic, reputational, and spiritual considerations are relevant in different ways, but can all lead to systems of classifications that separate “purity” from “danger”, and hence appropriate items that we can consume and matter out of place that needs to be avoided (Douglas, 2002).

Debates about taste in cultural sociology have focused on categorical intolerance, but most often attending to aesthetic judgements rather than avoidances, especially in matters of food and drink (Lizardo & Skiles, 2016; Lindblom & Mustonen, 2019; Warde, 2011). In fact, distastes and aversions serve as potent indicators of distinction, particularly when they stand out as “anomalies” from otherwise open-minded evaluations (Wright et al., 2013; Lindblom, 2022).

The few studies conducting quantitative research on clusters of dislikes highlight that higher-status persons display patterned tolerance, a result that could be taken as a signal of openness to diversity, intimating a cosmopolitan self or expressing distinction through ostentatious open-mindedness, eclecticism, and “anything but” attitudes (Bryson, 1996, 1997; Järvinen et al., 2014; Flemmen et al., 2018; Oncini & Triventi, 2021). Recently, Childress et al. (2021) proposed a solution to the puzzle by showing that inclusivity and exclusivity simultaneously operate at different levels of higher-status culture, the former towards genres (e.g. Contemporary Pop), the latter towards objects (e.g. Britney Spears). Less often, scholars have focused on age, race, or gender, although symbolic boundaries – conceptual distinctions made to categorize objects, people, practices and to demarcate distinctions, affiliations, or identities (Lamont & Molnár, 2002) – are recurrently constructed along those lines as well (see e.g. Bry et al., 2016; Lizardo & Skiles, 2016). For instance, alcoholic drinks are widely used to construct masculinities and femininities (Courtenay, 2000). Besides the fact that women are more likely to abstain than men (Oncini & Guetto, 2018), research highlights that types of drink are represented (and consumed) as masculine or feminine, both “between drinks”—e.g. beer vs. alcopops—and “within” drinks—e.g. dark beer vs. fruity beer (Järvinen et al., 2014; Darwin, 2018; Chapman et al., 2018).

While judgements and representations are fundamental to understanding the social significance and symbolic boundaries of food and drinks, avoidance is a much more practical phenomenon that only partly overlaps with distaste. In fact, differently from aversion—i.e. the physiological or emotional expression of strong dislike for an item—avoidance refers to the act of keeping away from or never doing something. Therefore, while both recur predictably and persist over time, the latter has a much stronger emphasis on the carrying out of practical activities. In other words, although aversions and avoidances are often closely related, it is not difficult to envision individuals abstaining from food or beverages they might otherwise enjoy due to health considerations or religious beliefs. For instance, some vegetarians may avoid meat for sustainability reasons, without necessarily disliking its taste. Conversely, many instances of avoidance may not always stem from strong distastes, but simply reflect a lack of awareness, stem from people’s routines, or simply arise from the actual inaccessibility of a product.

The partial relaxation of traditional norms around food habits and cuisines—what Fischler (1980) called gastro-anomie—coupled with the multiplication of authoritative sources proposing alternative and partially competitive models of how best to eat and drink, make avoidance a salient strategy for navigating excessive options. This is evident in the case of allergies and intolerances, with more people believing that they suffer from these conditions than their proven prevalence suggests (Haeusermann, 2015; Nettleton et al., 2009). In addition, over the past decades a plethora of new dietary schemes and secular doctrines based on a rigid codification of permitted and forbidden items have emerged, adding to or intersecting with more ancient taboos about eating and drinking (e.g. Oleschuck et al., 2019).

Eating is a compound practice involving food procurement, cooking and gastronomic judgement as well as ingestion. It is characterized by the weak coordination and regulation of its component elements (Warde, 2016). It is marked by a high level of personal discretion and general public tolerance of variation in preferences. Avoidance should then be seen as one tacit but significant anchorage that works transversally across different frames. Knowing what to avoid provides grounds for action and allays religious, health and gastro-anomic anxieties. Elimination or rejection of certain foods or drinks often reveals the contours of who we are and what we do, though sometimes they may just be dismissed from explicit consideration because of idiosyncratic preferences. For instance, many people dislike cucumbers, but this does not create a symbolic boundary separating cucumber haters and lovers or cause them to pass judgment on each other.

In any case, avoidances are part of people’s embodied dispositions: they can be innate, such as visceral responses to pathogens and poisons; or encultured, as in the case of religious taboos and normative principles of social groups; or learned, as in the cases of people turning vegetarian or discovering an intolerance or an allergy. In all instances however, they become sedimented in actors’ lines of action thanks to prior experience and recur predictably. They are engrained in everyday expertise, habituation, and routines, and mostly occur automatically, reducing the set of possibilities without requiring reflexivity and purposiveness all the time: coeliacs, for instance, rarely pause to think when following more or less implicit rules to avoid products or dishes with gluten.

In this study, we focus on broad categories of food and drink – like bread, wine and legumes - omnipresent in the Italian foodscape and potential components of everyday practice. To never consume any products from a given category is very unlikely to be due to a lack of awareness of their existence and therefore is evidence of an actual disposition – either encultured or learned. Although the categories are broad, the level of detail is sufficiently fine-grained to investigate how avoidances bind together and create empirical regularities in the population, and how they are socially patterned and have evolved over time.

3 Data and Variables

Data comes from the Multipurpose Survey of Daily Life conducted by ISTAT (the Italian National Statistical Institute) from 2003 to 2016 (ISTAT, 2019).Footnote 1 Cross-sectional surveys with a randomly selected, nationally representative sample of Italian families were carried out every year except 2004. The analytical sample consists of adults aged between 25 and 64. Our sample size amounts to 271,090 cases, which corresponds to 88.7% of the analytical pooled sample, with percentages missing ranging from 10.1 to 12.4% depending on the wave.

Among other things, the survey collects information on the eating habits of respondents. We selected twenty-three food and drink items available in all waves that offer a thorough representation of Italians’ core diets. These food and drink categories are both broad and common enough to allow us to assume that people know all the items. They are: bread, pasta, and rice (carbohydrates); pork; beef; cured meat; white meat; fish; milk; dairy products; vegetables in leaf; vegetables in fruit; fruit; eggs; legumes; potatoes; salty snacks; sweets; soft drinks; wine; beer; alcoholic cocktails; bitters (e.g. Fernet Branca, Montenegro); hard liquors; and non-alcoholic cocktails. The questionnaire asks respondents to note the frequency of their consumption of each item. Possible response categories for the sixteen foods are: more than once per day; once per day; several times per week; less than once per week; and never. Drinks have six response categories: more than one liter per day; from half to one liter per day; one or two glasses per day; more rarely; only seasonally; and never.

Unlike the other consumption frequencies (which could be subject to memory bias) the answer category ‘never’ is precise and potentially indicates a diverse range of significant relations to a food group or drink such as identity and self-perception, social status, religious affiliation, intolerance, or allergy. The option to answer ‘never’ is available for both foods and drinks. To analyze avoidance, we recoded all the variables as dummies distinguishing between items never consumed (1) and those consumed at least to some extent (0).

In the light of previous literature, we selected a wide range of variables that are known to be important individual and contextual factors for understanding patterns of consumption—and possibly avoidance—to use as predictors in the second stage of the analysis. Given the role of ascriptive attributes, and cultural and economic resources, in shaping eating and drinking practices (Darmon & Drewnowski, 2008; Daniel, 2016; Oncini & Guetto, 2017, 2018; Oncini, 2019), we include a range of variables measuring sociodemographic characteristics (gender, age, civil status, family type), and socioeconomic (economic resources, social class) and cultural (education level, reading books) endowments. Second, in line with works underlining the increasing importance of contextual and political forces shaping food access and consumption (Kolb, 2021; Rose et al., 2022), and localized food cultures (DeSoucey, 2010), we also take into account year, region, quality of the area of residence, and food accessibility indicators (access to food shops and supermarkets, regular lunch at home during the week). Third, we include a range of health-related indicators (perceived health, smoking behavior, engaging in regular sport activities) in light of the symbiotic relationship between food and health discourses and practices (Haeusermann, 2015). Finally, we also employ three lifestyle indicators that could partly capture religious, civic, or political drivers of food and drink choices (attendance at religious ceremonies, volunteering, associational involvement). More information on how these variables were constructed, how they are coded, and descriptive statistics are reported in Table A1 and Table A2 (in the Appendix).

3.1 Analytic Strategy

3.1.1 Self-organizing maps

Due to the multidimensionality, complexity, and size of the dataset, we do not directly rely on traditional clustering approaches. Instead, we use a machine learning approach to reduce the scale of large, multidimensional datasets called ‘self-organizing maps’ (SOM; Kohonen, 1982, 2001) which permits effective exploration of the data and its emergent clusters. Usually employed in natural sciences and engineering for classification and prediction tasks, in the social sciences the algorithm has been widely overlooked, except for a few studies on multiple deprivation (Lucchini & Assi, 2013; Pisati et al., 2010; Whelan et al., 2010).

SOM allow dominant patterns to be identified without entirely eliminating complexity. As they map a multidimensional dataset onto a much smaller, usually two-dimensional, map, they also preserve topology (Pisati et al., 2010). This is a valuable intermediary step to assess and retain the complexity of the data, before grouping into a much smaller output using hierarchical clustering (e.g. Lucchini & Assi, 2013; Pisati et al., 2010). Since the algorithm is described and discussed at length in dedicated works (e.g. Pisati et al., 2010), we summarize its main functioning briefly here. Creating a SOM follows the following steps:

  1. 1.

    A map with cells (sometimes also referred to as ‘nodes’) is set up. Each cell has as many properties as the dataset variables.

  2. 2.

    Each case is assigned to a cell on the map that it matches most closely. Doing this alters the value of each cell and that of its neighbors using established neighborhood and distance functions.

  3. 3.

    Once all cases have been positioned on the map, a new iteration of step two starts; while all cases are re-assigned, cell values persist and form the starting conditions for the next iteration.

  4. 4.

    Cell values alter increasingly less with every iteration, as the neighborhood radius shrinks and the map converges.

Fig. 1
figure 1

Graphic representation of an SOM. Notes: The scale bar indicates the number of cases (people) in each cell. The lattice wraps across borders; gray cells have no case assigned to them

After a defined number of iterations, the algorithm ends and returns the values of each cell of the SOM as well as the cases allocated to this cell. Cell values thus represent the variables of all cases assigned to that cell and are furthermore influenced by the cell’s neighbors.

For our analysis, we used the R statistical programming language with the ‘kohonen’ package for SOMs (Wehrens & Kruisselbrink, 2018). Following previous studies, and after some testing with our dataset, we gave our map 400 cells in a hexagonal 20 × 20 lattice with toroidal edges (Fig. 1). While Pisati et al. (2010) used a lattice having around one cell for five distinct cases, in our dataset most cases are very similar. Since we aim to find groups representing a major part of the data, we defined one cell for approximately five distinct cases with more than ten cases associated (ndist>10 = 1,955 with ndist = 25,768). In mapping the data, we also used a sum of squares distance function and a bubble neighborhood function over 100 iterations. We based our SOM, the algorithms, and a seed on two widely used quality indicators: quantization error—the average distance between each case and its nearest cell; and topographic error—the percentage of input vectors for which the best-matching and second-best-matching cells are not adjacent (e.g. de Bodt et al., 2002; Uriarte & Martín, 2005). Further descriptions of the algorithm and our specific application appear in the Appendix.

3.1.2 Clustering

In line with previous applications, we then clustered the weight vectors of the cells hierarchically (Lucchini & Assi, 2013; Pisati et al., 2010); thus, in the output space clusters represent the weight vectors of cells, not individual cases.Footnote 2 Specifically, we chose the generalized average method (flexible UPGMA as implemented by Maechler et al., 2019) because it generated the highest connectivity values among the hierarchical clustering algorithms that we tried. Since flexible UPGMA is deterministic, no best-match clustering tree needed to be identified.

We split the data into nine clusters as this appeared to reveal the highest internal validity and interpretability. A set with ten or more clusters results in an additional, largely omnivore, group with below-average food aversions for which we found no meaningful interpretation or social significance. Conversely, a set with eight or fewer clusters renders invisible clusters which are empirically relevant and allow meaningful interpretation (the Harām cluster, described below, is found if we use nine, but not if we use eight, clusters).

3.1.3 Boosted Regression Trees

In the last step of the analysis, we employed boosted regression trees (BRT) to explore how a number of individual-level characteristics predict the probability of individuals belonging to each of the nine clusters. We sought to understand whether and to what extent exhibiting specific profiles of avoidances profiles can be predicted by individual and contextual characteristics identified as important drivers of eating practices, and by others related to individuals’ lifestyle. All the variables are included together since the aim is to maximize the predictive power of the model, not to build a model analyzing causes (Shmueli, 2010).

Boosted regression (or boosting) is a recent machine learning technique developed by computer scientists and extended by statisticians. BRTs combine the strengths of two algorithms: regression trees (models that relate an outcome to their predictors by recursive binary splits) and boosting (an adaptive method for combining many simple models to improve predictive performance) (Elith et al., 2008). In BRT, each individual model is a simple regression tree, i.e. a rule-based classifier that partitions observations into groups having similar values for the outcome variable, based on a series of binary rules (splits) constructed from the predictor variables (Hastie et al., 2001). The boosting algorithm uses an iterative method to develop a final model in forward-moving stages, progressively adding trees to the model, while re-weighting the data to emphasize cases poorly predicted by the previous trees (Schonlau, 2005). The final BRT model can be understood as an additive regression model in which individual terms are simple trees, fitted in a forward, stage-wise fashion (Elith et al., 2008). Several empirical studies have shown that boosted regressions, in particular conditions, can greatly outperform traditional regression methods in predictive accuracy, especially when applied on large datasets (Friedman et al., 2000; Schonlau, 2005). BRT might be preferred to more standard regression models because of its greater flexibility, since it permits predictor variables to be included without specifying the functional relationship to the outcome and allows complex interactions with other predictors.Footnote 3 We use BRT in our application to illustrate the potentials of a predictive machine learning approach and to maintain coherence with the clustering approach adopted in the first step. One has to bear in mind that, in the second step of the analysis, other comparable supervised machine learning techniques could be applied as well, such as regression trees, lasso regression, random forests or more complicated ensemble methods (Hastie et al., 2001).

As with many other machine learning techniques, in BRT the model is first fitted to a training dataset (usually a subsample of the complete dataset), then the fitted model is used to make predictions on a test dataset. In our application, we used 50% of the sample as training data and the remaining 50% as test data. This ensured that the model was not overfitted and is generalizable (Friedman et al., 2000; Schonlau, 2005). We present the results of the BRT by reporting the parameters, called ‘influences’, which correspond—in the case of models based on a logistic function—to the percentage of log likelihood explained by each predictor variable (Friedman, 2001). The influences are standardized to add up to 100% and in our application are intuitively understood as the importance of each variable in predicting the probability of belonging to each profile of food and drink avoidance. To interpret the sign of the relationship we rely on predicted probabilities from the BRT models.Footnote 4

4 Results

4.1 Descriptive Statistics and Total Volume of Avoidances

The first set of findings concentrate on how frequently each food or drink is avoided and the results of the BRT model applied to the total volume of avoidances. Panel A in Fig. 2 illustrates in ascending order the percentage of individuals who never consume each of the twenty-three items. Two main reflections are in order. First, alcoholic drinks are the items most avoided. The strongest drinks (liquors, bitters, and alcoholic aperitifs—‘alcaper’) are avoided by more than 60% of the population, followed by non-alcoholic aperitifs—‘analc’ (43.3%), beer (42.5%), and wine (39.3%).Footnote 5 Interestingly, after alcoholic drinks, the items most avoided are salty snacks (38.1%) and soft drinks (34.8%), both powerful markers of food boundaries with negative connotations (self reference). Taken together, these results may suggest a link between avoidances and health considerations, as a gradient seems to reflect the (un)wholesomeness of the foods and drinks avoided.

Second, very few of our Italian subjects tend to avoid food items central to the Mediterranean diet. Carbohydrates (bread, pasta, and rice) are avoided by hardly any (0.3%), and fruit, potatoes, and vegetables all by 2% or fewer. Apart from milk (21.7%), among animal-derived products pork (11.3%) and cured meat (7.3%) are avoided most, followed by fish (5.3%), eggs (4.9%), beef (4.4%), cheese (3.9%), and chicken (2.5%). The high incidence of milk avoidance is unsurprising, as 16% of the Italian adult population self-report lactose intolerance (Statista 2021). Finally, legumes and sweets are avoided by 12% and 11.5% of the sample respectively.

Panel B in Fig. 2 shows the influence of the variables we examined in predicting the total number of avoidances per individual, expressed as percentages. Gender and age are by far the most important predictors, explaining almost 70% of the variation in the total sum of avoidances—respectively 43.4% and 24.8%. Net of other variables, men tend to have fewer aversions than women and older people tend to avoid more items than younger people. Otherwise, the overall contribution of socioeconomic and cultural resources is negligible, although we confirm the influence of an educational gradient on the number of aversions, mirroring research demonstrating that cultural tolerance has become a principle of good taste (Warde, 2011).Footnote 6

Fig. 2
figure 2

Proportion of individuals who avoid each specific food/drink (A) and influence of each predictor on the total number of avoidances (B)

Fig. 3
figure 3

Radarplots of the 9 food avoidance clusters, giving relative proportions. Notes: The size of each cluster is reported (percentage of cases) in the heading for each radarplot. The thick line represents the avoidance rate of the data subjects assigned to each cluster, while the thin line represents the reference rate of avoiding each item in the overall dataset. Please see Table A3 in the Appendix for data on the relative proportions

4.2 Avoidance Clusters

Looking at avoidance by item, and in terms of the absolute number of items avoided gives some indication of the extent of aversions but cannot account for the many possible patterns of abstinence. From the possible outputs of the SOM and cluster analysis we chose a set of nine clusters, representing ranges of cells in the SOM with large numbers of connections, which means they help to preserve its topology (see the Appendix for details). Based on the clustering, we investigated the composition of avoidances in the data: Fig. 3 uses radarplots to illustrate the probability of avoiding the twenty-three food and drink items conditional on belonging to each of the nine profiles. The thin black line within each radarplot connects the avoidance rate for each item in the total sample.

We dubbed the first cluster, the largest of the nine (29.7%), “Tolerant” as its members show a lower-than-average probability of avoiding all items. This cluster, like cultural omnivores in the literature on cultural stratification, displays tolerance for many different cultural items, of both low (e.g. soft-drinks) and high status (see e.g. Alderson et al., 2007; Chan & Goldthorpe, 2007; Fishman & Lizardo, 2013). The second cluster, “Non-drinker”, groups individuals that avoid all alcoholic drinks and includes 21.4% of respondents. Interestingly, non-alcoholic cocktails tend to be consumed less by this group. The third cluster, “Spirits avoider”, contains individuals (18.3%) likely to avoid all alcoholic drinks except beer and wine. Wine avoidance defines the fourth cluster, including 10.1% of respondents who show a very high probability of avoiding wine compared to all the other items. The “Health-conscious tolerant” (9.5%) refuse only salty snacks, soft drinks, and, to a much lesser extent, sweets. This cluster echoes findings about cultural omnivores who are mostly open-minded toward “anything but” a few specific, symbolically marked items (Bryson, 1996; Lindblom & Mustonen, 2019). The sixth (5.4%) and the ninth (0.3%) clusters are both “Vegetarian”, but the former (“Non-drinker vegetarian”) also avoids alcohol, snacks, soft drinks, and sweets.

We label the seventh cluster (4.9%) “Harām” as, in line with Islamic dietary prescription, it is characterized by avoidance of pork and cured meat—which in Italy is mostly derived from pork—and by avoidance of all alcoholic drinks. Finally, “Radical resister” is the second smallest cluster (0.4%) and contains individuals with a higher probability of not consuming several types of foods—vegetables, fruit, fish, legumes, and also beer and wine. This group rejects Mediterranean dietary principles and Italian mainstream culinary culture more generally.

The cluster analysis reveals a recognizable, organized portrayal of Italian consumption patterns. The size of the tolerant cluster, almost a hundred times more prevalent than the tiniest cluster (Vegetarian, 0.3%), suggests that most of the Italian population do not avoid any of the most common foodstuffs, and more generally dominant cultural practices involve consumption of all of the foods, although not all types of drink.

Fig. 4
figure 4

Prevalence of the food avoidance clusters over time: absolute (panel A) and relative change (panel B). Note: The trend line for the vegetarian cluster is omitted because it is more than four-fold higher than the others

The nine clusters evolved over fifteen years. Figure 4 A plots the absolute trends while Fig. 4B reports the relative trends in the incidence of each of the nine clusters. Between 2003 and 2018, the very small vegetarian cluster expanded from 0.14 to 0.74%, a five-fold increase in relative terms (excluded from panel B for scaling reasons). All the other profiles have experienced relevant but more modest variations. The radical resisters, the Harām, and the non-drinking vegetarians expanded. The share of spirits avoiders and health-conscious tolerants decreased by more than 30% between 2003 and 2018. The most likely interpretation of the latter finding is that increasing concern with health reduces tolerance.

4.3 The Social Patterning of Avoidance

In the last step of our analysis, we look at the results of the BRT to assess the extent to which individual and contextual characteristics can predict the probability of belonging to each of the nine clusters. As reported in Table A4, the predictive power of the models is overall very good, but with some heterogeneity across the clusters. The percentage of correctly classified cases ranges from 72% (non-drinker cluster) to more than 99% for the vegetarians and the non-drinker vegetarians, with the tolerant group in between (around 90%).Footnote 7

We rely on a variety of output models to better interpret the results from the BRT models. Figure 5 illustrates, in a graphic matrix, the relative influence (in percentages) of each variable in predicting membership to the nine food/drink avoidance clusters. Additionally, to get a sense of which categories of individuals are more likely to belong to each cluster, we report in Fig. 5 the predicted probability distribution of the response categories related to the most important predictor for each cluster. A more complete account of predicted probabilities from the BRT models is reported in Tables A5, A6 and A7 in the Appendix, which show the average predicted probabilities according to all categorical and continuous predictors and a summary of the results.

Fig. 5
figure 5

Predicted probability distribution of the most important predictor for each cluster

Overall, Fig. 6 suggests that heterogeneity in the relative predictive power of the various individual and contextual characteristics, even though some patterns are recognizable. More specifically, the results for the tolerant cluster substantially resemble those from the total volume of aversion, with gender and age as the most important predictors, accounting together for 57% of the explained variation. The other predictors suggest that belonging to this cluster is related to being young, male, and possessing higher cultural and economic resources. Food tolerants in Italy also are better satisfied with their health, read more books, engage more often in regular sport, civic activities, and volunteering, but are also more likely to smoke.

The probability of being in the non-drinker cluster is strongly predicted by gender, which alone accounts for 57% of the explained variation in the outcome log likelihood. In line with existing evidence on gender and nondrinking, women are more likely to belong to this cluster than men. The second most important predictor relates to tobacco (13%): not smoking is also a relatively important predictor of avoidance of alcohol and spirits. In addition, older individuals from southern regions, with a low socioeconomic and cultural background, but possibly living in less deprived areas, are more likely to belong to this cluster. Its members tend to be more religious, healthier, and non-smokers, but not to engage in sport, volunteering, or civic participation.

The likelihood of being part of either of the spirits or wine avoider groups is predicted mostly by age, with a relative influence of respectively 47.5% and 41.6%. However, the direction of the association goes in opposite directions: older people are more likely to avoid spirits, younger people wine. Besides age, variables for both clusters are rather heterogeneous both in predictive power and in direction, suggesting avoidances attributable to hedonic preferences. Region of residence is the second most important predictor, but there are no clear patterns, apart from those in central regions being more likely to avoid spirits and less likely to avoid wine, possibly because of local culinary traditions.

Belonging to the health-conscious tolerant cluster is related to sociodemographic variables such as age (24.2%) and gender (9.6%) and contextual characteristics such as region (12.2%) and year sampled (8.3%): the probability of belonging to this group is higher among the elderly, men, and was higher during the early 2000s than more recently. While some regions appear to predict this outcome better (e.g. Umbria, Valle d’Aosta), no clear territorial pattern emerges from inspection of the predicted probabilities. Among the variables with less predictive power, there is a gradient in relation to socioeconomic resources, with upper-status people (bourgeois, having good economic resources) more likely to belong. People in this cluster are also more likely to be satisfied with their health, and engage in sport more often.

Fig. 6
figure 6

Relative importance (%) of individual characteristics for the probability of belonging to each of the nine clusters. See Table A8 in the Appendix for detail

The most important predictor of membership to the non-drinking vegetarian cluster is satisfaction with health (30.5%): people scoring lower are more likely to belong. Although initially puzzling - given that the cluster is characterized by avoidance of alcoholic drinks, soft drinks, and salty snacks, and only to a lesser extent of meat – these many people may avoid certain items precisely because of poor health. Other predictors, however, suggest poverty and necessity: older people, more often women, those living alone, and widowed, with scarce economic and cultural resources, living in low-quality areas with difficult access to food shops.

This profile can be contrasted with the vegetarians who also drink, for which the most important predictor is reading (20.3%), with the probability of membership increasing with the number of books read. People in this cluster tend to be younger, from upper socio-economic and cultural milieux, with high health satisfaction, practicing regular sport, and more likely to engage in civic participation and volunteering, but not in religious ceremonies. Such associations, along with the cluster’s highest relative growth over time, suggest that this particular group, despite being a small fraction of the meat-avoiders identified through the SOM, may be more closely aligned with the values and instances of vegetarianism as a lifestyle movement – rather than simply as a lifestyle (Haenfler et al., 2012). Moreover, the region of residence has relatively high predictive power, with a clear north-to-south gradient in the probability of belonging to the cluster. This corresponds with a recent study showing that northern regions have a higher proportion of vegetarians and vegans (Eurispes, 2019).

For the Harām cluster, gender has the largest influence (18%), with women being more likely to belong. This is not surprising, as we have already seen that women are more likely to abstain from alcohol than men; thus, a subset of the non-drinker cluster additionally characterized by avoidance of pork and cured meat could have been allocated to this one by the SOM algorithm. Region of residence and social class are also important predictors, respectively explaining 17% and 8% of the variation. Although we have no direct measures, it is likely that these two variables roughly capture religious affiliation and migration histories. As we show in the Appendix (see Figure A2), there is a positive correlation (0.51) between the proportion of Muslim residents in each region and the probability of belonging to the Harām cluster which, in addition to a higher fraction of immigrants being in lower social strata (Fellini & Fullin, 2018), may explain the socioeconomic and cultural gradients. The relatively high predictive power of the variable ‘Attendance to religious ceremonies’ points in the same direction. In addition, the probability of membership decreases with age—in line with Italian recent migration history—and the cluster size has increased over time, moving from 4.5% in 2003 to 5.5% in 2016.

Finally, the radical resister cluster is not clearly defined. Region of residence (14.6%), age (13.2%), and area quality (9.3%) are the variables with the highest relative influence—though differences between regions are very small and area quality does not exhibit any clear pattern. Younger people are more likely to belong to this cluster, as are men, people having low health satisfaction, the non-religious, and possibly those with more disadvantaged backgrounds.

5 Discussion and Concluding Remarks

This paper has made use of a unique, repeated, cross-sectional dataset containing fine-grained information on how widely or often broad categories of food and drink are consumed, to explore how avoidances are clustered and socially patterned. We illustrate the value of the application of SOM and BRT, two powerful machine learning techniques that are rarely employed within sociology but that permit to efficiently reduce complex information to intelligible patterns. These techniques are used in combination toward two distinct but interrelated ends: the identification of the main food and drink avoidances profiles and their evolution over time; and prediction of cluster membership on the basis of individual and contextual attributes. Relying on machine learning techniques allowed us to deal with large multidimensional data in a more flexible way than traditional clustering and regression approaches, and to gather a more nuanced view of patterns of food and drink avoidance in contemporary Italy, a society in which food has a strong cultural relevance.

Each of the twenty-three variables employed offers the response “never”, which perfectly fits the purpose of the empirical investigation. Like many sociological studies that examine patterns of cultural consumption, we isolate groupings of individuals who share a similar portfolio of activities. We identify nine highly homogenous clusters, most of which resonate with commonly recognized forms of avoidance (e.g. harām, abstention from alcohol, vegetarian). This is reassuring, as it implies that the analytic procedures produce meaningful results. Because the questionnaire focused on items which are recognizable elements of the country’s diet, the clusters capture central elements of the Italian food and drink consumption landscape. Both the larger size of the tolerant cluster and the evolution of the clusters over time correspond with other available evidence, such as the relative growth of tolerance, the appearance of meat avoidance projects, the increase in Muslim migration, and the growing role of health considerations in dietary choices (Lizardo & Skiles, 2016; Eurispes, 2019; Oncini & Triventi, 2021).

It is somewhat puzzling that the social sciences almost always treat eating and drinking as different spheres of consumption, since culinary and gastronomic discourses very often address their interaction. In addition, drinks contain nutrients, some doctrines have taboos about both, the ‘matching’ of drinks with food is a part of distinctive culinary traditions and national heritages, and the manner of combination may indicate possession of significant cultural capital. The clusters illustrate the value of considering food and drink together, helping to identify the harām cluster, to differentiate among vegetarians, and to separate two types of tolerant profiles with specific aversions toward spirits and toward wine. Examining food and drink simultaneously offers a promising research agenda for analyzing consumption patterns (Warde et al., 2023).Footnote 8

As regards the substantive findings, the cluster profiles show that a large proportion of the Italian population is omnivorous in its selection, with roughly 50% of the population being characterized by tolerance for all food items (see also Figure A4 in the Appendix). Clusters identified primarily by their preferences for alcoholic drinks are also omnivorous eaters. Exceptions to the tendency to tolerance accord with recognisable principles governing vegetarian and haram diets, which constitute growing minorities of contra-hegemonic taste. Abstention from alcohol and preferences among alcoholic drinks are major bases of cluster differentiation, the rationales for which deserve further investigation in the Italian context.

Examination of the socio-demographic characteristics of clusters reveals some significant differentiation which reflects social and cultural boundaries affecting eating and drinking in Italy. Age and gender are generally the most powerful predictors, dividing segments of the population on socio-demographic characteristics (e.g. the tolerant, the non-drinkers, the avoiders of wine and spirits). Contextual indicators, particularly region of residence and year sampled, are occasionally important predictors of specific clusters (e.g. harām, vegetarian). Socioeconomic and cultural resources have little predictive power across several clusters but, in specific cases, reveal a gradient marking boundaries based on aesthetic or ethical tastes (e.g. tolerant, health-conscious tolerant, vegetarian). Similarly, lifestyle and health variables offer little predictive relevance for many clusters, although in the two vegetarian clusters they may help distinguishing between vegetarians “by necessity” and vegetarians stricto sensu. Finally, variables measuring ease of access to supermarkets and food shops are rarely informative, implying that avoidance of common items is not a matter of contextual opportunities.

The social profiling of the clusters obtained with BRT models suggests that several different rationales underpin avoidances, providing evidence of health concerns, hedonic preference, status display and doctrinal purity. The non-drinker cluster and the spirits avoider cluster are conditioned by gender and age respectively, suggesting the existence of social group norms, compounded by risk of illness. Health variables have a relatively high predictive power in both cases.

Sensory disappointment – disliking the taste or sensation of a category of products – explains some aversions, most likely the avoidance of spirits, or wine, or the rather eccentric pattern exhibited by the heterogeneous radical resister cluster. For these profiles, age and region of residence are the most powerful predictors with some identifiable pattern. Nevertheless, the specificity of the aversions suggests that their roots in personal preferences rather than health, doctrine, or cultural capital. Some distastes, to paraphrase Bourdieu, are just distastes. They do not ‘classify the classifier’, for they are signs neither of identity nor display of a social position. They may still, however, convey cultural meaning and act as a medium of cultural classification.

Rationales based on cultural boundaries appear to frame the tolerant, the health-conscious tolerant, and the vegetarian clusters. The gender composition of the tolerant, also apparent in the total volume of aversions, perhaps partly captures masculine expression of invulnerability – if openness were to be interpreted as a greater propensity to take risks (e.g. Courtenay, 2000). Moreover, in these three clusters we observe a gradient in the socioeconomic and/or cultural resources of the members, redolent of expressions of displays of distinction in many cultural fields. Nevertheless, it should be emphasized that, overall, cultural and economic resources have low predictive power when compared to gender and age. This may imply that while social standing matters when we look at distaste and aesthetic judgement, actual food avoidances are less sensitive to class-based boundaries. The relatively weak socio-demographic determination of profiles might be anticipated both because some avoidances must be attributable to vagaries of hedonic preferences and accidents of biography and because eating is a weakly regulated and weakly coordinated practice.

Cultural boundaries are also drawn in relation to ethical or religious principles. The haram cluster exemplifies effects of religious doctrine where avoidance is prescribed for reasons of spiritual integrity. Also, although observance is not only governed by matters of ethical principle, the vegetarian clusters also follow rules-based principles of exclusion. That vegetarians should avoid meat is definite and definitive. The relative merits of the health properties of different dietary regimes is, by contrast, controversial and widely contested. Because of heterogeneous advice, it is contentious to attribute specific avoidances to medical or nutritional concerns. Some avoidances are categorical, others less imperative. Explanations of avoidances lie on a continuum from visceral disgust and bodily rejection to observance of ethical principle. However, at the mid-points there is much variation in personal reasoning and therefore degrees of freedom. Hence, in this paper, avoidance is not equated with aversion or principled rejection.

Future research may build on this account to further refine understandings of cultural consumption. The techniques are widely applicable to other fields of cultural practice where participation or abstention are suspected to be socially significant. It would be extremely interesting to repeat the analyses using food and drink surveys carried out in other countries, or to examine avoidance taking into account genres (foodstuff) and objects (dishes) (Childress et al., 2021). A more thorough and comprehensive analysis can be imagined using complementary qualitative methods to further understand how aversions and avoidance are related, how they become engrained in eating practices, when they start to become part of self-definition, and eventually which rationales they follow. But for now, we are content to have demonstrated that the clustering of avoidances makes meaningful sense. Because the consumption categories are broad and their constituent items readily available and integral to Italian culinary culture, the clusters suggest plausible interpretations of distastes and help paint the background to a picture of the structure and evolution of taste and distaste in Italy in the 21st century.