1 Introduction

The scope of this paper is to analyze sustainable tourism by the citizens of the countries of the European Union; specifically, we concentrate on opinions on sustainable travel and on the willingness by European citizens to change their touristic habits in the near future in order to be more sustainable. Sustainable tourism is a topic attracting an increasing attention by academics, international organization, local and national authorities, and, obviously, by managers and investors in the sector (Swarbrooke, 1998). Specifically, we want to investigate heterogeneity both between and within European countries, association between different types of behavior and States, and factors eventually influencing this relationship.

The Flash Eurobarometer 499 survey collected, among other topics, information on actions that European citizens are most willing to take when on holidays in order to preserve natural resources and the environment. The survey interviewed a representative sample of European citizens, age 15 and over, in each of the 27 Member States (MSs) of the European Union (EU) in October 2021, after the spreading of the Covid-19 pandemic. Being interested in sustainable touristic behavior, we selected the ten binary variables asking if EU citizens are willing to perform specific circular economy actions related to traveling and taking holidays: nine questions refer to specific actions, the tenth question to a general disposition to behave more sustainably when on holiday. A large majority of European citizens (82%) are prepared to change at least some of their habits, however, a lot of heterogeneity both within and between European countries exists.

In order to identify associations between sustainable touristic behavior by citizens and European countries where they live, we applied the latent class distance association model (LCDA), an approach that allows to estimate association between categorical variables even in the case of large and sparse observed contingency tables (Vera et al. 2014). The data to be analyzed, in this context, consist of profiles, i.e., combinations of categories of independent variables plus a response variable, organized in a contingency table (see also Vera 2022). The LCDA model identifies clusters of profiles and their association with an explanatory or response categorical variable. In our application, profiles refer to European citizens’ answers to the series of questions regarding touristic behavior, while the 27 European countries constitute the categories of the explanatory variable. Attitude towards sustainability when traveling is a non-directly observable construct; therefore, in order to estimate it, we considered the answers given to 10 binary questions posed in the survey.

Social and behavioral sciences often deal with the analysis of associations for cross-classified data (see de Rooij and Heiser 2005). While there is a vast literature on the analysis of cross-classified data, the available solutions for sparse matrices are limited. We removed many of these limitations by using a latent class distance association model. The procedure can deal with cross-classified data with a categorical response variable, both when sparse tables are present using profiles and when the explanatory or response variable has many categories and is clustered. We showed, furthermore, that an easy interpretation of associations between clusters’ centers and categories of a response variable can be incorporated in this framework in an intuitive way, using unfolding.

As very well explained by Bergman and Magnusson (1997), the object of interest is information at individual level. The more traditional variable- oriented approach estimates relationships among variables, i.e., theoretical constructs with the limitation of not considering heterogeneity among individuals. In the person-oriented approach, the analysis takes into account the patterns of individual characteristics relevant for the study, and generating profiles. Several person-oriented statistical methods are proposed in the reference literature to analyze contingency tables even in the case of hierarchical data as, for example, latent class analysis (Bassi and Guidolin 2021); however, these methods are very sensitive to the sparseness of the contingency table. The LCDA approach overcomes this problem.

The estimation of the latent distance association model allowed us to identify clusters of European citizens with profiles that are similar for their willingness to adopt circular economy practices related to tourism and the association of these latent clusters with the 27 EU MSs Using socio-demographic variables collected by the Flash Eurobarometer Survey 499, it was possible also to describe these clusters.

2 Sustainable tourism (ST), review of the reference literature

2.1 Definition

Sustainable tourism has been a topic of discussion in tourism circles since the early 1990s. Interest in sustainable tourism was driven by two key factors: the influence of the Brundtland Commission report (1987) and the subsequent Summit of Rio (1992) on sustainable development. Although ideas about sustainable development had been discussed for some time before, the Brundtland report and the commitment made were decisive for the study of sustainable tourism (ST). ST is a multifaceted concept and, depending on the perspective, different aspects and areas of focus will be relevant. In the years following the Brundtland Commission reports, multiple alternative definitions and modifications of ST were developed (Roberts et al. 2022) and several hundred definitions now exist (Johnston et al. 2007).

According to the World Tourism Organization of the United Nations (UNWTO), the guidelines for sustainable tourism development and management practices are applicable to all forms of tourism in all types of destinations, including mass tourism and the various segments of niche tourism. Sustainability principles refer to the environmental, economic and socio-cultural aspects of tourism development, and an appropriate balance must be found with these three dimensions to ensure its long-term sustainability. Thus, sustainable tourism should:

  1. 1.

    Make an optimal use of environmental resources, which are a key element in tourism development, maintaining essential ecological processes and helping to conserve natural resources and biodiversity.

  2. 2.

    Respect the sociocultural authenticity of host communities, conserve their living and cultural heritage and traditional values, and contribute to intercultural understanding and tolerance.

  3. 3.

    Ensure viable long-term economic operations, providing socio-economic benefits, that must be fairly distributed, to all stakeholders, including stable employment, income-generating opportunities and social services for host communities, finally, contribute to the alleviation of poverty.

Continuous work of research in the field resulted in the identification of a fourth pillar: institutional sustainability and fine-tuning of the three original pillars (Burford et al. 2013).

2.2 A survey of the recent literature on ST

In the existing literature, we start highlighting the work of Lu and Nepal (2009). The authors affirmed that tourism is recognized as a resource-intensive industry, therefore it must be accountable in terms of sustainability both locally and globally. In addition, these authors identified five thematic areas: impact of tourism; sustainability assessment; development; visitors’ behavior and attitude; and planning, concluding that sustainable tourism research has largely mirrored trends in tourism research in general. However, Buckley (2012) concluded that the four most popular thematic areas in sustainable tourism are ecotourism, responsible tourism, community-based tourism, and conservation tourism.

ST is an important issue in the debate on environmentally integrated tourism development, but existing research has shown that sustainability is a complex concept that requires more critical and comprehensive analysis (Butler 1999; Mowforth and Munt 2003). The links between sustainability and tourism have been developed in many relevant publications on various dimensions such as quality of life, equity and the environment (Butler 1999; Collins 1999; Farrell and Twining-Ward 2005; Hunter 1995; Wall 1997). ST needs to be conceptualized in a more comprehensive way to meaningfully and critically assess its interconnectedness with natural, social, and economic elements across multiple scales and time periods (Farrell and Twining-Ward 2005; McKercher 1999). ST can also be interpreted as an “adaptive paradigm” (Hunter 1997) or “adaptive management” (Farrell and Twining-Ward, 2005), which addresses issues of unpredictability of events, uncertainties about the outcome of events, and complexities of scale and times. Some authors defined ST in broader terms, transferring the principles of sustainable development to the context of tourism needs (Hardy and Beeton 2001).

The content of the debate on ST has expanded to include not only environmental issues but also economic, social and cultural issues, political power and social equality. Some critical voices, however, argued that the viability of sustainability remains a key issue for tourism, as it is unrealistic to balance competing interests and, therefore, compensation decisions will no doubt prioritize certain interests (Hunter 1997). A longitudinal analysis of research on the topic made it possible to identify trends and patterns in studies on sustainable tourism (Ruhanen et al. 2015). These authors indicated that there is an evolution in the theoretical and methodological approaches, as well as in the themes and topics used in the research subfields.

In recent years, there has been a significant increase in the number of articles published on aspects of the multiple relationships between climate change, global warming and unsustainable tourism (Bramwell et al. 2017). In 2006, two special issues related to climate change were published, with 17 articles on this topic, while between 2007 and 2016 a further 68 articles appeared in leading journals. Climate change is a much more controversial topic than others in the sustainable tourism debate and it requires special technical knowledge (Bramwell et al. 2017). However, it is closely related to debates on the environmental impacts of tourism, social and behavioral change and the governance of the tourism industry, and it is considered a key issue for the future of ST (Scott 2011; Scott et al. al. 2016a; Scott et al. 2016b; Weaver 2011).

Roberts et al. (2022) conducted a meta-analysis of the literature on tourism and sustainability published in English between 2019 and 2021, as well as an analysis identifying the most commonly used research approaches. Four main categories were used to classify the method of analysis in each article: (i) qualitative data (case studies, interviews and observations), (ii) quantitative data, (iii) mixed methods (a combination of qualitative and quantitative approaches), and (iv) concept/review articles (documents without empirical components). Of the 881 articles considered, there were 323 qualitative studies (36.7%), 368 quantitative (41.8%), 141 mixed methods (16.0%), and 49 conceptual/review (5.6%). Overall, only 3.8% of the articles published during this period in the selected journals were associated with sustainability, showing a possible need for more emphasis on this topic. They emphasized the space for further studies on ST, especially necessary after the Covid-19 pandemic, whose effects on tourism have still to be completed understood.

In relation to the attitudes of tourists towards future sustainable behavior, the contribution that we present in this paper has not yet been addressed in previous studies, to our knowledge. Some works have been published on the attitudes of tourists after the Covid-19 pandemic, but they referred to specific countries such as China (Huang et al. 2021), Malaysia (Abhari et al. 2022), Colombia (Mestanza-Ramón and Jiménez -Caballero 2021), or even smaller geographical areas. Finally, we report on some more recent works that studied the relationship between tourism and the environment and that might complete our literature review. Ritchie et al. (2022) analyzed the attitudes of tourists towards more radical interventions in the Great Barrier Reef due to the significant threats it is suffering (anthropogenic and natural, along with climate change); they collected data on a sample of 468 national and international tourists in Cairns, Australia, and tried to identify and understand the effect of institutional support for these types of interventions. Casado-Díaz et al. (2022) used a sample of 680 hosts to examine gender differences in water conservation behavior. The originality of our paper lays also in the fact that it analyzes the attitudes of European tourists after Covid-19 with a sample of citizens from all 27 EU countries.

3 The latent class distance association model

For categorical variables, the latent class distance association model allows us to cluster the categories of a predictor variable while simultaneously representing associations (after main effects -row and column- have been removed) between the estimated latent classes and the categories of the response variable, using unfolding. Besides an optimal classification, a configuration of points for the classes and another one for the categories of the response variables in a low-dimensional Euclidean space are estimated, such that the Euclidean distances inversely represent these associations. The model can also be applied to any two-mode cross-classified data without requiring distinguishing between predictors and response categories (Vera et al. 2014).

This approach is particularly suitable when the number of predictor categories is large, and hence model selection procedures are cumbersome, when a very large number of parameters must be estimated, or when there are difficulties with sparse datasets containing too many zero entries (see also Vera and de Rooij 2020).

Given an I x J contingency table F = (fij), let us assume a row-blocked shaped partition P(F) of the rectangular matrix F into T blocks Ft of rt elements fi = (fi1,…, fij)’, with fi \(\in\) Ft. Hence, each row vector of F belongs to one and only one of the T subsets Ft, but we do not know in advance which specific latent block a particular vector belongs to. The unconditional probability that any row element fi belongs to latent class Ft. is denoted by γt, with 0 ≤ γt ≤ 1 and \({\sum }_{t=1}^{T}{\gamma }_{t}=1\).

The cluster centers are represented by points xt collected in the rows of a T x M configuration matrix X, and the categories of the response variables are represented by points yj collected in the rows of the J x M configuration matrix Y. Thus, under the general multiplicative form, the expected frequency of row i and column j, with fi \(\in\) Ft, is given by the expected frequency μtj of cluster t and column j, which can be written as,

$${\mu }_{tj}=\mu {\alpha }_{t}{\beta }_{j}\mathrm{exp}\left(-{d}_{tj}^{2}\right),$$
(1)

where μ is the overall scale parameter, αt is the latent class effect parameter, βj is the column effect parameter and \({d}_{tj}^{2}= {d}^{2}\left({x}_{t,}{y}_{j}\right)\) is the squared Euclidean distance given by:

$$d^{2} \left( {x_{t} ,y_{j} } \right) = \mathop \sum \limits_{m = 1}^{M} \left( {x_{tm} - y_{jm} } \right)^{2} .$$

Taking into account the well-known equivalence of the multinomial and Poisson distribution (Agresti 2013; Birch 1963), for the parameters’ estimation, the probability ht(.) for the data of a row element fi \(\in\) Ft can be expressed in terms of an usual Poisson sampling model, given by:

$$h_{t} = \left( {\varvec{f}_{i} {\text{|}}\varvec{x}_{t}, \sim \varvec{Y},\sim\mu ,\sim\alpha _{t} ,\varvec{\beta }} \right) = \mathop \prod \limits_{{j = 1}}^{J} \frac{{\mu _{{tj}}^{{f_{{ij}} }} }}{{f_{{ij}} !}}exp\left( { - \mu _{{tj}} } \right),$$
(2)

where μtj is given by (2) and β = (β1,…, βJ)’. The probability density function of the random variable fi is a finite mixture of Poisson densities given by (2), adopting the expression,

$$g\left( {\varvec{f}_{i} |\varvec{X},\varvec{Y},\mu ,\varvec{\alpha },\varvec{\beta },\varvec{\gamma }} \right) = \mathop \sum \limits_{{t = 1}}^{T} \gamma _{t} h_{t} \left( {\varvec{f}_{i} |\varvec{x}_{t} ,\varvec{Y},\mu ,\alpha _{t} ,\varvec{\beta }} \right),$$
(3)

where α = (α1,…, αT)’, and γ = (γ1,…, γT)’.

Parameter estimation is performed in an EM (Dempster et al. 1977) framework; the details can be found in Vera et al. (2014). Given the maximum likelihood estimates \(\widehat{{\varvec{X}}}\), \(\widehat{{\varvec{Y}}}\), \(\widehat{\mu }\), \(\widehat{\boldsymbol{\alpha }}\), \(\widehat{{\varvec{\beta}}}\), and \(\widehat{{\varvec{\gamma}}}\), the posterior probability that an element fi belongs to latent class Ft is calculated by means of the Bayes theorem as follows,

$${\pi }_{it}\left(\widehat{{\varvec{X}}}, \widehat{{\varvec{Y}}}, \widehat{\mu }, \widehat{{\alpha }}, \widehat{{\varvec{\beta}}}, \widehat{{\varvec{\gamma}}}\right)=\frac{{\widehat{\gamma }}_{t}{h}_{t}(\widehat{{{\varvec{x}}}_{{\varvec{t}}}}, \widehat{{\varvec{Y}}}, \widehat{\mu }, \widehat{{\alpha }_{t}}, \widehat{{\varvec{\beta}}})}{g\left(\widehat{{\varvec{X}}}, \widehat{{\varvec{Y}}}, \widehat{\mu }, \widehat{{\alpha }}, \widehat{{\varvec{\beta}}}, \widehat{{\varvec{\gamma}}}\right)}.$$
(4)

Hence, an element fi will be assigned to the class that is most likely to belong to, given these posterior probabilities.

Experimentally, on many occasions the researcher’s experience and the facility of interpretation prevail in the selection of some parameters of the model. Nevertheless, the LCDA model allows the possibility of determining the number of classes as well as the dimensionality of the representation. To this end, the adjusted BIC statistics given by (see Vera et al. 2014):

$${\text{BIC*}} = - 2\log L + l\log h,$$

can be employed, where \(h = (I + 2)/24\), and \(l = TJ + (T - 1)\). Therefore, the selected number of latent classes corresponds to the lower value of the BIC* statistics, when the proposed procedure is applied for a range of values of T without imposing the geometrical constraints. In addition, given a number of latent classes for the row category elements, the BIC* criterion can be employed to select the dimension of the distance-association representation.

4 Data analyses

4.1 The database

The data analyzed in this paper were collected by the survey Flash Eurobarometer 499 on attitudes of European citizens towards tourism. Eurobarometer is the polling instrument of the European Union institutions to monitor public opinion and other relevant issues from European citizens and enterprises. Collected data are open and can be downloaded from the portal together with questionnaires and other metadata.

Flash Eurobarometer 499 was conducted in October 2021 and wanted to investigate travel behavior and the impact on it of the Covid-19 pandemic, booking channels and sources of information for travelling arrangements, reasons for selecting destinations, options and information on sustainable tourism. The total sample consist of 25,714 European citizens over 15, distributed in the 27 European Union (EU) Member States (MSs). As in all other Eurobarometer surveys, information of socio-demographic characteristics of the respondents were collected (European Union 2021).

The focus of our paper is on sustainable tourism in the European Union, therefore we considered the ten binary variables related to the willingness by interviewed citizens to change travel and tourism habits to be more sustainable. Specifically, nine questions, reported in Table 1, were proposed to which it was requested to answer if prepared to change or not with reference to a specific circular economy practice. The tenth question was “I am not prepared to change my habits” and, again, it required a yes or no answer. In Table 1, the percentage of yes responses is reported for each option. Data are weighted according to the 15 + population of each EU MS.Footnote 1

Table 1 Percentages of YES responses to the ten binary questions in the sample

As a general evidence, we see that only a relatively small percentage of European citizens are not prepared to change habits. The European countries, where citizens are mostly ready, to change are Malta (99%), Ireland and Luxemburg (95%), Austria, Portugal and Romania (94%); however, this good disposition is not everywhere followed by the willingness to adhere to concrete actions. On the other side, countries, where citizens are more unprepared to assume a sustainable touristic behavior, are Cyprus (72%), Denmark (75%), Lithuania (76%), Bulgaria (77%).

The action that respondents are more willing to perform is consuming locally sourced products while on holiday, however, again there are important differences across countries. For example, for this action, the percentage of yes responses ranges from 15 in Malta to 86 in Romania. Malta is, as already noted, the country with only 1% of citizens not prepared to change travelling habits; this result shows that the various proposed actions might be performed very differently in the 27 EU MSs. Thus, there is non-negligible variability between countries, but we expect also differences within countries associated to socio-demographic characteristics of the respondents. We consider gender, age, number of family members 15 + , occupation, type of community where living, traveling habits before the pandemic. These variables all revealed as statistically associated to the considered ten binary items as reported in Table 2.

Table 2 Percentages of YES responses to the 10 binary questions by socio-demographic groups

In general, female and youngest respondents are more willing to change their habits as well as people living in large towns; oldest European citizens are less prepared to perform actions to be more sustainable; there is a statistically significant relationship also with the other variables considered in Table 2, that deserves further investigation.

4.2 Analyses using the LCDA model

In this study we are interested in the analysis of the data collected with the Eurobarometer survey 499 on European citizens’ attitude towards sustainable tourism, specifically on the ten variables indicating if respondents are willing to change their travelling and holiday habits with reference to specific green actions, reported in Table 2. For the estimation of the LCDA model, we considered our dataset taking a person-oriented approach using profiles. Any combination of the categories of the explanatory variables is called a profile, and the data here consist of a profile by a response contingency table.

Sampled citizens were required to answer to nine binary variables (1 = yes, 2 = no) referring to nine specific behaviors that could favor sustainable touristic practices; the tenth question asked whether they were prepared to change their habits with reference to travelling and taking holidays (1 = yes, for not prepared). With ten binary variables, 1,024 different profiles are generated; however, we considered only those 501 with at least one nonzero observed frequency. Interviewed citizens belong to the 27 countries of the European Union, therefore, we analyzed a 501 × 27 contingency tables. The 27-category variable indicating citizenship is our dependent variable. Even if we eliminated all profiles with all zero observed frequencies, the table is sparse and unbalanced, this makes traditional methods for the analysis of this type of data, as for example, multinomial regression, not adequate. For instance, we started with estimating a hierarchical multinomial regression model using up to five interactions among the predictor variables (the ten binary responses) and considering country as the dependent variable, but this analysis failed in estimating since parameters could not be computed (see, for more details on this problem, Vera et al. 2014).

We started our subsequent analyses selecting the number of clusters for the profiles. Figure 1 shows the BIC* index values obtained when the LCDA model was run without imposing geometrical constraints up to K = 50 clusters. The lowest BIC* value was achieved for K = 26 groups (BIC* = 29,286), followed by K = 19 (BIC* = 29,344). According to Fig. 1, and since the difference between both values is very small, for the sake of easier interpretation, the value of K = 19 was chosen. Thus, the LCDA model was run for K = 19 for two, three and four dimensions. The model with three dimensions was selected showing the lowest BIC* value. To minimize the problem of local optima, the model was run for these parameters and the best solution in 20 replications was considered.

Fig. 1
figure 1

BIC* values for testing the number of clusters up to K = 50

The relevant results of the estimation of the best fitting LCDA model to our data involve many parameters. Table 3 reports some of them related to the 19 clusters: αt, the estimated latent class effects, the number of profiles per cluster (Np), the percentage of respondents per cluster (Nr), and γt, the estimated prior probabilities; the overall estimated effect μ is equal to 18.311. Table 6 in the Appendix lists the other relevant parameters: the estimated expected frequencies µij and βj, the estimated column effects (see Eq. 2).

Table 3 Results of LCDA model estimation. αt: estimated latent class effects, Np: number of profiles per cluster, Nr: percentage of respondents per cluster, γt: estimated prior probabilities

Table 7 in the Appendix reports the squared Euclidean distances between each couple of cluster and European country; the smaller the distance, the stronger the association for each couple; the smallest distance for each cluster is shown in bold, the smallest distance for each country is shown in italics. Figure 2 contains the graphical representation of the distances in a three-dimensional setting. Figure 2 clearly shows, for example, that Malta occupies an outlier position in the graph as well as cluster 6. We will comment on this in the following detailed description of the 19 clusters.

Fig. 2
figure 2

Distances among clusters of profiles and countries in a three-dimensional setting

Distances are very useful to calculate odds, for example, the odds of a response j against a response j’ for a specific class t is given by

$$log\left(\frac{{\mu }_{tj}}{{\mu }_{tj{\prime}}}\right)=log\left({\beta }_{j}\right)-log\left({\beta }_{{j}{\prime}}\right)-{d}_{tj}^{2}-{d}_{tj{\prime}}^{2}.$$
(5)

The odds are function of the main effect parameters and of the distances; when calculated, they can give important insights in the relationships among the variables under study. According to Vera et al. (2014), the odds ratios can be defined in terms of the squared distances appearing in Table 7:

$$\frac{{\mu }_{tj}{\mu }_{t{\prime}j{\prime}}}{{\mu }_{tj{\prime}}{\mu }_{t{\prime}j}}=exp\left(-{d}_{tj}^{2}-{d}_{{t}{\prime}{j}{\prime}}^{2}+{d}_{t{j}{\prime}}^{2}+{d}_{t{\prime}j}^{2}\right).$$
(6)

Table 4 describes the 19 profiles in terms of the probability of a yes answers to the ten binary variables, these probabilities are useful to give an interpretation to the clusters of respondents obtained with the LCDA model. Considering the cluster to which each respondent is assigned given the specific answers to the ten items, it was also possible to interpret these groups in terms of the distribution of the considered socio-demographic characteristics, as reported in the subsequent paragraphs. In the Supplementary material of the paper, we provided the graphical representation of the probabilities listed in Table 4. Each graph refers to one of the 19 clusters: on the x-axis, the two possible answers (1 = yes, 2 = no) to the 10 binary variables are reported; on the y-axes, we have response probabilities. The graphs allow to immediately perceive the response profile in each cluster.

Table 4 Clusters’ profiles. Conditional probabilities of YES answer by cluster

In the following, the 19 clusters of profiles are described, both looking at the answers given to the ten variables related to sustainable actions for travelling and taking holidays and to socio-demographic characteristics of the typical citizen belonging to each cluster. Associations with European countries are also reported looking at the values of the Euclidean distances.

Cluster 1 collects more than half of possible response profiles, that were reported by 4.05% of the interviewed sample; for all investigated sustainable actions, the probabilities of a yes answer are very similar to the probabilities of the no answer and near to 50%. This cluster represents European citizens that declared to be prepared to change their habits but then were uncertain about performing the specific sustainability actions referred in the survey. The typical respondent in this group is female, works as manual worker, has an age between 55 and 64, single, lives in a large town, and, before the pandemic, used to travel once or twice in a year. This cluster is associated with Austria and Luxemburg. This result means that in these two countries, citizens are developing a positive attitude towards the topic of sustainability in tourism but still have not completely transformed attitude in practice.

13.68% of European citizens are assigned to cluster 2; they all declared that they are not yet prepared to change their habits with reference to this behavior. They do not intend to perform any of the CE actions proposed in the questionnaire. The typical respondents in this cluster is male, manual worker, never travelling before the pandemic. This cluster is associated to many EU MSs: Belgium, Cyprus, France, Lithuania, and the Netherlands.

The 6.39% of citizens belonging to cluster 3 declared that they are prepared to change all their habits with reference to traveling in order to be more sustainable. The typical respondent is female, between 25 and 34, living in a large town, in a household with three components over 14, working as employee, travelling several times in a year before Covid-19. This cluster is associated to Sweden.

Cluster 4 has a proportion of 3.53%. Half of the citizens in this cluster are prepared to travel to less visited destinations and to reduce waste while on holidays. They are not prepared to perform the other seven sustainable actions. The typical member of this cluster is a woman, 45–54 years, manual worker, living in a small or medium-sized town, with a family with three members over 14, never travelling before the pandemic. This profile is associated with Bulgaria and Italy.

In cluster 5, we find a proportion of 5.56% of citizens; half of them are prepared to take holidays outside the high tourist season and to consume locally products while on holiday; they are not prepared to take any other action. Typical socio-demographic characteristics are being female, between 55 and 64, living in a small or medium-sized town, in a family with three members, self-employed, single, traveling once or twice a year before Covid-19. This cluster of profiles is associated to Croatia, Estonia and Latvia.

Cluster 6 covers 1.35% of the sample. This cluster has the same profile of cluster 3, except for variable 9, meaning that these citizens are prepared to change their behavior to a more sustainable form of tourism with reference to all actions except reducing water usage while on holidays. The typical respondent is a woman, between 25 and 34, living in a small or medium-sized town, in a family with three members, never traveling before the pandemic, self-employed. This cluster is not specifically associated to any country, even if the smallest distance is with Romania; this cluster occupies an outlier position in the graph reported in Fig. 2, indicating that this type of citizen is present in a small percentage in many countries.

Cluster 7 refers to 1.82% of the respondents. Citizens in this cluster all are prepared to choose transport options based on ecological impact, they are not prepared to perform the other eight sustainable actions. The typical citizen is male, between 45 and 54 years, in a family with three components over 14, living in a small or medium-sized town, without a professional activity, never travelling before the pandemic. This cluster is associated with Denmark.

Cluster 8 has a proportion of 3.65% of respondents. In this cluster, all citizens are prepared to travel to less visited destinations, to consume locally source products and to reduce waste while on holiday; the majority of them are also prepared to take also all other actions. The typical citizen is male, between 25 and 44 years, living in a rural area, in a family with four or more members, working as employee, travelling several times in a year before the pandemic. This cluster is associated with Slovakia.

Cluster 9 covers 3.87% of the sample. This cluster has the same profile of cluster 2, except for variable 10, meaning that these citizens are prepared to change their behavior to a more sustainable form of tourism but not with reference to the specific actions mentioned in the survey. The typical respondent is a woman, older than 64, without a professional activity, never traveling before the pandemic, living in a large town, in a family with four or more members. It is associated with Belgium.

Citizens in Cluster 10 (8.14%) mention all actions, the profile is very similar to that of cluster 1, however, with almost all citizens intending to consume locally sourced products and reduce waste while on holiday. The typical citizen is female, between 55 and 64 years, living in a household with three members over 14 and in a rural area or small village, employee, travelling once every few years before the pandemic. This cluster is associated with Spain, Slovenia and Slovakia.

Cluster 11 has a proportion of 2.89% in terms of respondents. These citizens are not prepared to perform five actions; however, all of them intend to consume locally sourced products while on holiday and the majority of them intend to take holidays outside the high season, travel to less visited destinations, and reduce waste while on holiday. The typical citizen is female, between 35 and 44 years, living in a household with two persons over 14 and in a small or medium/sized town, not occupied, travelling several times in a year before the pandemic. This cluster is associated with Hungary.

In cluster 12 (3.73% of respondents), citizens are prepared to perform all nine actions but percentages are all lower than 50%. The typical citizen is female, between 25 and 34 years, living in a household with three members over 14 and in a rural area, working as employee, travelling many times in a year before the pandemic. This cluster is associated with Ireland.

In cluster 13 (5.84%), the majority of the citizens are prepared to consume locally sourced products and to reduce waste while on holiday; the other actions show very low probabilities. The typical citizen is male, between 15 and 24 years, living in a household with four or more members over 14 and in a small or medium/sized town, travelling several times a year before the pandemic, manual worker. This cluster is associated with Italy.

Citizens in cluster 14 (3.72%) are not prepared to perform five out of nine actions; however, the majority of them are in favor of taking holidays outside of the high touristic periods, and consuming locally sourced products while on holiday; a small percentage intends also to travel to less visited destinations and to reduce waste while on holiday. This cluster is very similar to cluster 11, actions chosen by citizens are the same but probabilities are much lower. The typical citizen is male, between 55 and 64 years, living in a couple and in a small or medium/sized town, manual worker, travelling several times a year before the pandemic. This cluster is associated with Italy.

The majority of citizens in cluster 15 (5.06%) are prepared to pay more for the benefit of the environment and of the local community, to travel to less visited destinations, to consume locally sourced products and to reduce waste while on holiday; a lower percentage of them is also in favor of the three other actions. The typical respondent is a man, younger than 25, living in a rural area, traveling several times a year before the pandemic, working as employee, single. This cluster is associated with Germany.

In cluster 16 (7.87%), all sustainable actions are performed with low probabilities (lower than 40%). The typical citizen is female, between 15 and 24 years, living in a small or medium-sized town and in a family with three members, working as employee, travelling once every few years before the pandemic. This cluster is associated with Malta and Portugal.

Cluster 17 (5.90%): citizens are not prepared to travel to less visited destinations; only in very small percentages (lower than 25%) are prepared to adhere to the other eight actions. The typical citizen is male, between 15 and 24 years, living in a household with four or more components over 14 and in a small or medium-sized town, without a professional activity, travelling once every few years before the pandemic. This cluster is associated with Italy.

In cluster 18 (3.41%), all citizens are prepared to pay more to protect the natural environment, take holidays outside of the high tourist season, contribute to carbon offsetting activities, consume locally sourced products on holiday and reduce waste and water usage while on holiday; the majority of them are also prepared to perform all other three sustainable actions. The typical citizen is female, between 55 and 64 years, living in a household with two members over 14 and in a large town, working as employee, travelling once every few years before the pandemic. This cluster is associated with Romania.

In cluster 19 (8.46%), all citizens are prepared to consume locally sourced products on holiday and reduce waste and water usage; the majority of them are also prepared to perform all other seven sustainable actions. The typical citizen is female, between 15 and 24 years, living in a household with two members over 14 and in a village or in a rural area, without a professional activity, travelling once or twice a year before the pandemic. This cluster is associated with Greece, Czech Republic and Poland.

Summarizing, cluster 3 refers to those European citizens that show the best attitude towards sustainability practices in travelling and taking holidays. Citizens in clusters 6 and 18 have also a very high good disposition to all actions that are proposed in the survey. These clusters are associate to Sweden, Romania. Cluster 2, on the other hand, refers to those citizens who appear more distant to this behavior; this is the largest cluster in terms of citizens but it refers only to one specific profile. This cluster is associated to a large group of EU countries: Belgium, Cyprus, France, and the Netherlands. Cluster 1 groups the largest number of different profiles but all related to citizens with a good disposition to all circular economy practices investigated with the survey. The other 16 clusters refer to groups of respondents with various attitudes to this topic, differences depend both on the number of actions that citizens in those clusters are prepared to performed as outlined in Table 5, and on the specific actions that are preferred.

Table 5 Clusters by average number of actions that tourists are prepared to take

Table 5 gives some other information about the 19 clusters: they are ordered by the number of actions that on average tourists assigned to each cluster declared to be prepared to perform in order to be more sustainable. In this table, we also report the number of profiles associated to each cluster and its relative dimension in terms sampling units.

Clusters 2 and 6, associated to the willingness of performing no actions, refer to over 17% of citizens, but also the “best” cluster (3, willingness to perform all actions) shows a non-negligible percentage of respondents. The other clusters can be divided into groups that are similar for the number of actions indicated by the respondents, what is different are the specific actions chosen and this might be a very important detailed information for strategic management.

Figure 3 reports the profiles of 10 out of 19 clusters, those that are more representative of the different intentions of EU citizens towards ST: on the x-axis there are the nine actions, on the y-axis, the value of the probability of a yes answer to the question about the intention to adopt each action. A graph with all profiles would not be readable, therefore, in the Supplementary material of the paper, we report the 19 graphs with all single profiles, so that they can be clearly appreciated and compared.

Fig. 3
figure 3

Profiles of 10 “representative” clusters

For what concerns the association between clusters and countries, some specific results deserve some attention. For example, the fact that Romania is associated to cluster 3 (tourists very keen to sustainability practices) is quite a novelty, since there are many evidences in the recent literature on the fact that in Eastern European countries CE practices are not adopted (Bassi and Dias 2020); it deserves further analysis.

Some EU countries are associated only to one cluster of citizens, other countries are instead associated to more clusters, indicating a higher level of heterogeneity within their citizens. Italy is associated with four clusters; Belgium, Romania and Slovakia with two of them.

With the estimated parameters by the best fitting LCDA model listed in Tables 3 and 6 and Euclidean squared distances reported in Table 7, it is possible to calculate odds and odds ratios directly from the distances, as reported in Eqs. (5) and (6). As an example, we calculate the log odds, for a European citizen, of living in Austria rather than in Belgium given belonging to Cluster 1.

$$log\left( {\frac{{\mu_{1At} }}{{\mu_{1BE} }}} \right) = log\left( {\beta_{AT} } \right) - log\left( {\beta_{BE} } \right) - d_{1AT}^{2} + d_{1BE}^{2} { } = { }log\left( {1.213} \right) - log\left( {0.954} \right) - 0.14 + 1.43{ } = { }1.39.$$

Therefore, the odds of being a citizen of Austria with respect to Belgium, given assignment to latent cluster 1 is equal to 4.01, which means that for cluster 1, a respondent is four times more likely to come from Austria than from Belgium. In general, odds ratios measure the relationship between the clusters and the European countries. Calculating oddsratios gives a relative measure of the odds that citizen for a certain cluster i come from a specific European country j, with reference to another country j’, respect to the odds that citizens for another cluster I’ come from this specific European country j with respect to the other country j’.

As example, we calculate the odds ratios, again contrasting Austria and Belgium, but comparing also clusters’ belonging, in this example, we consider the best cluster (3) with the worst one (2) in terms of citizens’ attitude towards the environment. The odds that a citizen classified in cluster 3 lives in Austria instead of Belgium is

$$\frac{{\mu }_{3AT}{\mu }_{2BE}}{{\mu }_{3BE}{\mu }_{2AT}}=exp\left(-{d}_{3AT}^{2}-{d}_{2BE}^{2}+{d}_{3BE}^{2}+{d}_{2AT}^{2}\right)=exp\left(-1.45-0.12+1.74+1.68\right)=5.03$$

times the odds that a citizen classified in cluster 2 lives in Austria instead of Belgium.

Referring to Tables 3, 6, and 7, all combinations of odds and odds ratios can be calculated. By allowing specific comparisons between latent clusters and European countries.

Odds ratios can also shed some light to clusters and EU countries that appear as outliers in Fig. 2. For example, in the following we calculate that the odds that a citizen classified in cluster 6 (that with the largest distances with all EU countries) lives, for example, in Belgium instead of Austria is

$$\frac{{\mu }_{6BE}{\mu }_{3AT}}{{\mu }_{6AT}{\mu }_{3BE}}=exp\left(-{d}_{6BE}^{2}-{d}_{3AT}^{2}+{d}_{6AT}^{2}+{d}_{3BE}^{2}\right)=exp\left(-3.01-1.74+5.17+0.74\right)=3.15$$

times the odds that a citizen classified in cluster 3 (that differs from cluster 6 only for one response variable) lives in Belgium instead of Austria.

The European country that shows the largest distance with all latent clusters is Malta; in the following, we calculate the odds ratio for Austria and Malta and clusters 2 and 3:

$$\frac{{\mu }_{3AT}{\mu }_{2MT}}{{\mu }_{3MT}{\mu }_{2AT}}=exp\left(-{d}_{3AT}^{2}-{d}_{2MT}^{2}+{d}_{3MT}^{2}+{d}_{2AT}^{2}\right)=exp\left(-1.45-4.11+6.86+1.68\right)=8.02.$$

Odds ratios make it is possible to identify the relative proportion of citizens belonging to a specific cluster relatively to another cluster chosen as benchmark and comparing two EU countries.

5 Conclusions

In this paper we analyze the tourism in the countries of the European Union in terms of citizens' opinions about sustainable travel, as well as the willingness of European citizens to change their tourist habits in the near future to make it more sustainable. To this end, ten binary questions about the willingness by interviewed citizens to change travel and tourism habits to be more sustainable are considered from the survey Flash Eurobarometer 499 on attitudes of European citizens towards tourism in a person-oriented approach. Here we thus focus on the patterns of individual characteristics relevant for the study generating profiles. Since the data set is sparse, the combination of clusters together with the analysis of associations between the cluster of profiles and the countries are simultaneously considered using the latent class distance association model. The procedure allows analysis of cross-classified sparse data with a categorical response variable, by grouping profiles while facilitating the interpretation of associations between cluster centers and categories of a response variable by unfolding. Since the odds-ratios depend on the estimated distances of the representation, the model also facilitates the choice of the reference modalities by means of the graphical representation of the associations.

The latent class distance association model identified 19 clusters of profiles corresponding to 19 groups of European citizens. These clusters describe people with different levels of commitment towards the environment and specifically with different levels of preparedness to perform actions related to travel and tourism that could preserve natural resources. These groups vary from that of citizens who are prepared to change their habits with reference to all sustainable actions proposed in the survey to a group of citizens who do not wish to change at all. In between, the model identified 17 other clusters that gather European citizens committed to different specific sustainable actions related to traveling and taking holidays.

The latent class distance association model estimated also the associations between each cluster and each one of the 27 EU countries. Citizens more committed to an environmentally friendly behavior live in Sweden and Romania; citizens less willing to change their habits towards a more sustainable behavior live in Belgium, Cyprus, Czech Republic, France, Lithuania, and the Netherlands.

Citizens preparedness to change habits however depends also on their socio-demographic characteristics such as gender, age, occupation, type of community where living, household size and the frequency of travelling before the Covid-19 pandemic. Respondents who did not use to travel are less interested in the topic of sustainable tourism. Female and youngest respondents are keener to change to adopt a more sustainable behavior, as well as those who live in large towns. Youngest citizens, however do not like those practices that increase the prices.

Implementation of sustainability in the touristic sector can, as it is obvious, not only preserve the natural environment, but as well attract tourists and increase their satisfaction. It is strategic for touristic destinations and firms to know customers’ attitude towards circular economy and specific sustainability actions.