The hierarchical SMAA-PROMETHEE method applied to assess the sustainability of European cities

Measuring the level of sustainability taking into account many contributing aspects is a challenge. In this paper, we apply a multiple criteria decision aiding framework, namely, the hierarchical-SMAA-PROMETHEE method, to assess the environmental, social, and economic sustainability of 20 European cities in the period going from 2012 to 2015. The application of the method is innovative for the following reasons: (i) it permits to study the sustainability of the mentioned cities not only comprehensively but also considering separately particular macro-criteria, providing in this way more specific information on their weak and strong points; (ii) the use of PROMETHEE and, in particular, of PROMETHEE II, avoids the compensation between different and heterogeneous criteria, that is arbitrarily assumed in value function aggregation models; finally, (iii) thanks to the application of the Stochastic Multicriteria Acceptability Analysis, the method provides more robust recommendations than a method based on a single instance of the considered preference model compatible with few preference information items provided by the Decision Maker.


Introduction
Nowadays cities are considered as the most important factor of environmental pollution and, as acknowledged by [54], although urban spaces are only 2% of the earth's surface, they consume 60-80% of all goods. Consequently, over the years many different objectives have been defined to reduce the pollution produced by cities and to make them develop concerning the environment. This is also the reason for which on September 25, 2015, the Agenda 2030 has specified, among the others, the sustainable development goal (SDG) 11 regarding cities [1]. Therefore, the evaluation of the level of sustainability of cities became a crucial issue.
Even if there is no univocal definition of the term sustainability, three main types of sustainability are commonly considered: environmental, economic, and social [39,45]. Because the level of sustainability is measured by means of several indicators, studying the sustainability of cities at both comprehensive and partial level can be considered a typical Multiple Criteria Decision Aiding (MCDA; [29]) problem [37].
In this paper, we would like to promote the use of a method recently proposed in the literature, called the hierarchical-SMAA-PROMETHEE method [3], to evaluate the sustainability of cities but that can analogously be applied to countries or regions. To this aim, we applied the method to study the sustainability of 20 European cities in the 2012-2015 period, taking into account 9 different indicators grouped in three macro-criteria, that is, environmental, economic, and social. The method combines three MCDA methodologies, namely, the Multiple Criteria Hierarchy Process [16], the PROMETHEE II method [7], and the Stochastic Multicriteria Acceptability Analysis [34], taking advantage of their main potentialities. This methodological combination is innovative and yields an added value to be shown by this study. The benefits resulting from the application of the hierarchical-SMAA-PROMETHEE method in the considered context are the following: • The use of the PROMETHEE methods and, in particular, of PROMETHEE II, permits to aggregate the evaluations on multiple criteria avoiding compensation between them. Indeed, following [38], in measuring sustainability "compensability should be avoided". Moreover, PROMETHEE II provides a complete ranking of the alternatives, permitting to define the position got by each city. • The use of the MCHP permits to get the ranking information not only at the comprehensive level but also at environmental, economic, and social levels. In this way, by integrating the MCHP with the PROMETHEE II method one can rank-order the cities not only comprehensively, but also with respect to each individual macro-criterion, learning in this way which are their weak and strong points. • The use of SMAA permits to provide robust recommendations concerning the considered sustainability evaluation. Indeed, instead of choosing a single vector of weights corresponding to the nine indicators, SMAA gives the possibility of ranking the considered cities at both comprehensive and partial levels using a big set of possible vectors of weights. The output information provided by SMAA is presented in statistical terms specifying the probability with which a city gets a certain rank position or the frequency with which a city is preferred to another one. This statistical information can then be used to obtain a robust ranking of the considered cities at both comprehensive and partial levels.
The paper is organized as follows: in the next Section, we shall briefly review the literature on sustainability evaluation of cities, countries, and regions; In Section 3, we remind the methodological background including, in particular, the three methods composing the hierarchical-SMAA-PROMETHEE method. In Section 4, we apply the proposed method to evaluate the sustainability of 20 European cities at both comprehensive and partial levels, showing the potential of the method; finally, in Section 5, we make conclusions and indicate some future research directions.

Literature review
The number of papers presenting applications of MCDA methods to assess sustainability in different fields is quite large (see, for example, [15,22,24,55,62]). In the following, without any ambition of being exhaustive, we review a few of these papers regarding the sustainability evaluation of cities, countries, and regions. As will become evident later, they differ for the type of sustainability to be studied, the chosen indicators, and the method used to perform the aggregation of the alternatives' performances [57].
Munda and Saisana [39] compared 25 regions in the Mediterranean area (17 Spanish,4 Italian,and 4 Greek) based on 29 indicators representing economic, social, and environmental aspects. Compensatory and noncompensatory aggregations have been used to underline that the obtained results depend on the chosen method; Data Envelopment Analysis (DEA; [12]) has also been used to study the efficiency of the same regions.
Phillis et al. [44] evaluated the sustainability of 106 cities and megacities all over the world by a fuzzy model called SAFE (Sustainable Assessment by Fuzzy Evaluation). The sustainability of these cities has been studied based on 46 indicators belonging to two macro-input, that is, ecology and well-being. A sensitivity analysis has been performed to highlight which indicators influence more the degree of sustainability of the considered cities. The same model has been used to compare the sustainability of 128 countries considering 75 indicators in [43].
Using an intuitionistic fuzzy approach, [23] assessed the environmental sustainability of 27 U.S. and Canadian metropoles. 16 sustainability indicators were taken into account in the paper while the analysis was performed based on experts' judgments used to assign weights to these indicators.
Chen and Zhang [13] studied the sustainability of 14 Chinese cities in Liaoning province considering 21 indicators grouped in economic, social end environmental macro-criteria in the 2013-2017 period. Interaction between indicators, distinguished in static interaction and dynamic trend similarity, has been taken into account. After normalizing the evaluations of the cities by the min − max normalization, the IOWA (Induced Ordered Weighted Averaging; [63]) operator has been used to aggregate them; the IOWA operator has also been applied in [65] to analyze the sustainable development of 13 Chinese cities in the 2011-2016 period. To this aim, 18 criteria concerning economic, social, and environmental aspects have been taken into account. Weights assigned to the different indicators were used to express their interdependence.
Deng et al. [21] assessed the sustainability of four largesized Chinese cities (Beijing, Chongqing, Shanghai, and Tianjin) in three different years (2005, 2010, and 2015). The evaluations of the cities on 18 indicators divided into four macro-areas have been aggregated by the arithmetic mean after the min − max normalization. The same preference model has been used by [5] to evaluate the sustainability of 92 municipalities located in the Umbria region (Italy), using 18 indicators (9 environmental and 9 socio-economic). After putting all evaluations in the [0,1] interval by a standardization method, trade-off weights necessary to apply in the weighted sum have been obtained using the SWING method [61].
Yi et al. [64] evaluated the sustainability of 14 cities in the Lianoling province of China in the 2011-2016 period. A weighted sum has also been used in this case to aggregate the evaluations of the considered cities on 21 indicators using equal weighting. By stochastic simulations, the authors forecasted also the sustainability of the same cities in the following years.
The sustainability of 4 metropolitan areas (Bari, Bitonto, Mola, and Molfetta) in the south of Italy has been studied by [10]. The AHP method [51] has been applied to the data set composed of 35 indicators belonging to seven different dimensions; AHP has been used in [20] and [46] as well. On the one hand, [20] evaluated the sustainability performance of the 28 European countries from environmental and energetic perspectives considering 9 indicators; on the other hand, [46] proposed an approach aiming to assess the sustainability and livability of cities based on cognitive mapping. The analysis took into consideration 6 macrocriteria.
Zhang et al. [66] evaluated the sustainability of 13 Chinese cities in the Jiangsu province with respect to 30 indicators by using the Choquet integral [14]. The evaluations of the considered cities have been normalized by the min-max normalization, and then aggregated considering weights assigned to the indicators and all possible coalitions of criteria, objectively based on the data at hand and, therefore, without any judgment of the Decision Maker (DM). The Choquet integral has also been used in conjunction with cognitive mapping by [8] and [11]. In the first paper, the "greenness" of 20 Portuguese cities has been evaluated, while, in the second, the "smartness" of the same 20 Portughese cities with respect to 6 macro-criteria has been analyzed.
Evaluation of the urban sustainable development of 16 Chinese cities in Anhui province has been performed based on 39 indicators divided into economic, social, and environmental in [56]. The evaluations of the cities on the considered indicators have been computed by an integrated framework composed of the TOPSIS method [31] and the grey relational analysis, while weights of criteria have been obtained by the entropy method [67].
Paolotti et al. [41] assessed the sustainability level of the 20 Italian regions and the 17 Spanish autonomous communities on the basis of 18 indicators belonging to economic, social, and environmental macro-criteria. The analysis has been performed based on the preferences of 8 experts (4 from Italy and 4 from Spain), while the results have been obtained by the GeoUmbriaSUIT [6] which integrates GIS and MCDA [35]. In particular, TOPSIS was used again as a preference model, while the weights necessary to apply the method have been obtained by SWING.
Antanasijevic et al. [2] measured the sustainability of 30 European countries in the 2004-2014 period. The analysis involved 38 indicators grouped into 8 subgroups. The authors applied PROMETHEE II as a preference model and provided a ranking of the considered countries for the period 2004-2014 and two shorter periods, being 2004-2009 and 2010-2014, respectively. Analogously, PROMETHEE II has been used to assess the sustainable energy transition readiness level of 14 countries belonging to all continents by [40]. 8 indicators have been considered in the study and AHP has been used to get the weights of indicators necessary to apply the PROMETHEE II method, which is admittedly not a correct combination [48].
Finally, [52] applied ELECTRE III [25] to evaluate the sustainability of some megacities, such as New York and Los Angeles, considering their evaluations on 12 indicators belonging to environmental, economic, social, and smart macro-criteria.
The above literature review shows that in the performed studies there was no MCDA method permitting at the same time to consider sustainability of the cities at the global level as well as at the level of macro-criteria, using a noncompensatory aggregation, taking into account preference information expressed by the DM in an indirect way, and providing robust recommendations following from not only one but a plurality of instances of the assumed preference model compatible with available preference information. Therefore, in this paper, we would like to fill this gap in the literature by applying the hierarchical SMAA-PROMETHEE method addressing all the mentioned issues simultaneously.

Methodological background
In this Section, we shall recall briefly the methods composing the hierarchical-SMAA-PROMETHEE method that will be applied in the considered case study, namely the PROMETHEE II (Section 3.1), the Multiple Criteria Hierarchy Process (Section 3.2), and the Stochastic Multicriteria Acceptability Analysis (Section 3.3).

Multiple criteria decision aiding and the PROMETHEE II method
Multiple Criteria Decision Aiding (MCDA; [29]) methods are designed to deal with ranking, choice, or sorting problems. In this paper, we are interested in the ranking since we aim to order several European cities from the best to the worst with respect to their sustainability at both comprehensive and partial levels, that is, economic, social, and environmental. In the following, by A we shall denote the set of alternatives {a, b, c, . . .} and by G the set of m criteria {g 1 , . . . , g m } on which the alternatives are evaluated. For the sake of simplicity and without loss of generality, we assume that each criterion g j is a real-valued function, that is, g j : A → R, and that it has an increasing direction of preference (the more g j (a), the better is a on g j ).
Considering the performance matrix being composed of the evaluations of the cities on the criteria at hand, the only objective information that can be obtained is the dominance relation D, such that for all a, b ∈ A, g j (a) g j (b) for all g j ∈ G and there is at least one g j ∈ G such that g j (a) > g j (b). The objectivity of this relation is counterbalanced by its poverty, since in most of the cases neither a dominates b nor vice versa, which means that many alternatives are non-comparable by the dominance relation. To make the alternatives more comparable, an aggregation method has to be used summarizing the information included in the performance matrix on one hand, and preference information provided by the DM on the other hand. Historically, three different aggregation methodologies have been considered: (i) value function methods belonging to Multiple Attribute Value Theory (MAVT; [33]), (ii) methods based on outranking relations, such as ELECTRE [25,26] and PROMETHEE [4,7], and (iii) methods based on induction of decision rules [27]. In this paper, we will aggregate the evaluations of the alternatives on the considered criteria by using the second approach and, in particular, the PROMETHEE II method. We recall this method below.
PROMETHEE II builds a complete order of the alternatives at hand based on their comparison through the net flow. The net flow of each alternative a is computed in the following steps: 1. For each criterion g j and each pair of alternatives (a, b) ∈ A × A, the preference function P j (a, b) is computed first. It is a non-decreasing function of the difference g j (a) − g j (b) expressing the degree of preference of a over b on g j . Six different types of function P j (a, b) have been defined by [7]. The most frequently used is the V-shape function with indifference zone, defined as follows: and, the greater P j (a, b), the more g j is in favor of the preference of a over b. In the definition of P j (a, b), q j and p j are, respectively, the indifference and the preference thresholds related to g j (see [50]). They are such that 0 q j < p j and q j is the maximum difference between evaluation g j (a) and evaluation g j (b) compatible with their indifference on g j , while, p j is the minimum difference between g j (a) and g j (b) compatible with the preference of a over b on this criterion.
is computed: where w j > 0 is a relative importance weight assigned Also π(a, b) ∈ [0, 1] and the greater π(a, b) the more a is preferred to b. 3. For each a ∈ A, the positive flow φ + (a), the negative flow φ − (a) and the net flow φ(a) are computed: On the one hand, φ + (a) measures how much, in average, a is preferred to all other alternatives; consequently, the greater φ + (a), the better a. On the other hand, φ − (a) measures how much, in average, the other alternatives are preferred to a; consequently, the greater φ − (a), the worse a. Finally, φ(a) provides a balance between the positive and the negative flows and it represents the relative quality of a in the set of alternatives A.
Based on the net flow of each alternative, PROMETHEE II builds a preference P I I and an indifference I I I relation, such that: The above two relations constitute a weak order (P I I and I I I are transitive and P I I ∪ I I I is reflexive and complete; see. e.g., [47]) in the set of alternatives, so one gets a ranking recommendation from the best to the worst alternative with possible ex-aequo.

The multiple criteria hierarchy process
In real-world applications of MCDA, the evaluation criteria are not always considered at the same level but they are structured hierarchically. This means that it is possible to distinguish a root criterion being the comprehensive objective of the problem, some macro-criteria descending from the root criterion hierarchically, until the elementary criteria being placed at the bottom of the hierarchy tree. The basic evaluations of the alternatives are made on the elementary criteria only, and they are aggregated to macro-criteria up the hierarchy tree, until the comprehensive criterion.
To deal with such a hierarchical structure of the family of criteria in MCDA, the Multiple Criteria Hierarchy Process (MCHP) has been proposed in [16]. In particular, its application to the PROMETHEE methods has been presented in [17]. The integration of the MCHP and the PROMETHEE II method permits to define a preference and an indifference relation not only at the comprehensive level (that is, at the root criterion level) but also at partial levels that correspond to macro-criteria (that is, at particular nodes of the hierarchy tree). To describe the integration of PROMETHEE II with MCHP, we shall use the following notation: g t represents an elementary criterion, while the set of indices of elementary criteria is denoted by EL; g r represents a generic macro-criterion in the hierarchy, while E(g r ) is a subset of EL composed of the indices of elementary criteria descending from g r . In particular, g 0 represents the root-criterion. For the sake of simplicity and without loss of generality we assume that macro-criteria, as well as elementary criteria, can descend from only one macro-criterion at the level above (see [17] for more details on this point).
To adapt the PROMETHEE II method to the MCHP, it is enough to perform the following replacements in steps 1-3 presented in the previous Subsection: 1→1'. The preference function P j (a, b) that was defined for each criterion g j and for each ordered pair of alternatives (a, b) ∈ A × A has now to be defined for each elementary criterion g t and for each ordered pair of alternatives (a, b); of course, if the V-shape is the used preference function, indifference q t and preference p t thresholds have to be defined for each g t , t ∈ EL.
2→2'. After a weight w t is assigned to each elementary criterion g t , t ∈ EL, so that w t > 0 for all t ∈ EL and t∈EL w t = 1, for each macro-criterion g r , the partial preference function π r (a, b) is defined for each ordered pair of alternatives (a, b) ∈ A × A as follows: ⎦ and, the greater π r (a, b), the more a is preferred to b on g r . 3→3'. For each alternative a ∈ A and for each macrocriterion g r , the partial positive, negative, and net flows are defined as follows: Based on the net flows, a marginal preference relation P I I r and a marginal indifference relation I I I r are defined for each macro-criterion g r as follows:

Stochastic multicriteria acceptability analysis
As described in the previous Subsection, the application of PROMETHEE II involves that the weights of elementary criteria w t , as well as the indifference q t and preference thresholds p t , have to be specified by the DM. Assuming that the discriminating thresholds are technical, and as such, they are fixed, the final ranking recommendation at both comprehensive level and partial levels depends on the choice of the weights w t . Indeed, different values of w t imply, in general, different values of the positive, negative, and net flows and, consequently, different relations between the considered alternatives. To avoid a single, and therefore to some extent arbitrary, choice of weights, [19] proposed the SMAA-PROMETHEE method, later extended to the hierarchical-SMAA-PROMETHEE method by [3]. The hierarchical-SMAA-PROMETHEE method integrates the Stochastic Multicriteria Acceptability Analysis (SMAA) in the hierarchical PROMETHEE II method (see [34] for the paper introducing SMAA and [42] for a recent survey of the SMAA applications). As result, one gets a recommendation in statistical terms specifying the probability with which an alternative gets a particular position in the ranking or on the probability with which an alternative is preferred to another one at the comprehensive and partial levels. Such a ranking recommendation is more robust than the ranking obtained for a single vector of weights.
Defining by W = w 1 , . . . , w |EL| ∈ R |EL| + : t∈EL w t = 1 the whole space of vectors of weights and by W DM the subset of W composed of the vectors of weights compatible with some preferences provided by the DM, the application of SMAA starts sampling several weights vectors from W DM . The hierarchical-PROMETHEE II method is then applied for each (w 1 , . . . , w |EL| ) ∈ W DM getting a preference and an indifference relation at both comprehensive and partial levels. Based on these computations, the following indices are then obtained: it is the probability with which an alternative a reaches the kth position in the ranking corresponding to criterion g r , • The pairwise winning index p r (a, b): it is the probability with which alternative a is preferred to alternative b on g r . Of course, based on p r (a, b) and p r (b, a) the frequency of indifference between a and b on g r can also be computed as

The SMAA-PROMETHEE method in detail
In this Section, we shall present in detail the different steps of the SMAA-PROMETHEE method summarized in the flow chart in Fig. 1.
Step 0: The set of criteria to be taken into account in the problem are structured hierarchically: a root criterion being representative of the problem itself is defined, as well as few macro-criteria descending from it, continuing down until the elementary criteria on which the alternatives will be evaluated and that are located at the bottom of the hierarchy tree.
Step 1: In this step, the DM is asked to provide some preference information: Step 1.1: With the help of the analyst, all technical parameters are fixed. These regard selection of the shape of the preference function P t (·, ·) for each elementary criterion g t that, as mentioned in Section 3.1, can be of six different types, as well as indifference q t and preference p t thresholds for each g t if the V-shape with indifference is assumed as preference function, Step 1.2: The DM elicits some preference information by comparing criteria in terms of their importance (for example, "g r 1 is more important than g r 2 " or "g r 1 and g r 2 have the same importance", etc.) or comparing some alternatives pairwise in terms of preference (for example, "a is preferred to b" or "a and b are indifferent", etc.). These pieces of preference information are translated into linear constraints involving the weights of the elementary criteria and, therefore, reducing the space W of the compatible weights vector giving rise to the W DM space.
Step 2: The analyst checks if the preference information provided by the DM is consistent, that is, if there exists at least one instance of the assumed preference model (the PROMETHEE II in our case) compatible with the preferences provided by the DM. We shall call such an instance compatible model in the following. If there exists at least one compatible model and, therefore, W DM = ∅, one can pass to step 3, otherwise the DM, with a help of the analyst, has to check the cause of the incompatibility and to resolve it in the revised preference information [36].
Step 3: A certain number of compatible models is sampled by means of, e.g., the Hit-And-Run method [53,59] (to have an idea of the number of compatible models that need to be sampled to get a certain precision in the obtained results, see [58]).
Step 4: Apply the hierarchical-PROMETHEE II method for each sampled vector of weights obtaining, therefore, a complete ranking of the alternatives at hand both at the comprehensive level as well as for each considered macro-criterion.
Step 5: Based on the rankings obtained in the previous step, the SMAA methodology is applied to compute the indices presented in Section 3.3, that is, the rank acceptability index of each alternative on each rank position, and the pairwise winning index between two alternatives. Both indices can be computed at comprehensive and at partial levels. The rank acceptability indices of the alternatives can also be aggregated to obtain a final ranking of the alternatives at hand, as will be shown in the next Section.

Case study
In this Section, we shall apply the hierarchical-SMAA-PROMETHEE method to assess the sustainability of 20  Fig. 2, while the reference to the data source as well as their preference direction is shown in Table 1.
In Table 2, there are evaluations of the 20 considered cities on the elementary criteria in the year 2015 only. The whole data set and the whole set of results can be downloaded by clicking on the following link: Supplementary Results(http://www.antoniocorrente.it/wwwsn/images/ allegati articoli/SMAA%20Results%20Sustainability %20Cities.xlsx).
The indifference and preference thresholds for each elementary criterion, equal for all considered years, are shown in Table 3.
Using the notation introduced in Section 3.3 we have that EL = { (1, 1), . . . , (3,3)} and, therefore, the whole set of weights is W = w (1,1) , . . . , w (3,3) ∈ R 9 + : t∈EL w t = 1 . For our analysis, let us assume that Environmental macro-criterion is more important than Social one which, in turn, is more important than Economic macro-criterion. This preference information is translated into the following constraints on the weights: W 1 > W 3 and W 3 > W 2 .
Solving the LP problem: ε * = max ε, subject to E DM , we find that E DM is feasible and ε * = 0.0833 meaning that there is at least one weights vector compatible with the preference we provided as a hypothetical DM. Applying the HAR method, we sampled 100,000 compatible weight vectors, applying for each of them the hierarchical PROMETHEE II method and obtaining for each weights vector a total ranking of the considered cities, not only at the comprehensive level but also at partial levels. As explained in Section 3.3, applying the SMAA methodology to the hierarchical PROMETHEE II method, we get the rank acceptability indices of all cities at comprehensive and partial levels. In Tables 4, 5, 6 and 7 we show the rank acceptability indices at the comprehensive level, as well as at the levels of the three considered macro-criteria, according to the provided preference information 2 .
Looking at data in Tables 4-7 and 8a-b, the following detailed observations can be done • Comprehensive level: Luxembourg, Oslo, Bern, Riga, and Prague are the only five cities that can take the first ranking position. In particular, Luxembourg is the one being in the first place more frequently (b 1 0 (LU ) = 42.66%), followed by Oslo (b 1 0 (OS) = 25.746%), Bern (b 1 0 (BL) = 14.432%), Riga (b 1 0 (RI ) = 12.774%) and Prague (b 1 0 (P R) = 4.388%). Moreover, looking at the sum of the rank acceptability indices of the same countries for the first three positions, we have the confirmation that Luxembourg and Oslo are the two best cities since this sum is equal to 85.623% for Luxembourg and 83.585% for Oslo meaning that, in almost all cases, both cities reach one of the first three positions in the ranking. Regarding Prague, instead, this sum is equal to 35.003% that means that it can reach the first three positions in the ranking but not very frequently. To further compare these five cities, 2 We did not compute the rank acceptability indices got by Paris at the comprehensive and Environmental levels because its evaluation on Passenger Cars was not available.
in Table 8 we provide the pairwise winning indices among them. Luxembourg is preferred to the other four considered cities with a probability at least equal to 59.599% being the probability with which it is preferred to Oslo. Analogously, Oslo is preferred to the other four cities with a probability at least equal to 40.401% being the probability with which it is preferred to Luxembourg. Indeed, again, we have the confirmation that the two cities are the best.
Regarding, instead, the tail of the ranking, London and Athens are the two cities more frequently in the last positions since the other three, that is, Rome, Berlin, and Madrid can be ranked last with very marginal probabilities. Similar to what has been done for the first three positions in the ranking, we look at the sum of the rank acceptability indices corresponding to the last three positions in the ranking. This sum is equal to 98.957% for London and 93.052% for Athens, meaning that the two cities are almost always in these positions independently on the weights assigned to the elementary criteria; • Environmental level: The information extracted from Table 5 is quite easy to be interpreted since there is only one city that can be ranked first, that is Bern, and one city that is always the last ranked, that is London. Moreover, one can observe that the results are quite stable and do not change very much with the weights of the three elementary criteria descending from the environmental macro-criterion since all cities can take at most 5 positions and at least one of them is filled with a probability close to 50% and, in many cases, even higher; • Economic level: At the economic level only three cities can be ranked first, that is, Stockholm, Oslo, and Luxembourg with probabilities of 77.242%, 20.616%, and 2.142%, respectively. Moreover, in the cases in which Stockholm is not ranked first, it is in the second (b 2 2 (ST ) = 17.256%) or in the third position (b 3 2 (ST ) = 5.502%) showing, therefore, great stability.  (3,2) : Population density (people/km 2 ) http://www.oecd.org/ ↓ g (3,3) : Crime rate (%) https://www.numbeo.com/cost-of-living/ ↓ Considering the tail of the ranking, Athens is surely the worst among the twenty analyzed cities since it fills always the last position. Looking at the secondworst city w.r.t. economic aspects, this has to be chosen between Lisbon, Madrid, and Rome since they are the only three cities that can be ranked at the last but one place of the ranking with probabilities of 62.056%, 36.873%, and 1.071%, respectively (see Table 6); • Social level: Warsaw, Luxembourg, and Oslo are the only three cities that can be ranked first considering the social macro-criterion. In particular, from Table 7, one can observe that they have a probability to be ranked first equal to 67.87%, 20.972%, and 11.158%, respectively. However, Luxembourg and Oslo present the highest probability in correspondence of a rank position different from the first. Indeed, Oslo's highest probability is 37.273% and it is obtained in correspondence of the second place, while Luxembourg has the highest probability in correspondence of the sixth place, meaning that this is the rank position that is occupied more frequently from this city. To further compare these three cities, we extract their pairwise winning indices shown in Table 8b from which one can observe that Warsaw is preferred to the other two cities with a probability at least equal to 79.028%, while Oslo and Luxembourg are preferred to each other with quite similar probabilities since Oslo is preferred to Luxembourg in 51.547% of the cases, while Luxembourg is preferred to Oslo in the remaining 48.453% of the cases. Considering the tail of the ranking, London and Paris are the only two cities that can be ranked last and they have the highest probability to be ranked last (b 20 3 (LO) = 97.686%) and last but one (b 19 3 (P A) = 80.841%), respectively.
To summarize the results of the rank acceptability indices of the cities at the comprehensive and partial levels, following [32], we can calculate the expected rank ER r (a) of each city a: Based on ER r (a), the cities are then ranked from the best (the one having the greatest ER r (a)) to the worst (the one having the smallest ER r (a)).
In Table 9, we show the positions got by each city in the ranking obtained according to the expected rank at the comprehensive and partial levels. Looking at the data, one can appreciate the finer results provided by the MCHP at the macro-criteria levels. Indeed, while some cities, such as Prague or Oslo, take more or less the same position at the comprehensive level and each macro-criterion, others take completely different rank positions in the four considered rankings. For example, London, is 3rd according to the economic ranking, while it is the 19th in the comprehensive and environmental rankings and the 20th in the social ranking. Analogously, while Bern is the first according to the environmental aspect, it is the 4th at the comprehensive level, the 5th on the economic aspect, and, finally, the 13th on the social aspect. In this way, from a policy-making point of view, it is possible to consider the weak and strong points of each city, so that the policymakers can develop strategies preserving the strong points and, at the same time, improving the weak ones.
Since the analysis has been performed separately for each year from 2012 to 2015, it can be beneficial to look at the evolution in time of the positions got by each city in the rankings obtained at the comprehensive and partial levels.
In Table 10a-c we reported the evolutions of the expected ranks over the 2012-2015 period at the comprehensive level as well as for economic and social macro-criteria. We did not include the same table for the environmental macrocriterion since the ranking is the same in all years apart from 2015 where Paris is not considered because of missing data on some elementary criteria. In parentheses, we reported the number of rank positions a city has increased (↑) or decreased (↓) with respect to the previous year. More in detail, one could observe the following: • Comprehensive level: In 2013 many cities have changed their rank position with respect to the previous year. However, this is mainly because four of them were excluded from the 2012 analysis because of missing data on some elementary criteria. Considering, instead,  Table 4 Rank Acceptability Indices of the cities at the comprehensive level: 2015 data  Table 5 Rank Acceptability Indices of the cities at the Environmental level: 2015 data  Table 6 Rank Acceptability Indices of the cities at the Economic level: 2015 data  Table 7 Rank Acceptability Indices of the cities at the Social level: 2015 data

Conclusions
Sustainability and, consequently, sustainable development is a wide concept universally acknowledged but not univocally defined. Several definitions of sustainable development have been provided over the years, starting from the first dated 1713 and provided by Von Carlowitz [60]. The most used is the one contained in the Brundtland report, dated 1987, for which it is "...a development that meets the needs of the present without compromising the ability of the future generations to meet their own needs" [9]. As three main types of sustainability are commonly acknowledged [45], that is, Environmental, Social and Economic, the measuring of sustainability of a project, a city, or a country, is a problem that has to be addressed using some Multiple Criteria Decision Aiding (MCDA) methods.
In this paper, we applied a recently developed MCDA method, namely, the hierarchical-SMAA-PROMETHEE method, to evaluate the sustainability of 20 European cities in the 2012-2015 period. Nine elementary criteria belonging to the environmental, social, and economic macro-criteria have been taken into account to evaluate the sustainability of the cities at the comprehensive level as well as at three partial levels. The preference model used to aggregate the multiple criteria evaluations is the one of PROMETHEE II, which permits to avoid the effect of compensation between criteria. The compensability of indices used to measure sustainability is in fact undesirable, as shown in [30] and [38]. The use of PROMETHEE II permits to rank order all considered cities from the best to the worst, not only at the comprehensive level but also at the level of each macrocriterion. To provide a robust ranking recommendation, the Stochastic Multicriteria Acceptability Analysis (SMAA) [34] has been applied. The application of SMAA permits to find the frequency with which a city is reaching a certain rank position at the comprehensive and partial levels, taking into account a large set of the instances of the assumed preference model compatible with the preferences provided by the Decision Maker. In our study, these instances are defined by vectors of criteria weights drawn randomly from a feasible set. The results of SMAA were summarized using the expected rank score proposed by [32], providing a final ranking of the cities.
Based on the results of our case study, we believe that the proposed method can be a useful tool for policymakers for at least two reasons. Firstly, it gives the possibility to identify the weak and strong points of each city looking at its rank position got at partial levels. Indeed, apart from information about the rank position of a particular city reached at the comprehensive level, the hierarchical-SMAA-PROMETHEE method gives also information about its position in the rankings taking into account the Environmental, Social and Economic aspects separately. This permits to identify weak points (the macrocriteria on which the city gets low-rank positions) and strong points (the macro-criteria on which the city gets high-rank positions). Secondly, comparing the rank position got by a city in different years, one can observe the consequences of implemented policies, and arrive to conclude which ones should be maintained (in case a city has improved or, at least, not deteriorated its rank position over the years) and which ones should be revised and improved (in case a city has seen its rank position going down during the years).
In our future research, we would like to apply the same framework considering a greater number of cities as well as a wider period. Moreover, from the methodological points of view it would be interesting to take into account some possible interactions between the different criteria [18] and, finally, to explain and justify the obtained results using the Dominance-based Rough Set Approach [27,28].