Public preferences for heritage conservation strategies: a choice modelling approach

Studies aiming at valuing cultural and natural heritage projects are often focussed on one or only a few sites, whereas planning decisions concerning the allocation of public funds to heritage conservation deal with classes of heritage rather than single sites. In addition, such planning decisions are almost always concerned with non-monetary values that need to be incorporated into assessment procedures if the total value of alternative strategies is to be estimated. In this paper, we put forward and estimate models to address both of these issues within a choice-modelling framework. The method is developed in the context of conservation of a particular class of cultural heritage, namely major historic buildings in a city or country. We report results from a discrete choice experiment to assess public preferences in which the choices are alternative conservation programs and the attributes are dimensions of the programs’ cultural and economic value. The model is estimated from survey data using several flexible econometric specifications. We show that the methods developed can be used to obtain robust estimates of the economic value of this category of buildings. We also find a significant contribution of all aspects of cultural value to the formation of conservation preferences and the public’s willingness to pay.


Introduction
In a paper published in the early 1990s, Alan Peacock argued that public preferences should be taken into account in decisions concerning the conservation of cultural heritage assets, especially when public funding was involved (Peacock 1994). Since then, interest has grown in the assessment of public attitudes to heritage conservation through the application of preference evaluation methodologies such as contingent valuation and discrete choice modelling (Rolfe and Windle 2003;Alberini and Longo 2006;Choi et al. 2010;Navrud 2007, 2008; Apostolakis and Jaffry 2005b). In this wide range of studies, two significant issues have emerged which to date have been insufficiently explored. The first relates to the object of assessment. Almost all the studies that have been undertaken relate to a specific building, archaeological site, etc., yet heritage policy decisions are often concerned with the allocation of funds across a range of heritage projects rather than simply to one particular case. Policymakers in heritage administration are routinely faced with decisions requiring an understanding of conservation preferences for a generalised class of buildings or sites rather than for a single specific project (Deodhar 2004;Provins et al. 2008;Lazrak 2009).
The second issue that remains inadequately investigated is the influence of cultural motivations on the formation of people's preferences for different types of heritage, and in particular the relationship between economic and cultural assessments in an individual's personal evaluation. This question can be located in the context of the distinction between economic and cultural value in the assessment of the demand both for cultural goods generally (Throsby 2001;Hutter and Throsby 2008;Throsby and Zednik 2014;Crossick and Kaszynska 2016) and for heritage in particular (Rizzo and Throsby 2006;Throsby 2013).
The objective of this paper is to deal with both of these issues: to provide information on public preferences for a class or category of cultural heritage rather than a single building; and to increase understanding of how cultural values translate into economic preferences. To address the first issue, we construct and implement a choice experiment in which a random sample of respondents drawn from the general public is asked to choose between conservation programs for buildings characterised as being within a particular class of heritage, namely the class of major historic buildings, i.e. those buildings that are included on an official list or register compiled at city, state or national level, and as such are likely to be recognisable to members of the public. In regard to the second issue, the attributes that we include in our investigation are drawn from the presumed dimensions of this category's cultural value. Relevant to both issues, the modelling is framed in terms of the costs of alternative conservation strategies, enabling us to analyse the links between economic and cultural value attributes for the designated heritage class as perceived by respondents.
Researchers interested in exploring preferences for any type of good or service have a choice between revealed preference (RP) and stated preference (SP) methods. RP studies analyse in retrospect actual behaviour rather than hypothetical choices, the advantage being that there is no need for assumptions about how intentions translate into behaviour. However, there are many limitations affecting the use of RP data . For example, the range of variation in explanatory variables in RP data is frequently limited, such that behavioural changes in response to changes in these variables may be difficult to predict. Moreover, explanatory variables tend to be highly collinear in real markets. But most importantly for our study, RP data simply do not exist for goods and services that are not traded on any market, such as public goods, or policy scenarios as yet untried. Thus, in the absence of RP data, the use of SP methods was the only way we could probe the nature of the cultural valuations for heritage under study. Fortunately, the methodologies involved have been widely tested and refined, and we have no reason to doubt their efficacy for our study.
The structure of the paper is as follows. In the next section, we consider the theoretical background to the application of discrete choice methods in the cultural heritage field. The following section includes a review of econometric specifications that have been proposed to analyse choice data sets. We then outline the choice experiment, beginning with a description of cultural heritage categories, the attributes that characterise the heritage class under study, the hypotheses relating to their relative influence, and details of the experimental design. The next section discusses the econometric approach, with an outline of various possible models. We then present results for the preferred model, a mixture-of-normals multinomial logit (MM-MNL) model in preference space, and willingness-to-pay (WTP) estimates for all attributes. After a discussion of the results, the final section puts forward some conclusions, caveats and proposals for future research. An Appendix to the paper sets out econometric specifications of all the models used.

Theoretical background
The theoretical foundations for this paper are derived from the theory of preference formation in economics. Among approaches to the empirical investigation of consumer preferences, discrete choice experiments have found an important place. Choice experiments originally arose from conjoint analysis and have been used in transportation, marketing and psychology (see for example Louviere and Woodworth 1983;Hensher 1994). Conjoint analysis typically asks respondents to rank or rate goods or attributes, whereas in choice experiments, respondents choose from alternative bundles of attributes, thus making their choices consistent with random utility theory (Thurstone 1927;McFadden 1974;Ben-Akiva et al. 1985). Choice experiments are also an application of the characteristics theory of value (Lancaster 1966), since in choice experiments goods are broken into attributes, where one of these attributes is usually price. Choice experiments have become an important and established methodology in environmental economics (Hanley et al. 1998;Adamowicz et al. 2014), since passive-use (non-use) value is of such importance for environmental goods. Adamowicz et al. (1998) were among the first to estimate non-use value for environmental resources; they undertook choice experiments for a particular site of old-growth forests in Alberta, Canada, where caribou, an endangered species, live. Nowadays, choice experiments are employed in a wide range of fields when revealed preference data are not available or as a supplement for revealed preference data (Rolfe et al. 2000;Campbell et al. 2009;Walker and Marquis-Kyle 2004;Strazzera et al. 2010;Mas and Pallais 2017).
In cultural economics, discrete choice experiments have been applied to several areas, for example to forecast demand for a cultural event (Louviere and Hensher 1983), to investigate theatre demand (Willis and Snowball 2009;Willis 2011, 2012), or to study museum attendance (Maddison and Foster 2003;Jaffry and Apostolakis 2011). In the particular field of cultural heritage research, researchers (Morey and Rossmann 2003;Morey et al. 2002) conducted a discrete choice experiment to estimate the benefits of reducing acid deposition injuries to 100 marble monuments in Washington, DC. Choi et al. (2010) conducted a choice experiment on the value of Old Parliament House, a national heritage site in Canberra, Australia, as well as on the various services it provides to the public. Rolfe and Windle (2003) employed a choice experiment to assess the trade-off between the protection of Aboriginal heritage sites and waterways development in the face of increased irrigation demands in the Fitzroy Basin in Australia. Other examples of choice experiments for cultural heritage come from Taiwan (Chen and Chen 2012), New Zealand (Miller et al. 2015), Greece (Apostolakis and Jaffry 2005b, a), Portugal (Lourenço-Gomes et al. 2014) and Italy (Mazzanti 2003). In Ireland, the hedonic pricing valuation method was applied to estimate the value of cultural heritage in the housing market (Moro et al. 2013).
A particular focus of the present study is on the notion of cultural value, a concept that has crystallised in recent years as a form of value distinct from conventional interpretations of economic value (Angelini and Castellani 2019). In the context of cultural heritage, this duality of value-cultural and economic-derives from the interpretation of heritage items as cultural capital assets (Throsby 1999;Rizzo and Throsby 2006;Apostolakis and Jaffry 2007), defined as capital goods that embody or yield cultural value in addition to whatever economic value they possess. It is understood that economic value, whether measured as direct use value or willingness to pay for non-use demand, is expressible in monetary terms, whereas cultural value is characterised by multidimensionality and has no single unit of account. The latter characteristic places cultural value outside the framework of pecuniary value inherent in neoclassical economics. To operationalise cultural value for any cultural good or service, it can be deconstructed into its constituent elements, identified in general terms as relating to the aesthetic and symbolic properties of the good or service in question.
Cultural value is of particular relevance to heritage, where criteria to determine the cultural significance of heritage buildings and sites are used by heritage professionals in evaluating the relative importance of historic buildings, archaeological sites, landscapes and so on (Avrami et al. 2000). For example, a standard procedure for evaluating and managing these sorts of heritage items is put forward in the socalled Burra Charter (Walker and Marquis-Kyle 2004), and UNESCO specifies similar criteria for determination of the significance of items of heritage to be included in the World Heritage List of heritage of "universal human value". A variety of possible dimensions or characteristics can be proposed to define cultural value in particular circumstances; in the present study, we chose aesthetic, historical, social and architectural values as the most appropriate for our purposes, as explained further in the following Sect. 3.
In economic analyses of heritage, this multidimensional nature of value is likely to be recognised by incorporation of some assessment of cultural significance alongside the standard economic measurements of use and non-use value (Mazzanti 2002;Mason 2005;Clark and Maeer 2008;Mason 2008). In these studies, assessment of significance is a separate issue; the approach we develop in this paper incorporates assessment of economic and cultural value into a single integrated methodology.

Pre-testing of concepts
As noted above, the choice experiment was set up to study public preferences for alternative conservation programs applied to different categories of built heritage. The first stage of the study involved two focus groups to test the extent to which participants recognised different categories of heritage, and to explore their perceptions of the cultural significance of heritage buildings. Five categories of heritage were put to participants: • Major historic buildings or groups of buildings of national or state significance • Local historic buildings or groups of buildings, of significance in local communities • Residential houses or groups of houses from earlier times • Rural landscapes or townscapes with historic structures • Sites or landscapes of indigenous significance The categories were described in detail and photographs of typical buildings in each category were distributed for discussion. The focus group work confirmed that all five categories could be readily distinguished by those participating using the wording and picture aids that we went on to employ in the final survey.
To assess perceptions of cultural value, we specified a number of criteria related to cultural significance and invited participants in the focus groups to consider them and to nominate any further characteristics of heritage buildings that they regarded as important. Analysis of the results of these procedures enabled us to identify the criteria that were most significant in respondents' minds. We used these as the basis for specifying the cultural value attributes included in the choice experiment, as described further as follows.

The survey questionnaire
The questionnaire began by introducing respondents to the concept of cultural heritage; photographs of heritage buildings in the above categories drawn from the Australian Heritage Database were presented and respondents were asked a series of questions to gauge their views on the importance of heritage for the Australian community and their attitudes to public funding to support heritage conservation. They were then told that the rest of the survey would focus on one specific type of heritage, namely major "iconic" cultural heritage buildings, the category most likely to be recognisable to survey respondents. Some such buildings may be of international significance-unique Australian heritage icons such as the Sydney Opera House that are known to many people around the world. At a more local level, citizens may be aware of major historic buildings located in their own city or town, or in other national or regional cities or towns. As a reminder, we showed them again photographs of the major cultural heritage category. Participants were then shown a screen that stated different points expressing why some people say that major historical buildings are valuable to the Australian community, and another series of points expressing the opposite view. The purpose of these procedures was to give respondents a sense of what the category "major historic buildings" comprises and to expose them to both positive and negative opinions about heritage so that they might be able to crystallise their own views.
The next section of the questionnaire described the nature of conservation as it applies to heritage buildings. We outlined the purposes that a one-off levy for heritage preservation could be used for, such as: protecting buildings from demolition, major alterations, re-development or neglect; providing more money for restoration and maintenance; and opening up government-owned sites to the public. By doing so, we implicitly defined the base "no conservation" condition, since these efforts would not take place without such funding.
As noted above, we chose four dimensions of cultural value to include as attributes of the major historical buildings category in the survey. The rationale for presenting these attributes to respondents was as follows: • Aesthetic value In specifying beauty as an attribute of heritage buildings in this study, we did not refer to a respondent's individual and possibly idiosyncratic personal evaluation, but rather we framed this attribute in terms of a general or average judgement by using the phrase "noted for their beautiful appearance". In this way, we were able to assess the extent to which aesthetic quality of heritage figures as important in the respondent's opinion. • Historical value The age of a building is an objective characteristic that contributes importantly to its cultural value. It can be argued that the historical value of heritage buildings is significant because their history provides a narrative context for interpretation of their importance, and it allows observers to think about connections between present and the past. • Social value Heritage sites can play an important role in community life, for example as a location for community engagement and social interaction. They may also have significance as symbols of national or local identity, contributing to social cohesion and unity. We gather these various dimensions of social value into an attribute concerning the role that the heritage plays (or does not play) in community life. • Architectural value Heritage buildings may have significance from an architectural viewpoint on account of their originality, their influence on architectural trends, or their typicality of a particular period. An objective assessment of architectural significance is a matter usually left to experts rather than the public. Hence, in specifying this attribute, we referred to what a respondent might see as a collective judgement about architectural importance, and thus we can assess the extent to which such a characteristic of historic heritage features as an item in the formation of the individual's preferences. Table 1 shows the levels of the attributes included in the choice experiment. Three of the cultural value attributes-aesthetic, social and architectural-were specified as binary variables. The fourth-historical value-was divided into four attribute levels corresponding to four different eras in history, as can be seen in the table. 1 The setting for the experiment was presented to participants as one in which the government was considering initiating a program of conservation for a number of major historic buildings, to be financed via a one-off levy that people would pay via their taxes. Thus, a fifth attribute was specified to represent the price that would be levied on an individual's tax as a contribution towards the cost of the program. During the focus groups, open-ended questions concerning willingness to pay for different combinations of attributes were presented to participants to obtain information useful for designing the cost attribute levels. 2 Values mentioned lay in the range between $0 and $100. It was indicated that different possible programs were being considered, of which only one could be implemented. Each of the programs would be focussed on heritage buildings with a different combination of qualities such as their beauty, their age, their architectural significance, their social significance and their tax price.
Respondents were asked which one they would choose. The experiment was then repeated with the different programs specified with different sets of attributes. 3 Conservation efforts necessarily represent a trade-off between many different subgroups of buildings within a heritage category, each of which might be more attractive in some cultural value aspects whilst being less attractive in others. The trade-off between many alternatives for conservations is part of the problem of cultural heritage protection. In addition, we take advantage of the statistical gains of larger choice set sizes. Table 2 shows an example choice set that was shown to participants before the actual choice experiment.

Experimental design
A full factorial for the experimental design for the discrete choice task would have led to 128 profiles. We used a 2 7 orthogonal design that gave us eight profiles. We then created the fold-over of that design to generate eight more profiles, a total of 16 profiles. That ensured that all estimated effects could be independently estimated from each other, as well as independently from first-order interactions. We used the 16 profiles in a balanced incomplete block design that led to 20 choice sets, each containing four of the profiles. Balanced incomplete block designs have the advantage that each profile is presented an equal number of times (in our case five times) and presented equally often with every other profile (in our case once). A "none" option was added to each choice set so that participants could pick this option if they did not want to support any of the conservation programs in that choice set. Each respondent was asked to complete all 20 choice sets. 4 In reaching a decision on choice set size and number of choice sets, researchers typically face a trade-off between learning and fatigue effects (Czajkowski et al. 2014). On one hand, a larger number of scenarios gives gains in statistical efficiency (more information per respondent) and learning (subjects' understanding of the choices involved). On the other hand, there might be a loss in the quality of responses when more scenarios are presented, since subjects may feel bored or tired (fatigue effects). Following Johnston et al. (2017), and after testing several choice set sizes and number of choices per respondent in focus groups, we observed that a choice set size of four, and 20 choices per respondent, provided high levels of realism and were a good combination of statistical and respondent efficiency. 5 The sensitivity of results to these design aspects was also tested by implementing a postsurvey nonparametric test analysing responses to the first (and last) five tasks, finding that parameters were quite stable.
In order to avoid any potential bias in presenting options in a certain order (for example a left-to-right survey response bias), we presented the four options within each choice set in a random order. We also randomised the order in which the choice sets were shown to each respondent. In addition, immediately before respondents went through the choice sets, they saw the same five images of examples of major heritage buildings again that they had seen earlier, to refresh their minds about the cultural heritage category concerned. A pilot study was undertaken online one month before the main study to verify that respondents understood the tasks and answered the questions as expected, and to determine how much time respondents took.

Hypotheses
The choice experiment was set to test a series of a priori hypotheses concerning the importance of cultural value elements in determining respondents' preferences, and their willingness to pay for alternative combinations of these attributes. In broad terms, these hypotheses propose that people will prefer conservation programs that focus on: buildings particularly noted for their beauty; older buildings rather than those from more recent times; buildings that play an important role in social life; and buildings of particular architectural significance. We expect that these preferences will be reflected in respondents' willingness to pay as indicated by their responses on the cost of alternative conservation programs. These hypotheses relate to preferences within attributes. In regard to the relative strengths of preferences between the cultural value attributes, there is no particular basis for formulating a testable hypothesis beyond the proposition that beauty, however interpreted, appears to be a consistently positive influence on decision-making in artistic and cultural contexts, 6 suggesting that aesthetic value in our choice experiment might be expected to have a stronger effect on choice of conservation programs than the other cultural value attributes.

Survey implementation
The main experiment was conducted online in August 2012, with a sample of 282 respondents drawn from adult residents in the state of New South Wales, Australia. Respondents were sampled from an online panel hosted by a market research company. Descriptive statistics showing the socio-demographic composition of the sample are shown in Table 3. We used stratified sampling to ensure that the sample would reflect the age and gender distribution of the adult (18+) population of the state. Although some time has elapsed between the date of the survey and the present, we note that the setting of the study and the nature of the information sought and hypotheses tested are not time-dependent but relate to more basic attitudes and preferences in the population. Moreover, it is not the intention of the research to provide direct policy advice but rather to throw light on the nature of public preferences for an important and enduring cultural phenomenon. 7

The econometric approach
Consider a range of heritage conservation programs j = 1, 2, … K . As in the traditional random utility model (McFadden 1974), let us assume that citizen i chooses a single conservation program j from among k mutually exclusive program alternatives in a specific choice situation t . For a well-behaved preference map, 8 upon selecting program j , a general indirect conditional utility U ijt takes the following form: (1) where V ijt represents the observed systematic portion of utility and is assumed to be a linear, additive function of the levels of the program attributes x ijt , and i is a vector of random variables that allow researchers to account for preference heterogeneity in the population. The random error term, ijt , represents the unobserved portion of the utility function, which is assumed to be independently and identically distributed (IID) over citizens, conservation programs and choice situations. 9 Following the behavioural decision process proposed by McFadden (1974), all models considered here depart from the straightforward structure that assumes that, if a citizen faces a multi-attribute discrete choice problem, the researcher will observe that citizen i chooses program j * if, and only if: Under this general specification of the utility function, the probability that a program j * is chosen by citizen i , in a choice situation t , is specified as follows; The standard specification presented in Eqs. 1-3 is often called a choice model in preference space. However, as has been noted by Johnston et al. (2017) among others, when the goal of the study is to elicit social preferences-as in the application proposed here-estimating the model directly in what has been called willingnessto-pay (WTP) space, rather than estimating a model in preference space and then calculating parameters of interest by reparametrization, has several advantages especially when the interest lies in the full distribution of the WTP and not just in some moments of the distribution (Train and Weeks 2005; Balcombe et al. 2008Balcombe et al. , 2009Scarpa et al. 2008;Thiene and Scarpa 2009;Scarpa et al. 2009;Greene and Hensher 2010).
Our model can be defined in the WTP space by taking into account that the WTP for any attribute i of the heritage policy comes from dividing the marginal utility of the attribute evaluated at the inverse utility function represented in Eq. 1 (i.e. i ) and the marginal utility of money in the same utility function M . Therefore, following Train and Weeks (2005), defining C i = i M allows us to rewrite Eq. 3 as follows Different specifications and assumptions concerning the deterministic ( V ijt ) and stochastic ( ijt ) portions of the utility function lead to alternative econometric models that have been employed to analyse choice data. In order to explore the sensitivity of our results to the choice of model specification (model uncertainty) in this study, we estimated seven of the most popular econometric models employed in analysing choice experiments. They are the multinomial logit model (MNL), the normal mixed A detailed specification of all models is included in an Appendix to this paper.

Results
As mentioned above, we account for potential model uncertainty by estimating all seven of the above-listed models. All models can be compared in terms of statistical goodness-of-fit and elicited WTP distributions for the proposed attributes in the experiment. Turning first to goodness-of-fit, we show in Table 4 the relevant statistics for the different model specifications. Four criteria are used for comparing the models' goodness-of-fit: marginal likelihood (ML); Akaike information criterion (AIC) (Akaike 1974(Akaike , 1987; Bayesian information criterion (BIC) (Schwarz 1978); and conditional Akaike information criterion (CAIC) (Bozdogan 1987). Since we observe the MM-MNL model to be the best fit, we report the results for this model in Table 5. We can see that the buildings most favoured for conservation by respondents are those from colonial times, reflecting a view that it is historic rather than more recent heritage that should be the primary focus of public policy. This result is also consistent with a sense that older buildings are those most likely to be in need of conservation. The results also indicate that respondents prefer conservation efforts that target buildings from pre-20th-century times but their enthusiasm is less pronounced for buildings from the post-war period or the late 20th century. Beautiful appearance, social role and architectural significance are all positive and significant. Not surprisingly, the coefficient for levy is negative, i.e. the higher the levy, the less likely respondents are in general to support the conservation program, other things equal.
Preferences are heterogeneous in the population, but since we do not have any particular hypotheses with respect to different preferences for cultural value aspects of major historic buildings, we simply examine the results to identify three classes of respondents, as shown in Table 5. Class 1 might be considered the greatest fans of colonial times, whilst being less influenced by other criteria. Those in class 2 have a strong dislike for post-war period buildings-for them the social role and architectural value of cultural heritage buildings is particularly important. For class 3, beautiful appearance, social importance and architectural significance are significant criteria and this class is also least influenced by the amount of the levy. We now turn to the WTP estimates for different cultural value attributes. Table 6 shows the WTP estimates for each cultural value component for the seven estimated models. These WTP estimates can be interpreted as follows: holding everything else constant, participants would on average be willing to pay the stated amount of money if a conservation program of cultural heritage was focussed on buildings with this particular attribute. As can be seen, the results are fairly robust against different specifications.
In light of the above results, we can draw the following conclusions concerning the hypotheses we proposed about the magnitudes of the cultural value effects. Our hypotheses concerning preferences within attributes are confirmed. However, the data do not support our proposition that the aesthetic qualities of buildings would be a stronger  criterion for allocating conservation funds than the other attributes. In fact, we can see that beautiful appearance is ranked only third in all our models estimated, with social role and architectural significance being judged to be more important.

Conclusions
The study reported in this paper leads to several important conclusions. Firstly, we have shown that it is possible to investigate preferences for a particular identifiable class of heritage buildings rather than simply for a single case. This result is significant for the formulation of heritage policy in the public sector at local or national levels, since policymakers are often faced with budgetary decisions framed in these terms. It should now be possible to extend this analysis to examine preferences for other categories of built heritage, such as groups of domestic houses in a heritage district, or local community-based heritage assets. Secondly, from a methodological/analytical perspective, our results demonstrate that both economic value and cultural value can be incorporated into a single integrated procedure in an evaluation of conservation decisions. In this regard, our methodological approach moves beyond choice modelling applications in the heritage field that focus only on the physical characteristics of publicly accessible heritage sites as they affect visitors (facilities, opening hours, etc.). The cultural value attributes in our study address more fundamental intrinsic qualities of heritage properties as drivers of people's preferences, and the results show the relationship between these characteristics and the economic parameters that influence the formulation of heritage policy decisions. Further research is needed to elaborate the components of cultural value in finer detail than we have been able to in this study.
As is always the case in stated-preference studies, our willingness-to-pay estimates cannot be interpreted as precise dollar amounts, but as indicators providing important input into understanding the economic dimensions of the issue under study. There is a long debate as to whether discrete choice experiments or other hypothetical stated-preference studies systematically overestimate WTP and whether they are susceptible to slight changes in experimental design. Whatever stance one takes in this debate, there is no reason to assume that some cultural value attributes are overestimated more than others, so that our study provides valuable information on the relative magnitude of the individual cultural value components as they relate to economic value of major cultural heritage buildings.
Thirdly, our study has some interesting implications in regard to particular aspects of heritage policy. One important and surprising result is that beautiful appearance does not play such an important role in influencing preferences as might be assumed. In this study, we find that architectural and social significance are the most important drivers of WTP. Respondents have a strong preference for some periods (namely colonial buildings) and show less enthusiasm for heritage buildings from more recent times. Another surprising result is that income does not seem to play a significant role in the likelihood of support for heritage conservation efforts. We also could not find any support for the hypothesis that high-income earners are likely to allocate a larger proportion of their income towards heritage conservation effort than those on lower incomes. One plausible explanation is that it is wealth rather than income that matters, and we only measured income.
In translating these sorts of procedures into practical application in specific cases, there is a challenge in determining how particular cultural value criteria can be represented in terms of rankings or ratings for funding allocation purposes. Some criteria can be more objectively represented than others. So, for example, the assessment of historic and architectural value might be relatively straightforward, but there may be more controversy in regard to social importance, and even more so in judging beautiful appearance.
To conclude, we return to the issue of whose preferences matter in the disbursement of public funds for heritage conservation that was originally raised by Alan Peacock and that we discussed in the introduction to this paper. Ideally, any decision relating to the allocation of funds for heritage conservation must be based on the best and fullest information available. Of course, expert opinion is indispensable in such a context-these are the heritage professionals with the detailed knowledge and expertise to be able to form sound judgements on cultural and architectural significance of buildings or classes of buildings under consideration. But Peacock's argument is also relevant-the public have a right to make their preferences known as an input into such decision-making. Our study provides an illustration as to how these preferences can be objectively assessed and potentially taken into account.
There are several implications for future research arising from this study. Firstly, the amount of visual or textual material provided to respondents is a critical issue in any choice experiment and is likely to be especially difficult in a DCE concerned with a class of buildings rather than a single case. In our study, we assumed that respondents were able to assimilate all the information provided, such that the results we obtained were based on a true understanding of the different classes of heritage under consideration. But this study is the first of its kind, and further research is needed to determine whether alternative ways of conveying information in these situations may be more efficient, especially with regard to the "no conservation" scenario. Likewise, it is important to consider whether results in a study like this one are robust to changes in experimental design, for example in the number of alternatives or choice sets shown to respondents.
A relevant issue in choice experiments aimed at guiding public policy is the potential existence of attribute non-attendance (ANA), i.e. respondents considering just a subset of attributes for some or all of the choice situations. It has been shown that ANA can seriously affect welfare estimations ). There are at least three ways of accounting for ANA in choice experiments. Firstly, respondents can be asked briefing or debriefing questions (before or after the choice task). This is often called a stated ANA (SANA) (Balcombe et al. 2016;Caputo et al. 2017). A second possibility consists in indirectly assessing the ANA by inferring it econometrically (Kragt 2013;Scarpa et al. 2009;Caputo et al. 2017). Finally, some authors have proposed the use of physical tests like eye-tracking data (Balcombe et al. 2015). In our study, a formal test for ANA was not implemented, although we did not observe evidence of ANA during the pre-survey design stages or in verbal protocol techniques employed during focus groups. Nevertheless, future research testing for ANA in our context could be very informative.
Finally, our data are derived from a survey conducted some time ago (2012), raising the question of the temporal stability of elicited preferences using choice experiments. Previous evidence shows that whilst there may be some temporal variations in public preferences elicited in such experiments, differences lie mostly in the marginal utility of income (affecting unconditional WTP), whilst the relative importance of attributes and levels remains stable (Liebe et al. 2012;Schaafsma et al. 2014;Rigby et al. 2016). For our study, there is no obvious reason why the preferences under consideration would have changed in any particular direction in the period since the fieldwork was carried out, and the results do not suggest any such problems. However, the stability of results in future studies of conservation policy using data collected in some previous period remains a matter worthy of further investigation.

M1. Multinomial logit model (MNL)
The simplest specification of the random utility model is the multinomial logit model (MNL) (McFadden 1974). The indirect conditional utility represented in (equation 1) presents two main assumptions in MNL. First, it assumes homogeneity in the observed part of the utility functions, that is, i = ;∀i = 1,..., N. And second, it assumes that the error term, ijt is IID distributed with a Type 1 extreme value distribution. Under this specification, the probability of observing a citizen choosing a specific program in any choice situation (equation 3) comes from the following expression: In the MNL, the variance of the error term is 2 2 ∕6 , where is the scale parameter. In order to ensure parameter identification, we follow the standard choice of normalising to 1, which results in a variance of the error term 2 ∕6.

M2. Normal mixed logit models (N-MIXL)
Several flexible econometric specifications have been proposed aimed at relaxing the restricted assumptions of conventional MNL. The most extensively used option for analysing choice data comes from the use of what can be termed uncorrelated normal mixed logit model (N-MIXL). As in any mixed logit model (MIXL), this specification implicitly accounts for unobserved citizen preference heterogeneity in the sampled population by assuming that i is a collection of variables that are independent and drawn from a specific statistical distribution. Thus, the indirect conditional utility in this model is where represents the mean value of citizens' preferences across the population, and i is the deviation from the mean of the preferences of citizen i. Although MIXL can accommodate any statistical distribution for i , under the N-MIXL model specification, it is assumed that i follows a normal distribution with mean and covari- Therefore, the unconditional choice probability of observing choice j * by citizen i in choice situation t is the expected value of the conditional logit probability over the parameter values. 11 This is the integral over all possible values of i , weighted by the distribution of i :

M3. Theoretically restricted mixed logit model (T-MIXL)
As shown in McFadden and Train (2000), if the mixing distribution is chosen appropriately, the MIXL random utility model can be approximated by this econometric specification. From a theoretical point of view, this result is very appealing and has been stated as the main argument for adopting a MIXL in DCE applications during the last decade or so. However, from an empirical point of view, it is implausible to test the potential alternative specifications of MIXL models. The rest of the modelling specifications that have been employed in the literature-and in particular those that we are using in this analysis-can be seen as specific cases of the MIXL model with different mixing distributions. One of these models is the T-MIXL. Unlike the N-MIXL in which all the parameters are assumed to follow a normal distribution, the T-MIXL model incorporates an additional restriction for the price parameter. Since for most DCE applications, economic theory suggests that it is very unlikely that rising prices will positively affect preferences, T-MIXL assumes that such parameters follow a constrained triangular distribution. This assumption ensures that only negative values of the parameters are considered (Greene et al. 2006). We also explored the use of alternative specifications for the price parameter (i.e. log-normal, specific utility forms,...). However, since the main results of this study were not sensitive to this choice specification, here we provide the most common one involving using a constrained triangular distribution.

M4. Scaled multinomial logit models (SMNL)
In all the above econometric specifications of the RUM, the scale parameter is normalised to one, in order to ensure parameter identification. However, several commentators have claimed that choice data are likely to present heterogeneity in the scale parameter in ways that are not explicitly captured by adopting N-MIXL or T-MIXL models (Swait and Louviere 1993). Fiebig et al. (2010) proposed an alternative model specification that allows researchers to account for heterogeneity across respondents in the random component of utility. This model has been termed elsewhere as the scaled multinomial logit (S-MNL) model (Keane and Wasi 2013;Hensher and Greene 2010;Hensher et al. 2011).
The main characteristic of the S-MNL model is that the error variance is allowed to be heterogeneous in the population. Thus, the conditional indirect utility that citizen i derives from program j in choice situation t ( U ijt ) is given by: where the parameters i collect the specific standard deviation of the error term for each citizen i, capturing potential scale heterogeneity. In order to guarantee that individual scaling factors are strictly positive, we use an exponential transformation as in Fiebig et al. (2010), that is, where w i follows a standard normal distribution and is a parameter accounting for the unobserved scale heterogeneity. Since t is directly and positively related with the existence of scale heterogeneity in the sample, its interpretation is that the higher the parameter, the higher is the likelihood that there is scale heterogeneity in the data set. Finally, in order to ensure identification, the error term is normalised such as E i = 1 ; and therefore, = − 2 ∕2.

M5. Generalized mixed logit models (G-MNL)
The generalized mixed logit (G-MNL) is a flexible specification that allows researchers to accommodate individual scale as well as individual preference heterogeneity. It was firstly proposed by Fiebig et al. (2010). The G-MNL model specification nests N-MIXL and S-MNL by accounting for unobserved heterogeneity in both the systematic and the random components of the conditional indirect utility function. In the G-MNL model, utility U ijt is defined by: where i is the individual-specific standard deviation of the error term capturing scale heterogeneity; i is individual-specific deviations from the mean, capturing individual heterogeneity in preferences; and g is a parameter between zero and one, that can capture how the variance of the individual preference heterogeneity varies with scale.
Finally, in order to estimate G-MNL, we need both to define a specification of the statistical distribution of i and to include some restrictions to ensure parameter identification. Here, we follow the recommendations of Keane and Wasi (2013) and use a log-normal distribution that guarantees positive values for the scale parameter, that is, log i ∼ N , 2 . Note that parameters and cannot be jointly identified. The strategy proposed in Keane and Wasi (2013) is to estimate and , and then to calibrate accordingly. By doing this, can be interpreted as a vector collecting mean preferences for each attribute in the choice set.

M6. The latent class model (LC)
Another popular model specification for analysing choice data is the latent class model (LC). In this model, preference heterogeneity is accounted for by a discrete distribution over unobservable endogenous (latent) classes of respondents (Boxall and Adamowicz 2002;Wedel and Kamakura 2000). Preferences are assumed to be homogeneous within each class but are allowed to differ across classes. The population is thus represented as consisting of a finite number of segments or classes (S). Respondents are allocated to segments simultaneously with the analysis of choices. The number of segments is often unknown before the experiment, so it is endogenously determined by the data, whilst membership of a segment depends probabilistically on the respondent's observable socio-economic or attitudinal and behavioural characteristics. The most commonly employed criteria to decide the number of classes (S) are the Bayesian information criterion (BIC) or the Aikake information criterion (AIC).
In the random utility framework for the LC, the utility a citizen i who belongs to segment s derives from program j at moment t is given by, where ′ s is the segment-specific vector of coefficients, X ijt is the vector of attributes associated with each program and ijt | | s is the random component of utility for each segment. Since the vectors of coefficients differ between segments, preference heterogeneity across segments is captured. Under the assumption of independently and identically distributed (iid) error terms that follow a Type 1 extreme value distribution, the probability that program j* is selected by a citizen i belonging to segment s is given by: Membership of a specific segment is determined by a likelihood function M that classifies respondents to one of the segments with probability P is . The membership likelihood function is given by M is = a s Z i + is ; where Z i is a vector of socio-economic and other observed characteristics of the respondent and is represents the error term. Assuming that this error term is also iid and follows a type 1 extreme value distribution, the probability that a citizen i belongs to segment s * is: The joint probability that citizen i chooses program j * at the moment t is given by

M7. The mixture-of-normals mixed multinomial logit model (MM-MNL)
The use of mixture-of-multivariate-normals as an alternative flexible distribution is present in the literature. In Geweke and Keane (2001), the authors develop the mixture-ofnormals probit model. This model has been applied by Araña and León (2005) in environmental valuation using dichotomous choice contingent valuation data (DCCV). In the context of DCE, a similar model has been applied to data sets from several choice experiments in different contexts (Brey and Walker 2011;Keane and Wasi 2013). The original application of the mixture-of-normals model to DCE data was probably Burda et al. (2008). In this paper, the authors specified a subset of coefficients in a MXL model to follow mixture-of-normal distributions, whilst some others followed a simple normal distribution. As Keane and Wasi (2013) point out, the MM-MNL (or "mixed-mixed logit") model essentially nests the MIXL with LC models, with the aim of minimising the disadvantages of each. In fact, specifying the mixing distribution of MIXL to be mixture-of-normals is equivalent to extending LC models to incorporate unobserved heterogeneity within class. Thus, the utility of citizen i in period t conditional on choice of program j is specified as: where i follows a MVN � s , ∑ s � with probability P is . ∑ S s=1 P is = 1 ; and P is ≥ 0, ∀s ; s = 1, 2, ...S. Note that when S = 1, MM-MNL boils down to the N-MIXL model presented above. On the other hand, when ∑ s = 0 , MM-MNL becomes the LC model. Therefore, the choice probabilities are given by the following expression: