Skip to main content
Log in

Single-Choice, Repeated-Choice, and Best-Worst Scaling Elicitation Formats: Do Results Differ and by How Much?

  • Published:
Environmental and Resource Economics Aims and scope Submit manuscript

Abstract

This paper presents what we believe to be the most comprehensive suite of comparison criteria regarding multinomial discrete-choice experiment elicitation formats to date. We administer a choice experiment focused on ecosystem-service valuation to three independent samples: single-choice, repeated-choice, and best-worst scaling elicitation. We test whether results differ by parameter estimates, scale factors, preference heterogeneity, status-quo effects, attribute non-attendance, and magnitude and precision of welfare measures. Overall, we find limited evidence of differences in attribute parameter estimates, scale factors, and attribute increment values across elicitation treatments. However, we find significant differences in status-quo effects across elicitation treatments, with repeated-choice resulting in greater proportions of “action” votes, and consequently, higher program-level welfare estimates. Also, we find that single-choice yields drastically less-precise welfare estimates. Finally, we find some differences in attribute non-attendance behavior across elicitation formats, although there appears to be little consistency in class shares even within a given elicitation treatment.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Similar content being viewed by others

Notes

  1. We adopt the terminology of Carson and Louviere (2011) in which a discrete choice experiment is a survey in which respondents are asked to make a discrete choice from two or more alternatives within a choice set and the choice sets are carefully constructed by the researcher according to an experimental design.

  2. This method can go by other names. See Carson and Louviere (2011) and Louviere et al. (2010) for discussions on nomenclature.

  3. See Petrolia and Interis (2013) for a detailed discussion. It should be noted that status-quo effects and strategic behavior are not necessarily unique to the repeated multinomial-choice format. Day et al. (2012), e.g., find status-quo effects in a repeated binary-choice format. As for strategic behavior, Samuelson (1954, p. 188) states that “it is in the selfish interest of each person to give false signals, to pretend to have less interest in a given collective consumption activity than he really has.” Following Samuelson’s view, Carson (2012, p. 37) states that “those answering contingent valuation surveys about a public good should follow a free-rider approach of pretending to be less interested, hoping that the costs of providing the public good will fall on others”.

  4. This adjustment to the price coefficient ensures that the sampling distribution of the price parameter will be entirely in the negative domain, so that when calculating, for example, willingness-to-pay values there will not be any division by zero or by a positive price parameter.

  5. It is worth noting that Holmes and Boyle (2005), Pattison et al. (2011), Day et al. (2012), and Carlsson et al. (2012) indicate that the repeated-response format may allow for learning, implying that initial choice tasks may be less informative than latter ones, or in some cases, should be discarded. Ladenburg and Olsen (2008) cite their results as evidence of this effect. Scheufele and Bennett (2012) point out, however, that it is also possible that respondents to a repeated-choice survey discover the possibility of responding strategically as they progress through the choice tasks, and this “strategic learning” may coincide with learning about the choice task.

  6. A related finding is that of Meyerhoff and Liebe (2009) that choice task complexity can lead to an increased probability of status-quo votes.

  7. It is important to note that these tests are conducted from a purely empirical basis because there exists no theory that would dictate which elicitation format, in a multinomial-choice setting, is the “standard”. Unlike single binary-choice questions, which have been shown to be incentive-compatible, multinomial-choice questions are not incentive compatible, at least in a field setting (see Carson and Groves 2007 and Petrolia and Interis 2013), and it is in this setting that our interest lies here, given the widespread use of these elicitation methods in the field for policy-relevant valuation. Multinomial-choice questions can be made incentive-compatible in a lab setting (see Taylor et al. 2010).

  8. A reviewer pointed out that the water-quality attribute could be a prior for the fisheries and wading bird attributes. Unfortunately, we did not consider this possibility when designing the survey, so we acknowledge this potential weakness.

  9. S-efficiency (Bliemer and Rose 2005) was also evaluated for each individual parameter, assuming both fixed- and random-parameters models. S-efficiency provides a lower bound on sample size to obtain significant estimates for each coefficient (Bliemer and Rose 2009). We specified coefficient priors of 0.3 (mean) and 0.15 (standard deviation), normally distributed, on all non-price attributes, and −0.005 on price, with \(t_{0.05} =1.96\), corresponding to a 95% confidence level. Assuming a fixed parameter model, the largest s-value was \(\sim \)6, implying that we would need to replicate the 24-row design a minimum of 6 times (i.e., \(24 \times 6 = 144\) choice observations) to obtain significance on our coefficients. Assuming a random-parameter design, the largest s-value on the mean coefficients was \(\sim \)12, implying a minimum of 288 choice observations. Each of our individual samples contained \(\sim \)500 choice observations, roughly twice the number required for the larger of the two s-values calculated. Note, however, that these efficiency measures do not speak to the question of whether our sample size is large enough to establish sufficient power for tests across elicitation formats. We did not conduct any such power analysis for this purpose.

  10. Because we did not randomize the order of presentation of choice sets within blocks, there is some possibility of confounding effects of our order-effect variables used in the regression models.

  11. This format may differ somewhat from other studies that have utilized the SBW elicitation format. In those studies, it appears that the choices are sequential, so that the respondent chooses the “best”, then is shown only the remaining alternatives and is asked to indicate the “worst”, etc., until all alternatives have been fully ranked.

  12. Let A and B represent a pair of alternatives in a choice set. The second-best case operates under the assumption that the probability of A being chosen as “worst” is equal to the probability of B being chosen as “best”. This rank-order explosion is also known as the Plackett–Luce model (Marden 1995), the choice-based method of conjoint analysis (Hair et al. 2010), and most frequently, rank-ordered logit (StataCorp 2013).

  13. A reviewer pointed out that our inclusion of language that the program would be partially funded with existing tax dollars may have introduced a problem with the resulting welfare estimates. If we do not know how respondents interpreted what would happen to the existing tax dollars in the event of no project, then we do not know how much utility loss they would be willing to trade for the utility gain of the described policy. If so, then our estimates may not be a sufficient money metric measure of the utility change. However, this potential flaw should not invalidate the comparisons made here.

  14. Under the RMC and SBW cases, models were specified as a panel, such that individual-specific coefficients for random parameters were constrained to be equal across observations for the same respondent.

  15. Note that we fix \({{\varvec{\upsigma }}}=0\) in these models, i.e., we do not allow for preference heterogeneity in the attributes when testing for differences in status-quo effects.

  16. The random-parameters specification did not significantly improve model fit, did not indicate any significant preference heterogeneity beyond that already captured by the latent classes, and all of the tests of class-share differences across elicitation types were identical to those of the fixed-parameters models. These results are reported in table A7 of the Appendix in ESM.

  17. We also estimated the models without the parameter restrictions across classes, and tests of class differences are exactly identical to the main results. In two cases the test of scale parameter differences is significant (Louisiana-Salt SMC vs. RMC and GOM-Oyster SMC vs. RMC). These results are reported in table A6 of the Appendix in ESM.

  18. Note that we fix \({{\varvec{\upsigma }}}=0\) in these models as well.

  19. This adjustment is not applied to the latent-class logit models due to the difficulty of imposing log-normal distributions in that setting. Because we are not using these results to construct welfare estimates, this omission should not affect the results, which generally comes into play during the simulation stage of welfare estimate construction (see Carson and Czajkowski 2013).

  20. In an alternative specification that follows the two-class model used by Hess et al. (2013) (all attended to and none attended to), test results were identical with one exception, where we found no significant class-share differences between SMC and SBW for the Louisiana Oyster sample. It should be noted that the 3-class results reported here indicate that the “price NAT” class, which is the class omitted in the 2-class model, has a larger share than one or both of the other classes in 5 out of the 8 models estimated, implying that a model that omits this class may be misspecified to begin with. These results are reported in table A5 of the Appendix in ESM.

  21. For the error-components logit models, 10,000 draws were used for each parameter. The Krinsky and Robb approach needs to be adjusted, however, when there are random parameters in order to account for the distribution of the random parameters in the population. For the random-coefficients logit models, we therefore used 5000 draws for each parameter, including the mean and standard deviations of the distributions of random variables. This creates 5000 simulated distributions for each random parameter, each defined by a mean and standard deviation. We then made 5000 draws of a given random parameter from each of these 5000 simulated distributions (see Hensher and Greene 2003). This yields \(5000^{2}\) total random draws for each random variable.

  22. These tests across two simulated distributions of length n and m involve the creation of an \(n-\)by-m vector, which exceeds computational capacity for the random coefficients, each of which has a simulated distribution of length \(5000^{2}\). We therefore re-estimated each of the random-coefficients models using 100 draws in each stage and conducted these tests using simulated distributions of length \(100^{2}\). The confidence intervals for the random-coefficients models presented are from the first simulation of vectors length \(5000^{2}\), however. Although these intervals are intended to be more precise than intervals based on vectors of only \(100^{2}\) draws, the tests of equality of means are not affected by the number of draws, as it is obvious that the confidence intervals for the random coefficients are very wide and overlap greatly.

  23. Precision is also a function of sample size. Although all of our samples were in the neighborhood of 500 observations, there were some minor differences, which could account partially for these differences in precision.

References

  • Alemu MH, Mørkbak MR, Olsen SB, Jensen CL (2013) Attending to the reasons for attribute non-attendance in choice experiments. Environ Resour Econ 54:333–359

    Article  Google Scholar 

  • Arrow K, Solow R, Portney PR, Leamer EE, Radner R, Schuman H (1993) Report of the NOAA panel on contingent valuation. Fed Regist 58:4601–4614

    Google Scholar 

  • Bateman IJ, Cole M, Cooper P, Georgiou S, Hadley D, Poe GL (2004) On visible choice sets and scope sensitivity. J Environ Econ Manag 47:71–93

    Article  Google Scholar 

  • Beaumais O, Prunetti D, Casacianca A, Pieri X (2015) Improving solid waste management in the Island of Beauty (Corsica): a latent-class rank-ordered logit approach with observed heterogeneous ranking capabilities. Revue d’economie politique 125(2):209–231

    Article  Google Scholar 

  • Blamey RK, Bennett JW, Louviere JJ, Morrison MD, Rolfe JC (2002) Attribute causality in environmental choice modelling. Environ Resour Econ 23:167–186

    Article  Google Scholar 

  • Bliemer MCJ, Rose JM (2009) Efficiency and sample size requirements for stated choice experiments. Transportation Research Board Annual Meeting, Washington, DC

  • Bliemer MCJ, Rose JM (2005) Efficiency and sample size requirements for stated choice studies. Report ITLS-WP-05-08, Institute of Transport and Logistics Studies, University of Sydney

  • Campbell D, Hensher DA, Scarpa R (2011) Non-attendance to attributes in environmental choice analysis: a latent class specification. J Environ Plan Manag 54(8):061–76

    Article  Google Scholar 

  • Campbell D, Hutchinson WG, Scarpa R (2008) Incorporating discontinuous preferences into the analysis of discrete choice experiments. Environ Resour Econ 41:401–417

    Article  Google Scholar 

  • Carlsson F, Mørkbak MR, Olsen SB (2012) The first time is the hardest: a test of ordering effects in choice experiments. J Choice Model 5(2):19–37

    Article  Google Scholar 

  • Carson RT (2012) Contingent valuation: a practical alternative when prices aren’t available. J Econ Perspect 26(4):27–42

    Article  Google Scholar 

  • Carson RT (1985) Three essays on contingent valuation. PhD thesis, University of California, Berkeley

  • Carson RT, Czajkowski M (2013) A new baseline model for estimating willingness to pay from discrete choice models. Presented at the 2013 international choice modelling conference, July. http://www.icmconference.org.uk/index.php/icmc/ICMC2013/paper/view/730. Cited 9 Dec 2014

  • Carson RT, Groves T (2007) Incentive and informational properties of preference questions. Environ Resour Econ 37:181–210

    Article  Google Scholar 

  • Carson RT, Louviere JJ (2011) A common nomenclature for stated preference elicitation approaches. Environ Resour Econ 49:539–559

    Article  Google Scholar 

  • Chapman RG, Staelin R (1982) Exploiting rank ordered choice set data within the stochastic utility model. J Mark Res XIX:288–301

  • ChoiceMetrics (2011) Ngene 1.1 user manual and reference guide

  • Collins AT, Rose JM, Hensher DA (2013) Specification issues in a generalized random parameters attribute nonattendance model. Transp Res Part B 56:234–253

    Article  Google Scholar 

  • Day B, Bateman IJ, Carson RT, Dupont D, Louviere JJ, Morimoto S, Scarpa R, Wang P (2012) Ordering effects and choice set awareness in repeat-response stated preference studies. J Environ Econ Manag 63:73–91

    Article  Google Scholar 

  • Day B, Prades JLP (2010) Ordering anomalies in choice experiments. J Environ Econ Manag 59:271–285

    Article  Google Scholar 

  • Flynn TN, Louviere JJ, Peters TJ, Coast J (2007) Best-worst scaling: what it can do for health care research and how to do it. J Health Econ 26:71–89

    Article  Google Scholar 

  • Flynn T, Marley AJ (2014) Best worst scaling: theory and methods. In: Hess S, Daly A (eds) Handbook of choice modelling. Edward Elgar Publishing, Camberley, pp 178–201

    Google Scholar 

  • Greene WH (2012) Reference Guide, NLOGIT Version 5.0, Econometric Software, Inc., Plainview, NY

  • Haab TC, McConnell KE (2002) Valuing environmental and natural resources: the econometrics of non-market valuation. Edward Elgar, Northampton

    Book  Google Scholar 

  • Hair JF Jr, Black WC, Babin BJ, Anderson RE (2010) Multivariate data analysis, 7th edn. Pearson, Upper Saddle River

    Google Scholar 

  • Hanemann W (1985) Some issues in continuous- and discrete-response contingent valuation studies. Northeast J Agric Econ 14:5–13

    Google Scholar 

  • Hensher DA, Collins AT, Greene WH (2013) Accounting for attribute non-attendance and common-metric aggregation in a probabilistic decision process mixed multinomial logit model: a warning on potential confounding. Transportation 40:1003–1020

    Article  Google Scholar 

  • Hensher DA, Greene WH (2003) The mixed logit model: the state of the practice. Transportation 30:133–176

    Article  Google Scholar 

  • Hensher DA, Greene WH (2010) Non-attendance and dual processing of common-metric attribute in choice analysis: a latent class specification. Empir Econ 39:413–426

    Article  Google Scholar 

  • Hensher DA, Rose JM, Greene WH (2012) Inferring attribute non-attendance from stated choice data: implications for willingness to pay estimates and a warning for stated choice experiment design. Transportation 39:235–245

    Article  Google Scholar 

  • Hess S, Stathopoulos A, Campbell D, O’Neill V, Caussade S (2013) It’s not that I don’t care, I just don’t care very much: confounding between attribute non-attendance and taste heterogeneity. Transportation 40:583–607

    Article  Google Scholar 

  • Holmes TP, Boyle KJ (2005) Dynamic learning and context-dependence in sequential, attribute-based, stated-preference valuation questions. Land Econ 81:114–126

    Article  Google Scholar 

  • Interis MG, Petrolia DR (2016) Location, location, habitat: how the value of ecosystem services varies across location and by habitat. Land Econ 92(2):292–307

    Article  Google Scholar 

  • Ladenburg J, Olsen SB (2008) Gender-specific starting point bias in choice experiments: evidence from an empirical study. J Environ Econ Manag 56:275–285

    Article  Google Scholar 

  • List JA, Sinha P, Taylor MH (2006) Using choice experiments to value non-market goods and services: evidence from field experiments. B.E. J Econ Anal Policy 5(2):1–37

    Google Scholar 

  • Louviere JJ, Flynn TN, Carson RT (2010) Discrete choice experiments are not conjoint analysis. J Choice Model 3(3):57–72

    Article  Google Scholar 

  • Louviere JJ, Flynn TN, Marley AAJ (2015) Best-worst scaling: theory, methods and applications. Cambridge University Press, Cambridge

    Book  Google Scholar 

  • Marden JI (1995) Analyzing and modeling rank data. Chapman and Hall, London

    Google Scholar 

  • Marley AAJ, Louviere JJ (2005) Some probabilistic models of best, worst, and best-worst choices. J Math Psychol 49:464–480

    Article  Google Scholar 

  • McNair B, Bennett J, Hensher D (2011) A comparison of responses to single and repeated discrete choice questions. Resour Energy Econ 33:554–571

    Article  Google Scholar 

  • McNair B, Hensher D, Bennett B (2012) Modelling heterogeneity in response behavior towards a sequence of discrete choice questions: a probabilistic decision process model. Environ Resour Econ 51:599–616

    Article  Google Scholar 

  • Meyerhoff J, Liebe U (2009) Status quo effect in choice experiments: empirical evidence on attitudes and choice task complexity. Land Econ 85(3):515–528

    Article  Google Scholar 

  • Newell LW, Swallow SK (2013) Real-payment choice experiments: valuing forested wetlands and spatial attributes within a landscape context. Ecol Econ 92:37–47

    Article  Google Scholar 

  • Pattison J, Boxall PC, Adamowicz WL (2011) The economic benefits of wetland retention and restoration in Manitoba. Can J Agric Econ 59:223–244

    Article  Google Scholar 

  • Petrolia DR, Interis MG (2013) Should we be using repeated-choice surveys to value public goods? Assoc Environ Resour Econ Newsl 33(2):19–25

    Google Scholar 

  • Petrolia DR, Interis MG, Hwang J (2014) America’s wetland? A national survey of willingness to pay for restoration of Louisiana’s coastal wetlands. Mar Resour Econ 29(1):17–37

    Article  Google Scholar 

  • Poe G, Giraud K, Loomis J (2005) Computational methods for measuring the difference of empirical distributions. Am J Agric Econ 87(2):353–365

    Article  Google Scholar 

  • Potoglou D, Burge P, Flynn T, Netten A, Malley J, Forder J, Brazier JE (2011) Best-worst scaling vs. discrete choice experiments: an empirical comparison using social care data. Soc Sci Med 72:1717–1727

    Article  Google Scholar 

  • Rigby D, Burton M, Lusk JL (2015) Journals, preferences, and publishing in Agricultural and Environmental Economics. Am J Agric Econ 97(2):490–509

    Article  Google Scholar 

  • Samuelson PA (1954) The pure theory of public expenditure. Rev Econ Stat 36(4):387–389

    Article  Google Scholar 

  • Scarpa R, Notaro S, Louviere JJ, Raffaelli R (2011) Exploring scale effects of best/worst rank ordered choice data to estimate benefits of tourism in Alpine Grazing Commons. Am J Agric Econ 93(3):813–828

    Article  Google Scholar 

  • Scarpa R, Thiene M, Hensher DA (2010) Monitoring choice task attribute attendance in nonmarket valuation of multiple park management services: does it matter? Land Econ 86(4):817–839

    Article  Google Scholar 

  • Scheufele G, Bennett J (2012) Response strategies and learning in discrete choice experiments. Environ Resour Econ 52:435–453

    Article  Google Scholar 

  • Silz-Carson K, Chilton SM, Hutchinson WG (2010) Bias in choice experiments for public goods. Newcastle discussion papers in Economics, no. 2010/05, Newcastle University Business School

  • StataCorp (2013) Stata release 13.0 statistical software. StataCorp LP, College Station

  • Swait J, Louviere J (1993) The role of the scale parameter in the estimation and comparison of multinomial logit models. J Mark Res XXX:305–314

  • Taylor LO, Morrison MD, Boyle KJ (2010) Exchange rules and the incentive compatibility of choice experiments. Environ Resour Econ 47:197–220

    Article  Google Scholar 

  • Train KE (2009) Discrete choice methods with simulation, 2nd edn. Cambridge University Press, Cambridge

    Book  Google Scholar 

  • Vossler CA, Doyon M, Rondeau D (2012) Truth in consequentiality: theory and field evidence on discrete choice experiments. Am Econ J Microecon 4:145–171

    Article  Google Scholar 

Download references

Acknowledgements

The authors thank A.A.J. Marley and two anonymous referees for comments that greatly improved the manuscript. This research was conducted under award NA10OAR4170078 to the Mississippi-Alabama Sea Grant Consortium by the NOAA Office of Ocean and Atmospheric Research, U.S. Department of Commerce, and was supported by the USDA Cooperative State Research, Education & Extension Service, Multistate Project W-3133 “Benefits and Costs of Natural Resources Policies Affecting Ecosystem Services on Public and Private Lands” (Hatch # MIS-033130).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Daniel R. Petrolia.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (docx 67 KB)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Petrolia, D.R., Interis, M.G. & Hwang, J. Single-Choice, Repeated-Choice, and Best-Worst Scaling Elicitation Formats: Do Results Differ and by How Much?. Environ Resource Econ 69, 365–393 (2018). https://doi.org/10.1007/s10640-016-0083-6

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10640-016-0083-6

Keywords

Navigation