Abstract
The simulated choice probabilities in mixed logit models are usually approximated numerically using Halton or random draws from a multivariate mixing distribution for the random parameters. Theoretically, the order in which the estimated variables enter the model should not matter. However, in practice, simulation “noise” inherent in the numerical procedure leads to differences in the magnitude of the estimated coefficients depending on the arbitrary order in which the random variables are estimated. The problem is exacerbated when a low number of draws are used or if correlation among coefficients is allowed. In particular, the Cholesky factorization procedure, which is used to incorporate correlation into the model, propagates simulation noise in the estimate of one coefficient to estimates of all subsequent coefficients in the model. Ignoring the potential ordering effects in simulated maximum likelihood estimation methods may seriously compromise the ability of replicating the results and can inadvertently influence policy recommendations. We find that better estimation accuracy is achieved with Halton draws using small prime numbers as it is the case for small integrating dimensions; but random draws provide better accuracy than Halton draws from large prime numbers as it is normally the case in high integrating dimensions. With correlation, the standard deviations have very large fluctuations depending on the order of the variables, affecting the conclusions regarding heterogeneity of preferences.
Similar content being viewed by others
Notes
The first numbers in Halton sequences are highly correlated by the way they are constructed. In order to reduce this problem, it is recommended to burn at least the first n draws, where n is the largest prime number used.
The models were also estimated for 200, 500, and 1000 draws. The results were consistent across the number of draws, and they are available in “Appendix”.
References
Bhat CR (2001) Quasi-random maximum simulated likelihood estimation of the mixed multinomial logit model. Transp Res Part B Methodol 35(7):677–693. https://doi.org/10.1016/S0191-2615(00)00014-X
Bhat CR (2003) Simulation estimation of mixed discrete choice models using randomized and scrambled Halton sequences. Transp Res Part B Methodol 37(9):837–855. https://doi.org/10.1016/S0191-2615(02)00090-5
Calfee J, Winston C, Stempski R (2001) econometric issues in estimating consumer preferences from stated preference data: a case study of the value of automobile travel time. Rev Econ Stat 83(4):699–707. https://doi.org/10.1162/003465301753237777
Cappellari L, Jenkins SP (2006) Calculation of multivariate normal probabilities by simulation, with applications to maximum simulated likelihood estimation. Stata J 6(2):156–189
Chang JB, Lusk JL (2011) Mixed logit models: accuracy and software choice. J Appl Econ 26(1):167–172. https://doi.org/10.1002/jae.1201
Croissant Y (2018) mlogit: multinomial logit model. R Package. https://CRAN.R-project.org/package=mlogit
Drukker DM, Gates R (2006) Generating Halton sequences using Mata. Stata J 6(2):214–228
Geweke J, Keane M, Runkle D (1994) Alternative computational approaches to inference in the multinomial probit model. Rev Econ Stat 76(4):609–632. https://doi.org/10.2307/2109766
Greene WH (2012) Econometric analysis. Prentice Hall, Boston
Hensher DA, Greene WH (2003) the mixed logit model: the state of practice. Transportation 30(2):133–176. https://doi.org/10.1023/A:1022558715350
Hess S, Rose JM (2009) Allowing for intra-respondent variations in coefficients estimated on repeated choice data. Transp Res Part B Methodol 43(6):708–719. https://doi.org/10.1016/j.trb.2009.01.007
Hess S, Train KE, Polak JW (2006) On the use of a modified latin hypercube sampling (MLHS) method in the estimation of a mixed logit model for vehicle choice. Transp Res Part B Methodol 40(2):147–163. https://doi.org/10.1016/j.trb.2004.10.005
Hole AR (2007) Estimating mixed logit models using maximum simulated likelihood. Stata J 7(3):388–401
Koop G, Pesaran MH, Potter Simon M (1996) Impulse response analysis in nonlinear multivariate models. J Econ 74(1):119–147. https://doi.org/10.1016/0304-4076(95)01753-4
McFadden D, Ruud PA (1994) Estimation by simulation. Rev Econ Stat 76(4):591–608. https://doi.org/10.2307/2109765
McFadden D, Train K (2000) Mixed MNL models for discrete response. J Appl Econ 15(5):447–470
Pesaran HH, Shin Y (1998) Generalized impulse response analysis in linear multivariate models. Econ Lett 58(1):17–29. https://doi.org/10.1016/S0165-1765(97)00214-0
Revelt D, Train K (1998) Mixed logit with repeated choices: households’ choices of appliance efficiency level. Rev Econ Stat 80(4):647–657. https://doi.org/10.1162/003465398557735
Sivakumar A, Bhat C, Ökten G (2005) Simulation estimation of mixed discrete choice models with the use of randomized quasi-monte carlo sequences: a comparative study. Transp Res Rec J Transp Res Board 1921:112–122. https://doi.org/10.3141/1921-13
Train K (2000) Halton sequences for mixed logit. Department of Economics, UCB, Berkeley
Train Kenneth E (2009) Discrete choice methods with simulation. Cambridge University Press, Cambridge
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Palma, M.A., Vedenov, D.V. & Bessler, D. The order of variables, simulation noise, and accuracy of mixed logit estimates. Empir Econ 58, 2049–2083 (2020). https://doi.org/10.1007/s00181-018-1609-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00181-018-1609-2