Abstract
Outliers in discrete choice response data may result from misclassification and misreporting of the response variable and from choice behaviour that is inconsistent with modelling assumptions (e.g. random utility maximisation). In the presence of outliers, standard discrete choice models produce biased estimates and suffer from compromised predictive accuracy. Robust statistical models are less sensitive to outliers than standard non-robust models. This paper analyses two robust alternatives to the multinomial probit (MNP) model. The two models are robit models whose kernel error distributions are heavy-tailed t-distributions to moderate the influence of outliers. The first model is the multinomial robit (MNR) model, in which a generic degrees of freedom parameter controls the heavy-tailedness of the kernel error distribution. The second model, the generalised multinomial robit (Gen-MNR) model, is more flexible than MNR, as it allows for distinct heavy-tailedness in each dimension of the kernel error distribution. For both models, we derive Gibbs samplers for posterior inference. In a simulation study, we illustrate the finite sample properties of the proposed Bayes estimators and show that MNR and Gen-MNR produce more accurate estimates if the choice data contain outliers through the lens of the non-robust MNP model. In a case study on transport mode choice behaviour, MNR and Gen-MNR outperform MNP by substantial margins in terms of in-sample fit and out-of-sample predictive accuracy. The case study also highlights differences in elasticity estimates across models.
This is a preview of subscription content, access via your institution.





Notes
The estimation code is available at https://github.com/RicoKrueger/robit.
References
Albert, J.H., Chib, S.: Bayesian analysis of binary and polychotomous response data. J. Am. Stat. Assoc. 88(422), 669–679 (1993)
Alptekinoğlu, A., Semple, J.H.: The exponomial choice model: a new alternative for assortment and price optimization. Oper. Res. 64(1), 79–93 (2016)
Benoit, D.F., Van Aelst, S., Van den Poel, D.: Outlier-robust Bayesian multinomial choice modeling. J. Appl. Econom. 31(7), 1445–1466 (2016)
Bezanson, J., Edelman, A., Karpinski, S., Shah, V.B.: Julia: a fresh approach to numerical computing. SIAM Rev. 59(1), 65–98 (2017)
Bhat, C.R.: A heteroscedastic extreme value model of intercity travel mode choice. Transp. Res. Part B Methodol. 29(6), 471–483 (1995)
Bhat, C.R.: The maximum approximate composite marginal likelihood (MACML) estimation of multinomial probit-based unordered response choice models. Transp. Res. Part B Methodol. 45(7), 923–939 (2011)
Botev, Z.I., l’Ecuyer, P.: Simulation from the normal distribution truncated to an interval in the tail. In VALUETOOLS (2016)
Brathwaite, T., Walker, J.L.: Asymmetric, closed-form, finite-parameter models of multinomial choice. J. Choice Model. 29, 78–112 (2018)
Burgette, L.F., Nordheim, E.V.: The trace restriction: an alternative identification strategy for the Bayesian multinomial probit model. J. Bus. Econ. Stat. 30(3), 404–410 (2012)
Burgette, L.F., Puelz, D., Hahn, P.R., et al.: A symmetric prior for multinomial probit models. Bayesian Analysis (2020)
Castillo, E., Menéndez, J.M., Jiménez, P., Rivas, A.: Closed form expressions for choice probabilities in the Weibull case. Transp. Res. Part B Methodol. 42(4), 373–380 (2008)
Chikaraishi, M., Nakayama, S.: Discrete choice models with q-product random utilities. Transp. Res. Part B Methodol. 93, 576–595 (2016)
Chipman, H.A., George, E.I., McCulloch, R.E., et al.: BART: Bayesian additive regression trees. Ann. Appl. Stat. 4(1), 266–298 (2010)
Daganzo, C.: Multinomial probit. Academic Press (1979)
Del Castillo, J.: A class of RUM choice models that includes the model in which the utility has logistic distributed errors. Transport. Res. Part B Methodol. 91, 1–20 (2016)
Del Castillo, J.: Choice probabilities of random utility maximization models when the errors distribution is a polynomial copula with Gumbel marginals. Transp. A Transp. Sci. 16(3), 439–472 (2020)
Dill, J., Rose, G.: Electric bikes and transportation policy: insights from early adopters. Transp. Res. Rec. 2314(1), 1–6 (2012)
Ding, P.: Bayesian robust inference of sample selection using selection-t models. J. Multivar. Anal. 124, 451–464 (2014)
Dubey, S., Bansal, P., Daziano, R.A., Guerra, E.: A generalized continuous-multinomial response model with a t-distributed error kernel. Transp. Res. Part B Methodol. 133, 114–141 (2020)
Fosgerau, M., Bierlaire, M.: Discrete choice models with multiplicative error terms. Transp. Res. Part B Methodol. 43(5), 494–505 (2009)
Gelman, A., Carlin, J.B., Stern, H.S., Dunson, D.B., Vehtari, A., Rubin, D.B.: Bayesian data analysis. CRC Press (2013)
Gelman, A., Hill, J.: Data analysis using regression and multilevel/hierarchical models. Cambridge University Press (2006)
Gelman, A., Rubin, D.B., et al.: Inference from iterative simulation using multiple sequences. Stat. Sci. 7(4), 457–472 (1992)
Geweke, J., Keane, M., Runkle, D.: Alternative computational approaches to inference in the multinomial probit model. The review of economics and statistics, 609–632 (1994)
Hajivassiliou, V., McFadden, D., Ruud, P.: Simulation of multivariate normal rectangle probabilities and their derivatives theoretical and computational results. J. Econom. 72(1–2), 85–134 (1996)
Hausman, J.A., Abrevaya, J., Scott-Morton, F.M.: Misclassification of the dependent variable in a discrete-response setting. J. Econom. 87(2), 239–269 (1998)
Hillel, T., Elshafie, M.Z., Jin, Y.: Recreating passenger mode choice-sets for transport simulation: a case study of London, UK. Proc. Inst. Civil Eng. Smart Infrastruct. Constr. 171(1), 29–42 (2018)
Huang, A., Wand, M.P., et al.: Simple marginally noninformative prior distributions for covariance matrices. Bayesian Anal. 8(2), 439–452 (2013)
Imai, K., Van Dyk, D.A.: A Bayesian analysis of the multinomial probit model using marginal data augmentation. J. Econom. 124(2), 311–334 (2005)
Jiang, Z., Ding, P.: Robust modeling using non-elliptically contoured multivariate t distributions. J. Stat. Plan. Inference 177, 50–63 (2016)
Kim, S., Chen, M.-H., Dey, D.K.: Flexible generalized t-link models for binary response data. Biometrika 95(1), 93–106 (2008)
Kindo, B.P., Wang, H., Peña, E.A.: Multinomial probit bayesian additive regression trees. Stat 5(1), 119–131 (2016)
Lange, K.L., Little, R.J., Taylor, J.M.: Robust statistical modeling using the t distribution. J. Am. Stat. Assoc. 84(408), 881–896 (1989)
Lee, S., Mclachlan, G.J.: Finite mixtures of multivariate skew t-distributions: some recent and new results. Stat. Comput. 24(2), 181–202 (2014)
Lerman, S., Manski, C.: On the use of simulated frequencies to approximate choice probabilities. Struct. Anal. Discret. Data Econom. Appl. 10, 305–319 (1981)
Liu, C.: Robit regression: a simple robust alternative to logistic and probit regression. Applied Bayesian Modeling and Casual Inference from Incomplete-Data Perspectives, 227–238 (2004)
Liu, J.S.: Monte Carlo strategies in scientific computing. Springer (2008)
McCulloch, R., Rossi, P.E.: An exact likelihood analysis of the multinomial probit model. J. Econom. 64(1–2), 207–240 (1994)
McFadden, D.: Modeling the choice of residential location. Transp. Res. Rec., (673) (1978)
McFadden, D.: Econometric models of probabilistic choice. Structural analysis of discrete data with econometric applications, 198272 (1981)
Paleti, R.: Discrete choice models with alternate kernel error distributions. J. Indian Inst. Sci., 1–10 (2019)
Paleti, R., Balan, L.: Misclassification in travel surveys and implications to choice modeling: application to household auto ownership decisions. Transportation 46(4), 1467–1485 (2019)
Peyhardi, D.J.: Robustness of student link function in multinomial choice models. J. Choice Model. 36, 100228 (2020)
Rayaprolu, H.S., Llorca, C., Moeckel, R.: Impact of bicycle highways on commuter mode choice: a scenario analysis. Environ. Plan. B Urban Anal. City Sci. 47(4), 662–677 (2020)
Robert, C., Casella, G.: Monte Carlo statistical methods. Springer (2013)
Scarinci, R., Markov, I., Bierlaire, M.: Network design of a transport system based on accelerating moving walkways. Transp. Res. Part C Emerg. Technol. 80, 310–328 (2017)
Tanner, M.A., Wong, W.H.: The calculation of posterior distributions by data augmentation. J. Am. Stat. Assoc. 82(398), 528–540 (1987)
Train, K.E.: Discrete choice methods with simulation. Cambridge University Press (2009)
Van Dyk, D.A.: Marginal Markov chain Monte Carlo methods. Stat. Sin., 1423–1454 (2010)
Van Dyk, D.A., Meng, X.-L.: The art of data augmentation. J. Comput. Graph. Stat. 10(1), 1–50 (2001)
Author information
Authors and Affiliations
Contributions
RK: conception and design, method development and implementation, data processing and analysis, manuscript writing and editing, supervision. MB: conception and design, manuscript editing, supervision. TG: conception and design, method development and implementation, manuscript writing and editing. PB: conception and design, manuscript writing and editing.
Corresponding author
Ethics declarations
Conflicts of interest
The authors declare that they have no conflicts of interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Appendix A Gibbs sampling details
1.1 A.1 Sampling \(\varvec{w}\)
To update \(\varvec{w}\), we iteratively sample from univariate truncated normal distributions. We have
For MNP, \(\mu _{ij} = \varvec{X}_{ij}^{\top } \varvec{\beta } + \varvec{\Sigma }_{j,-j} \varvec{\Sigma }_{-j,-j}^{-1} (w_{i,-j} - \varvec{X}_{i,-j} \varvec{\beta })\) and \(\tau _{ij}^{2} = \varvec{\Sigma }_{jj} - \varvec{\Sigma }_{j,-j} \varvec{\Sigma }_{-j,-j}^{-1} \varvec{\Sigma }_{-j,j}\). For MNR, \(\mu _{ij} = \varvec{X}_{ij}^{\top } \varvec{\beta } + \varvec{\Sigma }_{j,-j} \varvec{\Sigma }_{-j,-j}^{-1} (w_{i,-j} - \varvec{X}_{i,-j} \varvec{\beta })\) and \(\tau _{ij}^{2} = (\varvec{\Sigma }_{jj} - \varvec{\Sigma }_{j,-j} \varvec{\Sigma }_{-j,-j}^{-1} \varvec{\Sigma }_{-j,j}) / q_{i}\). For Gen-MNR, \(\mu _{ij} = \varvec{X}_{ij}^{\top } \varvec{\beta } + \varvec{Q}_{ijj}^{-1/2} \varvec{\Sigma }_{j,-j} \varvec{\Sigma }_{-j,-j}^{-1} \varvec{Q}_{i,-j,-j}^{1/2} (w_{i,-j} - \varvec{X}_{i,-j} \varvec{\beta })\) and \(\tau _{ij}^{2} = (\varvec{\Sigma }_{jj} - \varvec{\Sigma }_{j,-j} \varvec{\Sigma }_{-j,-j}^{-1} \varvec{\Sigma }_{-j,j}) / q_{ij}\). Here, the index \(-l\) denotes the vector without the lth element. For all models, the constraint on \(w_{ij}\) is \(w_{ij} \ge \max \{ 0, w_{i,-j} \}\), if \(y_{ij} = j\); \(w_{ij} < 0\), if \(y_{ij} = J\); \(w_{ij} \le \max \{ 0, w_{ij'} \}\), if \(y_{ij} = j' \ne j\).
1.2 A.2 Sampling \(\nu \)
The full conditional distribution of \(\nu \) is nonstandard. Ding (2014) shows that
where \(\xi = \beta _{0} + \frac{1}{2} \sum _{i = 1}^{N} q_{i} - \frac{1}{2} \sum _{i = 1}^{N} \log q_{i}\). \(\Gamma (x)\) denotes the Gamma function. Ding (2014) proposes to sample from (11) using a Metropolised Independence sampler (Liu 2008) with an approximate Gamma proposal. The shape parameter \(\alpha ^{*}\) and the rate parameter \(\beta ^{*}\) of the proposal density are obtained as follows. The log conditional density of \(\nu \) up to an additive constant is
The log density of the Gamma proposal is
The first and second derivates of \(l(\nu )\) and \(h(\nu )\) are
where \(\psi (x)\) and \(\psi '(x)\) are the di- and trigamma functions, respectively. The mode of \(h(\nu )\) is \(\frac{\alpha ^{*} - 1}{\beta ^{*}}\) and the corresponding curvature is \(\frac{(\beta ^{*})^{2}}{\alpha ^{*} - 1}\). We numerically find the mode \(\nu ^{*}\) of \(l(\nu )\) and its corresponding curvature \(l^{*} = l''(\nu ^{*})\). Ultimately, we match the modes and the corresponding curvatures of \(l(\nu )\) and \(h(\nu )\) to obtain
1.3 A.3 Sampling \(q_{ij}\)
The full conditional distribution of \(q_{ij}\) is nonstandard. Jiang and Ding (2016) show that
where \(u_{ij} = \nu _{j} + ( \varvec{\Sigma }^{-1} )_{jj} (w_{ij} - \varvec{X}_{ij}^{\top } \varvec{\beta })^{2}\) and \(c_{ij} = (w_{ij} - \varvec{X}_{ij}^{\top } \varvec{\beta }) \sum _{j' \ne j} \left( \sqrt{ q_{ij'} ( \varvec{\Sigma }^{-1} )_{jj'}} (w_{ij} - \varvec{X}_{ij}^{\top } \varvec{\beta }) \right) \). Jiang and Ding (2016) propose to sample from (17) using a Metropolised Independence sampler (Liu 2008) with an approximate Gamma proposal. The shape parameter \(\alpha ^{*}\) and the rate parameter \(\beta ^{*}\) of the proposal density are obtained as follows. For \(\nu _{j} \le 1\), we set \(\alpha ^{*} = 1\) and \(\beta ^{*} = \frac{u_{ij}}{2}\). For \(\nu _{j} > 1\), \(\alpha ^{*}\) and \(\beta ^{*}\) are obtained through matching the modes and the corresponding curvatures of the target and the proposal densities. The log conditional density of \(q_{ij}\) up to an additive constant is
The log density of the Gamma proposal is
The mode of (19) and its corresponding curvature are \(\frac{\alpha ^{*} - 1}{\beta ^{*}} = m_{ij}^{*}\) and \(\frac{(\beta ^{*})^{2}}{\alpha ^{*} - 1} = l_{ij}^{*}\), respectively. The first and second derivatives of (18) are
The mode of (18) is \(m_{ij}^{*} = \left( \frac{ \frac{c_{ij}}{2} + \sqrt{ \left( \frac{c_{ij}}{2} \right) ^{2} + u_{ij} (\nu _{j} - 1)}}{\nu _{j} - 1} \right) ^{-2}\), and the corresponding curvature is \(l_{ij}^{*} = f''(m_{ij}^{*})\). After matching the modes and corresponding curvatures of the log target and the log proposal densities, we obtain
B Additional results for the simulation study
1.1 B.1 Example I
1.2 B.2 Example II
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Krueger, R., Bierlaire, M., Gasos, T. et al. Robust discrete choice models with t-distributed kernel errors. Stat Comput 33, 2 (2023). https://doi.org/10.1007/s11222-022-10182-3
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11222-022-10182-3
Keywords
- Robustness
- Probit
- Robit
- Bayesian estimation
- Discrete choice
- Outliers