Skip to main content
Log in

Robust discrete choice models with t-distributed kernel errors

  • Original Paper
  • Published:
Statistics and Computing Aims and scope Submit manuscript

Abstract

Outliers in discrete choice response data may result from misclassification and misreporting of the response variable and from choice behaviour that is inconsistent with modelling assumptions (e.g. random utility maximisation). In the presence of outliers, standard discrete choice models produce biased estimates and suffer from compromised predictive accuracy. Robust statistical models are less sensitive to outliers than standard non-robust models. This paper analyses two robust alternatives to the multinomial probit (MNP) model. The two models are robit models whose kernel error distributions are heavy-tailed t-distributions to moderate the influence of outliers. The first model is the multinomial robit (MNR) model, in which a generic degrees of freedom parameter controls the heavy-tailedness of the kernel error distribution. The second model, the generalised multinomial robit (Gen-MNR) model, is more flexible than MNR, as it allows for distinct heavy-tailedness in each dimension of the kernel error distribution. For both models, we derive Gibbs samplers for posterior inference. In a simulation study, we illustrate the finite sample properties of the proposed Bayes estimators and show that MNR and Gen-MNR produce more accurate estimates if the choice data contain outliers through the lens of the non-robust MNP model. In a case study on transport mode choice behaviour, MNR and Gen-MNR outperform MNP by substantial margins in terms of in-sample fit and out-of-sample predictive accuracy. The case study also highlights differences in elasticity estimates across models.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Notes

  1. The estimation code is available at https://github.com/RicoKrueger/robit.

References

  • Albert, J.H., Chib, S.: Bayesian analysis of binary and polychotomous response data. J. Am. Stat. Assoc. 88(422), 669–679 (1993)

    Article  MathSciNet  MATH  Google Scholar 

  • Alptekinoğlu, A., Semple, J.H.: The exponomial choice model: a new alternative for assortment and price optimization. Oper. Res. 64(1), 79–93 (2016)

    Article  MathSciNet  MATH  Google Scholar 

  • Benoit, D.F., Van Aelst, S., Van den Poel, D.: Outlier-robust Bayesian multinomial choice modeling. J. Appl. Econom. 31(7), 1445–1466 (2016)

    Article  Google Scholar 

  • Bezanson, J., Edelman, A., Karpinski, S., Shah, V.B.: Julia: a fresh approach to numerical computing. SIAM Rev. 59(1), 65–98 (2017)

    Article  MathSciNet  MATH  Google Scholar 

  • Bhat, C.R.: A heteroscedastic extreme value model of intercity travel mode choice. Transp. Res. Part B Methodol. 29(6), 471–483 (1995)

    Article  Google Scholar 

  • Bhat, C.R.: The maximum approximate composite marginal likelihood (MACML) estimation of multinomial probit-based unordered response choice models. Transp. Res. Part B Methodol. 45(7), 923–939 (2011)

    Article  Google Scholar 

  • Botev, Z.I., l’Ecuyer, P.: Simulation from the normal distribution truncated to an interval in the tail. In VALUETOOLS (2016)

  • Brathwaite, T., Walker, J.L.: Asymmetric, closed-form, finite-parameter models of multinomial choice. J. Choice Model. 29, 78–112 (2018)

    Article  Google Scholar 

  • Burgette, L.F., Nordheim, E.V.: The trace restriction: an alternative identification strategy for the Bayesian multinomial probit model. J. Bus. Econ. Stat. 30(3), 404–410 (2012)

    Article  MathSciNet  Google Scholar 

  • Burgette, L.F., Puelz, D., Hahn, P.R., et al.: A symmetric prior for multinomial probit models. Bayesian Analysis (2020)

  • Castillo, E., Menéndez, J.M., Jiménez, P., Rivas, A.: Closed form expressions for choice probabilities in the Weibull case. Transp. Res. Part B Methodol. 42(4), 373–380 (2008)

    Article  Google Scholar 

  • Chikaraishi, M., Nakayama, S.: Discrete choice models with q-product random utilities. Transp. Res. Part B Methodol. 93, 576–595 (2016)

    Article  Google Scholar 

  • Chipman, H.A., George, E.I., McCulloch, R.E., et al.: BART: Bayesian additive regression trees. Ann. Appl. Stat. 4(1), 266–298 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  • Daganzo, C.: Multinomial probit. Academic Press (1979)

    MATH  Google Scholar 

  • Del Castillo, J.: A class of RUM choice models that includes the model in which the utility has logistic distributed errors. Transport. Res. Part B Methodol. 91, 1–20 (2016)

    Article  Google Scholar 

  • Del Castillo, J.: Choice probabilities of random utility maximization models when the errors distribution is a polynomial copula with Gumbel marginals. Transp. A Transp. Sci. 16(3), 439–472 (2020)

    Google Scholar 

  • Dill, J., Rose, G.: Electric bikes and transportation policy: insights from early adopters. Transp. Res. Rec. 2314(1), 1–6 (2012)

    Article  Google Scholar 

  • Ding, P.: Bayesian robust inference of sample selection using selection-t models. J. Multivar. Anal. 124, 451–464 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  • Dubey, S., Bansal, P., Daziano, R.A., Guerra, E.: A generalized continuous-multinomial response model with a t-distributed error kernel. Transp. Res. Part B Methodol. 133, 114–141 (2020)

    Article  Google Scholar 

  • Fosgerau, M., Bierlaire, M.: Discrete choice models with multiplicative error terms. Transp. Res. Part B Methodol. 43(5), 494–505 (2009)

    Article  Google Scholar 

  • Gelman, A., Carlin, J.B., Stern, H.S., Dunson, D.B., Vehtari, A., Rubin, D.B.: Bayesian data analysis. CRC Press (2013)

    Book  MATH  Google Scholar 

  • Gelman, A., Hill, J.: Data analysis using regression and multilevel/hierarchical models. Cambridge University Press (2006)

    Book  Google Scholar 

  • Gelman, A., Rubin, D.B., et al.: Inference from iterative simulation using multiple sequences. Stat. Sci. 7(4), 457–472 (1992)

    Article  MATH  Google Scholar 

  • Geweke, J., Keane, M., Runkle, D.: Alternative computational approaches to inference in the multinomial probit model. The review of economics and statistics, 609–632 (1994)

  • Hajivassiliou, V., McFadden, D., Ruud, P.: Simulation of multivariate normal rectangle probabilities and their derivatives theoretical and computational results. J. Econom. 72(1–2), 85–134 (1996)

    Article  MathSciNet  MATH  Google Scholar 

  • Hausman, J.A., Abrevaya, J., Scott-Morton, F.M.: Misclassification of the dependent variable in a discrete-response setting. J. Econom. 87(2), 239–269 (1998)

    Article  MathSciNet  MATH  Google Scholar 

  • Hillel, T., Elshafie, M.Z., Jin, Y.: Recreating passenger mode choice-sets for transport simulation: a case study of London, UK. Proc. Inst. Civil Eng. Smart Infrastruct. Constr. 171(1), 29–42 (2018)

    Google Scholar 

  • Huang, A., Wand, M.P., et al.: Simple marginally noninformative prior distributions for covariance matrices. Bayesian Anal. 8(2), 439–452 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  • Imai, K., Van Dyk, D.A.: A Bayesian analysis of the multinomial probit model using marginal data augmentation. J. Econom. 124(2), 311–334 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  • Jiang, Z., Ding, P.: Robust modeling using non-elliptically contoured multivariate t distributions. J. Stat. Plan. Inference 177, 50–63 (2016)

    Article  MathSciNet  MATH  Google Scholar 

  • Kim, S., Chen, M.-H., Dey, D.K.: Flexible generalized t-link models for binary response data. Biometrika 95(1), 93–106 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  • Kindo, B.P., Wang, H., Peña, E.A.: Multinomial probit bayesian additive regression trees. Stat 5(1), 119–131 (2016)

    Article  MathSciNet  Google Scholar 

  • Lange, K.L., Little, R.J., Taylor, J.M.: Robust statistical modeling using the t distribution. J. Am. Stat. Assoc. 84(408), 881–896 (1989)

    MathSciNet  Google Scholar 

  • Lee, S., Mclachlan, G.J.: Finite mixtures of multivariate skew t-distributions: some recent and new results. Stat. Comput. 24(2), 181–202 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  • Lerman, S., Manski, C.: On the use of simulated frequencies to approximate choice probabilities. Struct. Anal. Discret. Data Econom. Appl. 10, 305–319 (1981)

    Google Scholar 

  • Liu, C.: Robit regression: a simple robust alternative to logistic and probit regression. Applied Bayesian Modeling and Casual Inference from Incomplete-Data Perspectives, 227–238 (2004)

  • Liu, J.S.: Monte Carlo strategies in scientific computing. Springer (2008)

  • McCulloch, R., Rossi, P.E.: An exact likelihood analysis of the multinomial probit model. J. Econom. 64(1–2), 207–240 (1994)

  • McFadden, D.: Modeling the choice of residential location. Transp. Res. Rec., (673) (1978)

  • McFadden, D.: Econometric models of probabilistic choice. Structural analysis of discrete data with econometric applications, 198272 (1981)

  • Paleti, R.: Discrete choice models with alternate kernel error distributions. J. Indian Inst. Sci., 1–10 (2019)

  • Paleti, R., Balan, L.: Misclassification in travel surveys and implications to choice modeling: application to household auto ownership decisions. Transportation 46(4), 1467–1485 (2019)

    Article  Google Scholar 

  • Peyhardi, D.J.: Robustness of student link function in multinomial choice models. J. Choice Model. 36, 100228 (2020)

    Article  Google Scholar 

  • Rayaprolu, H.S., Llorca, C., Moeckel, R.: Impact of bicycle highways on commuter mode choice: a scenario analysis. Environ. Plan. B Urban Anal. City Sci. 47(4), 662–677 (2020)

    Article  Google Scholar 

  • Robert, C., Casella, G.: Monte Carlo statistical methods. Springer (2013)

  • Scarinci, R., Markov, I., Bierlaire, M.: Network design of a transport system based on accelerating moving walkways. Transp. Res. Part C Emerg. Technol. 80, 310–328 (2017)

    Article  Google Scholar 

  • Tanner, M.A., Wong, W.H.: The calculation of posterior distributions by data augmentation. J. Am. Stat. Assoc. 82(398), 528–540 (1987)

    Article  MathSciNet  MATH  Google Scholar 

  • Train, K.E.: Discrete choice methods with simulation. Cambridge University Press (2009)

    MATH  Google Scholar 

  • Van Dyk, D.A.: Marginal Markov chain Monte Carlo methods. Stat. Sin., 1423–1454 (2010)

  • Van Dyk, D.A., Meng, X.-L.: The art of data augmentation. J. Comput. Graph. Stat. 10(1), 1–50 (2001)

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Contributions

RK: conception and design, method development and implementation, data processing and analysis, manuscript writing and editing, supervision. MB: conception and design, manuscript editing, supervision. TG: conception and design, method development and implementation, manuscript writing and editing. PB: conception and design, manuscript writing and editing.

Corresponding author

Correspondence to Rico Krueger.

Ethics declarations

Conflicts of interest

The authors declare that they have no conflicts of interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix A Gibbs sampling details

1.1 A.1 Sampling \(\varvec{w}\)

To update \(\varvec{w}\), we iteratively sample from univariate truncated normal distributions. We have

$$\begin{aligned} w_{ij} \sim TN(\mu _{ij}, \tau _{ij}^{2}),\text {for } i = 1, \ldots , N, j = 1, \ldots , J-1. \nonumber \\ \end{aligned}$$
(10)

For MNP, \(\mu _{ij} = \varvec{X}_{ij}^{\top } \varvec{\beta } + \varvec{\Sigma }_{j,-j} \varvec{\Sigma }_{-j,-j}^{-1} (w_{i,-j} - \varvec{X}_{i,-j} \varvec{\beta })\) and \(\tau _{ij}^{2} = \varvec{\Sigma }_{jj} - \varvec{\Sigma }_{j,-j} \varvec{\Sigma }_{-j,-j}^{-1} \varvec{\Sigma }_{-j,j}\). For MNR, \(\mu _{ij} = \varvec{X}_{ij}^{\top } \varvec{\beta } + \varvec{\Sigma }_{j,-j} \varvec{\Sigma }_{-j,-j}^{-1} (w_{i,-j} - \varvec{X}_{i,-j} \varvec{\beta })\) and \(\tau _{ij}^{2} = (\varvec{\Sigma }_{jj} - \varvec{\Sigma }_{j,-j} \varvec{\Sigma }_{-j,-j}^{-1} \varvec{\Sigma }_{-j,j}) / q_{i}\). For Gen-MNR, \(\mu _{ij} = \varvec{X}_{ij}^{\top } \varvec{\beta } + \varvec{Q}_{ijj}^{-1/2} \varvec{\Sigma }_{j,-j} \varvec{\Sigma }_{-j,-j}^{-1} \varvec{Q}_{i,-j,-j}^{1/2} (w_{i,-j} - \varvec{X}_{i,-j} \varvec{\beta })\) and \(\tau _{ij}^{2} = (\varvec{\Sigma }_{jj} - \varvec{\Sigma }_{j,-j} \varvec{\Sigma }_{-j,-j}^{-1} \varvec{\Sigma }_{-j,j}) / q_{ij}\). Here, the index \(-l\) denotes the vector without the lth element. For all models, the constraint on \(w_{ij}\) is \(w_{ij} \ge \max \{ 0, w_{i,-j} \}\), if \(y_{ij} = j\); \(w_{ij} < 0\), if \(y_{ij} = J\); \(w_{ij} \le \max \{ 0, w_{ij'} \}\), if \(y_{ij} = j' \ne j\).

1.2 A.2 Sampling \(\nu \)

The full conditional distribution of \(\nu \) is nonstandard. Ding (2014) shows that

$$\begin{aligned}{} & {} p(\nu \vert \cdot ) \propto \exp \nonumber \\{} & {} \quad \left\{ \frac{N \nu }{2} \log \left( \frac{\nu }{2} \right) - N \log \Gamma \left( \frac{\nu }{2} \right) + (\alpha _{0} - 1) \log \nu - \xi \nu \right\} ,\nonumber \\ \end{aligned}$$
(11)

where \(\xi = \beta _{0} + \frac{1}{2} \sum _{i = 1}^{N} q_{i} - \frac{1}{2} \sum _{i = 1}^{N} \log q_{i}\). \(\Gamma (x)\) denotes the Gamma function. Ding (2014) proposes to sample from (11) using a Metropolised Independence sampler (Liu 2008) with an approximate Gamma proposal. The shape parameter \(\alpha ^{*}\) and the rate parameter \(\beta ^{*}\) of the proposal density are obtained as follows. The log conditional density of \(\nu \) up to an additive constant is

$$\begin{aligned} l(\nu ) \!=\! \frac{N \nu }{2} \log \left( \frac{\nu }{2} \right) \!-\! N \log \Gamma \left( \frac{\nu }{2} \right) \!+\! (\alpha _{0} \!-\! 1) \log \nu - \xi \nu .\nonumber \\ \end{aligned}$$
(12)

The log density of the Gamma proposal is

$$\begin{aligned} h(\nu ) = (\alpha ^{*} - 1) \log \nu - \beta ^{*} \nu . \end{aligned}$$
(13)

The first and second derivates of \(l(\nu )\) and \(h(\nu )\) are

$$\begin{aligned} l'(\nu )= & {} \frac{N}{2} \left[ \log \left( \frac{\nu }{2} \right) + 1 - \psi \left( \frac{\nu }{2} \right) \right] + \frac{\alpha _{0} - 1}{\nu } - \xi , \nonumber \\ h'(\nu )= & {} \frac{\alpha ^{*} - 1}{\nu } - \beta ^{*}, \end{aligned}$$
(14)
$$\begin{aligned} l''(\nu )= & {} \frac{N}{2} \left[ \frac{1}{\nu } - \frac{1}{2} \psi ' \left( \frac{\nu }{2} \right) \right] + \frac{\alpha _{0} - 1}{\nu ^{2}}, \nonumber \\ h''(\nu )= & {} - \frac{\alpha ^{*} - 1}{\nu ^{2}}, \end{aligned}$$
(15)

where \(\psi (x)\) and \(\psi '(x)\) are the di- and trigamma functions, respectively. The mode of \(h(\nu )\) is \(\frac{\alpha ^{*} - 1}{\beta ^{*}}\) and the corresponding curvature is \(\frac{(\beta ^{*})^{2}}{\alpha ^{*} - 1}\). We numerically find the mode \(\nu ^{*}\) of \(l(\nu )\) and its corresponding curvature \(l^{*} = l''(\nu ^{*})\). Ultimately, we match the modes and the corresponding curvatures of \(l(\nu )\) and \(h(\nu )\) to obtain

$$\begin{aligned} \alpha ^{*} = 1 - (\nu ^{*})^{2} l^{*}, \quad \beta ^{*} = - \nu ^{*} l^{*}. \end{aligned}$$
(16)
Fig. 6
figure 6

Estimated posterior distribution and true values of the taste parameters \(\{ \beta _{4}, \beta _{5} \}\) for MNR in simulation example I

1.3 A.3 Sampling \(q_{ij}\)

The full conditional distribution of \(q_{ij}\) is nonstandard. Jiang and Ding (2016) show that

$$\begin{aligned} p(q_{ij} \vert \cdot ) \propto \exp \left\{ - \frac{q_{ij} u_{ij}}{2} - \sqrt{q_{ij}} c_{ij} + \frac{\nu _{j} - 1}{2} \log q_{ij} \right\} ,\nonumber \\ \end{aligned}$$
(17)

where \(u_{ij} = \nu _{j} + ( \varvec{\Sigma }^{-1} )_{jj} (w_{ij} - \varvec{X}_{ij}^{\top } \varvec{\beta })^{2}\) and \(c_{ij} = (w_{ij} - \varvec{X}_{ij}^{\top } \varvec{\beta }) \sum _{j' \ne j} \left( \sqrt{ q_{ij'} ( \varvec{\Sigma }^{-1} )_{jj'}} (w_{ij} - \varvec{X}_{ij}^{\top } \varvec{\beta }) \right) \). Jiang and Ding (2016) propose to sample from (17) using a Metropolised Independence sampler (Liu 2008) with an approximate Gamma proposal. The shape parameter \(\alpha ^{*}\) and the rate parameter \(\beta ^{*}\) of the proposal density are obtained as follows. For \(\nu _{j} \le 1\), we set \(\alpha ^{*} = 1\) and \(\beta ^{*} = \frac{u_{ij}}{2}\). For \(\nu _{j} > 1\), \(\alpha ^{*}\) and \(\beta ^{*}\) are obtained through matching the modes and the corresponding curvatures of the target and the proposal densities. The log conditional density of \(q_{ij}\) up to an additive constant is

$$\begin{aligned} f(q_{ij}) = - \frac{q_{ij} u_{ij}}{2} - \sqrt{q_{ij}} c_{ij} + \frac{\nu _{j} - 1}{2} \log q_{ij}. \end{aligned}$$
(18)

The log density of the Gamma proposal is

$$\begin{aligned} g(q_{ij} ) = (\alpha ^{*} - 1) \log q_{ij} - \beta ^{*} q_{ij}. \end{aligned}$$
(19)

The mode of (19) and its corresponding curvature are \(\frac{\alpha ^{*} - 1}{\beta ^{*}} = m_{ij}^{*}\) and \(\frac{(\beta ^{*})^{2}}{\alpha ^{*} - 1} = l_{ij}^{*}\), respectively. The first and second derivatives of (18) are

$$\begin{aligned} f'(q_{ij})= & {} - \frac{u_{ij}}{2} - \frac{c_{ij}}{2 \sqrt{q_{ij}}} + \frac{\nu _{j} - 1}{2 q_{ij}}, \nonumber \\ f''(q_{ij})= & {} \frac{c_{ij}}{4 \sqrt{q_{ij}^{3}}} - \frac{\nu _{j} - 1}{2 q_{ij}^{2}}. \end{aligned}$$
(20)

The mode of (18) is \(m_{ij}^{*} = \left( \frac{ \frac{c_{ij}}{2} + \sqrt{ \left( \frac{c_{ij}}{2} \right) ^{2} + u_{ij} (\nu _{j} - 1)}}{\nu _{j} - 1} \right) ^{-2}\), and the corresponding curvature is \(l_{ij}^{*} = f''(m_{ij}^{*})\). After matching the modes and corresponding curvatures of the log target and the log proposal densities, we obtain

$$\begin{aligned} \alpha ^{*} = 1 - (m_{ij}^{*})^{2} l_{ij}^{*}, \quad \beta ^{*} = - m_{ij}^{*} l_{ij}^{*}. \end{aligned}$$
(21)

B Additional results for the simulation study

1.1 B.1 Example I

Fig. 7
figure 7

Estimated posterior distribution and true values of the unique elements of the covariance matrix \(\varvec{\Sigma }\) for MNR in simulation example I

Fig. 8
figure 8

Estimated posterior distribution and true values of the taste parameters \(\{ \beta _{4}, \beta _{5} \}\) for the Gen-MNR model in simulation example II

Fig. 9
figure 9

Estimated posterior distribution and true values of the unique elements of the covariance matrix \(\varvec{\Sigma }\) for the Gen-MNR model in simulation example II

1.2 B.2 Example II

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Krueger, R., Bierlaire, M., Gasos, T. et al. Robust discrete choice models with t-distributed kernel errors. Stat Comput 33, 2 (2023). https://doi.org/10.1007/s11222-022-10182-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11222-022-10182-3

Keywords

Navigation