Skip to main content

Advertisement

Log in

Bayesian procedures as a numerical tool for the estimation of an intertemporal discrete choice model

  • Published:
Empirical Economics Aims and scope Submit manuscript

Abstract

Discrete choice models usually require a general specification of unobserved heterogeneity. In this paper, we apply Bayesian procedures as a numerical tool for the estimation of a female labor supply model based on a sample size that is typical for common household panels. We provide two important results for the practitioner: First, for a specification with a multivariate normal distribution for the unobserved heterogeneity, the Bayesian MCMC estimator yields almost identical results as a classical maximum simulated likelihood (MSL) estimator. Second, we show that when imposing distributional assumptions that are consistent with economic theory, e.g., log-normally distributed consumption preferences, the Bayesian method performs well and provides reasonable estimates, while the MSL estimator does not converge. These results indicate that Bayesian procedures can be a beneficial tool for the estimation of intertemporal discrete choice models.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Notes

  1. The same applies for hazard rate models that try to separate duration dependence from dynamic selection due to unobserved characteristics, see e.g., Lancaster (1990) and van den Berg (2001) for overviews.

  2. Further, discrete choice models with unobserved heterogeneity do not rely on the IIA assumption that is inherent to standard discrete choice models, such as the conditional logit model. This is of course relevant as well for the static case.

  3. Zellner and Rossi (1984) were the first to apply Bayesian procedures for a logit model using importance sampling, and starting with Zeger and Karim (1991) various MCMC methods have been developed for the estimation of discrete choice models, see Albert and Chib (1993) and McCulloch and Rossi (1994) for probit models, and Allenby and Lenk (1994) and Allenby (1997) for logit models. More recently, several advances have been developed that can increase efficiency of the MCMC procedures: Holmes and Held (2006) propose extensions to the auxiliary variables approach of Albert and Chib (1993); for data augmentation methods see Rossi et al. (2005), Frühwirth-Schnatter and Frühwirth (2007, 2010), and Scott (2011), Gramacy and Polson (2012) develop a simulation-based framework for regularized logistic regression.

  4. Alternatively, the random effect can be specified in a latent class framework by assuming discrete points in the heterogeneity distribution, see e.g., Heckman and Singer (1984) or Pacifico (2013) in the context of discrete choice labor supply.

  5. See Train (2009) for more details on this method as well as alternative approaches such as the method of simulated moments and the method of simulated scores.

  6. This follows directly from the symmetry of the normal distribution. As has been pointed out, the posterior distribution of the parameters is asymptotically normal.

  7. Gelman et al. (1995) find that the optimal acceptance rate is about 0.44 if \(x_{ijt}\) contains only one variable and decreases with the number of variables in \(x_{ijt}\) toward 0.23.

  8. This leaves us with 1,000 draws for the actual estimation. When increasing the number of retained draws by taking more draws after burn-in, this only leads to marginal changes in the results.

  9. See Akay (2012) for the performance of the method suggested by Wooldridge (2005).

  10. The specification of demographic variables is similar to, e.g., Aaberge et al. (1995), van Soest (1995) or Blundell et al. (2000)

  11. Note, for women not employed in the month preceding the interview, gross hourly wages are estimated by applying a two-stage estimation procedure with a Heckman sample selection correction.

  12. We use an analytic gradient in the optimization algorithm. The simulation of the choice probabilities is based on 200 Halton draws. Estimating the model with other choices of R has shown that 200 Halton draws seem to be a lower bound to the number of draws required for the MSLE to have good statistical properties in our finite sample.

  13. We have also performed a robustness check simulating data sets where either specification 4 or specification 5 is assumed to correspond to the true data generating process. The panel data of the covariates are taken as given to maintain the relevant distributions. The estimates from the real data are assumed to be to the true values of the parameters. Then, we estimate the model for specifications 4 and 5 using the simulated data sets both with MSL and MCMC estimators. In line with our findings on the real data, the MSLE only converges for specification 4, while the MCMC estimator accurately reproduces (within the limits of the estimation accuracy) the parameter values of the data generating process. Again, the MSL and MCMC estimators produce almost identical results if the MSLE converges at the global maximum. We have simulated a number of data sets (changing the seed before taking draws) to make sure that this result does not hinge on a special outcome for the simulated data sets.

  14. For a detailed description of the calculation see Haan (2010) and Haan and Uhlendorff (2013).

References

  • Aaberge R, Dagsvik J, Stroem S (1995) Labor supply responses and welfare effects of tax reforms. Scand J Econ 97:635–659

    Article  Google Scholar 

  • Akay A (2009) Dynamics of the employment assimilation of first-generation immigrant men in Sweden: comparing dynamic and static assimilation models with longitudinal data. IZA Discussion Paper, 4655

  • Akay A (2012) Finite-sample comparison of alternative methods for estimating dynamic panel data models. J Appl Econ 27:1189–1204

    Article  Google Scholar 

  • Albert JH, Chib S (1993) Bayesian analysis of binary and polychotomous response data. J Am Stat Assoc 88(422):669–679

    Article  Google Scholar 

  • Allenby G (1997) An introduction to hierarchical Bayesian modelling. Tutorial Notes, Advanced Research Techniques Forum, American Market Association

  • Allenby GM, Lenk PJ (1994) Modeling household purchase behavior with logistic normal regression. J Am Stat Assoc 89(428):1218–1231

    Article  Google Scholar 

  • Blundell R, Duncan A, McCrae J, Meghir C (2000) The labour market impact of the working families’ tax credit. Fisc Stud 21(1):75–104

    Article  Google Scholar 

  • Chiappori P (1988) Rational household labor supply. Econometrica 56:63–89

    Article  Google Scholar 

  • Dube J-P, Hitsch GJ, Rossi PE (2010) State dependence and alternative explanations for consumer inertia. Rand J Econ 41:417–445

    Article  Google Scholar 

  • Fitzenberger B, Osikominu A, Paul M (2010) The heterogeneous effects of training incidence and duration on labor market transitions. IZA Discussion Paper, 5269

  • Frühwirth-Schnatter S, Frühwirth R (2007) Auxiliary mixture sampling with applications to logistic models. Comput Stat Data Anal 51:3509–3528

    Article  Google Scholar 

  • Frühwirth-Schnatter S, Frühwirth R (2010) Data augmentation and MCMC for binary and multinomial logit models. In: Kneib T, Tutz G (eds) Statistical modelling and regression structures. Physica-Verlag Heidelberg, pp 111–132

  • Gelman A, Carlin JB, Stern HS, Rubin DB (1995) Bayesian data analysis. Chapman and Hall, London

    Google Scholar 

  • Gramacy RB, Polson NG (2012) Simulation-based regularized logistic regression. Bayesian Anal 7(3):567–590

    Article  Google Scholar 

  • Haan P (2010) A multi-state model of state dependence in labor supply: intertemporal labor supply effects of a shift from joint to individual taxation. Labour Econ 17(2):323–335

    Article  Google Scholar 

  • Haan P, Uhlendorff A (2013) Intertemporal labor supply and involuntary unemployment. Empir Econ 44:661–683

    Article  Google Scholar 

  • Hastings WK (1970) Monte Carlo sampling methods using Markov chains and their applications. Biometrika 57(1):97–109

    Article  Google Scholar 

  • Heckman J (1981a) Heterogeneity and state dependence. In: Rosen S (ed) Studies in labor markets. Chicago Press, Chicago, IL, pp 91–139

    Google Scholar 

  • Heckman J (1981b) Statistical models for discrete panel data. In: Manski C, McFadden D (eds) Structural analysis of discrete data with econometric applications. MIT Press, Cambridge, MA, pp 114–178

    Google Scholar 

  • Heckman J, Singer B (1984) A method for minimizing the distributional assumptions in econometric models for duration data. Econometrica 52:271–320

    Article  Google Scholar 

  • Holmes CC, Held L (2006) Bayesian auxiliary variable models for binary and multinomial regression. Bayesian Anal 1:145–168

    Article  Google Scholar 

  • Imai S, Jain N, Ching A (2009) Bayesian estimation of dynamic discrete choice models. Econometrica 77:1865–1899

    Article  Google Scholar 

  • Keane M, Wolpin K (2001) Estimating welfare effects consistent with forward-looking behavior: part II: empirical results. J Hum Resour 37:600–622

    Article  Google Scholar 

  • Lancaster T (1990) The econometric analysis of transition data. Cambridge University Press, Cambridge

    Google Scholar 

  • McCulloch R, Rossi PE (1994) An exact likelihood analysis of the multinomial probit model. J Econ 64(1–2):207–240

    Article  Google Scholar 

  • McFadden D, Train K (2000) Mixed MNL models for discrete response. J Appl Econ 15(5):447–470

    Article  Google Scholar 

  • Metropolis N, Rosenbluth AW, Rosenbluth MN, Teller AH, Teller E (1953) Equation of state calculations by fast computing machines. J Chem Phys 21(6):1087–1092

    Article  Google Scholar 

  • Pacifico D (2013) On the rule of unobserved heterogeneity in discrete choice models of labour supply. Empir Econ 45:929–963

    Google Scholar 

  • Prowse V (2012) Modeling employment dynamics with state dependence and unobserved heterogeneity. J Bus Econ Stat 30:411–431

    Article  Google Scholar 

  • Regier DA, Ryan M, Phimister E, Marra CA (2009) Bayesian and classical estimation of mixed logit: an application to genetic testing. J Health Econ 28(3):598–610

  • Rossi P, Allenby G, McCulloch R (2005) Bayesian statistics and marketing, no. Bd. 13. Wiley, New York

  • Scott S (2011) Data augmentation, frequentist estimation, and the Bayesian analysis of multinomial logit models. Stat Pap 52(1):87–109

    Article  Google Scholar 

  • Steiner V, Wrohlich K, Geyer J, Haan P (2008) Documentation of the tax-benefit microsimulation model STSM: Version 2008. Data Documentation 31

  • Train K (2001) A comparison of hierarchical Bayes and maximum simulated likelihood for mixed logit. Discussion paper

  • Train K (2009) Discrete choice models using simulation, 2nd edn. Cambridge University Press, Cambridge

    Book  Google Scholar 

  • Troske K, Voicu A (2010) Joint estimation of sequential labor force participation and fertility decisions using Markov Chain Monte Carlo techniques. Labour Econ 17:150–169

    Article  Google Scholar 

  • van den Berg GJ (2001) Duration models: specification, identification, and multiple durations. In: Heckman JJ, Leamer E (eds) Handbook of econometrics, vol 5. North-Holland, Amsterdam, pp 3381–3460

    Google Scholar 

  • van Soest A (1995) Structural models of family labor supply: a discrete choice approach. J Hum Resour 30:63–88

    Article  Google Scholar 

  • Wagner G, Frick J, Schupp J (2007) The German socio-economic panel study (SOEP)—scope, evolution and enhancements. Schmollers Jahrb 127(1):139–169

    Google Scholar 

  • Wooldridge J (2005) Simple solutions to the initial conditions problem for dynamic, nonlinear panel data models with unobserved heterogeneity. J Appl Econ 20:39–54

    Article  Google Scholar 

  • Zeger SL, Karim MR (1991) Generalized linear models with random effects; a Gibbs sampling approach. J Am Stat Assoc 86(413):79–86

    Article  Google Scholar 

  • Zellner A, Rossi PE (1984) Bayesian analysis of dichotomous quantal response models. J Econ 25(3):365–393

    Article  Google Scholar 

Download references

Acknowledgments

The authors would like to thank Joerg Breitung, Carsten Trenkler, Victoria Prowse, Arthur van Soest, and Louis Raes for valuable comments. Peter Haan and Daniel Kemptner would like to thank Thyssen foundation (Project: 10112085 and 10141098) for financial support.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Peter Haan.

Appendix

Appendix

1.1 Additional details on the numerical implementation

We describe how the draws of \(\beta _i\) have been taken by applying the Metropolis–Hastings algorithm. The draws of the fixed coefficients \(\alpha \) have been taken analogously (see Train 2009 for more details). The algorithm is embedded into the Gibbs sampler. Starting with an arbitrary value for \(\beta _{i}\), the following steps are implemented within each iteration of the Gibbs sampling conditional on the other subsets of the parameters:

  1. 1.

    Draw F independent values from a standard normal density to get a vector \(\eta \).

  2. 2.

    This gives a new trial value \(\beta _{i}^\mathrm{new}=\beta _{i}^\mathrm{old}+\rho L\eta \), where L is the Choleski factor of W and \(\rho \) is the size of each jump that is adjusted iteratively to target an acceptance rate of 0.3. Hence, the proposal distribution is specified to be normal with mean zero and variance \(\rho ^{2}W\).

  3. 3.

    Draw \(\mu \) from a standard uniform distribution and calculate \(R=\frac{L_{i\mathbf j }(\alpha ,\beta _{i}^\mathrm{new})\phi (\beta _{i}^\mathrm{new}|b,W)}{L_{i\mathbf j }(\alpha ,\beta _{i}^\mathrm{old})\phi (\beta _{i}^\mathrm{old}|b,W)}\).

  4. 4.

    If \(\mu <=R\), accept \(\beta _{i}^\mathrm{new}\), otherwise reject \(\beta _{i}^\mathrm{new}\).

1.2 Estimated distributions of consumption preferences

The figure shows the marginal distributions of the consumption preferences \(\beta _i^I\) for different distributional assumptions. We consider here the preferred specifications with four random coefficients (specifications 4–6). The density functions have been computed by taking a large number of draws from the respective distributions on the basis of the estimated moments and using a standard software package to obtain kernel density estimates. This shows how the different distributional assumptions translate into fairly different estimates for the unobserved heterogeneity (Fig. 1).

Fig. 1
figure 1

Estimated distributions of consumption preferences for spec. 4–6

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Haan, P., Kemptner, D. & Uhlendorff, A. Bayesian procedures as a numerical tool for the estimation of an intertemporal discrete choice model. Empir Econ 49, 1123–1141 (2015). https://doi.org/10.1007/s00181-014-0906-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00181-014-0906-7

Keywords

JEL Classification

Navigation