Skip to main content
Log in

Efficient MCMC estimation of inflated beta regression models

  • Original Paper
  • Published:
Computational Statistics Aims and scope Submit manuscript

Abstract

This paper introduces a new and computationally efficient Markov chain Monte Carlo (MCMC) estimation algorithm for the Bayesian analysis of zero, one, and zero and one inflated beta regression models. The algorithm is computationally efficient in the sense that it has low MCMC autocorrelations and computational time. A simulation study shows that the proposed algorithm outperforms the slice sampling and random walk Metropolis–Hastings algorithms in both small and large sample settings. An empirical illustration on a loss given default banking model demonstrates the usefulness of the proposed algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

Notes

  1. Note that the CG algorithm is only applicable to NIB regression models and to the continuous portion of the proposed ZOIB model. Moreover, although the CG algorithm can be nested in the DA algorithm to create another algorithm for the first scenario, this nesting is not considered as the results from the second scenario suggest that the QL-based algorithm marginally outperforms the CG algorithm for \(\beta \) but significantly outperforms CG for \(\phi \).

  2. The priors are set as follows. The hyperparameters \(b_0\) and \(g_0\) are appropriately-sized vectors of zeros. The matrices \(B_0\) and \(G_0\) are identity matrices multiplied by 1000. The hyperparameters \(s_0\) and \(S_0\) are respectively set to 1 and 100. For the truncated normal prior, \(p_0\) and \(P_0\) are respectively set to 2 and 2000. For the log-normal prior, denoted as \(\mathcal {LN}(\phi \vert l_0, L_0)\), \(l_0\) and \(L_0\) are respectively set to 0 and 4. These priors are set so that they are relatively uninformative. Note that the priors for \(\beta \) and \(\phi \) from the second scenario are also set this way.

  3. Using starting values other than the maximum likelihood estimates do not affect convergence or the qualitative results.

  4. See http://www.mathworks.com/help/stats/slicesample.html.

  5. This implementation is confirmed with Matlab technical support.

  6. Computational time includes the time to obtain the QL estimates.

  7. The CG algorithm, which also samples \(\beta \) and \(\phi \) conditional on each other, is based on first-order Taylor series approximations around the most up-to-date values of the parameters. One possible reason for the CG algorithm not performing as well as the QL algorithm may be due to approximation errors in the first-order Taylor series expansions used by CG.

  8. Although a model that accounts for temporal correlations can be used, the variation in this dataset is too limited to meaningfully estimate such model specifications.

References

  • Albert JH, Chib S (1993) Bayesian analysis of binary and polychotomous response data. J Am Stat Assoc 88(422):669–679

    Article  MathSciNet  MATH  Google Scholar 

  • Banking Committee on Banking Supervision (2001) Overview of the new basel capital accord. Technical report. http://www.bis.org/publ/bcbsca02.pdf

  • Billio M, Casarin R (2011) Beta autoregressive transition Markov–Switching models for business cycle analysis. Stud Nonlinear Dyn Econom 15(4):1–32

    MathSciNet  Google Scholar 

  • Bonat WH, Ribeiro PJ Jr, Zeviani WM (2015) Likelihood analysis for a class of beta mixed models. J Appl Stat 42(2):252–266

    Article  MathSciNet  Google Scholar 

  • Branscum AJ, Johnson WO, Thurmond MC (2007) Bayesian beta regression: applications to household expenditure data and genetic distance between foot-and-mouth disease viruses. Aust N Z J Stat 49(3):287–301

    Article  MathSciNet  MATH  Google Scholar 

  • Calabrese R (2012) Regression Model for Proportions with Probability Masses at Zero and One. Working Papers 201209, Geary Institute, University College Dublin

  • Casarin R, Dalla Valle L, Leisen F et al (2012) Bayesian model selection for beta autoregressive processes. Bayesian Anal 7(2):385–410

    Article  MathSciNet  MATH  Google Scholar 

  • Cepeda E, Gamerman D (2005) Bayesian methodology for modeling parameters in the two parameter exponential family. Revista Estadística 57(168–169):93–105

    MathSciNet  Google Scholar 

  • Cepeda-Cuervo E, Garrido L (2015) Bayesian beta regression models with joint mean and dispersion modeling. Monte Carlo Methods Appl 21(1):49–58

    Article  MathSciNet  MATH  Google Scholar 

  • Cook DO, Kieschnick R, McCullough BD (2008) Regression analysis of proportions in finance with self selection. J Empir Financ 15(5):860–867

    Article  Google Scholar 

  • Da-Silva C, Migon H (2012) Hierarchical dynamic beta model. Technical report 253, Departmento de Metodos Estatisticos, Universidade Federal do Rio de Janeiro

  • Da-Silva C, Migon HS, Correia L (2011) Dynamic bayesian beta models. Comput Stat Data Anal 55(6):2074–2089

    Article  MathSciNet  MATH  Google Scholar 

  • Ferrari S, Cribari-Neto F (2004) Beta regression for modelling rates and proportions. J Appl Stat 31(7):799–815

    Article  MathSciNet  MATH  Google Scholar 

  • Figueroa-Zúniga JI, Arellano-Valle RB, Ferrari SL (2013) Mixed beta regression: a Bayesian perspective. Comput Stat Data Anal 61:137–147

    Article  MathSciNet  MATH  Google Scholar 

  • Imai K, van Dyk DA (2005) A bayesian analysis of the multinomial probit model using marginal data augmentation. J Econom 124(2):311–334

    Article  MathSciNet  MATH  Google Scholar 

  • Jeliazkov I, Graves J, Kutzbach M (2008) Fitting and comparison of models for multivariate ordinal outcomes. Adv Econom 23:115–156

    Article  MATH  Google Scholar 

  • Kieschnick R, McCullough BD (2003) Regression analysis of variates observed on (0, 1): percentages, proportions and fractions. Stat Model 3(3):193–213

    Article  MathSciNet  MATH  Google Scholar 

  • Le Cam L, Yang GL (2012) Asymptotics in statistics: some basic concepts. Springer, Berlin

    MATH  Google Scholar 

  • Li P, Qi M, Zhang X, Zhao X (2016) Further investigation of parametric loss given default modeling. J Credit Risk 12(4):17–47

    Article  Google Scholar 

  • Liu F, Eugenio EC (2016) A review and comparison of Bayesian and likelihood-based inferences in beta regression and zero-or-one-inflated beta regression. Statistical methods in medical research, p 0962280216650699

  • Liu F, Kong Y (2015) zoib: an R package for bayesian inference for beta regression and zero/one inflated beta regression. R J 7(2):34–51

    Google Scholar 

  • Liu JS, Wong WH, Kong A (1994) Covariance structure of the gibbs sampler with applications to the comparisons of estimators and augmentation schemes. Biometrika 81(1):27–40

    Article  MathSciNet  MATH  Google Scholar 

  • McFadden D (1974) Conditional logit analysis of qualitative choice behavior. In: Zarembka P (ed) Frontiers in econometrics. Academic Press, New York, pp 105–142

  • Neal RM (2003) Slice sampling. Ann Stat 31:705–741

    Article  MathSciNet  MATH  Google Scholar 

  • Ospina R, Ferrari SL (2010) Inflated beta distributions. Stat Papers 51(1):111–126

    Article  MathSciNet  MATH  Google Scholar 

  • Ospina R, Ferrari SL (2012) A general class of zero-or-one inflated beta regression models. Comput Stat Data Anal 56(6):1609–1623

    Article  MathSciNet  MATH  Google Scholar 

  • Paolino P (2001) Maximum likelihood estimation of models with beta-distributed dependent variables. Polit Anal 9(4):325–346

    Article  Google Scholar 

  • Papke LE, Wooldridge JM (1996) Econometric methods for fractional response variables with an application to 401 (k) plan participation rates. J Appl Econom 11(6):619–632

    Article  Google Scholar 

  • Prakasa Rao BLS (1987) Asymptotic theory of statistical inference. Wiley, New York

    MATH  Google Scholar 

  • Qi M, Zhao X (2011) Comparison of modeling methods for loss given default. J Bank Financ 35(11):2842–2855

    Article  Google Scholar 

  • Qi M, Zhao X (2013) Debt structure, market value of firm and recovery rate. J Credit Risk 9(1):3–37

    Article  Google Scholar 

  • Rocha AV, Cribari-Neto F (2009) Beta autoregressive moving average models. Test 18(3):529–545

    Article  MathSciNet  MATH  Google Scholar 

  • Rydlewski JP (2007) Beta-regression model for periodic data with a trend. Univ Lagellonicae Acta Math 45:211–222

    MathSciNet  MATH  Google Scholar 

  • Simas AB, Barreto-Souza W, Rocha AV (2010) Improved estimators for a general class of beta regression models. Comput Stat Data Anal 54(2):348–366

    Article  MathSciNet  MATH  Google Scholar 

  • Smithson M, Verkuilen J (2006) A better lemon squeezer? Maximum-likelihood regression with beta-distributed dependent variables. Psychol Methods 11(1):54

    Article  Google Scholar 

  • Tanner MA, Wong WH (1987) The calculation of posterior distributions by data augmentation. J Am Stat Assoc 82(398):528–540

    Article  MathSciNet  MATH  Google Scholar 

  • Verkuilen J, Smithson M (2012) Mixed and mixture regression models for continuous bounded responses using the beta distribution. J Educ Behav Stat 37(1):82–113

    Article  Google Scholar 

  • Wieczorek J, Nugent C, Hawala S (2012) A bayesian zero-one inflated beta model for small area shrinkage estimation. In: Proceedings, section on survey research methods. American Statistical Association, Alexandria, VA

  • Zimprich D (2010) Modeling change in skewed variables using mixed beta regression models. Res Hum Dev 7(1):9–26

    Article  Google Scholar 

Download references

Acknowledgements

I would like to thank Xinlei Zhao for generously providing the data and model specification. Also, Sibel Sirakaya, Alicia Lloro, Angela Vossmeyer, Jonathan Cook, and Andrew Chang provided numerous suggestions that improved this paper. I am also indebted to an anonymous referee for providing helpful suggestions. This paper was started before I joined the Office of Financial Research. The views and opinions expressed here are not necessarily those of the Department of the Treasury or of the Office of Financial Research.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Phillip Li.

Appendix

Appendix

1.1 Parameterization of the inverse-gamma distribution

If X has an inverse-gamma distribution with shape a and scale b, then the density function is parameterized as

$$\begin{aligned} \mathcal {IG}(x \vert a, b) = \frac{x^{-(a+1)}}{{\varGamma }(a) b^a} e^{-\frac{1}{xb}}. \end{aligned}$$

The mean is \(\mathbb {E}(X) = (b(a-1))^{-1}\) for \(a>1\), and the variance is \(\mathbb {V}(X)=(b^2(a-1)^2(a-2))^{-1}\) for \(a>2\).

1.2 Quasi-likelihood quantities

The following describes the method to obtain consistent estimates of \(\beta \) and \(\phi \) (or equivalently, \(\psi \)). Following Papke and Wooldridge (1996), the first step is to maximize the following log Bernoulli quasi-likelihood function

$$\begin{aligned} L_{QL}(b) = \sum _{t=1}^{n_{01} } \left[ y_t \log \left( \frac{ \exp (x_t^\top b) }{ 1 + \exp (x_t^\top b)} \right) + (1-y_t) \log \left( 1 - \frac{ \exp (x_t^\top b) }{ 1 + \exp (x_t^\top b)} \right) \right] \end{aligned}$$
(23)

with respect to the \(K_2 \times 1\) vector b. In (23), \(t=1,\ldots ,n_{01}\) indexes the \(N_{01}\) subsample, and \(n_{01}\) is the cardinality of \(N_{01}\). Note that the referenced paper expresses the log QL function in terms of an arbitrary function \(G(\cdot )\); in this paper, \(G( x_t^\top b) = \text {logit}^{-1}(x_t^\top b) = \frac{ \exp (x_t^\top b) }{ 1 + \exp (x_t^\top b)}\), which leads to (23). Papke and Wooldridge (1996) show that the maximizing vector, denoted as \(\widehat{\beta }\), is a consistent estimator of \(\beta \), as long as \(\mathbb {E}(y_t) = \frac{\exp (x_t^\top \beta )}{1+\exp (x_t^\top \beta )}\) for \(t=1,\ldots ,n_{01}\) (which holds due to (3) and (8) holding for \(i \in N_{01}\)). Next, under the assumption that \(\mathbb {V}(y_t) = \delta ^2 \mu _t (1-\mu _t)\) for \(t=1,\ldots ,n_{01}\) with \(\delta ^2 = \frac{1}{1+\phi }\) (which is true given (9) for \(i \in N_{01}\)), the authors also show that the estimator

$$\begin{aligned} \widehat{\delta }^2 = \frac{ \sum _{t=1}^{n_{01}} \tilde{u}_t^2}{n_{01} - K_2}, \end{aligned}$$

is consistent for \(\delta ^2\), where

$$\begin{aligned} \tilde{u}_t = \frac{y_t - \widehat{\mu }_t}{ \sqrt{ \widehat{\mu }_t(1-\widehat{\mu }_t) } } \end{aligned}$$

and

$$\begin{aligned} \widehat{\mu }_t = \frac{\exp (x_t^\top \widehat{\beta } )}{1+\exp (x_t^\top \widehat{\beta })}. \end{aligned}$$

Then, by the continuous mapping theorem, the estimator \(\widehat{\phi }= \frac{1-\widehat{\delta }^2}{\widehat{\delta }^2}\) is consistent for \(\phi \). The vector \(\widehat{\psi }\) in Sect. 3.1 is defined as \(\widehat{\psi } = (\widehat{\beta }^\top , \widehat{\phi })^\top \).

1.3 Information matrix quantities

Partition \(\widehat{K}\) as

$$\begin{aligned} \widehat{K}&= \left( \begin{array}{cc} \underset{(K_2 \times K_2)}{\widehat{K}_{\beta \beta }} &{} \underset{(K_2 \times 1)}{\widehat{K}_{\beta \phi } }\\ \underset{(1 \times K_2)}{\widehat{K}_{\phi \beta } }&{} \underset{(1 \times 1)}{ \widehat{K}_{\phi \phi } } \end{array} \right) . \end{aligned}$$
(24)

Following Ferrari and Cribari-Neto (2004), the quantities in (24) are defined as

$$\begin{aligned} \widehat{K}_{\beta \beta }&= \widehat{\phi }X_{01}^\top W_{01} X_{01} \nonumber \\ W_{01}&= \text {diag}(w_1, \ldots , w_{n_{01}})\nonumber \\ w_{t}&= \widehat{\phi } \left[ \psi '( \widehat{\mu }_t \widehat{\phi }) + \psi '( (1-\widehat{\mu }_t) \widehat{\phi }) \right] \left[ \widehat{\mu }_t (1 - \widehat{\mu }_t) \right] ^2 \end{aligned}$$
(25)
$$\begin{aligned} \widehat{K}_{\beta \phi }&= \widehat{K}^\top _{ \phi \beta } = X_{01}^\top T_{01}C_{01} \nonumber \\ T_{01}&= \text {diag}(\widehat{\mu }_1(1-\widehat{\mu }_1), \ldots , \widehat{\mu }_{n_{01}}(1-\widehat{\mu }_{n_{01}} ) ) \nonumber \\ C_{01}&= (c_1, \ldots , c_{n_{01}})^\top \nonumber \\ c_t&= \widehat{\phi } \left[ \psi '( \widehat{\mu }_t \widehat{\phi })\widehat{\mu }_t - \psi '( (1-\widehat{\mu }_t) \widehat{\phi }) (1 - \widehat{\mu }_t) \right] \nonumber \\ \widehat{K}_{\phi \phi }&= \text {trace}(D_{01})\nonumber \\ D_{01}&= \text {diag}(d_1, \ldots , d_{n_{01}}) \nonumber \\ d_{t}&= \psi '( \widehat{\mu }_t \widehat{\phi }) \widehat{\mu }_t^2 + \psi '( (1-\widehat{\mu }_t) \widehat{\phi }) (1 - \widehat{\mu }_t)^2 - \psi '(\widehat{\phi }), \end{aligned}$$
(26)

where \(t=1,\ldots , n_{01}\), \(X_{01}=(x_1, \ldots , x_{n_{01}})^\top \) is an \(n_{01} \times K_2\) matrix of covariates for the observations in \(N_{01}\), and \(\psi '(\cdot )\) denotes the trigamma function. Note that the quantities above are defined over \(N_{01}\), while the quantities in Ferrari and Cribari-Neto (2004) are defined over the full sample. Also, the analogous quantities for (25) and (26) in Ferrari and Cribari-Neto (2004) are expressed for an arbitrary link function \(g( \mu _t)\). In this paper, \(g(\mu _t) = \text {logit}(\mu _t) = \log (\frac{\mu _t}{1-\mu _t})\), and \(g'(\mu _t) = \frac{1}{\mu _t(1-\mu _t)}\) (the first derivative), so plugging these quantities into the formulas in the referenced paper results in the expressions for (25) and (26).

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, P. Efficient MCMC estimation of inflated beta regression models. Comput Stat 33, 127–158 (2018). https://doi.org/10.1007/s00180-017-0747-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00180-017-0747-x

Keywords

Navigation