Efficient MCMC estimation of inflated beta regression models

Li, Phillip

doi:10.1007/s00180-017-0747-x

Efficient MCMC estimation of inflated beta regression models

Original Paper
Published: 07 July 2017

Volume 33, pages 127–158, (2018)
Cite this article

Computational Statistics Aims and scope Submit manuscript

Phillip Li ORCID: orcid.org/0000-0002-1729-4012¹

698 Accesses
4 Citations
Explore all metrics

Abstract

This paper introduces a new and computationally efficient Markov chain Monte Carlo (MCMC) estimation algorithm for the Bayesian analysis of zero, one, and zero and one inflated beta regression models. The algorithm is computationally efficient in the sense that it has low MCMC autocorrelations and computational time. A simulation study shows that the proposed algorithm outperforms the slice sampling and random walk Metropolis–Hastings algorithms in both small and large sample settings. An empirical illustration on a loss given default banking model demonstrates the usefulness of the proposed algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Calibration-Based Method in Computing Bayesian Posterior Distributions with Applications in Stock Market

Parameter estimation and diagnostic tests for INMA(1) processes

Article 30 March 2019

Likelihood-based risk estimation for variance-gamma models

Article 16 August 2017

Notes

Note that the CG algorithm is only applicable to NIB regression models and to the continuous portion of the proposed ZOIB model. Moreover, although the CG algorithm can be nested in the DA algorithm to create another algorithm for the first scenario, this nesting is not considered as the results from the second scenario suggest that the QL-based algorithm marginally outperforms the CG algorithm for $\beta $ but significantly outperforms CG for $\phi $.
The priors are set as follows. The hyperparameters $b_0$ and $g_0$ are appropriately-sized vectors of zeros. The matrices $B_0$ and $G_0$ are identity matrices multiplied by 1000. The hyperparameters $s_0$ and $S_0$ are respectively set to 1 and 100. For the truncated normal prior, $p_0$ and $P_0$ are respectively set to 2 and 2000. For the log-normal prior, denoted as $\mathcal {LN}(\phi \vert l_0, L_0)$, $l_0$ and $L_0$ are respectively set to 0 and 4. These priors are set so that they are relatively uninformative. Note that the priors for $\beta $ and $\phi $ from the second scenario are also set this way.
Using starting values other than the maximum likelihood estimates do not affect convergence or the qualitative results.
See http://www.mathworks.com/help/stats/slicesample.html.
This implementation is confirmed with Matlab technical support.
Computational time includes the time to obtain the QL estimates.
The CG algorithm, which also samples $\beta $ and $\phi $ conditional on each other, is based on first-order Taylor series approximations around the most up-to-date values of the parameters. One possible reason for the CG algorithm not performing as well as the QL algorithm may be due to approximation errors in the first-order Taylor series expansions used by CG.
Although a model that accounts for temporal correlations can be used, the variation in this dataset is too limited to meaningfully estimate such model specifications.

References

Albert JH, Chib S (1993) Bayesian analysis of binary and polychotomous response data. J Am Stat Assoc 88(422):669–679
Article MathSciNet MATH Google Scholar
Banking Committee on Banking Supervision (2001) Overview of the new basel capital accord. Technical report. http://www.bis.org/publ/bcbsca02.pdf
Billio M, Casarin R (2011) Beta autoregressive transition Markov–Switching models for business cycle analysis. Stud Nonlinear Dyn Econom 15(4):1–32
MathSciNet Google Scholar
Bonat WH, Ribeiro PJ Jr, Zeviani WM (2015) Likelihood analysis for a class of beta mixed models. J Appl Stat 42(2):252–266
Article MathSciNet Google Scholar
Branscum AJ, Johnson WO, Thurmond MC (2007) Bayesian beta regression: applications to household expenditure data and genetic distance between foot-and-mouth disease viruses. Aust N Z J Stat 49(3):287–301
Article MathSciNet MATH Google Scholar
Calabrese R (2012) Regression Model for Proportions with Probability Masses at Zero and One. Working Papers 201209, Geary Institute, University College Dublin
Casarin R, Dalla Valle L, Leisen F et al (2012) Bayesian model selection for beta autoregressive processes. Bayesian Anal 7(2):385–410
Article MathSciNet MATH Google Scholar
Cepeda E, Gamerman D (2005) Bayesian methodology for modeling parameters in the two parameter exponential family. Revista Estadística 57(168–169):93–105
MathSciNet Google Scholar
Cepeda-Cuervo E, Garrido L (2015) Bayesian beta regression models with joint mean and dispersion modeling. Monte Carlo Methods Appl 21(1):49–58
Article MathSciNet MATH Google Scholar
Cook DO, Kieschnick R, McCullough BD (2008) Regression analysis of proportions in finance with self selection. J Empir Financ 15(5):860–867
Article Google Scholar
Da-Silva C, Migon H (2012) Hierarchical dynamic beta model. Technical report 253, Departmento de Metodos Estatisticos, Universidade Federal do Rio de Janeiro
Da-Silva C, Migon HS, Correia L (2011) Dynamic bayesian beta models. Comput Stat Data Anal 55(6):2074–2089
Article MathSciNet MATH Google Scholar
Ferrari S, Cribari-Neto F (2004) Beta regression for modelling rates and proportions. J Appl Stat 31(7):799–815
Article MathSciNet MATH Google Scholar
Figueroa-Zúniga JI, Arellano-Valle RB, Ferrari SL (2013) Mixed beta regression: a Bayesian perspective. Comput Stat Data Anal 61:137–147
Article MathSciNet MATH Google Scholar
Imai K, van Dyk DA (2005) A bayesian analysis of the multinomial probit model using marginal data augmentation. J Econom 124(2):311–334
Article MathSciNet MATH Google Scholar
Jeliazkov I, Graves J, Kutzbach M (2008) Fitting and comparison of models for multivariate ordinal outcomes. Adv Econom 23:115–156
Article MATH Google Scholar
Kieschnick R, McCullough BD (2003) Regression analysis of variates observed on (0, 1): percentages, proportions and fractions. Stat Model 3(3):193–213
Article MathSciNet MATH Google Scholar
Le Cam L, Yang GL (2012) Asymptotics in statistics: some basic concepts. Springer, Berlin
MATH Google Scholar
Li P, Qi M, Zhang X, Zhao X (2016) Further investigation of parametric loss given default modeling. J Credit Risk 12(4):17–47
Article Google Scholar
Liu F, Eugenio EC (2016) A review and comparison of Bayesian and likelihood-based inferences in beta regression and zero-or-one-inflated beta regression. Statistical methods in medical research, p 0962280216650699
Liu F, Kong Y (2015) zoib: an R package for bayesian inference for beta regression and zero/one inflated beta regression. R J 7(2):34–51
Google Scholar
Liu JS, Wong WH, Kong A (1994) Covariance structure of the gibbs sampler with applications to the comparisons of estimators and augmentation schemes. Biometrika 81(1):27–40
Article MathSciNet MATH Google Scholar
McFadden D (1974) Conditional logit analysis of qualitative choice behavior. In: Zarembka P (ed) Frontiers in econometrics. Academic Press, New York, pp 105–142
Neal RM (2003) Slice sampling. Ann Stat 31:705–741
Article MathSciNet MATH Google Scholar
Ospina R, Ferrari SL (2010) Inflated beta distributions. Stat Papers 51(1):111–126
Article MathSciNet MATH Google Scholar
Ospina R, Ferrari SL (2012) A general class of zero-or-one inflated beta regression models. Comput Stat Data Anal 56(6):1609–1623
Article MathSciNet MATH Google Scholar
Paolino P (2001) Maximum likelihood estimation of models with beta-distributed dependent variables. Polit Anal 9(4):325–346
Article Google Scholar
Papke LE, Wooldridge JM (1996) Econometric methods for fractional response variables with an application to 401 (k) plan participation rates. J Appl Econom 11(6):619–632
Article Google Scholar
Prakasa Rao BLS (1987) Asymptotic theory of statistical inference. Wiley, New York
MATH Google Scholar
Qi M, Zhao X (2011) Comparison of modeling methods for loss given default. J Bank Financ 35(11):2842–2855
Article Google Scholar
Qi M, Zhao X (2013) Debt structure, market value of firm and recovery rate. J Credit Risk 9(1):3–37
Article Google Scholar
Rocha AV, Cribari-Neto F (2009) Beta autoregressive moving average models. Test 18(3):529–545
Article MathSciNet MATH Google Scholar
Rydlewski JP (2007) Beta-regression model for periodic data with a trend. Univ Lagellonicae Acta Math 45:211–222
MathSciNet MATH Google Scholar
Simas AB, Barreto-Souza W, Rocha AV (2010) Improved estimators for a general class of beta regression models. Comput Stat Data Anal 54(2):348–366
Article MathSciNet MATH Google Scholar
Smithson M, Verkuilen J (2006) A better lemon squeezer? Maximum-likelihood regression with beta-distributed dependent variables. Psychol Methods 11(1):54
Article Google Scholar
Tanner MA, Wong WH (1987) The calculation of posterior distributions by data augmentation. J Am Stat Assoc 82(398):528–540
Article MathSciNet MATH Google Scholar
Verkuilen J, Smithson M (2012) Mixed and mixture regression models for continuous bounded responses using the beta distribution. J Educ Behav Stat 37(1):82–113
Article Google Scholar
Wieczorek J, Nugent C, Hawala S (2012) A bayesian zero-one inflated beta model for small area shrinkage estimation. In: Proceedings, section on survey research methods. American Statistical Association, Alexandria, VA
Zimprich D (2010) Modeling change in skewed variables using mixed beta regression models. Res Hum Dev 7(1):9–26
Article Google Scholar

Download references

Acknowledgements

I would like to thank Xinlei Zhao for generously providing the data and model specification. Also, Sibel Sirakaya, Alicia Lloro, Angela Vossmeyer, Jonathan Cook, and Andrew Chang provided numerous suggestions that improved this paper. I am also indebted to an anonymous referee for providing helpful suggestions. This paper was started before I joined the Office of Financial Research. The views and opinions expressed here are not necessarily those of the Department of the Treasury or of the Office of Financial Research.

Author information

Authors and Affiliations

717 14th Street NW, Washington, DC, 20220, USA
Phillip Li

Authors

Phillip Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Phillip Li.

Appendix

1.1 Parameterization of the inverse-gamma distribution

If X has an inverse-gamma distribution with shape a and scale b, then the density function is parameterized as

$$\begin{aligned} \mathcal {IG}(x \vert a, b) = \frac{x^{-(a+1)}}{{\varGamma }(a) b^a} e^{-\frac{1}{xb}}. \end{aligned}$$

The mean is $\mathbb {E}(X) = (b(a-1))^{-1}$ for $a>1$, and the variance is $\mathbb {V}(X)=(b^2(a-1)^2(a-2))^{-1}$ for $a>2$.

1.2 Quasi-likelihood quantities

The following describes the method to obtain consistent estimates of $\beta $ and $\phi $ (or equivalently, $\psi $). Following Papke and Wooldridge (1996), the first step is to maximize the following log Bernoulli quasi-likelihood function

$$\begin{aligned} L_{QL}(b) = \sum _{t=1}^{n_{01} } \left[ y_t \log \left( \frac{ \exp (x_t^\top b) }{ 1 + \exp (x_t^\top b)} \right) + (1-y_t) \log \left( 1 - \frac{ \exp (x_t^\top b) }{ 1 + \exp (x_t^\top b)} \right) \right] \end{aligned}$$

(23)

with respect to the $K_2 \times 1$ vector b. In (23), $t=1,\ldots ,n_{01}$ indexes the $N_{01}$ subsample, and $n_{01}$ is the cardinality of $N_{01}$. Note that the referenced paper expresses the log QL function in terms of an arbitrary function $G(\cdot )$; in this paper, $G( x_t^\top b) = \text {logit}^{-1}(x_t^\top b) = \frac{ \exp (x_t^\top b) }{ 1 + \exp (x_t^\top b)}$, which leads to (23). Papke and Wooldridge (1996) show that the maximizing vector, denoted as $\widehat{\beta }$, is a consistent estimator of $\beta $, as long as $\mathbb {E}(y_t) = \frac{\exp (x_t^\top \beta )}{1+\exp (x_t^\top \beta )}$ for $t=1,\ldots ,n_{01}$ (which holds due to (3) and (8) holding for $i \in N_{01}$). Next, under the assumption that $\mathbb {V}(y_t) = \delta ^2 \mu _t (1-\mu _t)$ for $t=1,\ldots ,n_{01}$ with $\delta ^2 = \frac{1}{1+\phi }$ (which is true given (9) for $i \in N_{01}$), the authors also show that the estimator

$$\begin{aligned} \widehat{\delta }^2 = \frac{ \sum _{t=1}^{n_{01}} \tilde{u}_t^2}{n_{01} - K_2}, \end{aligned}$$

is consistent for $\delta ^2$, where

$$\begin{aligned} \tilde{u}_t = \frac{y_t - \widehat{\mu }_t}{ \sqrt{ \widehat{\mu }_t(1-\widehat{\mu }_t) } } \end{aligned}$$

and

$$\begin{aligned} \widehat{\mu }_t = \frac{\exp (x_t^\top \widehat{\beta } )}{1+\exp (x_t^\top \widehat{\beta })}. \end{aligned}$$

Then, by the continuous mapping theorem, the estimator $\widehat{\phi }= \frac{1-\widehat{\delta }^2}{\widehat{\delta }^2}$ is consistent for $\phi $. The vector $\widehat{\psi }$ in Sect. 3.1 is defined as $\widehat{\psi } = (\widehat{\beta }^\top , \widehat{\phi })^\top $.

1.3 Information matrix quantities

Partition $\widehat{K}$ as

$$\begin{aligned} \widehat{K}&= \left( \begin{array}{cc} \underset{(K_2 \times K_2)}{\widehat{K}_{\beta \beta }} &{} \underset{(K_2 \times 1)}{\widehat{K}_{\beta \phi } }\\ \underset{(1 \times K_2)}{\widehat{K}_{\phi \beta } }&{} \underset{(1 \times 1)}{ \widehat{K}_{\phi \phi } } \end{array} \right) . \end{aligned}$$

(24)

Following Ferrari and Cribari-Neto (2004), the quantities in (24) are defined as

$$\begin{aligned} \widehat{K}_{\beta \beta }&= \widehat{\phi }X_{01}^\top W_{01} X_{01} \nonumber \\ W_{01}&= \text {diag}(w_1, \ldots , w_{n_{01}})\nonumber \\ w_{t}&= \widehat{\phi } \left[ \psi '( \widehat{\mu }_t \widehat{\phi }) + \psi '( (1-\widehat{\mu }_t) \widehat{\phi }) \right] \left[ \widehat{\mu }_t (1 - \widehat{\mu }_t) \right] ^2 \end{aligned}$$

(25)

$$\begin{aligned} \widehat{K}_{\beta \phi }&= \widehat{K}^\top _{ \phi \beta } = X_{01}^\top T_{01}C_{01} \nonumber \\ T_{01}&= \text {diag}(\widehat{\mu }_1(1-\widehat{\mu }_1), \ldots , \widehat{\mu }_{n_{01}}(1-\widehat{\mu }_{n_{01}} ) ) \nonumber \\ C_{01}&= (c_1, \ldots , c_{n_{01}})^\top \nonumber \\ c_t&= \widehat{\phi } \left[ \psi '( \widehat{\mu }_t \widehat{\phi })\widehat{\mu }_t - \psi '( (1-\widehat{\mu }_t) \widehat{\phi }) (1 - \widehat{\mu }_t) \right] \nonumber \\ \widehat{K}_{\phi \phi }&= \text {trace}(D_{01})\nonumber \\ D_{01}&= \text {diag}(d_1, \ldots , d_{n_{01}}) \nonumber \\ d_{t}&= \psi '( \widehat{\mu }_t \widehat{\phi }) \widehat{\mu }_t^2 + \psi '( (1-\widehat{\mu }_t) \widehat{\phi }) (1 - \widehat{\mu }_t)^2 - \psi '(\widehat{\phi }), \end{aligned}$$

(26)

where $t=1,\ldots , n_{01}$, $X_{01}=(x_1, \ldots , x_{n_{01}})^\top $ is an $n_{01} \times K_2$ matrix of covariates for the observations in $N_{01}$, and $\psi '(\cdot )$ denotes the trigamma function. Note that the quantities above are defined over $N_{01}$, while the quantities in Ferrari and Cribari-Neto (2004) are defined over the full sample. Also, the analogous quantities for (25) and (26) in Ferrari and Cribari-Neto (2004) are expressed for an arbitrary link function $g( \mu _t)$. In this paper, $g(\mu _t) = \text {logit}(\mu _t) = \log (\frac{\mu _t}{1-\mu _t})$, and $g'(\mu _t) = \frac{1}{\mu _t(1-\mu _t)}$ (the first derivative), so plugging these quantities into the formulas in the referenced paper results in the expressions for (25) and (26).

Rights and permissions

Reprints and permissions

About this article

Cite this article

Li, P. Efficient MCMC estimation of inflated beta regression models. Comput Stat 33, 127–158 (2018). https://doi.org/10.1007/s00180-017-0747-x

Download citation

Received: 07 March 2016
Accepted: 24 June 2017
Published: 07 July 2017
Issue Date: March 2018
DOI: https://doi.org/10.1007/s00180-017-0747-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Efficient MCMC estimation of inflated beta regression models

Abstract

Access this article

Similar content being viewed by others

A Calibration-Based Method in Computing Bayesian Posterior Distributions with Applications in Stock Market

Parameter estimation and diagnostic tests for INMA(1) processes

Likelihood-based risk estimation for variance-gamma models

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Appendix

1.1 Parameterization of the inverse-gamma distribution

1.2 Quasi-likelihood quantities

1.3 Information matrix quantities

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Efficient MCMC estimation of inflated beta regression models

Abstract

Access this article

Similar content being viewed by others

A Calibration-Based Method in Computing Bayesian Posterior Distributions with Applications in Stock Market

Parameter estimation and diagnostic tests for INMA(1) processes

Likelihood-based risk estimation for variance-gamma models

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Appendix

Appendix

1.1 Parameterization of the inverse-gamma distribution

1.2 Quasi-likelihood quantities

1.3 Information matrix quantities

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation