Abstract
Beta regression models have become a popular tool for describing and predicting limited-range continuous data such as rates and proportions. However, these models can be severely affected by outlying observations that the beta distribution does not handle well. A robust alternative to the modeling with the beta distribution is considering the rectangular beta (RB) distribution, which is an extension of the former one. The RB distribution can deal with heavy tails and is therefore more flexible than the beta distribution. Regression modeling where covariates are measured with error is a frequent issue in different areas. This paper derives robust regression modeling for proportions with errors-in-variables using the RB distribution under a new parametrization recently proposed in the literature. We use a Bayesian approach to estimate the model parameters with a specification of prior distributions and a computational implementation carried out via the Gibbs sampling. Monte Carlo simulations allow us to conduct numerical evaluation to detect the statistical performance of the approach considered. Then, an illustration with real-world data is presented to show its potential uses.
Similar content being viewed by others
References
Aykroyd RG, Leiva V, Marchant C (2018) Multivariate Birnbaum–Saunders distributions: modelling and applications. Risks 6:21
Bayes C, Bazán J (2014) An EM algorithm for beta-rectangular regression models. Personal Communication
Bayes C, Bazán J, García C (2012) A new robust regression model for proportions. Bayesian Anal 7:841–866
Borssoi JA, Paula GA, Galea M (2020) Elliptical linear mixed models with a covariate subject to measurement error. Stat Pap 61:31–69
Bouguila N, Djemel Z, Monga E (2006) Practical Bayesian estimation of a finite beta mixture through Gibbs sampling and its applications. Stat Comput 16:215–225
Brooks SP (2002) Discussion on the paper by Spiegelhalter, Best, Carlin, and van der Linde (2002). J R Stat Soc B 64:616–618
Brooks SP, Gelman A (1998) General methods for monitoring convergence of iterative simulations. J Comput Graph Stat 7:434–455
Buonaccorsi JP (2010) Measurement error: models, methods and applications. Chapman and Hall, Boca Raton
Carlin BP, Louis TA (2001) Bayes and empirical Bayes methods for data analysis. Chapman and Hall, Boca Raton
Carrasco JMF, Ferrari SLP, Arellano-Valle RB (2014) Errors-in-variables beta regression models. J Appl Stat 41:1530–1547
Carrasco JMF, Figueroa-Zúniga JI, Leiva V, Riquelme M, Aykroyd RG (2020) An errors-in-variables model based on the Birnbaum–Saunders and its diagnostics with an application to earthquake data. Stoch Env Res Risk Assess 34:369–380
Carroll RJ, Ruppert D, Stefanski LA, Crainiceanu CM (2006) Measurement error in nonlinear models: a modern perspective. Chapman and Hall, New York
Carvalho CM, Polson NG, Scott JG (2009) Handling sparsity via the horseshoe. Artif Intell Stat 16:73–80
Chahuan-Jimenez K, Rubilar R, de la Fuente-Mella H, Leiva V (2021) Breakpoint analysis for the COVID-19 pandemic and its effect on the stock markets. Entropy 32:100
Cheng C, Van Ness JW (1999) Statistical regression with measurement error. Oxford University Press, London
de la Fuente-Mella H, Rojas Fuentes JL, Leiva V (2020) Econometric modeling of productivity and technical efficiency in the Chilean manufacturing industry. Comput Ind Eng 139:105793
Ferrari SLP, Cribari-Neto F (2004) Beta regression for modelling rates and proportions. J Appl Stat 31:799–815
Figueroa-Zúniga JI, Niklitschek S, Leiva V, Liu S (2022) Modeling heavy-tailed bounded data by the trapezoidal beta distribution with applications. REVSTAT, pages in press available at https://www.ine.pt/revstat/inicio.html
Figueroa-Zúniga JI, Arellano-Valle RB, Ferrari SL (2013) Mixed beta regression: a Bayesian perspective. Comput Stat Data Anal 61:137–147
Fong Y, Rue H, Wakefield J (2010) Bayesian inference for generalized linear mixed models. Biostatistics 11:397–412
Fuller WA (1987) Measurement error models. Wiley, New York
García C, García J, Dorp JV (2011) Modeling heavy-tailed, skewed and peaked uncertainty phenomena with bounded support. Stat Methods Appl 20:463–486
Garcia-Papani F, Leiva V, Uribe-Opazo M, Aykroyd RG (2018) Birnbaum–Saunders spatial regression models: diagnostics and application to chemical data. Chemom Intell Lab Syst 177:114–128
Giraldo R, Herrera L, Leiva V (2020) Cokriging prediction using as secondary variable a functional random field with application in environmental pollution. Mathematics 8:1305
Hahn ED (2008) Mixture densities for project management activity times: a robust approach to PERT. Eur J Oper Res 188:450–459
Hoffman MD, Gelman A (2014) The No-U-turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo. J Mach Learn Res 15:1593–1623
Ibrahim JG, Lipsitz SR, Chen MH (1999) Missing covariates in generalized linear models when the missing data mechanism is non-ignorable. J R Stat Soc B 61:173–190
Leao J, Leiva V, Saulo H, Tomazella V (2018) A survival model with Birnbaum–Saunders frailty for uncensored and censored cancer data. Braz J Probab Stat 32:707–729
Leiva V, Sanchez L, Galea M, Saulo H (2020) Global and local diagnostic analytics for a geostatistical model based on a new approach to quantile regression. Stoch Env Res Risk Assess 34:1457–1471
Leiva V, Saulo H, Souza R, Aykroyd RG, Vila R (2021) A new BISARMA time series model for forecasting mortality using weather and particulate matter data. J Forecast 40:346–364
Markatou M (2000) Mixture models, robustness, and the weighted likelihood methodology. Biometrics 56:483–486
Martinez-Florez G, Leiva V, Gomez-Deniz E, Marchant C (2020) A family of skew-normal distributions for modeling proportions and rates with zeros/ones excess. Symmetry 12:1439
Mazucheli J, Menezes AFB, Dey S (2018) The unit Birnbaum–Saunders distribution with applications. Chilean J Stat 9:47–57
Mazucheli J, Bapat SR, Menezes AFB (2019) A new one-parameter unit Lindley distribution. Chilean J Stat 11:53–67
Mazucheli M, Leiva V, Alves B, Menezes AFB (2021) A new quantile regression for modeling bounded data under a unit Birnbaum–Saunders distribution with applications in medicine and politics. Symmetry 13:682
Neal R (2011) MCMC using Hamiltonian dynamics. In: Brooks S, Gelman A, Jones GL, Meng XL (eds) Handbook of Markov Chain Monte Carlo, chapter 5. Chapman and Hall, London, pp 116–162
Roberts GO, Rosenthal JS (1998) Optimal scaling of discrete approximations to Langevin diffusions. J R Stat Soc B 60:255–268
Saulo H, Dasilva A, Leiva V, Sanchez L, de la Fuente-Mella H (2022) Log-symmetric quantile regression models. Statistica Neerlandica. https://doi.org/10.1111/stan.12243
Silva AR, Azevedo CL, Bazán J, Nobre JS (2021) Augmented-limited regression models with an application to the study of the risk perceived using continuous scales. J Appl Stat 48:1998–2021
Smithson M, Verkuilen J (2006) A better lemon squeezer? Maximum-likelihood regression with beta-distributed dependent variables. Psychol Methods 11:54–71
Spiegelhalter D, Best N, Carlin B, Linde A (2002) Bayesian measures of model complexity and fit. J R Stat Soc B 64:583–639
Stan Development Team (2016) Stan Modeling Language User’s Guide and Reference Manual. Version 2(11)
Ventura M, Saulo H, Leiva V, Monsueto S (2019) Log-symmetric regression models: information criteria, application to movie business and industry data with economic implications. Appl Stoch Model Bus Ind 35:963–977
Villa C, Walker S (2015) An objective Bayesian criterion to determine model prior probabilities. Scand J Stat 42:947–966
Wei C, Yang J (2020) Stochastic restricted estimation in partially linear additive errors-in-variables models. Stat Pap 61:1269–1279
Wolf M (2017) Hemoglobin-dilution method: effect of measurement errors on vascular volume estimation. Comput Math Methods Med
Acknowledgements
The authors would like to thank the Editors and Reviewers for their constructive comments on an earlier version of this manuscript which led to an improved presentation. The authors acknowledge funding supported by Grants: VRID 217.014.027-1 from Universidad de Concepción, Chile (J.I. Figueroa-Zúñiga), DGI-2014-0017/0070 and DGI-2014-0077/0065 from the Dirección de Gestión de la Investigación at PUCP, Peru (C.L. Bayes); and FONDECYT 1200525 from the National Agency for Research and Development (ANID) of the Chilean government (V. Leiva).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix
Appendix
Next, we present BUGS codes used for fitting the RB regression models with errors-in-variables.
Rights and permissions
About this article
Cite this article
Figueroa-Zúñiga, J.I., Bayes, C.L., Leiva, V. et al. Robust beta regression modeling with errors-in-variables: a Bayesian approach and numerical applications. Stat Papers 63, 919–942 (2022). https://doi.org/10.1007/s00362-021-01260-1
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00362-021-01260-1