Skip to main content
Log in

Robust beta regression modeling with errors-in-variables: a Bayesian approach and numerical applications

  • Regular Article
  • Published:
Statistical Papers Aims and scope Submit manuscript

Abstract

Beta regression models have become a popular tool for describing and predicting limited-range continuous data such as rates and proportions. However, these models can be severely affected by outlying observations that the beta distribution does not handle well. A robust alternative to the modeling with the beta distribution is considering the rectangular beta (RB) distribution, which is an extension of the former one. The RB distribution can deal with heavy tails and is therefore more flexible than the beta distribution. Regression modeling where covariates are measured with error is a frequent issue in different areas. This paper derives robust regression modeling for proportions with errors-in-variables using the RB distribution under a new parametrization recently proposed in the literature. We use a Bayesian approach to estimate the model parameters with a specification of prior distributions and a computational implementation carried out via the Gibbs sampling. Monte Carlo simulations allow us to conduct numerical evaluation to detect the statistical performance of the approach considered. Then, an illustration with real-world data is presented to show its potential uses.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Similar content being viewed by others

References

  • Aykroyd RG, Leiva V, Marchant C (2018) Multivariate Birnbaum–Saunders distributions: modelling and applications. Risks 6:21

    Google Scholar 

  • Bayes C, Bazán J (2014) An EM algorithm for beta-rectangular regression models. Personal Communication

  • Bayes C, Bazán J, García C (2012) A new robust regression model for proportions. Bayesian Anal 7:841–866

    MathSciNet  MATH  Google Scholar 

  • Borssoi JA, Paula GA, Galea M (2020) Elliptical linear mixed models with a covariate subject to measurement error. Stat Pap 61:31–69

    MathSciNet  MATH  Google Scholar 

  • Bouguila N, Djemel Z, Monga E (2006) Practical Bayesian estimation of a finite beta mixture through Gibbs sampling and its applications. Stat Comput 16:215–225

    MathSciNet  Google Scholar 

  • Brooks SP (2002) Discussion on the paper by Spiegelhalter, Best, Carlin, and van der Linde (2002). J R Stat Soc B 64:616–618

    Google Scholar 

  • Brooks SP, Gelman A (1998) General methods for monitoring convergence of iterative simulations. J Comput Graph Stat 7:434–455

    MathSciNet  Google Scholar 

  • Buonaccorsi JP (2010) Measurement error: models, methods and applications. Chapman and Hall, Boca Raton

    MATH  Google Scholar 

  • Carlin BP, Louis TA (2001) Bayes and empirical Bayes methods for data analysis. Chapman and Hall, Boca Raton

    MATH  Google Scholar 

  • Carrasco JMF, Ferrari SLP, Arellano-Valle RB (2014) Errors-in-variables beta regression models. J Appl Stat 41:1530–1547

    MathSciNet  MATH  Google Scholar 

  • Carrasco JMF, Figueroa-Zúniga JI, Leiva V, Riquelme M, Aykroyd RG (2020) An errors-in-variables model based on the Birnbaum–Saunders and its diagnostics with an application to earthquake data. Stoch Env Res Risk Assess 34:369–380

    Google Scholar 

  • Carroll RJ, Ruppert D, Stefanski LA, Crainiceanu CM (2006) Measurement error in nonlinear models: a modern perspective. Chapman and Hall, New York

    MATH  Google Scholar 

  • Carvalho CM, Polson NG, Scott JG (2009) Handling sparsity via the horseshoe. Artif Intell Stat 16:73–80

    Google Scholar 

  • Chahuan-Jimenez K, Rubilar R, de la Fuente-Mella H, Leiva V (2021) Breakpoint analysis for the COVID-19 pandemic and its effect on the stock markets. Entropy 32:100

    Google Scholar 

  • Cheng C, Van Ness JW (1999) Statistical regression with measurement error. Oxford University Press, London

    MATH  Google Scholar 

  • de la Fuente-Mella H, Rojas Fuentes JL, Leiva V (2020) Econometric modeling of productivity and technical efficiency in the Chilean manufacturing industry. Comput Ind Eng 139:105793

    Google Scholar 

  • Ferrari SLP, Cribari-Neto F (2004) Beta regression for modelling rates and proportions. J Appl Stat 31:799–815

    MathSciNet  MATH  Google Scholar 

  • Figueroa-Zúniga JI, Niklitschek S, Leiva V, Liu S (2022) Modeling heavy-tailed bounded data by the trapezoidal beta distribution with applications. REVSTAT, pages in press available at https://www.ine.pt/revstat/inicio.html

  • Figueroa-Zúniga JI, Arellano-Valle RB, Ferrari SL (2013) Mixed beta regression: a Bayesian perspective. Comput Stat Data Anal 61:137–147

    MathSciNet  MATH  Google Scholar 

  • Fong Y, Rue H, Wakefield J (2010) Bayesian inference for generalized linear mixed models. Biostatistics 11:397–412

    MATH  Google Scholar 

  • Fuller WA (1987) Measurement error models. Wiley, New York

    MATH  Google Scholar 

  • García C, García J, Dorp JV (2011) Modeling heavy-tailed, skewed and peaked uncertainty phenomena with bounded support. Stat Methods Appl 20:463–486

    MathSciNet  MATH  Google Scholar 

  • Garcia-Papani F, Leiva V, Uribe-Opazo M, Aykroyd RG (2018) Birnbaum–Saunders spatial regression models: diagnostics and application to chemical data. Chemom Intell Lab Syst 177:114–128

    Google Scholar 

  • Giraldo R, Herrera L, Leiva V (2020) Cokriging prediction using as secondary variable a functional random field with application in environmental pollution. Mathematics 8:1305

    Google Scholar 

  • Hahn ED (2008) Mixture densities for project management activity times: a robust approach to PERT. Eur J Oper Res 188:450–459

    MATH  Google Scholar 

  • Hoffman MD, Gelman A (2014) The No-U-turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo. J Mach Learn Res 15:1593–1623

    MathSciNet  MATH  Google Scholar 

  • Ibrahim JG, Lipsitz SR, Chen MH (1999) Missing covariates in generalized linear models when the missing data mechanism is non-ignorable. J R Stat Soc B 61:173–190

    MathSciNet  MATH  Google Scholar 

  • Leao J, Leiva V, Saulo H, Tomazella V (2018) A survival model with Birnbaum–Saunders frailty for uncensored and censored cancer data. Braz J Probab Stat 32:707–729

    MathSciNet  MATH  Google Scholar 

  • Leiva V, Sanchez L, Galea M, Saulo H (2020) Global and local diagnostic analytics for a geostatistical model based on a new approach to quantile regression. Stoch Env Res Risk Assess 34:1457–1471

    Google Scholar 

  • Leiva V, Saulo H, Souza R, Aykroyd RG, Vila R (2021) A new BISARMA time series model for forecasting mortality using weather and particulate matter data. J Forecast 40:346–364

    MathSciNet  Google Scholar 

  • Markatou M (2000) Mixture models, robustness, and the weighted likelihood methodology. Biometrics 56:483–486

    MATH  Google Scholar 

  • Martinez-Florez G, Leiva V, Gomez-Deniz E, Marchant C (2020) A family of skew-normal distributions for modeling proportions and rates with zeros/ones excess. Symmetry 12:1439

    Google Scholar 

  • Mazucheli J, Menezes AFB, Dey S (2018) The unit Birnbaum–Saunders distribution with applications. Chilean J Stat 9:47–57

    MathSciNet  MATH  Google Scholar 

  • Mazucheli J, Bapat SR, Menezes AFB (2019) A new one-parameter unit Lindley distribution. Chilean J Stat 11:53–67

    MathSciNet  Google Scholar 

  • Mazucheli M, Leiva V, Alves B, Menezes AFB (2021) A new quantile regression for modeling bounded data under a unit Birnbaum–Saunders distribution with applications in medicine and politics. Symmetry 13:682

    Google Scholar 

  • Neal R (2011) MCMC using Hamiltonian dynamics. In: Brooks S, Gelman A, Jones GL, Meng XL (eds) Handbook of Markov Chain Monte Carlo, chapter 5. Chapman and Hall, London, pp 116–162

    Google Scholar 

  • Roberts GO, Rosenthal JS (1998) Optimal scaling of discrete approximations to Langevin diffusions. J R Stat Soc B 60:255–268

    MathSciNet  MATH  Google Scholar 

  • Saulo H, Dasilva A, Leiva V, Sanchez L, de la Fuente-Mella H (2022) Log-symmetric quantile regression models. Statistica Neerlandica. https://doi.org/10.1111/stan.12243

    Article  Google Scholar 

  • Silva AR, Azevedo CL, Bazán J, Nobre JS (2021) Augmented-limited regression models with an application to the study of the risk perceived using continuous scales. J Appl Stat 48:1998–2021

    MathSciNet  MATH  Google Scholar 

  • Smithson M, Verkuilen J (2006) A better lemon squeezer? Maximum-likelihood regression with beta-distributed dependent variables. Psychol Methods 11:54–71

    Google Scholar 

  • Spiegelhalter D, Best N, Carlin B, Linde A (2002) Bayesian measures of model complexity and fit. J R Stat Soc B 64:583–639

    MathSciNet  MATH  Google Scholar 

  • Stan Development Team (2016) Stan Modeling Language User’s Guide and Reference Manual. Version 2(11)

  • Ventura M, Saulo H, Leiva V, Monsueto S (2019) Log-symmetric regression models: information criteria, application to movie business and industry data with economic implications. Appl Stoch Model Bus Ind 35:963–977

    MathSciNet  Google Scholar 

  • Villa C, Walker S (2015) An objective Bayesian criterion to determine model prior probabilities. Scand J Stat 42:947–966

    MathSciNet  MATH  Google Scholar 

  • Wei C, Yang J (2020) Stochastic restricted estimation in partially linear additive errors-in-variables models. Stat Pap 61:1269–1279

    MathSciNet  MATH  Google Scholar 

  • Wolf M (2017) Hemoglobin-dilution method: effect of measurement errors on vascular volume estimation. Comput Math Methods Med

Download references

Acknowledgements

The authors would like to thank the Editors and Reviewers for their constructive comments on an earlier version of this manuscript which led to an improved presentation. The authors acknowledge funding supported by Grants: VRID 217.014.027-1 from Universidad de Concepción, Chile (J.I. Figueroa-Zúñiga), DGI-2014-0017/0070 and DGI-2014-0077/0065 from the Dirección de Gestión de la Investigación at PUCP, Peru (C.L. Bayes); and FONDECYT 1200525 from the National Agency for Research and Development (ANID) of the Chilean government (V. Leiva).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Víctor Leiva.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Appendix

Next, we present BUGS codes used for fitting the RB regression models with errors-in-variables.

figure a

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Figueroa-Zúñiga, J.I., Bayes, C.L., Leiva, V. et al. Robust beta regression modeling with errors-in-variables: a Bayesian approach and numerical applications. Stat Papers 63, 919–942 (2022). https://doi.org/10.1007/s00362-021-01260-1

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00362-021-01260-1

Keywords

Navigation