Skip to main content
Log in

Bayesian quantile regression models for heavy tailed bounded variables using the No-U-Turn sampler

  • Original paper
  • Published:
Computational Statistics Aims and scope Submit manuscript

Abstract

When we are interested in knowing how covariates impact different levels of the response variable, quantile regression models can be very useful, with their practical use being benefited from the increasing of computational power. The use of bounded response variables is also very common when there are data containing percentages, rates, or proportions. In this work, with the generalized Gompertz distribution as the baseline distribution, we derive two new two-parameter distributions with bounded support, and new quantile parametric mixed regression models are proposed based on these distributions, which consider bounded response variables with heavy tails. Estimation of the parameters using the Bayesian approach is considered for both models, relying on the No-U-Turn sampler algorithm. The inferential methods can be implemented and then easily used for data analysis. Simulation studies with different quantiles (\(q=0.1\), \(q=0.5\) and \(q=0.9\)) and sample sizes (\(n=100\), \(n=200\), \(n=500\), \(n=2000\), \(n=5000\)) were conducted for 100 replicas of simulated data for each combination of settings, in the (0, 1) and [0, 1), showing the good performance of the recovery of parameters for the proposed inferential methods and models, which were compared to Beta Rectangular and Kumaraswamy regression models. Furthermore, a dataset on extreme poverty is analyzed using the proposed regression models with fixed and mixed effects. The quantile parametric models proposed in this work are an alternative and complementary modeling tool for the analysis of bounded data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  • Akdur HTK (2021) Unit-Lindley mixed-effect model for proportion data. J Appl Stat 48(13–15):2389–2405

    Article  MathSciNet  MATH  Google Scholar 

  • Atkinson AC (1985) Plots, transformation and regression: an introduction to graphical methods of diagnostic regression analysis. New York, NY, Oxford

    MATH  Google Scholar 

  • Bayes CL, Bazán JL, de Castro M (2017) A quantile parametric mixed regression model for bounded response variables. Stat Interface 10(3):483–493

    Article  MATH  Google Scholar 

  • Bayes CL, Bazán JL, Garcia C (2012) A new robust regression model for proportions. Bayesian Anal 7(4):841–866

    Article  MathSciNet  MATH  Google Scholar 

  • Bazán JL, Valdivieso L, Branco MD (2017) Measurement of the nonsense word fluency: bayesian approach to a item response model with speededness. Rev Bras Biom 35(4):810–833

    Google Scholar 

  • Biswas J, Das K (2021) A Bayesian quantile regression approach to multivariate semi-continuous longitudinal data. Comput Stat 36(1):241–260

    Article  MathSciNet  MATH  Google Scholar 

  • Cai Y, Jiang T (2015) Estimation of non-crossing quantile regression curves. Aust N Z J Stat 57(1):139–162

    Article  MathSciNet  MATH  Google Scholar 

  • Chakraborty B (2003) On multivariate quantile regression. J Stat Plan Inference 110(1–2):109–132

    Article  MathSciNet  MATH  Google Scholar 

  • da Silva MA, de Oliveira ES, von Davier AA, Bazán JL (2018) Estimating the DINA model parameters using the No-U-Turn Sampler. Biom J 60(2):352–368

    Article  MathSciNet  MATH  Google Scholar 

  • de la Cruz Huayanay A, Bazán JL, Cancho VG, Dey DK (2019) Performance of asymmetric links and correction methods for imbalanced data in binary regression. J Stat Comput Simul 89(9):1694–1714

    Article  MathSciNet  MATH  Google Scholar 

  • Dunn PK, Smyth GK (1996) Randomized quantile residuals. J Comput Graph Stat 5(3):236–244

    Google Scholar 

  • Fernández R, Bayes CL, Valdivieso L (2018) A beta-inflated mean regression model with mixed effects for fractional response variables. J Stat Comput Simul 88(10):1936–1957

    Article  MathSciNet  MATH  Google Scholar 

  • Ferrari SLP, Cribari-Neto F (2004) Beta regression for modelling rates and proportions. J Appl Stat 31(7):799–815

    Article  MathSciNet  MATH  Google Scholar 

  • Figueroa-Zuñiga JI, Arellano-Valle RB, Ferrari SLP (2013) Mixed beta regression: a Bayesian perspective. Comput Stat Data Anal 61:137–147

    Article  MathSciNet  MATH  Google Scholar 

  • Gelman A, Hwang J, Vehtari A (2014) Understanding predictive information criteria for Bayesian models. Stat Comput 24(6):997–1016

    Article  MathSciNet  MATH  Google Scholar 

  • Gelman A, Rubin DB (1992) Inference from iterative simulation using multiple sequences. Stat Sci 7(4):457–472

    Article  MATH  Google Scholar 

  • Geraci M, Bottai M (2014) Linear quantile mixed models. Stat Comput 24(3):461–479

    Article  MathSciNet  MATH  Google Scholar 

  • Ghitany ME, Mazucheli J, Menezes AFB, Alqallaf F (2019) The unit-inverse Gaussian distribution: a new alternative to two-parameter distributions on the unit interval. Commun Stat-Theory Methods 48(14):3423–3438

    Article  MathSciNet  MATH  Google Scholar 

  • Harris MN, Zhao X (2007) A zero-inflated ordered probit model, with an application to modelling tobacco consumption. J Econ 141(2):1073–1099

    Article  MathSciNet  MATH  Google Scholar 

  • Hoffman MD, Gelman A (2014) The no-u-turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo. J Mach Learn Res 15:1593–1623

    MathSciNet  MATH  Google Scholar 

  • INEI (2009) Mapa de pobreza provincial y distrital 2009. El enfoque de la pobreza monetaria. Dirección Técnica de Demografía e Indicadores Sociales. Instituto Nacional de Estadística e Informática (INEI). Lima, Peru

  • Jodrá P (2018) A bounded distribution derived from the shifted Gompertz law. J King Saud Univ-Sci 32(1):523–536

    Article  Google Scholar 

  • Lemonte A, Bazan JL (2016) New class of Johnson SB distributions and its associated regression model for rates and proportions. Biom J 58(4):727–746

    Article  MathSciNet  MATH  Google Scholar 

  • Lemonte AG, Moreno-Arenas G (2020) On a heavy-tailed parametric quantile regression model for limited range response variables. Comput Stat 35(1):379–398

    Article  MathSciNet  MATH  Google Scholar 

  • Lenart A (2014) The moments of the Gompertz distribution and maximum likelihood estimation of its parameters. Scand Actuar J 2014(3):255–277

    Article  MathSciNet  MATH  Google Scholar 

  • Mazucheli J, Menezes AF, Dey S (2019) Unit-Gompertz distribution with applications. Statistica (Bologna) 79(1):25–43

    Google Scholar 

  • Merkle EC, Furr D, Rabe-Hesketh S (2018) Bayesian model assessment: use of conditional vs marginal likelihoods. arXiv preprint arXiv:1802.04452

  • Migliorati S, Di Brisco AM, Ongaro A (2018) A new regression model for bounded responses. Bayesian Anal 13(3):845–872

    Article  MathSciNet  MATH  Google Scholar 

  • Nishio M, Arakawa A (2019) Performance of Hamiltonian Monte Carlo and No-U-Turn Sampler for estimating genetic parameters and breeding values. Genet Sel Evol 51(1):1–12

    Article  Google Scholar 

  • Oliveira ESB, Andrade Filho MC, Bayes CL, Bazán JL (2018) New Gompertz based distributions to skewed bounded responses [abstract]. In: VI workshop on probabilistic and statistical methods - PROGRAM, São Carlos, SP, Brazil. pp 18–19. http://wpsm.icmc.usp.br/6WPSM/program_6WPSM.pdf

  • Ospina R, Ferrari SL (2012) A general class of zero-or-one inflated beta regression models. Comput Stat Data Anal 56(6):1609–1623

    Article  MathSciNet  MATH  Google Scholar 

  • Peng F, Dey DK (1995) Bayesian analysis of outlier problems using divergence measures. Can J Stat 23(2):199–213

    Article  MATH  Google Scholar 

  • Pereira GHA (2019) On quantile residuals in beta regression. Commun Stat-Simul Comput 48(1):302–316

    Article  MathSciNet  MATH  Google Scholar 

  • PNUD (2009) Informe sobre desarrollo humano Perú 2009: por una densidad del Estado al servicio de la gente. Parte II: Una visión desde las cuencas. Programa de las Naciones Unidas para el Desarrollo. Lima, Peru

  • Qiu Z, Song PXG, Tan M (2008) Simplex mixed-effects models for longitudinal proportional data. Scand J Stat 35(4):577–596

    Article  MathSciNet  MATH  Google Scholar 

  • R Core Team (2018) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria URL http://www.R-project.org/

  • Rogers WM, Schmitt N (2004) Parameter recovery and model fit using multidimensional composites: A comparison of four empirical parceling algorithms. Multivar Behav Res 39(3):379–412

    Article  Google Scholar 

  • Stan Development Team (2014) RStan: the R interface to Stan, version 2.5.0 URL http://mc-stan.org/rstan.html

  • Stan Development Team (2021) Stan modeling language user’s guide and reference manual, version 2.28 URL https://mc-stan.org/docs/2_28/reference-manual/index.html

  • Tsodikov A, Ibrahim J, Yakovlev A (2003) Estimating cure rates from survival data: an alternative to two-component mixture models. J Am Stat Assoc 98(464):1063–1078

    Article  MathSciNet  Google Scholar 

  • Vehtari A, Gelman A, Gabry J (2017) Practical Bayesian model evaluation using leave-one-out cross-validation and WAIC. Stat Comput 27(5):1413–1432

    Article  MathSciNet  MATH  Google Scholar 

  • Watanabe S (2010) Asymptotic equivalence of Bayes cross validation and widely applicable information criterion in singular learning theory. J Mach Learn Res 11(12):3571–3594

    MathSciNet  MATH  Google Scholar 

  • Wu L (2009) Mixed effects models for complex data. Chapman and Hall/CRC, Boca Raton, FL

    Book  Google Scholar 

  • Yu K, Lu Z, Stander J (2003) Quantile regression: applications and current research areas. J R Stat Soc Ser D (The Statistician) 52(3):331–350

    MathSciNet  Google Scholar 

Download references

Acknowledgements

This study was financed in part by the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior - Brasil (CAPES) - Finance Code 001. The work of the second author is partially funded by CNPq, Brazil. Dr. Luis Bazán’s work was partially supported by Fundação de Amparo à Pesquisa do Estado de São Paulo FAPESP-Brazil 2021/11720-0.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Eduardo S. B. de Oliveira.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

de Oliveira, E.S.B., de Castro, M., Bayes, C.L. et al. Bayesian quantile regression models for heavy tailed bounded variables using the No-U-Turn sampler. Comput Stat (2022). https://doi.org/10.1007/s00180-022-01297-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s00180-022-01297-2

Keywords

Navigation