Flexible regression modeling for censored data based on mixtures of student-t distributions

Lachos, Víctor H.; Cabral, Celso R. B.; Prates, Marcos O.; Dey, Dipak K.

doi:10.1007/s00180-018-0856-1

Flexible regression modeling for censored data based on mixtures of student-t distributions

Original Paper
Published: 03 December 2018

Volume 34, pages 123–152, (2019)
Cite this article

Computational Statistics Aims and scope Submit manuscript

Víctor H. Lachos¹,
Celso R. B. Cabral ORCID: orcid.org/0000-0001-6776-6690²,
Marcos O. Prates³ &
…
Dipak K. Dey¹

764 Accesses
6 Citations
Explore all metrics

Abstract

In some applications of censored regression models, the distribution of the error terms departs significantly from normality, for instance, in the presence of heavy tails, skewness and/or atypical observation. In this paper we extend the censored linear regression model with normal errors to the case where the random errors follow a finite mixture of Student-t distributions. This approach allows us to model data with great flexibility, accommodating multimodality, heavy tails and also skewness depending on the structure of the mixture components. We develop an analytically tractable and efficient EM-type algorithm for iteratively computing maximum likelihood estimates of the parameters, with standard errors as a by-product. The algorithm has closed-form expressions at the E-step, that rely on formulas for the mean and variance of the truncated Student-t distributions. The efficacy of the method is verified through the analysis of simulated and real datasets. The proposed algorithm and methods are implemented in the new R package \(\texttt {CensMixReg}\).

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Linear censored regression models with scale mixtures of normal distributions

Article 11 June 2015

Finite mixture of regression models for censored data based on scale mixtures of normal distributions

Article 24 August 2018

Finite mixture modeling of censored and missing data using the multivariate skew-normal distribution

Article 17 June 2021

References

Akaike H (1974) A new look at the statistical model identification. Autom Control IEEE Trans 19:716–723
Article MathSciNet MATH Google Scholar
Andrews DF, Mallows CL (1974) Scale mixtures of normal distributions. J R Stat Soc Ser B 36:99–102
MathSciNet MATH Google Scholar
Andrews JL, McNicholas PD (2011) Extending mixtures of multivariate t-factor analyzers. Stat Comput 21:361–373
Article MathSciNet MATH Google Scholar
Arellano-Valle R, Castro L, González-Farías G, Muñoz-Gajardo K (2012) Student-t censored regression model: properties and inference. Stat Methods Appl 21:453–473
Article MathSciNet MATH Google Scholar
Bai Z, Krishnaiah P, Zhao L (1989) On rates of convergence of efficient detection criteria in signal processing with white noise. Inform Theory IEEE Trans 35:380–388
Article MathSciNet MATH Google Scholar
Basford K, Greenway D, McLachlan G, Peel D (1997) Standard errors of fitted component means of normal mixtures. Comput Stat 12:1–18
MATH Google Scholar
Basso RM, Lachos VH, Cabral CRB, Ghosh P (2010) Robust mixture modeling based on scale mixtures of skew-normal distributions. Comput Stat Data Anal 54:2926–2941
Article MathSciNet MATH Google Scholar
Cabral CRB, Lachos VH, Prates MO (2012) Multivariate mixture modeling using skew-normal independent distributions. Comput Stat Data Anal 56:126–142
Article MathSciNet MATH Google Scholar
Caudill SB (2012) A partially adaptive estimator for the censored regression model based on a mixture of normal distributions. Stat Methods Appl 21:121–137
Article MathSciNet Google Scholar
Chib S (1992) Bayes inference in the Tobit censored regression model. J Econ 51:79–99
Article MathSciNet MATH Google Scholar
Dempster A, Laird N, Rubin D (1977) Maximum likelihood from incomplete data via the EM algorithm. J R Stat Soc Ser B 39:1–38
MathSciNet MATH Google Scholar
Efron B, Tibshirani R (1986) Bootstrap methods for standard errors, confidence intervals, and other measures of statistical accuracy. Stat Sci, pp 54–75
Galimberti G, Soffritti G (2014) A multivariate linear regression analysis using finite mixtures of t distributions. Comput Stat Data Anal 71:138–150
Article MathSciNet MATH Google Scholar
Garay AM, Bolfarine H, Lachos VH, Cabral CRB (2015) Bayesian analysis of censored linear regression models with scale mixtures of normal distributions. J Appl Stat 42:2694–2714
Article MathSciNet Google Scholar
Garay AM, Lachos VH, Bolfarine H, Cabral CR (2017) Linear censored regression models with scale mixtures of normal distributions. Stat Pap 58:247–278
Article MathSciNet MATH Google Scholar
Hastie T, Tibshirani R, Friedman J (2013) The elements of statistical learning. Springer, New York
MATH Google Scholar
Karlis D, Santourian A (2008) Model-based clustering with non-elliptically contoured distributions. Stat Comput 19:73–83
Article MathSciNet Google Scholar
Karlsson M, Laitila T (2014) Finite mixture modeling of censored regression models. Stat Pap 55:627–642
Article MathSciNet MATH Google Scholar
Kim H-J (2008) Moments of truncated student-t distribution. J Korean Stat Soc 37:81–87
Article MathSciNet MATH Google Scholar
Lachos VH, Ghosh P, Arellano-Valle RB (2010) Likelihood based inference for skew-normal independent linear mixed models. Stat Sin 20:303–322
MathSciNet MATH Google Scholar
Lange KL, Little R, Taylor J (1989) Robust statistical modeling using t distribution. J Am Stat Assoc 84:881–896
MathSciNet Google Scholar
Liu C, Rubin DB (1994) The ECME algorithm: a simple extension of EM and ECM with faster monotone convergence. Biometrika 80:267–278
MathSciNet MATH Google Scholar
Maehara RP, Sanchez LB (2016)BSSN: Birnbaum-saunders model based on skew-normal distribution. R package version 0.7
Massuia MB, Cabral CRB, Matos LA, Lachos VH (2015) Influence diagnostics for Student-t censored linear regression models. Statistics 49:1074–1094
Article MathSciNet MATH Google Scholar
Matos LA, Prates MO, Chen M-H, Lachos VH (2013) Likelihood-based inference for mixed-effects models with censored response using the multivariate-t distribution. Stat Sin 23:1323–1345
MathSciNet MATH Google Scholar
McLachlan GJ, Krishnan T (2008) The EM algorithm and extensions. Wiley, New Jersey
Book MATH Google Scholar
Meng X, Rubin DB (1993) Maximum likelihood estimation via the ECM algorithm: a general framework. Biometrika 81:633–648
MathSciNet MATH Google Scholar
Mroz TA (1987) The sensitivity of an empirical model of married women’s hours of work to economic and statistical assumptions. Econometrica, pp 765–799
Powell JL (1984) Least absolute deviations estimation for the censored regression model. J Econ 25:303–325
Article MathSciNet MATH Google Scholar
Powell JL (1986) Symmetrically trimmed least squares estimation for Tobit models. Econometrica 54:1435–1460
Article MathSciNet MATH Google Scholar
R Core Team (2018) R: a language and environment for statistical computing. R foundation for statistical computing, Vienna, Austria
Rao CR (1973) Linear statistical inference and its applications, 2nd edn. Wiley, New York
Book MATH Google Scholar
Sanchez LB, Lachos VH, Moreno EJL (2017) CensMixReg: censored linear mixture regression models. R package version 3.0
Santana L, Vilca F, Leiva V (2011) Influence analysis in skew-Birnbaum Saunders regression models and applications. J Appl Stat 38:1633–1649
Article MathSciNet MATH Google Scholar
Schwarz G (1978) Estimating the dimension of a model. Ann Stat 6:461–464
Article MathSciNet MATH Google Scholar
Wei GCG, Tanner MA (1990) A Monte Carlo implementation of the EM algorithm and the poor man’s data augmentation algorithms. J Am Stat Assoc 85:699–704
Article Google Scholar
Wu L (2010) Mixed effects models for complex data. Chapman & Hall/CRC, Boca Raton
MATH Google Scholar

Download references

Acknowledgements

We are grateful to four anonymous referees, the editor and the associate editor for very useful comments and suggestions, which greatly improved this paper. This paper was written while Celso R. B. Cabral was a visiting professor in the Department of Statistics at the University of Campinas, Brazil. Celso R. B. Cabral was supported by CNPq (Grants 167731/2013-0 and 447964/2014-3), and FAPESP-Brazil (Grant 2015/20922-5). V.H. Lachos acknowledges support from FAPESP-Brazil (Grant 2018/05013-7). M.O. Prates was supported by CNPq-Brazil (Grant PQ-305401/2017-7) and FAPEMIG-Brazil (Grant PPM-00532-16). We also thank Luis B. Sanchez from University of São Paulo for his help on an earlier version of the article.

Author information

Authors and Affiliations

Department of Statistics, University of Connecticut, Storrs, CT, 06269, USA
Víctor H. Lachos & Dipak K. Dey
Departamento de Estatística, Universidade Federal do Amazonas, Av. General Rodrigo Octávio, 6200, Coroado I, CEP 69080-900, Manaus, Amazonas, Brazil
Celso R. B. Cabral
Departamento de Estatística, Universidade Federal de Minas Gerais, Belo Horizonte, Brazil
Marcos O. Prates

Authors

Víctor H. Lachos
View author publications
You can also search for this author in PubMed Google Scholar
Celso R. B. Cabral
View author publications
You can also search for this author in PubMed Google Scholar
Marcos O. Prates
View author publications
You can also search for this author in PubMed Google Scholar
Dipak K. Dey
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Celso R. B. Cabral.

Appendix: Simulation Study 2: bias and RMSE with data generated by the FM-NCR model

See Tables 7 and 8.

Table 7 Simulation Study 2: Bias of estimates with data generated by the FM-NCR model

Full size table

Table 8 Simulation Study 2: Root mean squared errors (RMSE) of estimates with data generated by the FM-NCR model

Full size table

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lachos, V.H., Cabral, C.R.B., Prates, M.O. et al. Flexible regression modeling for censored data based on mixtures of student-t distributions. Comput Stat 34, 123–152 (2019). https://doi.org/10.1007/s00180-018-0856-1

Download citation

Received: 15 February 2016
Accepted: 28 November 2018
Published: 03 December 2018
Issue Date: 05 March 2019
DOI: https://doi.org/10.1007/s00180-018-0856-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Flexible regression modeling for censored data based on mixtures of student-t distributions

Abstract

Access this article

Similar content being viewed by others

Linear censored regression models with scale mixtures of normal distributions

Finite mixture of regression models for censored data based on scale mixtures of normal distributions

Finite mixture modeling of censored and missing data using the multivariate skew-normal distribution

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Appendix: Simulation Study 2: bias and RMSE with data generated by the FM-NCR model

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Flexible regression modeling for censored data based on mixtures of student-t distributions

Abstract

Access this article

Similar content being viewed by others

Linear censored regression models with scale mixtures of normal distributions

Finite mixture of regression models for censored data based on scale mixtures of normal distributions

Finite mixture modeling of censored and missing data using the multivariate skew-normal distribution

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Appendix: Simulation Study 2: bias and RMSE with data generated by the FM-NCR model

Appendix: Simulation Study 2: bias and RMSE with data generated by the FM-NCR model

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation