Log-symmetric regression models under the presence of non-informative left- or right-censored observations

Vanegas, Luis Hernando; Paula, Gilberto A.

doi:10.1007/s11749-016-0517-z

Log-symmetric regression models under the presence of non-informative left- or right-censored observations

Original Paper
Published: 08 December 2016

Volume 26, pages 405–428, (2017)
Cite this article

TEST Aims and scope Submit manuscript

Luis Hernando Vanegas¹ &
Gilberto A. Paula²

295 Accesses
13 Citations
Explore all metrics

Abstract

In this paper, an extension to allow the presence of non-informative left- or right-censored observations in log-symmetric regression models is addressed. Under such models, the log-lifetime distribution belongs to the symmetric class and its location and scale parameters are described by semi-parametric functions of explanatory variables, whose nonparametric components are approximated using natural cubic splines or P-splines. An iterative process of parameter estimation by the maximum penalized likelihood method is presented. The large sample properties of the maximum penalized likelihood estimators are studied analytically and by simulation experiments. Diagnostic methods such as deviance-type residuals and local influence measures are derived. The package ssym, which includes an implementation in the computational environment R of the methodology addressed in this paper, is also discussed. The proposed methodology is illustrated by the analysis of a real data set.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Likelihood analysis and stochastic EM algorithm for left truncated right censored data and associated model selection from the Lehmann family of life distributions

Article 21 March 2021

On Some Goodness-of-Fit Tests and Their Connection to Graphical Methods with Uncensored and Censored Data

Trimmed and winsorized semiparametric estimator for left-truncated and right-censored regression models

Article 28 October 2014

References

Akaike H (1973) Information theory and an extension of the maximum likelihood principle. In: Petrov BN, Csaki F (eds) Second internacional symposium on information theory. Akademiai Kiado, Budapest, pp 267–281
Google Scholar
Bagdonavičius V, Nikulin M (2001) Accelerated life models. Modeling and statistical analysis. Chapman & Hall, Boca Raton
Book MATH Google Scholar
Barros M, Paula GA, Leiva V (2008) A new class of survival regression models with heavy-tailed errors: robustness and diagnostics. Lifetime Data Anal 14:316–332
Article MathSciNet MATH Google Scholar
Billingsley P (1961) Statistical inference for Markov processes. The University of Chicago Press, Chicago
MATH Google Scholar
Borgan Ø (1984) Maximum likelihood estimation in parametric counting process models, with applications to censored failure time data. Scand J Stat 11:1–16
MathSciNet MATH Google Scholar
Brostom G (2014) eha: event history analysis. R package version 2.4-2. http://CRAN.R-project.org/package=eha
Conover WJ (1971) Practical nonparametric statistics. Wiley, New York
Google Scholar
Cook RD (1986) Assessment of local influence (with discussion). J R Stat Soc B Methodol 48:133–169
MATH Google Scholar
Davison AC, Gigli A (1989) Deviance residual and normal scores plots. Biometrika 76:211–221
Article MATH Google Scholar
Eilers PHC, Marx BD (1996) Flexible smoothing with B-splines and penalties. Stat Sci 11:89–121
Article MathSciNet MATH Google Scholar
Fang KT, Kotz S, Ng KW (1990) Symmetric multivariate and related distributions. Chapman & Hall, London
Book MATH Google Scholar
Green PJ, Silverman BW (1994) Nonparametric regression and generalized linear models. Chapman & Hall, London
Book MATH Google Scholar
Harrell FE (2015) rms: regression modeling strategies. R package version 4.3-1. http://CRAN.R-project.org/package=rms
Jackson C (2015) flexsurv: flexible parametric survival and multi-state models. R package version 0.6. http://CRAN.R-project.org/package=flexsurv
Kalbfleisch JD, Prentice RL (2002) The statistical analysis of failure time data. Wiley, New York
Book MATH Google Scholar
Kaplan EL, Meier P (1958) Nonparametric estimation from incomplete observations. J Am Stat Assoc 53:457–481
Article MathSciNet MATH Google Scholar
Klein JP, Moeschberger ML (1997) Survival analysis. Springer, New York
Book MATH Google Scholar
Kneib T (2013) Rejoinder. Stat Model 13:373–385
Article MathSciNet Google Scholar
Lancaster P, Salkauskas K (1986) Curve and surface fitting: an introduction. Academic Press, London
MATH Google Scholar
Marshall AW, Olkin I (2007) Life distributions. Springer, New York
MATH Google Scholar
Ortega JM, Rheinboldt WC (1970) Iterative solution of nonlinear equations in several variables. Academic Press, New York
MATH Google Scholar
Paula GA, Leiva V, Barros M, Liu S (2012) Robust statistical modeling using Birnbaum–Saunders-$t$ distribution applied to insurance. Appl Stoch Model Bus 28:16–34
Article MathSciNet MATH Google Scholar
Pierce DA, Shafer DW (1996) Residuals in generalized linear models. J Am Stat Assoc 81:977–986
Article MathSciNet Google Scholar
Poon WY, Poon YS (1999) Conformal normal curvature and assessment of local influence. J R Stat Soc B Methodol 61:51–61
Article MathSciNet MATH Google Scholar
R Core Team (2016) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. http://www.R-project.org/
Rieck JR, Nedelman JR (1991) A log-linear model for the Birnbaum–Saunders distribution. Technometrics 33:51–60
MATH Google Scholar
Rigby RA, Stasinopoulos DM (2005) Generalized additive models for location, scale and shape. J Appl Stat 54:507–554
MathSciNet MATH Google Scholar
Rigby RA, Stasinopoulos DM (2007) Generalized additive models for location, scale and shape (GAMLSS) in R. J Stat Softw 23:1–46
Google Scholar
Rigby RA, Stasinopoulos DM (2016) gamlss: generalized additive models for location scale and shape. R package version 4.2-8. http://CRAN.R-project.org/package=gamlss
Schwarz GE (1978) Estimating the dimension of a model. Ann Stat 6:461–464
Article MathSciNet MATH Google Scholar
Therneau T (2014) survival: a package for survival analysis in S. R package version 2.38-1. http://CRAN.R-project.org/package=survival
Vanegas LH, Cysneiros FJA (2010) Assessment of diagnostic procedures in symmetrical nonlinear regression models. Comput Stat Data Anal 54:1002–1016
Article MathSciNet MATH Google Scholar
Vanegas LH, Paula GA (2016a) An extension of log-symmetric models: R codes and applications. J Stat Comput Simul 86:1709–1735
Article MathSciNet Google Scholar
Vanegas LH, Paula GA (2016b) Log-symmetric distributions: statistical properties and parameter estimation. Braz J Probab Stat 30:196–220
Article MathSciNet MATH Google Scholar
Vanegas LH, Paula GA (2016c) ssym: fitting semiparametric symmetric regression models. R package version 1.5-2. http://CRAN.R-project.org/package=ssym
Waller LA, Turnbull BW (1992) Probability plotting with censored data. Am Stat 46:5–12
Google Scholar
Wood SN (2006) Generalized additive models: an introduction with R. Chapman & Hall, Boca Raton
MATH Google Scholar
Wu C, Yu Y (2014) Partially linear modeling of conditional quantiles using penalized splines. Comput Stat Data Anal 77:170–187
Article MathSciNet Google Scholar
Yu Y, Ruppert D (2002) Penalized spline estimation for partially linear single-index models. J Am Stat Assoc 97:1042–1054
Article MathSciNet MATH Google Scholar
Zou Y, Zhang J, Qinb G (2011) A semiparametric accelerated failure time partial linear model and its application to breast cancer. Comput Stat Data Anal 55:1479–1487
Article MathSciNet MATH Google Scholar

Download references

Acknowledgements

The authors are grateful to the reviewers for their helpful comments and suggestions.

Author information

Authors and Affiliations

Departamento de Estadística, Universidad Nacional de Colombia, Bogotá, Colombia
Luis Hernando Vanegas
Instituto de Matemática e Estatística, Universidade de São Paulo, São Paulo, Brazil
Gilberto A. Paula

Authors

Luis Hernando Vanegas
View author publications
You can also search for this author in PubMed Google Scholar
Gilberto A. Paula
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Luis Hernando Vanegas.

Appendix

Theorem 2

Under Conditions (1)–(4) the maximum penalized likelihood estimator of ${\varvec{\theta }}$ is consistent and

$$\begin{aligned} \Bigl [-\mathbf{J}\bigl ({\varvec{\theta }}^{^{[0]}}\bigr )\Bigr ]^{ -\frac{1}{2}} \Bigl [-\mathbf{J}\bigl ({\varvec{\theta }}^{^{[0]}}\bigr )+\mathbf{M}\Bigr ] \Bigl (\hat{{\varvec{\theta }}} - {\varvec{\theta }}^{^{[0]}}\Bigr ) \xrightarrow [n \rightarrow \infty ]{\mathcal {D}} \mathcal {N}(\mathbf{0},\mathbf{I}). \end{aligned}$$

Proof

By a Taylor series expansion of the score function of ${\varvec{\theta }}{ around}{\varvec{\theta }}^{^{[0]}}{} $ it is possible to write

$$\begin{aligned} \dfrac{\partial \mathsf{PL}({{\varvec{\theta }}})}{\partial {\varvec{\theta }}}(\hat{{\varvec{\theta }}})&=\mathbf{U}\bigl ({\varvec{\theta }}^{^{[0]}}\bigr )-\mathbf{M}{\varvec{\theta }}^{^{[0]}}+\left[ \mathbf{J}({\varvec{\theta }}^{^{[0]}}\bigr )-\mathbf{M}\right] \left( \hat{{\varvec{\theta }}}-{\varvec{\theta }}^{^{[0]}}\right) \\&\quad +\;\frac{1}{2}\sum \limits _{l=1}\sum \limits _{l'=1} \left( \hat{\theta }_l-\theta ^{^{[0]}}_{l}\right) \left( \hat{\theta }_{l'}-\theta ^{^{[0]}}_{l'}\right) \dfrac{\partial {\mathbf{J}_{{ ll'}} ({\varvec{\theta }})}}{{\partial {\varvec{\theta }}^{\top }}}({\varvec{\theta }}^{^*}), \end{aligned}$$

where ${\varvec{\theta }}^{^*}{} { isonthelinesegmentjoining}\hat{{\varvec{\theta }}}{} { and}{\varvec{\theta }}^{^{[0]}},\,{ and}\partial {\mathbf{J}_{{ ll'}} ({\varvec{\theta }})}{} { isthe}(l,l'){ thelementof}{\mathbf{J}({\varvec{\theta }})}{} $. From Conditions 1, 2, 3 (expressions (a) and (b)) and 4, it follows that

(1)
$n^{-1}\left[ \mathbf{U}\bigl ({\varvec{\theta }}^{^{[0]}}\bigr )-\mathbf{M}{\varvec{\theta }}^{^{[0]}}\right] \xrightarrow [n \rightarrow \infty ]{\mathcal {P}}\mathbf{0}$ and
(2)
$n^{-1}\left[ \mathbf{J}\bigl ({\varvec{\theta }}^{^{[0]}}\bigr )-\mathbf{M}\right] \xrightarrow [n \rightarrow \infty ]{\mathcal {P}}-{\varvec{\varOmega }}\bigl ({\varvec{\theta }}^{^{[0]}}\bigr ).$

Therefore, from 1 and 2 and Condition 3 (expression (e)), and using the argument in Billingsley (1961, page 12), it follows that $\hat{{\varvec{\theta }}}{} { isaconsistentestimatorof}{\varvec{\theta }}^{^{[0]}}.{ Moreover},\,{ byanotherTaylorseriesexpansionofthescorefunctionof}\hat{{\varvec{\theta }}}{} { around}{\varvec{\theta }}^{^{[0]}}{} $, it is possible to write

$$\begin{aligned} \dfrac{\partial \mathsf{PL}({\varvec{\theta }})}{\partial {\varvec{\theta }}}(\hat{{\varvec{\theta }}})=\mathbf{0}=\mathbf{U}\bigl ({\varvec{\theta }}^{^{[0]}}\bigr )-\mathbf{M}{\varvec{\theta }}^{^{[0]}}+ \left[ \mathbf{J}({\varvec{\theta }}^*\bigr )-\mathbf{M}\right] \left( \hat{{\varvec{\theta }}}-{\varvec{\theta }}^{^{[0]}}\right) , \end{aligned}$$

where ${\varvec{\theta }}^*{ isonthelinesegmentjoining}\hat{{\varvec{\theta }}}{} { and}{\varvec{\theta }}^{^{[0]}}{} $. By rearranging this expression, we arrive at

$$\begin{aligned} \left[ -\mathbf{J}\bigl ({\varvec{\theta }}^{^{[0]}}\bigr )\right] ^{-\frac{1}{2}} \left[ -\mathbf{J}\bigl ({\varvec{\theta }}^*\bigr )+\mathbf{M}\right] \left( \hat{{\varvec{\theta }}}-{\varvec{\theta }}^{^{[0]}}\right) =\left[ -\mathbf{J}\bigl ({\varvec{\theta }}^{^{[0]}}\bigr )\right] ^{-\frac{1}{2}} \left[ \mathbf{U}\bigl ({\varvec{\theta }}^{^{[0]}}\bigr )-\mathbf{M}{\varvec{\theta }}^{^{[0]}}\right] . \end{aligned}$$

(5)

From Conditions 1, 2, 3 (expressions (c) and (d)) and 4, it follows that

(3)
$\left[ -\mathbf{J}\bigl ({\varvec{\theta }}^{^{[0]}}\bigr )\right] ^{-\frac{1}{2}} \left[ \mathbf{U}\bigl ({\varvec{\theta }}^{^{[0]}}\bigr )-\mathbf{M}{\varvec{\theta }}^{^{[0]}}\right] \xrightarrow [n \rightarrow \infty ]{\mathcal {D}}\mathcal {N}(\mathbf{0},\mathbf{I})$ and
(4)
$n^{-1}\left[ \mathbf{J}\bigl ({\varvec{\theta }}^*\bigr )-\mathbf{M}\right] \xrightarrow [n \rightarrow \infty ]{\mathcal {P}}-{\varvec{\varOmega }}\bigl ({\varvec{\theta }}^{^{[0]}}\bigr ).$

Therefore, from 3 and 4 and (5), and using the Slutsky theorem, it follows that

$$\begin{aligned} \Bigl [-\mathbf{J}\bigl ({\varvec{\theta }}^{^{[0]}}\bigr )\Bigr ]^{ -\frac{1}{2}} \Bigl [-\mathbf{J}\bigl ({\varvec{\theta }}^{^{[0]}}\bigr )+\mathbf{M}\Bigr ] \Bigl (\hat{{\varvec{\theta }}} - {\varvec{\theta }}^{^{[0]}}\Bigr ) \xrightarrow [n \rightarrow \infty ]{\mathcal {D}} \mathcal {N}(\mathbf{0},\mathbf{I}). \end{aligned}$$

Rights and permissions

Reprints and permissions

About this article

Cite this article

Vanegas, L.H., Paula, G.A. Log-symmetric regression models under the presence of non-informative left- or right-censored observations. TEST 26, 405–428 (2017). https://doi.org/10.1007/s11749-016-0517-z

Download citation

Received: 12 January 2016
Accepted: 25 November 2016
Published: 08 December 2016
Issue Date: June 2017
DOI: https://doi.org/10.1007/s11749-016-0517-z

Keywords

Mathematics Subject Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Log-symmetric regression models under the presence of non-informative left- or right-censored observations

Abstract

Access this article

Similar content being viewed by others

Likelihood analysis and stochastic EM algorithm for left truncated right censored data and associated model selection from the Lehmann family of life distributions

On Some Goodness-of-Fit Tests and Their Connection to Graphical Methods with Uncensored and Censored Data

Trimmed and winsorized semiparametric estimator for left-truncated and right-censored regression models

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Appendix

Theorem 2

Proof

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Navigation

Log-symmetric regression models under the presence of non-informative left- or right-censored observations

Abstract

Access this article

Similar content being viewed by others

Likelihood analysis and stochastic EM algorithm for left truncated right censored data and associated model selection from the Lehmann family of life distributions

On Some Goodness-of-Fit Tests and Their Connection to Graphical Methods with Uncensored and Censored Data

Trimmed and winsorized semiparametric estimator for left-truncated and right-censored regression models

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Appendix

Appendix

Theorem 2

Proof

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation