Skip to main content
Log in

Log-symmetric regression models under the presence of non-informative left- or right-censored observations

  • Original Paper
  • Published:
TEST Aims and scope Submit manuscript

Abstract

In this paper, an extension to allow the presence of non-informative left- or right-censored observations in log-symmetric regression models is addressed. Under such models, the log-lifetime distribution belongs to the symmetric class and its location and scale parameters are described by semi-parametric functions of explanatory variables, whose nonparametric components are approximated using natural cubic splines or P-splines. An iterative process of parameter estimation by the maximum penalized likelihood method is presented. The large sample properties of the maximum penalized likelihood estimators are studied analytically and by simulation experiments. Diagnostic methods such as deviance-type residuals and local influence measures are derived. The package ssym, which includes an implementation in the computational environment R of the methodology addressed in this paper, is also discussed. The proposed methodology is illustrated by the analysis of a real data set.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

References

  • Akaike H (1973) Information theory and an extension of the maximum likelihood principle. In: Petrov BN, Csaki F (eds) Second internacional symposium on information theory. Akademiai Kiado, Budapest, pp 267–281

    Google Scholar 

  • Bagdonavičius V, Nikulin M (2001) Accelerated life models. Modeling and statistical analysis. Chapman & Hall, Boca Raton

    Book  MATH  Google Scholar 

  • Barros M, Paula GA, Leiva V (2008) A new class of survival regression models with heavy-tailed errors: robustness and diagnostics. Lifetime Data Anal 14:316–332

    Article  MathSciNet  MATH  Google Scholar 

  • Billingsley P (1961) Statistical inference for Markov processes. The University of Chicago Press, Chicago

    MATH  Google Scholar 

  • Borgan Ø (1984) Maximum likelihood estimation in parametric counting process models, with applications to censored failure time data. Scand J Stat 11:1–16

    MathSciNet  MATH  Google Scholar 

  • Brostom G (2014) eha: event history analysis. R package version 2.4-2. http://CRAN.R-project.org/package=eha

  • Conover WJ (1971) Practical nonparametric statistics. Wiley, New York

    Google Scholar 

  • Cook RD (1986) Assessment of local influence (with discussion). J R Stat Soc B Methodol 48:133–169

    MATH  Google Scholar 

  • Davison AC, Gigli A (1989) Deviance residual and normal scores plots. Biometrika 76:211–221

    Article  MATH  Google Scholar 

  • Eilers PHC, Marx BD (1996) Flexible smoothing with B-splines and penalties. Stat Sci 11:89–121

    Article  MathSciNet  MATH  Google Scholar 

  • Fang KT, Kotz S, Ng KW (1990) Symmetric multivariate and related distributions. Chapman & Hall, London

    Book  MATH  Google Scholar 

  • Green PJ, Silverman BW (1994) Nonparametric regression and generalized linear models. Chapman & Hall, London

    Book  MATH  Google Scholar 

  • Harrell FE (2015) rms: regression modeling strategies. R package version 4.3-1. http://CRAN.R-project.org/package=rms

  • Jackson C (2015) flexsurv: flexible parametric survival and multi-state models. R package version 0.6. http://CRAN.R-project.org/package=flexsurv

  • Kalbfleisch JD, Prentice RL (2002) The statistical analysis of failure time data. Wiley, New York

    Book  MATH  Google Scholar 

  • Kaplan EL, Meier P (1958) Nonparametric estimation from incomplete observations. J Am Stat Assoc 53:457–481

    Article  MathSciNet  MATH  Google Scholar 

  • Klein JP, Moeschberger ML (1997) Survival analysis. Springer, New York

    Book  MATH  Google Scholar 

  • Kneib T (2013) Rejoinder. Stat Model 13:373–385

    Article  MathSciNet  Google Scholar 

  • Lancaster P, Salkauskas K (1986) Curve and surface fitting: an introduction. Academic Press, London

    MATH  Google Scholar 

  • Marshall AW, Olkin I (2007) Life distributions. Springer, New York

    MATH  Google Scholar 

  • Ortega JM, Rheinboldt WC (1970) Iterative solution of nonlinear equations in several variables. Academic Press, New York

    MATH  Google Scholar 

  • Paula GA, Leiva V, Barros M, Liu S (2012) Robust statistical modeling using Birnbaum–Saunders-\(t\) distribution applied to insurance. Appl Stoch Model Bus 28:16–34

    Article  MathSciNet  MATH  Google Scholar 

  • Pierce DA, Shafer DW (1996) Residuals in generalized linear models. J Am Stat Assoc 81:977–986

    Article  MathSciNet  Google Scholar 

  • Poon WY, Poon YS (1999) Conformal normal curvature and assessment of local influence. J R Stat Soc B Methodol 61:51–61

    Article  MathSciNet  MATH  Google Scholar 

  • R Core Team (2016) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. http://www.R-project.org/

  • Rieck JR, Nedelman JR (1991) A log-linear model for the Birnbaum–Saunders distribution. Technometrics 33:51–60

    MATH  Google Scholar 

  • Rigby RA, Stasinopoulos DM (2005) Generalized additive models for location, scale and shape. J Appl Stat 54:507–554

    MathSciNet  MATH  Google Scholar 

  • Rigby RA, Stasinopoulos DM (2007) Generalized additive models for location, scale and shape (GAMLSS) in R. J Stat Softw 23:1–46

    Google Scholar 

  • Rigby RA, Stasinopoulos DM (2016) gamlss: generalized additive models for location scale and shape. R package version 4.2-8. http://CRAN.R-project.org/package=gamlss

  • Schwarz GE (1978) Estimating the dimension of a model. Ann Stat 6:461–464

    Article  MathSciNet  MATH  Google Scholar 

  • Therneau T (2014) survival: a package for survival analysis in S. R package version 2.38-1. http://CRAN.R-project.org/package=survival

  • Vanegas LH, Cysneiros FJA (2010) Assessment of diagnostic procedures in symmetrical nonlinear regression models. Comput Stat Data Anal 54:1002–1016

    Article  MathSciNet  MATH  Google Scholar 

  • Vanegas LH, Paula GA (2016a) An extension of log-symmetric models: R codes and applications. J Stat Comput Simul 86:1709–1735

    Article  MathSciNet  Google Scholar 

  • Vanegas LH, Paula GA (2016b) Log-symmetric distributions: statistical properties and parameter estimation. Braz J Probab Stat 30:196–220

    Article  MathSciNet  MATH  Google Scholar 

  • Vanegas LH, Paula GA (2016c) ssym: fitting semiparametric symmetric regression models. R package version 1.5-2. http://CRAN.R-project.org/package=ssym

  • Waller LA, Turnbull BW (1992) Probability plotting with censored data. Am Stat 46:5–12

    Google Scholar 

  • Wood SN (2006) Generalized additive models: an introduction with R. Chapman & Hall, Boca Raton

    MATH  Google Scholar 

  • Wu C, Yu Y (2014) Partially linear modeling of conditional quantiles using penalized splines. Comput Stat Data Anal 77:170–187

    Article  MathSciNet  Google Scholar 

  • Yu Y, Ruppert D (2002) Penalized spline estimation for partially linear single-index models. J Am Stat Assoc 97:1042–1054

    Article  MathSciNet  MATH  Google Scholar 

  • Zou Y, Zhang J, Qinb G (2011) A semiparametric accelerated failure time partial linear model and its application to breast cancer. Comput Stat Data Anal 55:1479–1487

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

The authors are grateful to the reviewers for their helpful comments and suggestions.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Luis Hernando Vanegas.

Appendix

Appendix

Theorem 2

Under Conditions (1)–(4) the maximum penalized likelihood estimator of \({\varvec{\theta }}\) is consistent and

$$\begin{aligned} \Bigl [-\mathbf{J}\bigl ({\varvec{\theta }}^{^{[0]}}\bigr )\Bigr ]^{ -\frac{1}{2}} \Bigl [-\mathbf{J}\bigl ({\varvec{\theta }}^{^{[0]}}\bigr )+\mathbf{M}\Bigr ] \Bigl (\hat{{\varvec{\theta }}} - {\varvec{\theta }}^{^{[0]}}\Bigr ) \xrightarrow [n \rightarrow \infty ]{\mathcal {D}} \mathcal {N}(\mathbf{0},\mathbf{I}). \end{aligned}$$

Proof

By a Taylor series expansion of the score function of \({\varvec{\theta }}{ around}{\varvec{\theta }}^{^{[0]}}{} \) it is possible to write

$$\begin{aligned} \dfrac{\partial \mathsf{PL}({{\varvec{\theta }}})}{\partial {\varvec{\theta }}}(\hat{{\varvec{\theta }}})&=\mathbf{U}\bigl ({\varvec{\theta }}^{^{[0]}}\bigr )-\mathbf{M}{\varvec{\theta }}^{^{[0]}}+\left[ \mathbf{J}({\varvec{\theta }}^{^{[0]}}\bigr )-\mathbf{M}\right] \left( \hat{{\varvec{\theta }}}-{\varvec{\theta }}^{^{[0]}}\right) \\&\quad +\;\frac{1}{2}\sum \limits _{l=1}\sum \limits _{l'=1} \left( \hat{\theta }_l-\theta ^{^{[0]}}_{l}\right) \left( \hat{\theta }_{l'}-\theta ^{^{[0]}}_{l'}\right) \dfrac{\partial {\mathbf{J}_{{ ll'}} ({\varvec{\theta }})}}{{\partial {\varvec{\theta }}^{\top }}}({\varvec{\theta }}^{^*}), \end{aligned}$$

where \({\varvec{\theta }}^{^*}{} { isonthelinesegmentjoining}\hat{{\varvec{\theta }}}{} { and}{\varvec{\theta }}^{^{[0]}},\,{ and}\partial {\mathbf{J}_{{ ll'}} ({\varvec{\theta }})}{} { isthe}(l,l'){ thelementof}{\mathbf{J}({\varvec{\theta }})}{} \). From Conditions 1, 2, 3 (expressions (a) and (b)) and 4, it follows that

  1. (1)

    \(n^{-1}\left[ \mathbf{U}\bigl ({\varvec{\theta }}^{^{[0]}}\bigr )-\mathbf{M}{\varvec{\theta }}^{^{[0]}}\right] \xrightarrow [n \rightarrow \infty ]{\mathcal {P}}\mathbf{0}\) and

  2. (2)

    \(n^{-1}\left[ \mathbf{J}\bigl ({\varvec{\theta }}^{^{[0]}}\bigr )-\mathbf{M}\right] \xrightarrow [n \rightarrow \infty ]{\mathcal {P}}-{\varvec{\varOmega }}\bigl ({\varvec{\theta }}^{^{[0]}}\bigr ).\)

Therefore, from 1 and 2 and Condition 3 (expression (e)), and using the argument in Billingsley (1961, page 12), it follows that \(\hat{{\varvec{\theta }}}{} { isaconsistentestimatorof}{\varvec{\theta }}^{^{[0]}}.{ Moreover},\,{ byanotherTaylorseriesexpansionofthescorefunctionof}\hat{{\varvec{\theta }}}{} { around}{\varvec{\theta }}^{^{[0]}}{} \), it is possible to write

$$\begin{aligned} \dfrac{\partial \mathsf{PL}({\varvec{\theta }})}{\partial {\varvec{\theta }}}(\hat{{\varvec{\theta }}})=\mathbf{0}=\mathbf{U}\bigl ({\varvec{\theta }}^{^{[0]}}\bigr )-\mathbf{M}{\varvec{\theta }}^{^{[0]}}+ \left[ \mathbf{J}({\varvec{\theta }}^*\bigr )-\mathbf{M}\right] \left( \hat{{\varvec{\theta }}}-{\varvec{\theta }}^{^{[0]}}\right) , \end{aligned}$$

where \({\varvec{\theta }}^*{ isonthelinesegmentjoining}\hat{{\varvec{\theta }}}{} { and}{\varvec{\theta }}^{^{[0]}}{} \). By rearranging this expression, we arrive at

$$\begin{aligned} \left[ -\mathbf{J}\bigl ({\varvec{\theta }}^{^{[0]}}\bigr )\right] ^{-\frac{1}{2}} \left[ -\mathbf{J}\bigl ({\varvec{\theta }}^*\bigr )+\mathbf{M}\right] \left( \hat{{\varvec{\theta }}}-{\varvec{\theta }}^{^{[0]}}\right) =\left[ -\mathbf{J}\bigl ({\varvec{\theta }}^{^{[0]}}\bigr )\right] ^{-\frac{1}{2}} \left[ \mathbf{U}\bigl ({\varvec{\theta }}^{^{[0]}}\bigr )-\mathbf{M}{\varvec{\theta }}^{^{[0]}}\right] . \end{aligned}$$
(5)

From Conditions 1, 2, 3 (expressions (c) and (d)) and 4, it follows that

  1. (3)

    \(\left[ -\mathbf{J}\bigl ({\varvec{\theta }}^{^{[0]}}\bigr )\right] ^{-\frac{1}{2}} \left[ \mathbf{U}\bigl ({\varvec{\theta }}^{^{[0]}}\bigr )-\mathbf{M}{\varvec{\theta }}^{^{[0]}}\right] \xrightarrow [n \rightarrow \infty ]{\mathcal {D}}\mathcal {N}(\mathbf{0},\mathbf{I})\)      and

  2. (4)

    \(n^{-1}\left[ \mathbf{J}\bigl ({\varvec{\theta }}^*\bigr )-\mathbf{M}\right] \xrightarrow [n \rightarrow \infty ]{\mathcal {P}}-{\varvec{\varOmega }}\bigl ({\varvec{\theta }}^{^{[0]}}\bigr ).\)

Therefore, from 3 and 4 and (5), and using the Slutsky theorem, it follows that

$$\begin{aligned} \Bigl [-\mathbf{J}\bigl ({\varvec{\theta }}^{^{[0]}}\bigr )\Bigr ]^{ -\frac{1}{2}} \Bigl [-\mathbf{J}\bigl ({\varvec{\theta }}^{^{[0]}}\bigr )+\mathbf{M}\Bigr ] \Bigl (\hat{{\varvec{\theta }}} - {\varvec{\theta }}^{^{[0]}}\Bigr ) \xrightarrow [n \rightarrow \infty ]{\mathcal {D}} \mathcal {N}(\mathbf{0},\mathbf{I}). \end{aligned}$$

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Vanegas, L.H., Paula, G.A. Log-symmetric regression models under the presence of non-informative left- or right-censored observations. TEST 26, 405–428 (2017). https://doi.org/10.1007/s11749-016-0517-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11749-016-0517-z

Keywords

Mathematics Subject Classification

Navigation