Abstract
In this paper, an extension to allow the presence of non-informative left- or right-censored observations in log-symmetric regression models is addressed. Under such models, the log-lifetime distribution belongs to the symmetric class and its location and scale parameters are described by semi-parametric functions of explanatory variables, whose nonparametric components are approximated using natural cubic splines or P-splines. An iterative process of parameter estimation by the maximum penalized likelihood method is presented. The large sample properties of the maximum penalized likelihood estimators are studied analytically and by simulation experiments. Diagnostic methods such as deviance-type residuals and local influence measures are derived. The package ssym, which includes an implementation in the computational environment R of the methodology addressed in this paper, is also discussed. The proposed methodology is illustrated by the analysis of a real data set.
Similar content being viewed by others
References
Akaike H (1973) Information theory and an extension of the maximum likelihood principle. In: Petrov BN, Csaki F (eds) Second internacional symposium on information theory. Akademiai Kiado, Budapest, pp 267–281
Bagdonavičius V, Nikulin M (2001) Accelerated life models. Modeling and statistical analysis. Chapman & Hall, Boca Raton
Barros M, Paula GA, Leiva V (2008) A new class of survival regression models with heavy-tailed errors: robustness and diagnostics. Lifetime Data Anal 14:316–332
Billingsley P (1961) Statistical inference for Markov processes. The University of Chicago Press, Chicago
Borgan Ø (1984) Maximum likelihood estimation in parametric counting process models, with applications to censored failure time data. Scand J Stat 11:1–16
Brostom G (2014) eha: event history analysis. R package version 2.4-2. http://CRAN.R-project.org/package=eha
Conover WJ (1971) Practical nonparametric statistics. Wiley, New York
Cook RD (1986) Assessment of local influence (with discussion). J R Stat Soc B Methodol 48:133–169
Davison AC, Gigli A (1989) Deviance residual and normal scores plots. Biometrika 76:211–221
Eilers PHC, Marx BD (1996) Flexible smoothing with B-splines and penalties. Stat Sci 11:89–121
Fang KT, Kotz S, Ng KW (1990) Symmetric multivariate and related distributions. Chapman & Hall, London
Green PJ, Silverman BW (1994) Nonparametric regression and generalized linear models. Chapman & Hall, London
Harrell FE (2015) rms: regression modeling strategies. R package version 4.3-1. http://CRAN.R-project.org/package=rms
Jackson C (2015) flexsurv: flexible parametric survival and multi-state models. R package version 0.6. http://CRAN.R-project.org/package=flexsurv
Kalbfleisch JD, Prentice RL (2002) The statistical analysis of failure time data. Wiley, New York
Kaplan EL, Meier P (1958) Nonparametric estimation from incomplete observations. J Am Stat Assoc 53:457–481
Klein JP, Moeschberger ML (1997) Survival analysis. Springer, New York
Kneib T (2013) Rejoinder. Stat Model 13:373–385
Lancaster P, Salkauskas K (1986) Curve and surface fitting: an introduction. Academic Press, London
Marshall AW, Olkin I (2007) Life distributions. Springer, New York
Ortega JM, Rheinboldt WC (1970) Iterative solution of nonlinear equations in several variables. Academic Press, New York
Paula GA, Leiva V, Barros M, Liu S (2012) Robust statistical modeling using Birnbaum–Saunders-\(t\) distribution applied to insurance. Appl Stoch Model Bus 28:16–34
Pierce DA, Shafer DW (1996) Residuals in generalized linear models. J Am Stat Assoc 81:977–986
Poon WY, Poon YS (1999) Conformal normal curvature and assessment of local influence. J R Stat Soc B Methodol 61:51–61
R Core Team (2016) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. http://www.R-project.org/
Rieck JR, Nedelman JR (1991) A log-linear model for the Birnbaum–Saunders distribution. Technometrics 33:51–60
Rigby RA, Stasinopoulos DM (2005) Generalized additive models for location, scale and shape. J Appl Stat 54:507–554
Rigby RA, Stasinopoulos DM (2007) Generalized additive models for location, scale and shape (GAMLSS) in R. J Stat Softw 23:1–46
Rigby RA, Stasinopoulos DM (2016) gamlss: generalized additive models for location scale and shape. R package version 4.2-8. http://CRAN.R-project.org/package=gamlss
Schwarz GE (1978) Estimating the dimension of a model. Ann Stat 6:461–464
Therneau T (2014) survival: a package for survival analysis in S. R package version 2.38-1. http://CRAN.R-project.org/package=survival
Vanegas LH, Cysneiros FJA (2010) Assessment of diagnostic procedures in symmetrical nonlinear regression models. Comput Stat Data Anal 54:1002–1016
Vanegas LH, Paula GA (2016a) An extension of log-symmetric models: R codes and applications. J Stat Comput Simul 86:1709–1735
Vanegas LH, Paula GA (2016b) Log-symmetric distributions: statistical properties and parameter estimation. Braz J Probab Stat 30:196–220
Vanegas LH, Paula GA (2016c) ssym: fitting semiparametric symmetric regression models. R package version 1.5-2. http://CRAN.R-project.org/package=ssym
Waller LA, Turnbull BW (1992) Probability plotting with censored data. Am Stat 46:5–12
Wood SN (2006) Generalized additive models: an introduction with R. Chapman & Hall, Boca Raton
Wu C, Yu Y (2014) Partially linear modeling of conditional quantiles using penalized splines. Comput Stat Data Anal 77:170–187
Yu Y, Ruppert D (2002) Penalized spline estimation for partially linear single-index models. J Am Stat Assoc 97:1042–1054
Zou Y, Zhang J, Qinb G (2011) A semiparametric accelerated failure time partial linear model and its application to breast cancer. Comput Stat Data Anal 55:1479–1487
Acknowledgements
The authors are grateful to the reviewers for their helpful comments and suggestions.
Author information
Authors and Affiliations
Corresponding author
Appendix
Appendix
Theorem 2
Under Conditions (1)–(4) the maximum penalized likelihood estimator of \({\varvec{\theta }}\) is consistent and
Proof
By a Taylor series expansion of the score function of \({\varvec{\theta }}{ around}{\varvec{\theta }}^{^{[0]}}{} \) it is possible to write
where \({\varvec{\theta }}^{^*}{} { isonthelinesegmentjoining}\hat{{\varvec{\theta }}}{} { and}{\varvec{\theta }}^{^{[0]}},\,{ and}\partial {\mathbf{J}_{{ ll'}} ({\varvec{\theta }})}{} { isthe}(l,l'){ thelementof}{\mathbf{J}({\varvec{\theta }})}{} \). From Conditions 1, 2, 3 (expressions (a) and (b)) and 4, it follows that
-
(1)
\(n^{-1}\left[ \mathbf{U}\bigl ({\varvec{\theta }}^{^{[0]}}\bigr )-\mathbf{M}{\varvec{\theta }}^{^{[0]}}\right] \xrightarrow [n \rightarrow \infty ]{\mathcal {P}}\mathbf{0}\) and
-
(2)
\(n^{-1}\left[ \mathbf{J}\bigl ({\varvec{\theta }}^{^{[0]}}\bigr )-\mathbf{M}\right] \xrightarrow [n \rightarrow \infty ]{\mathcal {P}}-{\varvec{\varOmega }}\bigl ({\varvec{\theta }}^{^{[0]}}\bigr ).\)
Therefore, from 1 and 2 and Condition 3 (expression (e)), and using the argument in Billingsley (1961, page 12), it follows that \(\hat{{\varvec{\theta }}}{} { isaconsistentestimatorof}{\varvec{\theta }}^{^{[0]}}.{ Moreover},\,{ byanotherTaylorseriesexpansionofthescorefunctionof}\hat{{\varvec{\theta }}}{} { around}{\varvec{\theta }}^{^{[0]}}{} \), it is possible to write
where \({\varvec{\theta }}^*{ isonthelinesegmentjoining}\hat{{\varvec{\theta }}}{} { and}{\varvec{\theta }}^{^{[0]}}{} \). By rearranging this expression, we arrive at
From Conditions 1, 2, 3 (expressions (c) and (d)) and 4, it follows that
-
(3)
\(\left[ -\mathbf{J}\bigl ({\varvec{\theta }}^{^{[0]}}\bigr )\right] ^{-\frac{1}{2}} \left[ \mathbf{U}\bigl ({\varvec{\theta }}^{^{[0]}}\bigr )-\mathbf{M}{\varvec{\theta }}^{^{[0]}}\right] \xrightarrow [n \rightarrow \infty ]{\mathcal {D}}\mathcal {N}(\mathbf{0},\mathbf{I})\) and
-
(4)
\(n^{-1}\left[ \mathbf{J}\bigl ({\varvec{\theta }}^*\bigr )-\mathbf{M}\right] \xrightarrow [n \rightarrow \infty ]{\mathcal {P}}-{\varvec{\varOmega }}\bigl ({\varvec{\theta }}^{^{[0]}}\bigr ).\)
Therefore, from 3 and 4 and (5), and using the Slutsky theorem, it follows that
Rights and permissions
About this article
Cite this article
Vanegas, L.H., Paula, G.A. Log-symmetric regression models under the presence of non-informative left- or right-censored observations. TEST 26, 405–428 (2017). https://doi.org/10.1007/s11749-016-0517-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11749-016-0517-z
Keywords
- Survival analysis
- Accelerated lifetime models
- Symmetric distributions
- Robust estimation
- Backfitting algorithm
- Additive model