Abstract
It is often of interest to effectively use the information on a large number of covariates in predicting response or outcome. Various statistical tools have been developed to overcome the difficulties caused by the high-dimensionality of the covariate space in the setting of a linear regression model. This paper focuses on the situation where the outcomes of interest are subjected to right censoring. We implement the extended partial least squares method along with other commonly used approaches for analyzing the high dimensional covariates to a data set from AIDS clinical trials (ACTG333). Predictions were computed on the covariate effect and the response for a future subject with a set of covariates. Simulation studies were conducted to compare our proposed methods with other prediction procedures for different numbers of covariates, different correlations among the covariates and different failure time distributions. Mean squared prediction error and mean absolute distance were used to measure the accuracy of prediction on the covariate effect and the response, respectively. We also compared the prediction performance of different approaches using numerical studies. The results show that the Buckley-James based partial least squares, stepwise subset model selection and principal components regression have similar predictive power and the partial least squares method has several advantages in terms of interpretability and numerical computation.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Brown, P., Measurement, Regression, and Calibration. Clarendon: Oxford (1993)
J. Buckley and I. James, “Linear regression with censored data,” Biometrika vol. 66, pp. 429–436, 1979.
N. Butler and M. Denham, “The peculiar shrinkage properties of partial least squares regression,” J. Roy. Stat. Soc., Ser. B vol. 62, pp. 585–593, 2000.
Collier, A., Coombs, R., Schoenfeld, D., Bassett, R., Timpone, J., Baruch, A., Jones, M., Facey, K., Whitacre, C., McAuliffe, V., Friedman, H., Merigan, T., Reichman, R., Hopper, C., Corey L.: Treatment of human immunodeficiency virus infection with saquinavir, zidovudine, and zalcitabine: AIDS Clinical Trial Group. N. Engl. J. Med. 16, 1011–1017 (1996)
J. Condra, W. Schleif, O. Blahy, L. Gabryelski, D. Graham, J. Quintero, A. Rhodes, H. Robbins, E. Roth, M. Shivaprakash, D. Titus, T. Yang, H. Tepplert, K. Squires, P. Deutsch and E. Emini, “In vivo emergence of HIV-I variants resistant to multiple protease inhibitors,” Nature vol. 374, pp. 569–571, 1995.
J. Condra, D. Holder, W. Schleif, and et al., “Genetic correlates of in vivo viral resistance to indinavir, a human immunodeficiency virus type I protease inhibitor,” J. Virol. vol. 70, 8270–8276, 1996.
D. Cox, “Regression models and life tables,” J. Roy. Stat. Soc., Ser. B vol. 34, pp. 187–220, 1972.
I. Currie, “A note on Buckley-James estimators for censored data,” Biometrika vol. 83, pp. 912–915, 1996.
S. de Jong, “SIMPLS: an alternative approach to partial least squares regression,” Chem. Intell. Lab. Syst. vol. 18, pp. 251–263, 1993.
M. Denham, Calibration in infrared spectroscopy, Ph.D. Dissertaion, University of Liverpool, 1991.
N. Draper and H. Smith, Applied Regression Analysis, John Wiley and Sons: New York, 1981.
I. Frank and J. Friedman, “A statistical view of some chemometrics regression tools,” Technometrics vol. 35, pp. 109–134, 1993.
C. Goutis, “Partial least squares algorithm yields shrinkage estimators,” Ann. Stat. vol. 24, pp. 816–824, 1996.
R. Gunst, “Regression analysis with multicollinear predictor variables: Definition, detection, and effects,” Commun. Stat. Theo. Meth. vol. 12, pp. 2217–2260, 1983.
I. Helland, “On the structure of partial least squares regression,” Commun. Stat. Simu. Comp. vol. 17, pp. 581–607, 1988.
G. Heller and J. Simonoff, “Prediction in censored survival data: a comparison of the proportional hazards and linear regression models,” Biometrics vol. 48, pp. 101–115, 1992.
R. Hocking, “The analysis and selection of variables in linear regression,” Biometrics vol. 32, pp. 1–49, 1976.
H. Hotelling, “Analysis of a complex of statistical variables into principal components,” J. Educ. Psychol. vol. 24, pp. 417–441, 498–520, 1933.
J. Hughes, “Mixed effects models with censored data with applications to HIV RNA levels,” Biometrics vol 55, pp. 625–629, 1999.
J. Huang and D. Harrington, “Iterative partial least squares with right-censored data analysis: A comparision to other dimension reduction technique,” Biometrics, in press, March 2005.
J. Huang and D. Harrington, “Dimension reduction in the linear model for right-censored data: predicting the change of HIV-I RNA levels using clinical and protease gene mutation data,” Lifetime Data Analysis, in press, December 2004.
H. Jacobsen, M. Hanggi, M. Ott, I. Duncan, S. Owen, M. Andreoni, S. Vella, and J. Mous, “In vivo resistance to a human immunodeficiency virus type I protease inhibitor: mutations, kinetics, and frequencies,” J. Inf. Dis. vol. 173, pp. 1379–1387, 1996.
H. Jacqmin-Gadda and R. Thiébaut, “Analysis of left censored longitudinal data with application to viral load in HIV infection,” Biostatistics vol. 1, pp. 355–368, 2000.
Z. Jin, D. Lin, L. Wei, and Z. Ying, “Rank-based inference for the accelerated failure time model,” Biometrika vol. 90, pp. 341–353, 2003.
I. Jolliffe, Principal Component Analysis, Springer-Verlag: New York, 1986.
N. Laird and J. Ware, “Random effects models for longitudinal data,” Biometrics vol. 38, pp. 963–974, 1982.
I. Marschner, R. Betensky, V. Degruttola, S. Hammer, and D. Kuritzkes, “Clinical trials using HIV-1 RNA-based primary endpoints: statistical analysis and potential biases,”Ť J. Acq. Imm. Def. Syndr. Hum. Retr. vol. 20, pp. 220Ů227, 1999.
A. Miller, Subset Selection in Regression, Chapman and Hall: London, 1990.
R. Miller and J. Halpern, “Regression with censored data,” Biometrika vol. 69, pp. 521–531, 1982.
D. Nguyen and D. Rocke, “Partial least squares proportional hazard regression for application to DNA microarray survival data,” Bioinformatics vol. 18, pp. 1625–1632, 2002.
M. Para, D. Glidden, R. Coombs, A. Collier, J. Condra, C. Craig, R. Bassett, S. Leavitt, V. McAuliffe, and C. Roucher, “Baseline human immunodeficiency virus type I phenotype, genotype, and RNA response after switching from long-term hard-capsule saquinavir to indinavir or softgel-capsule in AIDS clinical trials group protocol 333,” J. Inf. Dis. vol. 182, pp. 733–743, 2000.
P. Park, L. Tian and I. Kohane, “Linking gene expression data with patient survival times using partial least squares,” Bioinformatics vol. 18, pp. S120–S127, 2002.
M. Stone and R. Brooks, “Continuum regression: cross-validation sequentially constructed prediction embracing ordinary least squares, partial least squares and principal components regression,” J. Roy. Stat. Soc., Ser. B vol. 52, pp. 237–269, 1990.
R. Tibshirani, “Regression shrinkage and selection via the lasso,” J. Roy. Stat. Soc., Ser. B vol. 58, pp. 267–288, 1996.
A. Tsiatis, “Estimation regression parameters using linear rank tests for censored data model with censored data,” Ann. Stat. vol. 18, pp. 354–372, 1990.
M. Vaillancourt, R. Irlbeck, T. Smith, R. Coombs, and R. Swanstrom, “The HIV type I protease inhibitor saquinavir can select for multiple mutations that confer increasing resistance,” AIDS Res. Hum. Retr. vol. 15, pp. 355–363, 1999.
P. Wentzell and L. Montoto, “Comparison of principal components regression and partial least squares through generic simulations of complex mixtures,” Chem. Intell. Lab. Syst. vol. 65, pp. 257–279, 2003.
H. Wold, “Nonlinear estimation by iterative least squares procedures,” Research papers in Statistics: Festschrift for J. Neyman John Wiley and Sons: New York, pp. 411–444, 1966.
H. Wold, “Soft modeling by latent variables: The non-linear iterative partial least squares (NIPALS) approach,” Perspectives in Probability and Statistics, In Honor of M. S. Bartlett Academic: New York, pp. 117–144, 1976.
S. Wold, H. Wold, W. Dunn, and A. Ruhe, “The collinearity problem in linear regression: The partial least squares (PLS) approach to generalized inverse,” SIAM J. Sci. Stat. Comput. vol. 5, pp. 735–743, 1984.
Z. Ying, L. Wei, and D. Lin, “Prediction of survival probability based on a linear regression model,” Biometrika vol. 79, pp. 205–209, 1992.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer Science+Business Media, Inc.
About this paper
Cite this paper
Huang, J., Harrington, D. (2006). Operating Characteristics of Partial Least Squares in Right-Censored Data Analysis and Its Application in Predicting the Change of HIV-I RNA. In: Nikulin, M., Commenges, D., Huber, C. (eds) Probability, Statistics and Modelling in Public Health. Springer, Boston, MA. https://doi.org/10.1007/0-387-26023-4_14
Download citation
DOI: https://doi.org/10.1007/0-387-26023-4_14
Publisher Name: Springer, Boston, MA
Print ISBN: 978-0-387-26022-8
Online ISBN: 978-0-387-26023-5
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)