Abstract
The principal component analysis has been widely used in various fields of research (e.g., bioinformatics, medical statistics, etc.), especially high dimensional data analysis. Although crucial components selection is a vital matter in principal components analysis, relatively little attention was paid to this issue. The existing studies for principal component analysis were based on ad-hoc methods (e.g., method with cumulative percent variance or average eigenvalue). We propose a novel method for selecting principal component based on L\(_{1}\)-type regularized regression modeling. In order to effectively perform for principal component regression, we consider adaptive L\(_{1}\)-type penalty based on singular values of components, and propose adaptive penalized principal component regression. The proposed method can perform feature selection incorporating explanation power of components for not only high-dimensional predictor variables but also response variable. In sparse regression modeling, choosing the regularization parameter is a crucial issue, since feature selection and estimation heavily depend on the selected regularization parameter. We derive a model selection criterion for choosing the regularization parameter of the proposed adaptive L\(_{1}\)-type regularization method in line with a generalized information criterion. Monte Carlo simulations and real data analysis demonstrate that the proposed modeling strategies outperform for principal component regression modeling.
Similar content being viewed by others
References
Alfons A, Croux C, Gelper S (2013) Sparse least trimmed squares regression for analyzing high-dimensional large data sets. Ann Appl Stat 7:226–248
Akaike H (1973) Information theory and an extension of the maximum likelihood principle. In: Petrov BN, Csaki F (eds) Secondinternational symposium on Information theory. Akademiai Kiado, Budapest, pp 267–281
Chang X, Yang H (2012) Combining two-parameter and principal component regression estimators. Stat Pap 53:549–562
Fan J, Li R (2001) Variable selection via nonconcave penalized likelihood and its oracle properties. J Am Stat Assoc 96:1348–1360
Hirose K, Konishi S (2012) Variable selection via the weighted group lasso for factor analysis models. Can J Stat 40:345–361
Konishi S, Kitagawa G (1996) Generalised information criteria in model selection. Biometrika 83:875–890
Konishi S, Kitagawa G (2008) Information Criteria and Statistical Modeling. Springer, New York
Neykov NM, Filzmoser EP, Neytchev EPN (2014) Ultrahigh dimensional variable selection through the penalized maximum trimmed likelihood estimator. Stat Pap 55:187–207
Park H, Sakaori F (2013) Lag weighted lasso for time series model. Comp Stat 28:493–504
Park H, Sakaori F, Konishi S (2012) Selection of tuning parameters in robust sparse regression modeling. Proceedings of COMPSTAT2012 pp 713–723, A Springer Company
Qingguo T (2014) Robust estimation for spatial semiparametric varying coefficient partially linear regression. Stat Pap. doi:10.1007/s00362-014-0629-z
Schwarz G (1978) Estimating dimension of a model. Ann Stat 6:461–464
Selecting the number of functional principal components. http://www.stat.ucdavis.edu/PACE/Help/pc2
Valle S, Li W, Qin SJ (1999) Selection of the number of principal components: the variance of the reconstruction error criterion with a comparison to other methods. Ind Eng Chem Res 38:4389–4401
Wang H, Li R, Tsai CL (2007) Tuning parameter selectors for the smoothly clipped absolute deviation method. Biometrika 94:553–568
Xu D, Zhang Z, Wu L (2014) Variable selection in high-dimensional double generalized linear models. Stat Pap 55:327–347
Yang H, Guo C, Lv J (2014) Variable selection for generalized varying coefficient models with longitudinal data. Stat Pap. doi:10.1007/s00362-014-0647-x
Zou H (2006) The adaptive lasso and its oracle properties. J Am Stat Assoc 101:1418–1429
Zou H, Hastie T (2008) Regularization and variable selection via the elastic net. J R Stat Soc Ser B 67:301–320
Acknowledgments
The authors would like to thank the associate editor and anonymous reviewers for the constructive and valuable comments that improved the quality of the paper.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Park, H., Konishi, S. Principal component selection via adaptive regularization method and generalized information criterion. Stat Papers 58, 147–160 (2017). https://doi.org/10.1007/s00362-015-0691-1
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00362-015-0691-1