Skip to main content
Log in

Principal component selection via adaptive regularization method and generalized information criterion

  • Regular Article
  • Published:
Statistical Papers Aims and scope Submit manuscript

Abstract

The principal component analysis has been widely used in various fields of research (e.g., bioinformatics, medical statistics, etc.), especially high dimensional data analysis. Although crucial components selection is a vital matter in principal components analysis, relatively little attention was paid to this issue. The existing studies for principal component analysis were based on ad-hoc methods (e.g., method with cumulative percent variance or average eigenvalue). We propose a novel method for selecting principal component based on L\(_{1}\)-type regularized regression modeling. In order to effectively perform for principal component regression, we consider adaptive L\(_{1}\)-type penalty based on singular values of components, and propose adaptive penalized principal component regression. The proposed method can perform feature selection incorporating explanation power of components for not only high-dimensional predictor variables but also response variable. In sparse regression modeling, choosing the regularization parameter is a crucial issue, since feature selection and estimation heavily depend on the selected regularization parameter. We derive a model selection criterion for choosing the regularization parameter of the proposed adaptive L\(_{1}\)-type regularization method in line with a generalized information criterion. Monte Carlo simulations and real data analysis demonstrate that the proposed modeling strategies outperform for principal component regression modeling.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

References

  • Alfons A, Croux C, Gelper S (2013) Sparse least trimmed squares regression for analyzing high-dimensional large data sets. Ann Appl Stat 7:226–248

    Article  MathSciNet  MATH  Google Scholar 

  • Akaike H (1973) Information theory and an extension of the maximum likelihood principle. In: Petrov BN, Csaki F (eds) Secondinternational symposium on Information theory. Akademiai Kiado, Budapest, pp 267–281

    Google Scholar 

  • Chang X, Yang H (2012) Combining two-parameter and principal component regression estimators. Stat Pap 53:549–562

    Article  MathSciNet  MATH  Google Scholar 

  • Fan J, Li R (2001) Variable selection via nonconcave penalized likelihood and its oracle properties. J Am Stat Assoc 96:1348–1360

    Article  MathSciNet  MATH  Google Scholar 

  • Hirose K, Konishi S (2012) Variable selection via the weighted group lasso for factor analysis models. Can J Stat 40:345–361

    Article  MathSciNet  MATH  Google Scholar 

  • Konishi S, Kitagawa G (1996) Generalised information criteria in model selection. Biometrika 83:875–890

    Article  MathSciNet  MATH  Google Scholar 

  • Konishi S, Kitagawa G (2008) Information Criteria and Statistical Modeling. Springer, New York

    Book  MATH  Google Scholar 

  • Neykov NM, Filzmoser EP, Neytchev EPN (2014) Ultrahigh dimensional variable selection through the penalized maximum trimmed likelihood estimator. Stat Pap 55:187–207

    Article  MathSciNet  MATH  Google Scholar 

  • Park H, Sakaori F (2013) Lag weighted lasso for time series model. Comp Stat 28:493–504

    Article  MathSciNet  MATH  Google Scholar 

  • Park H, Sakaori F, Konishi S (2012) Selection of tuning parameters in robust sparse regression modeling. Proceedings of COMPSTAT2012 pp 713–723, A Springer Company

  • Qingguo T (2014) Robust estimation for spatial semiparametric varying coefficient partially linear regression. Stat Pap. doi:10.1007/s00362-014-0629-z

  • Schwarz G (1978) Estimating dimension of a model. Ann Stat 6:461–464

    Article  MathSciNet  MATH  Google Scholar 

  • Selecting the number of functional principal components. http://www.stat.ucdavis.edu/PACE/Help/pc2

  • Valle S, Li W, Qin SJ (1999) Selection of the number of principal components: the variance of the reconstruction error criterion with a comparison to other methods. Ind Eng Chem Res 38:4389–4401

    Article  Google Scholar 

  • Wang H, Li R, Tsai CL (2007) Tuning parameter selectors for the smoothly clipped absolute deviation method. Biometrika 94:553–568

    Article  MathSciNet  MATH  Google Scholar 

  • Xu D, Zhang Z, Wu L (2014) Variable selection in high-dimensional double generalized linear models. Stat Pap 55:327–347

    Article  MathSciNet  MATH  Google Scholar 

  • Yang H, Guo C, Lv J (2014) Variable selection for generalized varying coefficient models with longitudinal data. Stat Pap. doi:10.1007/s00362-014-0647-x

  • Zou H (2006) The adaptive lasso and its oracle properties. J Am Stat Assoc 101:1418–1429

    Article  MathSciNet  MATH  Google Scholar 

  • Zou H, Hastie T (2008) Regularization and variable selection via the elastic net. J R Stat Soc Ser B 67:301–320

Download references

Acknowledgments

The authors would like to thank the associate editor and anonymous reviewers for the constructive and valuable comments that improved the quality of the paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Heewon Park.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Park, H., Konishi, S. Principal component selection via adaptive regularization method and generalized information criterion. Stat Papers 58, 147–160 (2017). https://doi.org/10.1007/s00362-015-0691-1

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00362-015-0691-1

Keywords

Navigation