Skip to main content
Log in

Examination of Dimension Reduction Performances of PLSR and PCR Techniques in Data with Multicollinearity

  • Research Paper
  • Published:
Iranian Journal of Science and Technology, Transactions A: Science Aims and scope Submit manuscript

Abstract

One of the common ways to cope with the multicollinearity problem in multiple regression analysis is to use dimension reduction techniques. Among these techniques, the present study focuses on the Partial Least Square Regression (PLSR) and the Principle Component Regression (PCR) techniques. The study tries to determine in which cases the two techniques give similar results and in which cases and to what extent they are different in terms of dimension reduction. For this purpose, the performance of the techniques is examined on two real dataset. In addition, a Monte Carlo simulation is made to evaluate the performances of these techniques based on the criterion of Root Mean Square Error of Cross Validation (RMSECV) under different conditions.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

References

  • Abdi H (2003) Partial least square regression (PLS regression). Encyclop Res Methods Soc Sci 6(4):792–795

    Google Scholar 

  • Allen DM (1974) The relationship between variable selection and data augmentation and a method for prediction. Technometrics 16(1):125–127

    Article  MathSciNet  MATH  Google Scholar 

  • Bodzioch K, Baczek T, Kaliszan R, Vander Heyden Y (2009) The molecular descriptor logSum AA and its alternatives in QSRR models to predict the retention of peptides. J Pharm Biomed Anal 50(4):563–569

    Article  Google Scholar 

  • D’ambra A, Sarnacchiora P (2010) Some data reduction methods to analyze the dependence with highly collinear variables: a simulation study. Asian J Math Stat 3(2):69–81

    Article  MathSciNet  Google Scholar 

  • Diaz TG, Guiberteau A, Burguillos JO, Salinas F (1997) Comparison of chemometric methods: derivative ratio spectra and multivariate methods (CLS, PCR and PLS) for the resolution of ternary mixtures of the pesticides carbofuran carbaryl and phenamifos after their extraction into chloroform. Analyst 122(6):513–517

    Article  Google Scholar 

  • Druilhet P, Mom A (2006) PLS regression: a directional signal-to-noise ratio approach. J Multivar Anal 97(6):1313–1329

    Article  MathSciNet  MATH  Google Scholar 

  • Du YP, Kasemsumran S, Maruo K, Nakagawa T, Ozaki Y (2006) Ascertainment of the number of samples in the validation set in Monte Carlo cross validation and the selection of model dimension with Monte Carlo cross validation. Chemometr Intel Lab 82(1):83–89

    Article  Google Scholar 

  • Ebegil M, Gokpinar F (2012) A test Static to choose between Liu-type and least-squares estimator based on mean square error criteria. J Appl Stat 39(10):2081–2096

    Article  MathSciNet  Google Scholar 

  • Geladi P, Kowalski BR (1986) Partial least-squares regression: a tutorial. Anal Chim Acta 185:1–17

    Article  Google Scholar 

  • Gibbons DG (1981) A simulation study of some ridge estimators. J Am Stat Assoc 76(373):131–139

    Article  MATH  Google Scholar 

  • He G, Sentell T, Schillinger D (2010) A new public health tool for risk assessment of abnormal glucose levels. Prev Chronic Dis 7(2):1–9

    Google Scholar 

  • Helland IS (1988) On the structure of partial least squares regression. Commun Stat Simulat 17(2):581–607

    Article  MathSciNet  MATH  Google Scholar 

  • Helland I (2006) Partial least squares regression. In: Kotz S, Read B, Balakrishnan N, Vidakovic B (eds) Encylopedia of Statical sciences. Wiley, New Jersey, pp 5957–5962

    Google Scholar 

  • Hemmateenejad B, Akhond M, Samari F (2007) A comparative study between PCR and PLS in simultaneous spectrophotometric determination of diphenylamine, aniline, and phenol: effect of wavelength selection. Spectrochim Acta Part A 67(3):958–965

    Article  Google Scholar 

  • Hoerl AE, Kennard RW (1970) Ridge regression: biased estimation for nonorthogonal problems. Technometrics 12(1):55–67

    Article  MATH  Google Scholar 

  • Jong SD (1993) PLS fits closer than PCR. J Chemometr 7(6):551–557

    Article  Google Scholar 

  • Kibria BMG (2003) Performance of some new ridge regression estimators. Commun Stat Simulat 32(2):419–435

    Article  MathSciNet  MATH  Google Scholar 

  • Li Y, Udén P, Von Rosen D (2013) A two-step PLS inspired method for linear prediction with group effect. Sankhya A 75(1):96–117

    Article  MathSciNet  MATH  Google Scholar 

  • Li Y, Udén P, Von Rosen D (2015) A two-step estimation method for grouped data with connections to the extended growth curve model and partial least squares regression. J Multivar Anal 139:347–359

    Article  MathSciNet  MATH  Google Scholar 

  • Magidson J (2013) Correlated component regression: re-thinking regression in the presence of near collinearity. New perspectives in partial least squares and related methods. Springer, New York, pp 65–78

    Chapter  Google Scholar 

  • Mahesh S, Jayas DS, Paliwal J, White NDG (2015) Comparison of partial least squares regression (PLSR) and principal components regression (PCR) methods for protein and hardness predictions using the near-infrared (NIR) hyperspectral images of bulk samples of Canadian wheat. Food Bioprocess Technol 8(1):31–40

    Article  Google Scholar 

  • Maitra S, Yan J (2008) Principle component analysis and partial least squares: two dimension reduction techniques for regression. Applying multivariate statical models, vol 79. Discussion Paper Program. Casualty Actuarial Society, Arlington, pp 79–90

  • Månsson K, Shukur G, Kibria BMG (2010) A simulation study of some ridge regression estimators under different distributional assumptions. Commun Stat Simul Comput 39(8):1639–1670

    Article  MathSciNet  MATH  Google Scholar 

  • Massy WF (1965) Principal components regression in exploratory statical research. J Am Stat Assoc 60(309):234–256

    Article  Google Scholar 

  • McDonald GC, Galarneau DI (1975) A Monte Carlo evaluation of some ridge-type estimators. J Am Stat Assoc 70(350):407–416

    Article  MATH  Google Scholar 

  • McDonald GC, Schwing RC (1973) Instabilities of regression estimates relating air pollution to mortality. Technometrics 15(3):463–481

    Article  Google Scholar 

  • Mohiddin SB (2006) Development of novel unsupervised and supervised informatics methods for drug discovery applications. PhD thesis, United States: The Ohio State University

  • Montgomery DC, Askin RG (1981) Problems of nonnormality and multicollinearity for forecasting methods based on least squares. AIIE T 13(2):102–115

    Article  MathSciNet  Google Scholar 

  • Montgomery DC, Peck EA, Vining GG (2001) Introduction to linear regression analysis. Wiley, New York

    MATH  Google Scholar 

  • Naes T, Martens H (1985) Comparison of prediction methods for multicollinear data. Commun Stat Simulat 14(3):545–576

    Article  MATH  Google Scholar 

  • Naes T, Mevik BH (2001) Understanding the collinearity problem in regression and discriminant analysis. J Chemometr 15(4):413–426

    Article  Google Scholar 

  • Newhouse JP, Oman SD (1971) An evaluation of ridge estimators. Rand, Santa Monica

    Google Scholar 

  • Ni Y, Gong X (1997) Simultaneous spectrophotometric determination of mixtures of food colorants. Anal Chim Acta 354(1):163–171

    Article  Google Scholar 

  • Rao CR, Toutenburg H, Heumann SC (2008) Linear models and generalizations: least squares and alternatives. Springer, Germany

    MATH  Google Scholar 

  • Rawlings JO, Pantula SG, Dickey DA (1998) Applied regression analysis: a research tool. Springer, New York

    Book  MATH  Google Scholar 

  • Rosipal R, Krämer N (2006) Overview and recent advances in partial least squares. Lect Notes Comput Sci 3490:34–51

    Article  Google Scholar 

  • Saleh AKME (2014) A ridge regression estimation approach to the measurement error model. J Multivar Anal 123:68–84

    Article  MathSciNet  MATH  Google Scholar 

  • Serneels S, Filzmoser P, Croux C, Van Espen PJ (2005) Robust continuum regression. Chemometr Intel Lab. 76(2):197–204

    Article  Google Scholar 

  • Stone M, Brooks RJ (1990) Continuum regression: cross-validated sequentially constructed prediction embracing ordinary least squares, partial least squares and principal components regression. J R Stat Soc Series B Stat Methodol 52(2):237–269

    MathSciNet  MATH  Google Scholar 

  • Tobias RD (1995) An introduction to partial least squares regression. In: Proceedings of the twentieth annual SAS users group international conference, SAS Institute Cary, NC, pp 1250–1257

  • Vigneau E, Bertrand D, Qannari EM (1996) Application of latent root regression for calibration in near-infrared spectroscopy. Comparison with principal component regression and partial least squares. Chemometr Intel Lab 35(2):231–238

    Article  Google Scholar 

  • Vigneau E, Devaux MF, Qannari EM, Robert P (1997) Principal component regression, ridge regression and ridge principal component regression in spectroscopy calibration. J Chemometr 11(3):239–249

    Article  Google Scholar 

  • Wold S, Trygg J, Berglung A, Antti H (2001a) Some recent developments in PLS modeling. Chemometr Intel Lab 58(2):131–152

    Article  Google Scholar 

  • Wold S, Sjostrom M, Eriksson L (2001b) PLS-regression: a basic tool of chemometrics. Chemometr Intel Lab 58(2):109–130

    Article  Google Scholar 

  • Xu QS, Liang YZ (2001) Monte Carlo cross validation. Chemometr Intel Lab 56(1):1–11

    Article  Google Scholar 

  • Yeniay O, Göktaş A (2002) A comparison of partial least squares regression with other prediction methods. Hacet J Math Stat 31(99):99–101

    MathSciNet  MATH  Google Scholar 

  • Zeng XQ, Li GZ, Wu G, Zou HX (2007) On the number of partial least squares components in dimension reduction for tumor classification. Lect Notes Comput Sci Springer Berlin Heidelberg 4819:206–217

    Article  Google Scholar 

  • Ziegel ER (2004) A user-friendly guide to multivariate calibration and classification. Technometrics 46(1):108–110

    Article  Google Scholar 

Download references

Acknowledgements

The authors are grateful to reviewers for their valuable comments and suggestions to improve the quality of this paper.

Funding

This work was supported by Scientific Research Projects of Eskisehir Osmangazi University [grand number 201519A112].

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hatice Samkar.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Guven, G., Samkar, H. Examination of Dimension Reduction Performances of PLSR and PCR Techniques in Data with Multicollinearity. Iran J Sci Technol Trans Sci 43, 969–978 (2019). https://doi.org/10.1007/s40995-018-0565-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s40995-018-0565-1

Keywords

Navigation