A two-step PLS inspired method for linear prediction with group effect

Li, Ying; Udén, Peter; von Rosen, Dietrich

doi:10.1007/s13171-012-0022-8

A two-step PLS inspired method for linear prediction with group effect

Published: 20 March 2013

Volume 75, pages 96–117, (2013)
Cite this article

Sankhya A Aims and scope Submit manuscript

Ying Li¹,
Peter Udén¹ &
Dietrich von Rosen^1,2

208 Accesses
3 Citations
Explore all metrics

An Erratum to this article was published on 21 August 2015

Abstract

In this article, we consider prediction of a univariate response from background data. The data may have a near-collinear structure and additionally group effects are assumed to exist. A two-step method is proposed. The first step summarizes the information in the predictors via a bilinear model. The bilinear model has a Krylov structured within individual design matrix, which is the link to classical partial least squares (PLS) analysis and a between-individual design matrix which handles group effects. The second step is the prediction step where a conditional expectation approach is used. The two-step method gives new insight into PLS. Explicit maximum likelihood estimators of the dispersion matrix and mean for the predictors are derived under the assumption that the covariance between the response and explanatory variables is known. It is shown that for within-sample prediction the mean squared error of the two-step method is always smaller than PLS.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Group least squares regression for linear models with strongly correlated predictor variables

Article 26 July 2022

Objective Bayesian group variable selection for linear model

Article 30 September 2021

On Predicting Principal Components Through Linear Mixed Models

References

Björkström, A. (2010). Krylov sequences as a tool for analysing iterated regression algorithms. Scand. J. Stat., 37(1), 166–175.
Article MathSciNet MATH Google Scholar
Brown, P.J. (1993). Measurement, regression, and calibration. Oxford Statistical Science Series, vol. 12. The Clarendon Press Oxford University Press, New York.
Google Scholar
Cook, R.D., Li, B. and Chiaromonte, F. (2010). Envelope models for parsimonious and efficient multivariate linear regression (with discussion). Statist. Sinica, 20, 927–1010 .
MathSciNet MATH Google Scholar
Fearn, T. (1983). A misuse of ridge regression in the calibration of a near infrared reflectance instrument. J. R. Stat. Soc. Ser. C, 32, 73–79.
Google Scholar
Frank, I.E., and Friedman, J.H. (1993). A statistical view of some chemometrics regression tools. Technometrics, 35(2), 109–135.
Article MATH Google Scholar
Helland, I.S. (1988). On the structure of partial least squares regression. Comm. Statist. Simulation Comput., 17, 581–607.
Article MathSciNet MATH Google Scholar
Helland, I.S. (1990). Partial least squares regression and statistical models. Scand. J. Stat., 17, 97–114.
MathSciNet MATH Google Scholar
Helland, I.S. (1992). Maximum likelihood regression on relevant components. J. R. Stat. Soc. Ser. B, 54, 637–647.
MathSciNet MATH Google Scholar
Helland, I.S. (2010). Discussion of Cook, R.D., Li, B. and Chiaromonte, F. (2010). Envelope models for parsimonious and efficient multivariate linear regression. Statist. Sinica, 20, 978–981.
Hoerl, A.E., Kennard, R.W. and Hoerl, R.W. (1985). Practical use of ridge regression: A challenge met. J. R. Stat. Soc. Ser. C, 34, 114–120.
Google Scholar
Kollo, T., and Von Rosen, D. (2005). Advanced Multivariate Statistics with Matrices. Mathematics and Its Applications (New York), vol. 579. Springer, Dordrecht.
MATH Google Scholar
Kondylis, A. and Whittaker, J. (2008). Spectral preconditioning of Krylov spaces: Combining PLS and PC regression. Comput. Statist. Data Anal., 52, 2588–2603.
Article MathSciNet MATH Google Scholar
Li, Y. and Von Rosen, D. (2012). Maximum likelihood estimators in a two step model for PLS. Comm. Statist. Theory Methods, 41, 2503–2511.
Article MathSciNet MATH Google Scholar
Ohlson, M., Ahmad, M.R. and Von Rosen, D. (2013). The multilinear normal distribution: Introduction and some basic properties. J. Multivariate Anal., 113, 37–47.
Article MathSciNet MATH Google Scholar
Potthoff, R.F. and Roy, S.N. (1964). A generalized multivariate analysis of variance model useful especially for growth curve problems. Biometrika, 51, 313–326.
MathSciNet MATH Google Scholar
Von Rosen, D. (1991). The growth curve model: A review. Comm. Statist. Theory Methods, 20, 2791–2822.
Article MathSciNet MATH Google Scholar
Von Rosen, D. (1994). PLS, linear models and invariant spaces. Scand. J. Stat., 21, 179–186.
MATH Google Scholar
Srivastava, M.S. and Khatri, C.G. (1979) An introduction to multivariate statistics. North-Holland, New York.
MATH Google Scholar
Srivastava, M.S. and Von Rosen, D. (1999). Growth curve models. In Multivariate Analysis, Design of Experiments, and Survey Sampling. Statist. Textbooks Monogr., vol. 159, pp. 547–578. Dekker, New York.
Google Scholar
Stone, M. and Brooks, R.J. (1990). Continuum regression: Cross-validated sequentially constructed prediction embracing ordinary least squares, partial least squares and principal components regression. J. R. Stat. Soc. Ser. B, 52(2), 237–269. With discussion and a reply by the authors.
MathSciNet MATH Google Scholar
Sundberg, R. (1999). Multivariate calibration—direct and indirect regression methodology. Scand. J. Stat., 26, 161–207.
Article MathSciNet MATH Google Scholar
Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. J. R. Statist. Soc. Ser. B, 58, 267–288.
MathSciNet MATH Google Scholar
Wold, S., Martens, H. and Wold, H. (1983). The multivariate calibration problem in chemistry solved by the PLS method. In Matrix Pencils. Lecture Notes in Mathematics, (B. Kågström and A. Ruhe, eds.), vol. 973, pp. 286–293. Springer Berlin / Heidelberg.
Chapter Google Scholar
Woolson, R.F. and Leeper, J.D. (1980). Growth curve analysis of complete and incomplete longitudinal data. Comm. Statist. Theory Methods, 9, 1491–1513.
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Swedish University of Agricultural Sciences, Box 7032, 75007, Uppsala, Sweden
Ying Li, Peter Udén & Dietrich von Rosen
Department of Mathematics, Linköping University, Linköping, Sweden
Dietrich von Rosen

Authors

Ying Li
View author publications
You can also search for this author in PubMed Google Scholar
Peter Udén
View author publications
You can also search for this author in PubMed Google Scholar
Dietrich von Rosen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ying Li.

Additional information

An erratum to this article is available at http://dx.doi.org/10.1007/s13171-015-0077-4.

A Appendix: Proof of Theorem 2

Proof

Let us start with the proof of $\boldsymbol{E} (\boldsymbol{y}-\hat{\boldsymbol{y}}_{a,TS})'(\hat{\boldsymbol{y}}_{a,TS}-\hat{\boldsymbol{y}}_{a,PLS})=0$. Based on the inverse binomial theorem $\hat{{\boldsymbol{\Sigma}}}^{-1}$ can be split into two parts,

$$ \label{isigma} \hat{\boldsymbol{\Sigma}}^{-1}=\left(\frac{1}{n}\boldsymbol{S}\right)^{-1}-\boldsymbol{K}, $$

(A.1)

where

$$ \begin{array}{rll} \boldsymbol{K} &=&n \boldsymbol{S}^{-1} (\boldsymbol{I}-\boldsymbol{\hat{A}(\hat{A}'S^{-1}\hat{A})^{-} \hat{A}' S ^{-1}})\boldsymbol{X} \boldsymbol{P}_{1}\boldsymbol{Z}^{-1} \\ &&\times \boldsymbol{P}_{\mathbf{1}}\boldsymbol{X}'(\boldsymbol{I}- \boldsymbol{S ^{-1}\hat{A}(\hat{A}'S^{-1}\hat{A})^{-1} \hat{A}'}){\boldsymbol{S}}^{-1}, \end{array} $$

with

$$ \boldsymbol{Z}=\boldsymbol{P}_{\mathbf{1}}\boldsymbol{X}'(\boldsymbol{I}- \boldsymbol{S ^{-1}\hat{A}(\hat{A}'S^{-1}\hat{A})^{-1} \hat{A}'}){\boldsymbol{S}}^{-1}({\boldsymbol{I}}-\boldsymbol{\hat{A}(\hat{A}'S^{-1}\hat{A})^{-} \hat{A}' S ^{-1}}){\boldsymbol{X}} {\boldsymbol{P}}_{\mathbf{1}}+{\boldsymbol{I}}. $$

In order to compare the $\hat{\boldsymbol{y}}_{a,PLS}$ with $\hat{\boldsymbol{y}}_{a,TS}$, we replace $\hat{\boldsymbol{G}}_a$ with $n \boldsymbol{S}^{-1} \hat{\boldsymbol{A}}$, and using (A.1)

$$ \begin{array}{rll} (\hat{\boldsymbol{y}}_{a,TS}-\hat{\boldsymbol{y}}_{a,PLS}) &=&(\hat{\boldsymbol{\omega}}'n\boldsymbol{S}^{-1}\boldsymbol{X}-\hat{\boldsymbol{\omega}}'n\boldsymbol{S}^{-1}\hat{\boldsymbol{A}}(\hat{\boldsymbol{A}}'\boldsymbol{S}^{-1}\hat{\boldsymbol{A}})^{-}\hat{\boldsymbol{A}}'\boldsymbol{S}^{-1}\boldsymbol{X} \boldsymbol{P}_{\mathbf{1}}\\ &&-\hat{\boldsymbol{\omega}}'\boldsymbol{K} \boldsymbol{X}+\hat{\boldsymbol{\omega}}'\boldsymbol{K}\hat{\boldsymbol{A}}(\hat{\boldsymbol{A}}'\boldsymbol{S}^{-1}\hat{\boldsymbol{A}})^{-}\hat{\boldsymbol{A}}'\boldsymbol{S}^{-1}\boldsymbol{X} \boldsymbol{P}_{\mathbf{1}}\\ &&-(\hat{\boldsymbol{\omega}}'n \boldsymbol{S}^{-1}\hat{\boldsymbol{A}}(\hat{\boldsymbol{A}}'\boldsymbol{S}^{-1}\hat{\boldsymbol{A}})^{-}\hat{\boldsymbol{A}}'\boldsymbol{S}^{-1} \boldsymbol{X}\\ &&\label{eq5.7}+(\hat{\boldsymbol{\omega}}'n \boldsymbol{S}^{-1}\hat{\boldsymbol{A}}(\hat{\boldsymbol{A}}'\boldsymbol{S}^{-1}\hat{\boldsymbol{A}})^{-}\hat{\boldsymbol{A}}'\boldsymbol{S}^{-1} \boldsymbol{X} \boldsymbol{P}_{\mathbf{1}})'. \end{array} $$

(A.2)

Since $\boldsymbol{K}\hat{\boldsymbol{A}}(\hat{\boldsymbol{A}}'\boldsymbol{S}^{-1}\hat{\boldsymbol{A}})^{-}\hat{\boldsymbol{A}}'\boldsymbol{S}^{-1}\boldsymbol{X} \boldsymbol{P}_{\mathbf{1}}=\mathbf{0}$, (A.2) reduces to

$$ \begin{array}{rll} &&(\hat{\boldsymbol{\omega}}'\hat{\boldsymbol{\Sigma}}^{-1}\boldsymbol{X}-\hat{\boldsymbol{\omega}}'n \boldsymbol{S}^{-1}\hat{\boldsymbol{A}}(\hat{\boldsymbol{A}}'\boldsymbol{S}^{-1}\hat{\boldsymbol{A}})^{-}\hat{\boldsymbol{A}}'\boldsymbol{S}^{-1} \boldsymbol{X})'\\ &&=\label{eq5.8} \boldsymbol{X}'(\boldsymbol{I}-\boldsymbol{S}^{-1}\hat{\boldsymbol{A}}(\hat{\boldsymbol{A}}'\boldsymbol{S}^{-1}\hat{\boldsymbol{A}})^{-}\hat{\boldsymbol{A}}') \hat{\boldsymbol{\Sigma}}^{-1} \hat{{\boldsymbol{\omega}}}. \end{array} $$

(A.3)

For $\boldsymbol{y}-\boldsymbol{y}_{a,TS}$,

$$ \begin{array}{rll} (\boldsymbol{y}-\hat{\boldsymbol{y}}_{a,TS})'&=&\boldsymbol{y}' - \hat{\boldsymbol{\omega}}'\hat{\boldsymbol{\Sigma}}^{-1}(\boldsymbol{X}-\hat{\boldsymbol{\mu}}_x)-\hat{\boldsymbol{\mu}}'_y \\ &=&\label{eq5.9}\boldsymbol{y}' (\boldsymbol{I}-\boldsymbol{P}_{\mathbf{1}}) (\boldsymbol{I}-\frac{1}{n}\boldsymbol{X}' \hat{\boldsymbol{\Sigma}}^{-1}(\boldsymbol{X}-\hat{\boldsymbol{\mu}}_x)). \end{array} $$

(A.4)

Then we apply the result of (A.3) and (A.4),

$$ \begin{array}{rll} &&(\boldsymbol{y}-\hat{\boldsymbol{y}}_{a,TS})'(\hat{\boldsymbol{y}}_{a,TS}-\hat{\boldsymbol{y}}_{a,PLS})\\ &&\quad=\boldsymbol{y}' (\boldsymbol{I}-\boldsymbol{P}_{\mathbf{1}}) (\boldsymbol{I}-\frac{1}{n}\boldsymbol{X}' \hat{{\boldsymbol{\Sigma}}}^{-1}({\boldsymbol{X}}-\hat{{\boldsymbol{\mu}}}_x)){\boldsymbol{X}}'({\boldsymbol{I}}-{\boldsymbol{S}}^{-1}\hat{{\boldsymbol{A}}}(\hat{{\boldsymbol{A}}}'{\boldsymbol{S}}^{-1}\hat{{\boldsymbol{A}}})^{-}\hat{{\boldsymbol{A}}}') \hat{{\boldsymbol{\Sigma}}}^{-1} \hat{{\boldsymbol{\omega}}}\nonumber, \end{array} $$

which equals 0 if

$$ \label{shorterm} \left(\boldsymbol{I}-\frac{1}{n}\boldsymbol{X}' \hat{{\boldsymbol{\Sigma}}}^{-1}({\boldsymbol{X}}-\hat{{\boldsymbol{\mu}}}_x)\right){\boldsymbol{X}}'({\boldsymbol{I}}-{\boldsymbol{S}}^{-1}\hat{{\boldsymbol{A}}}(\hat{{\boldsymbol{A}}}'{\boldsymbol{S}}^{-1}\hat{{\boldsymbol{A}}})^{-}\hat{{\boldsymbol{A}}}')=\mathbf{0}. $$

For the second inequality, we need to prove that the sufficient condition $\boldsymbol{E}(\boldsymbol{y}-\hat{\boldsymbol{y}}_L)'(\hat{\boldsymbol{y}}_L-\hat{\boldsymbol{y}}_{a,TS})=0 $ holds, where

$$ \label{yl-yts} (\hat{\boldsymbol{y}}_L-\hat{\boldsymbol{y}}_{a,TS})=\left(\hat{\boldsymbol{\omega}}'\left(\frac{1}{n}\boldsymbol{S}\right)^{-1}(\boldsymbol{X}-\boldsymbol{X} \boldsymbol{P}_{\mathbf{1}}')-\hat{\boldsymbol{\omega}}'\hat{\boldsymbol{\Sigma}}^{-1}(\boldsymbol{X}-\widehat{\boldsymbol{\mu}_x})\right)'. $$

(A.5)

Then using (A.1) in (A.5),

$$ (\hat{\boldsymbol{y}}_L-\hat{\boldsymbol{y}}_{a,TS}) \label{eq5.5}=\left( -\boldsymbol{P}_{\mathbf{1}}\boldsymbol{X}'\left(\frac{1}{n}\boldsymbol{S}\right)^{-1} + \boldsymbol{X}'\boldsymbol{K}+\hat{\boldsymbol{\mu}}_x'\left(\frac{1}{n}\boldsymbol{S}\right)^{-1}- \hat{\boldsymbol{\mu}}_x'\boldsymbol{K} \right)\hat{\boldsymbol{\omega}} $$

(A.6)

$$ \begin{array}{rll} \label{eq5.6}&=& \boldsymbol{X}'\boldsymbol{K} \hat{{\boldsymbol{\omega}}}-({\boldsymbol{P}}_{\mathbf{1}}{\boldsymbol{X}}'-\hat{{\boldsymbol{\mu}}}_x')\left(\frac{1}{n}{\boldsymbol{S}}\right)^{-1}\hat{{\boldsymbol{\omega}}} \\[-2.5pt] &=&\boldsymbol{X}'\boldsymbol{K} \hat{{\boldsymbol{\omega}}}-{\boldsymbol{P}}_{\mathbf{1}}{\boldsymbol{X}}'({\boldsymbol{I}}-{\boldsymbol{S}}^{-1}\hat{{\boldsymbol{A}}}(\hat{{\boldsymbol{A}}}'{\boldsymbol{S}}^{-1}\hat{{\boldsymbol{A}}})^{-}\hat{{\boldsymbol{A}}}')\left(\frac{1}{n}{\boldsymbol{S}}\right)^{-1}\hat{{\boldsymbol{\omega}}}, \end{array} $$

(A.7)

where, (A.7) follows (A.6) since $ \hat{\boldsymbol{\mu}}_x'\boldsymbol{K}=\mathbf{0}. $ Furthermore,

$$ \begin{array}{rll} &&(\boldsymbol{y}-\hat{\boldsymbol{y}}_L)'(\hat{\boldsymbol{y}}_L-\hat{\boldsymbol{y}}_{a,TS})\\[-2.5pt] &&\quad=\boldsymbol{y}' (\boldsymbol{I}-\boldsymbol{P}_{\mathbf{1}})(\boldsymbol{I}-\boldsymbol{X}' \boldsymbol{S}^{-1} \boldsymbol{X}(\boldsymbol{I}-\boldsymbol{P}_{\mathbf{1}}))\boldsymbol{X}'\boldsymbol{K} \hat{\boldsymbol{\omega}}\\[-2.5pt] &&\qquad-(\boldsymbol{y}'-\boldsymbol{y}' (\boldsymbol{I}-\boldsymbol{P}_{\mathbf{1}})\boldsymbol{X}' \boldsymbol{S}^{-1}\boldsymbol{X})(\boldsymbol{I}-\boldsymbol{P}_{\mathbf{1}})\boldsymbol{P}_{\mathbf{1}}\boldsymbol{X}'\\[-2.5pt] &&(\boldsymbol{I}-\boldsymbol{S}^{-1}\hat{\boldsymbol{A}}(\hat{\boldsymbol{A}}'\boldsymbol{S}^{-1}\hat{\boldsymbol{A}})^{-}\hat{\boldsymbol{A}}')\left(\frac{1}{n}\boldsymbol{S}\right)^{-1}\hat{\boldsymbol{\omega}} \\[-2.5pt] &&\quad=\boldsymbol{y}' (\boldsymbol{I}-\boldsymbol{P}_{\mathbf{1}})\boldsymbol{X}'(\boldsymbol{I}-\boldsymbol{S}^{-1} \boldsymbol{X}(\boldsymbol{I}-\boldsymbol{P}_{\mathbf{1}})\boldsymbol{X}')\boldsymbol{K} \hat{\boldsymbol{\omega}}, \end{array} $$

since $\boldsymbol{I}=\boldsymbol{S}^{-1} \boldsymbol{X}(\boldsymbol{I}-\boldsymbol{P}_{\mathbf{1}})\boldsymbol{X}'$. Thus $(\boldsymbol{y}-\hat{\boldsymbol{y}}_L)'(\hat{\boldsymbol{y}}_L-\hat{\boldsymbol{y}}_{a,TS})=0$.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Li, Y., Udén, P. & von Rosen, D. A two-step PLS inspired method for linear prediction with group effect. Sankhya A 75, 96–117 (2013). https://doi.org/10.1007/s13171-012-0022-8

Download citation

Received: 04 May 2012
Revised: 14 December 2012
Published: 20 March 2013
Issue Date: February 2013
DOI: https://doi.org/10.1007/s13171-012-0022-8

Keywords and phrases

AMS (2000) subject classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A two-step PLS inspired method for linear prediction with group effect

Abstract

Access this article

Similar content being viewed by others

Group least squares regression for linear models with strongly correlated predictor variables

Objective Bayesian group variable selection for linear model

On Predicting Principal Components Through Linear Mixed Models

References

Author information

Authors and Affiliations

Corresponding author

Additional information

A Appendix: Proof of Theorem 2

Proof

Rights and permissions

About this article

Cite this article

Keywords and phrases

AMS (2000) subject classification

Navigation

A two-step PLS inspired method for linear prediction with group effect

Abstract

Access this article

Similar content being viewed by others

Group least squares regression for linear models with strongly correlated predictor variables

Objective Bayesian group variable selection for linear model

On Predicting Principal Components Through Linear Mixed Models

References

Author information

Authors and Affiliations

Corresponding author

Additional information

A Appendix: Proof of Theorem 2

A Appendix: Proof of Theorem 2

Proof

Rights and permissions

About this article

Cite this article

Share this article

Keywords and phrases

AMS (2000) subject classification

Search

Navigation