Skip to main content
Log in

A two-step PLS inspired method for linear prediction with group effect

  • Published:
Sankhya A Aims and scope Submit manuscript

An Erratum to this article was published on 21 August 2015

Abstract

In this article, we consider prediction of a univariate response from background data. The data may have a near-collinear structure and additionally group effects are assumed to exist. A two-step method is proposed. The first step summarizes the information in the predictors via a bilinear model. The bilinear model has a Krylov structured within individual design matrix, which is the link to classical partial least squares (PLS) analysis and a between-individual design matrix which handles group effects. The second step is the prediction step where a conditional expectation approach is used. The two-step method gives new insight into PLS. Explicit maximum likelihood estimators of the dispersion matrix and mean for the predictors are derived under the assumption that the covariance between the response and explanatory variables is known. It is shown that for within-sample prediction the mean squared error of the two-step method is always smaller than PLS.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Figure 1
Figure 2
Figure 3
Figure 4

Similar content being viewed by others

References

  • Björkström, A. (2010). Krylov sequences as a tool for analysing iterated regression algorithms. Scand. J. Stat., 37(1), 166–175.

    Article  MathSciNet  MATH  Google Scholar 

  • Brown, P.J. (1993). Measurement, regression, and calibration. Oxford Statistical Science Series, vol. 12. The Clarendon Press Oxford University Press, New York.

    Google Scholar 

  • Cook, R.D., Li, B. and Chiaromonte, F. (2010). Envelope models for parsimonious and efficient multivariate linear regression (with discussion). Statist. Sinica, 20, 927–1010 .

    MathSciNet  MATH  Google Scholar 

  • Fearn, T. (1983). A misuse of ridge regression in the calibration of a near infrared reflectance instrument. J. R. Stat. Soc. Ser. C, 32, 73–79.

    Google Scholar 

  • Frank, I.E., and Friedman, J.H. (1993). A statistical view of some chemometrics regression tools. Technometrics, 35(2), 109–135.

    Article  MATH  Google Scholar 

  • Helland, I.S. (1988). On the structure of partial least squares regression. Comm. Statist. Simulation Comput., 17, 581–607.

    Article  MathSciNet  MATH  Google Scholar 

  • Helland, I.S. (1990). Partial least squares regression and statistical models. Scand. J. Stat., 17, 97–114.

    MathSciNet  MATH  Google Scholar 

  • Helland, I.S. (1992). Maximum likelihood regression on relevant components. J. R. Stat. Soc. Ser. B, 54, 637–647.

    MathSciNet  MATH  Google Scholar 

  • Helland, I.S. (2010). Discussion of Cook, R.D., Li, B. and Chiaromonte, F. (2010). Envelope models for parsimonious and efficient multivariate linear regression. Statist. Sinica, 20, 978–981.

  • Hoerl, A.E., Kennard, R.W. and Hoerl, R.W. (1985). Practical use of ridge regression: A challenge met. J. R. Stat. Soc. Ser. C, 34, 114–120.

    Google Scholar 

  • Kollo, T., and Von Rosen, D. (2005). Advanced Multivariate Statistics with Matrices. Mathematics and Its Applications (New York), vol. 579. Springer, Dordrecht.

    MATH  Google Scholar 

  • Kondylis, A. and Whittaker, J. (2008). Spectral preconditioning of Krylov spaces: Combining PLS and PC regression. Comput. Statist. Data Anal., 52, 2588–2603.

    Article  MathSciNet  MATH  Google Scholar 

  • Li, Y. and Von Rosen, D. (2012). Maximum likelihood estimators in a two step model for PLS. Comm. Statist. Theory Methods, 41, 2503–2511.

    Article  MathSciNet  MATH  Google Scholar 

  • Ohlson, M., Ahmad, M.R. and Von Rosen, D. (2013). The multilinear normal distribution: Introduction and some basic properties. J. Multivariate Anal., 113, 37–47.

    Article  MathSciNet  MATH  Google Scholar 

  • Potthoff, R.F. and Roy, S.N. (1964). A generalized multivariate analysis of variance model useful especially for growth curve problems. Biometrika, 51, 313–326.

    MathSciNet  MATH  Google Scholar 

  • Von Rosen, D. (1991). The growth curve model: A review. Comm. Statist. Theory Methods, 20, 2791–2822.

    Article  MathSciNet  MATH  Google Scholar 

  • Von Rosen, D. (1994). PLS, linear models and invariant spaces. Scand. J. Stat., 21, 179–186.

    MATH  Google Scholar 

  • Srivastava, M.S. and Khatri, C.G. (1979) An introduction to multivariate statistics. North-Holland, New York.

    MATH  Google Scholar 

  • Srivastava, M.S. and Von Rosen, D. (1999). Growth curve models. In Multivariate Analysis, Design of Experiments, and Survey Sampling. Statist. Textbooks Monogr., vol. 159, pp. 547–578. Dekker, New York.

    Google Scholar 

  • Stone, M. and Brooks, R.J. (1990). Continuum regression: Cross-validated sequentially constructed prediction embracing ordinary least squares, partial least squares and principal components regression. J. R. Stat. Soc. Ser. B, 52(2), 237–269. With discussion and a reply by the authors.

    MathSciNet  MATH  Google Scholar 

  • Sundberg, R. (1999). Multivariate calibration—direct and indirect regression methodology. Scand. J. Stat., 26, 161–207.

    Article  MathSciNet  MATH  Google Scholar 

  • Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. J. R. Statist. Soc. Ser. B, 58, 267–288.

    MathSciNet  MATH  Google Scholar 

  • Wold, S., Martens, H. and Wold, H. (1983). The multivariate calibration problem in chemistry solved by the PLS method. In Matrix Pencils. Lecture Notes in Mathematics, (B. Kågström and A. Ruhe, eds.), vol. 973, pp. 286–293. Springer Berlin / Heidelberg.

    Chapter  Google Scholar 

  • Woolson, R.F. and Leeper, J.D. (1980). Growth curve analysis of complete and incomplete longitudinal data. Comm. Statist. Theory Methods, 9, 1491–1513.

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ying Li.

Additional information

An erratum to this article is available at http://dx.doi.org/10.1007/s13171-015-0077-4.

A Appendix: Proof of Theorem 2

A Appendix: Proof of Theorem 2

Proof

Let us start with the proof of \(\boldsymbol{E} (\boldsymbol{y}-\hat{\boldsymbol{y}}_{a,TS})'(\hat{\boldsymbol{y}}_{a,TS}-\hat{\boldsymbol{y}}_{a,PLS})=0\). Based on the inverse binomial theorem \(\hat{{\boldsymbol{\Sigma}}}^{-1}\) can be split into two parts,

$$ \label{isigma} \hat{\boldsymbol{\Sigma}}^{-1}=\left(\frac{1}{n}\boldsymbol{S}\right)^{-1}-\boldsymbol{K}, $$
(A.1)

where

$$ \begin{array}{rll} \boldsymbol{K} &=&n \boldsymbol{S}^{-1} (\boldsymbol{I}-\boldsymbol{\hat{A}(\hat{A}'S^{-1}\hat{A})^{-} \hat{A}' S ^{-1}})\boldsymbol{X} \boldsymbol{P}_{1}\boldsymbol{Z}^{-1} \\ &&\times \boldsymbol{P}_{\mathbf{1}}\boldsymbol{X}'(\boldsymbol{I}- \boldsymbol{S ^{-1}\hat{A}(\hat{A}'S^{-1}\hat{A})^{-1} \hat{A}'}){\boldsymbol{S}}^{-1}, \end{array} $$

with

$$ \boldsymbol{Z}=\boldsymbol{P}_{\mathbf{1}}\boldsymbol{X}'(\boldsymbol{I}- \boldsymbol{S ^{-1}\hat{A}(\hat{A}'S^{-1}\hat{A})^{-1} \hat{A}'}){\boldsymbol{S}}^{-1}({\boldsymbol{I}}-\boldsymbol{\hat{A}(\hat{A}'S^{-1}\hat{A})^{-} \hat{A}' S ^{-1}}){\boldsymbol{X}} {\boldsymbol{P}}_{\mathbf{1}}+{\boldsymbol{I}}. $$

In order to compare the \(\hat{\boldsymbol{y}}_{a,PLS}\) with \(\hat{\boldsymbol{y}}_{a,TS}\), we replace \(\hat{\boldsymbol{G}}_a\) with \(n \boldsymbol{S}^{-1} \hat{\boldsymbol{A}}\), and using (A.1)

$$ \begin{array}{rll} (\hat{\boldsymbol{y}}_{a,TS}-\hat{\boldsymbol{y}}_{a,PLS}) &=&(\hat{\boldsymbol{\omega}}'n\boldsymbol{S}^{-1}\boldsymbol{X}-\hat{\boldsymbol{\omega}}'n\boldsymbol{S}^{-1}\hat{\boldsymbol{A}}(\hat{\boldsymbol{A}}'\boldsymbol{S}^{-1}\hat{\boldsymbol{A}})^{-}\hat{\boldsymbol{A}}'\boldsymbol{S}^{-1}\boldsymbol{X} \boldsymbol{P}_{\mathbf{1}}\\ &&-\hat{\boldsymbol{\omega}}'\boldsymbol{K} \boldsymbol{X}+\hat{\boldsymbol{\omega}}'\boldsymbol{K}\hat{\boldsymbol{A}}(\hat{\boldsymbol{A}}'\boldsymbol{S}^{-1}\hat{\boldsymbol{A}})^{-}\hat{\boldsymbol{A}}'\boldsymbol{S}^{-1}\boldsymbol{X} \boldsymbol{P}_{\mathbf{1}}\\ &&-(\hat{\boldsymbol{\omega}}'n \boldsymbol{S}^{-1}\hat{\boldsymbol{A}}(\hat{\boldsymbol{A}}'\boldsymbol{S}^{-1}\hat{\boldsymbol{A}})^{-}\hat{\boldsymbol{A}}'\boldsymbol{S}^{-1} \boldsymbol{X}\\ &&\label{eq5.7}+(\hat{\boldsymbol{\omega}}'n \boldsymbol{S}^{-1}\hat{\boldsymbol{A}}(\hat{\boldsymbol{A}}'\boldsymbol{S}^{-1}\hat{\boldsymbol{A}})^{-}\hat{\boldsymbol{A}}'\boldsymbol{S}^{-1} \boldsymbol{X} \boldsymbol{P}_{\mathbf{1}})'. \end{array} $$
(A.2)

Since \(\boldsymbol{K}\hat{\boldsymbol{A}}(\hat{\boldsymbol{A}}'\boldsymbol{S}^{-1}\hat{\boldsymbol{A}})^{-}\hat{\boldsymbol{A}}'\boldsymbol{S}^{-1}\boldsymbol{X} \boldsymbol{P}_{\mathbf{1}}=\mathbf{0}\), (A.2) reduces to

$$ \begin{array}{rll} &&(\hat{\boldsymbol{\omega}}'\hat{\boldsymbol{\Sigma}}^{-1}\boldsymbol{X}-\hat{\boldsymbol{\omega}}'n \boldsymbol{S}^{-1}\hat{\boldsymbol{A}}(\hat{\boldsymbol{A}}'\boldsymbol{S}^{-1}\hat{\boldsymbol{A}})^{-}\hat{\boldsymbol{A}}'\boldsymbol{S}^{-1} \boldsymbol{X})'\\ &&=\label{eq5.8} \boldsymbol{X}'(\boldsymbol{I}-\boldsymbol{S}^{-1}\hat{\boldsymbol{A}}(\hat{\boldsymbol{A}}'\boldsymbol{S}^{-1}\hat{\boldsymbol{A}})^{-}\hat{\boldsymbol{A}}') \hat{\boldsymbol{\Sigma}}^{-1} \hat{{\boldsymbol{\omega}}}. \end{array} $$
(A.3)

For \(\boldsymbol{y}-\boldsymbol{y}_{a,TS}\),

$$ \begin{array}{rll} (\boldsymbol{y}-\hat{\boldsymbol{y}}_{a,TS})'&=&\boldsymbol{y}' - \hat{\boldsymbol{\omega}}'\hat{\boldsymbol{\Sigma}}^{-1}(\boldsymbol{X}-\hat{\boldsymbol{\mu}}_x)-\hat{\boldsymbol{\mu}}'_y \\ &=&\label{eq5.9}\boldsymbol{y}' (\boldsymbol{I}-\boldsymbol{P}_{\mathbf{1}}) (\boldsymbol{I}-\frac{1}{n}\boldsymbol{X}' \hat{\boldsymbol{\Sigma}}^{-1}(\boldsymbol{X}-\hat{\boldsymbol{\mu}}_x)). \end{array} $$
(A.4)

Then we apply the result of (A.3) and (A.4),

$$ \begin{array}{rll} &&(\boldsymbol{y}-\hat{\boldsymbol{y}}_{a,TS})'(\hat{\boldsymbol{y}}_{a,TS}-\hat{\boldsymbol{y}}_{a,PLS})\\ &&\quad=\boldsymbol{y}' (\boldsymbol{I}-\boldsymbol{P}_{\mathbf{1}}) (\boldsymbol{I}-\frac{1}{n}\boldsymbol{X}' \hat{{\boldsymbol{\Sigma}}}^{-1}({\boldsymbol{X}}-\hat{{\boldsymbol{\mu}}}_x)){\boldsymbol{X}}'({\boldsymbol{I}}-{\boldsymbol{S}}^{-1}\hat{{\boldsymbol{A}}}(\hat{{\boldsymbol{A}}}'{\boldsymbol{S}}^{-1}\hat{{\boldsymbol{A}}})^{-}\hat{{\boldsymbol{A}}}') \hat{{\boldsymbol{\Sigma}}}^{-1} \hat{{\boldsymbol{\omega}}}\nonumber, \end{array} $$

which equals 0 if

$$ \label{shorterm} \left(\boldsymbol{I}-\frac{1}{n}\boldsymbol{X}' \hat{{\boldsymbol{\Sigma}}}^{-1}({\boldsymbol{X}}-\hat{{\boldsymbol{\mu}}}_x)\right){\boldsymbol{X}}'({\boldsymbol{I}}-{\boldsymbol{S}}^{-1}\hat{{\boldsymbol{A}}}(\hat{{\boldsymbol{A}}}'{\boldsymbol{S}}^{-1}\hat{{\boldsymbol{A}}})^{-}\hat{{\boldsymbol{A}}}')=\mathbf{0}. $$

For the second inequality, we need to prove that the sufficient condition \(\boldsymbol{E}(\boldsymbol{y}-\hat{\boldsymbol{y}}_L)'(\hat{\boldsymbol{y}}_L-\hat{\boldsymbol{y}}_{a,TS})=0 \) holds, where

$$ \label{yl-yts} (\hat{\boldsymbol{y}}_L-\hat{\boldsymbol{y}}_{a,TS})=\left(\hat{\boldsymbol{\omega}}'\left(\frac{1}{n}\boldsymbol{S}\right)^{-1}(\boldsymbol{X}-\boldsymbol{X} \boldsymbol{P}_{\mathbf{1}}')-\hat{\boldsymbol{\omega}}'\hat{\boldsymbol{\Sigma}}^{-1}(\boldsymbol{X}-\widehat{\boldsymbol{\mu}_x})\right)'. $$
(A.5)

Then using (A.1) in (A.5),

$$ (\hat{\boldsymbol{y}}_L-\hat{\boldsymbol{y}}_{a,TS}) \label{eq5.5}=\left( -\boldsymbol{P}_{\mathbf{1}}\boldsymbol{X}'\left(\frac{1}{n}\boldsymbol{S}\right)^{-1} + \boldsymbol{X}'\boldsymbol{K}+\hat{\boldsymbol{\mu}}_x'\left(\frac{1}{n}\boldsymbol{S}\right)^{-1}- \hat{\boldsymbol{\mu}}_x'\boldsymbol{K} \right)\hat{\boldsymbol{\omega}} $$
(A.6)
$$ \begin{array}{rll} \label{eq5.6}&=& \boldsymbol{X}'\boldsymbol{K} \hat{{\boldsymbol{\omega}}}-({\boldsymbol{P}}_{\mathbf{1}}{\boldsymbol{X}}'-\hat{{\boldsymbol{\mu}}}_x')\left(\frac{1}{n}{\boldsymbol{S}}\right)^{-1}\hat{{\boldsymbol{\omega}}} \\[-2.5pt] &=&\boldsymbol{X}'\boldsymbol{K} \hat{{\boldsymbol{\omega}}}-{\boldsymbol{P}}_{\mathbf{1}}{\boldsymbol{X}}'({\boldsymbol{I}}-{\boldsymbol{S}}^{-1}\hat{{\boldsymbol{A}}}(\hat{{\boldsymbol{A}}}'{\boldsymbol{S}}^{-1}\hat{{\boldsymbol{A}}})^{-}\hat{{\boldsymbol{A}}}')\left(\frac{1}{n}{\boldsymbol{S}}\right)^{-1}\hat{{\boldsymbol{\omega}}}, \end{array} $$
(A.7)

where, (A.7) follows (A.6) since \( \hat{\boldsymbol{\mu}}_x'\boldsymbol{K}=\mathbf{0}. \) Furthermore,

$$ \begin{array}{rll} &&(\boldsymbol{y}-\hat{\boldsymbol{y}}_L)'(\hat{\boldsymbol{y}}_L-\hat{\boldsymbol{y}}_{a,TS})\\[-2.5pt] &&\quad=\boldsymbol{y}' (\boldsymbol{I}-\boldsymbol{P}_{\mathbf{1}})(\boldsymbol{I}-\boldsymbol{X}' \boldsymbol{S}^{-1} \boldsymbol{X}(\boldsymbol{I}-\boldsymbol{P}_{\mathbf{1}}))\boldsymbol{X}'\boldsymbol{K} \hat{\boldsymbol{\omega}}\\[-2.5pt] &&\qquad-(\boldsymbol{y}'-\boldsymbol{y}' (\boldsymbol{I}-\boldsymbol{P}_{\mathbf{1}})\boldsymbol{X}' \boldsymbol{S}^{-1}\boldsymbol{X})(\boldsymbol{I}-\boldsymbol{P}_{\mathbf{1}})\boldsymbol{P}_{\mathbf{1}}\boldsymbol{X}'\\[-2.5pt] &&(\boldsymbol{I}-\boldsymbol{S}^{-1}\hat{\boldsymbol{A}}(\hat{\boldsymbol{A}}'\boldsymbol{S}^{-1}\hat{\boldsymbol{A}})^{-}\hat{\boldsymbol{A}}')\left(\frac{1}{n}\boldsymbol{S}\right)^{-1}\hat{\boldsymbol{\omega}} \\[-2.5pt] &&\quad=\boldsymbol{y}' (\boldsymbol{I}-\boldsymbol{P}_{\mathbf{1}})\boldsymbol{X}'(\boldsymbol{I}-\boldsymbol{S}^{-1} \boldsymbol{X}(\boldsymbol{I}-\boldsymbol{P}_{\mathbf{1}})\boldsymbol{X}')\boldsymbol{K} \hat{\boldsymbol{\omega}}, \end{array} $$

since \(\boldsymbol{I}=\boldsymbol{S}^{-1} \boldsymbol{X}(\boldsymbol{I}-\boldsymbol{P}_{\mathbf{1}})\boldsymbol{X}'\). Thus \((\boldsymbol{y}-\hat{\boldsymbol{y}}_L)'(\hat{\boldsymbol{y}}_L-\hat{\boldsymbol{y}}_{a,TS})=0\).

Rights and permissions

Reprints and permissions

About this article

Cite this article

Li, Y., Udén, P. & von Rosen, D. A two-step PLS inspired method for linear prediction with group effect. Sankhya A 75, 96–117 (2013). https://doi.org/10.1007/s13171-012-0022-8

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13171-012-0022-8

Keywords and phrases

AMS (2000) subject classification

Navigation