Skip to main content

Advertisement

Log in

Prediction of clinical outcome with microarray data: a partial least squares discriminant analysis (PLS-DA) approach

  • Original Investigation
  • Published:
Human Genetics Aims and scope Submit manuscript

Abstract

Partial least squares discriminant analysis (PLS-DA) is a partial least squares regression of a set Y of binary variables describing the categories of a categorical variable on a set X of predictor variables. It is a compromise between the usual discriminant analysis and a discriminant analysis on the significant principal components of the predictor variables. This technique is specially suited to deal with a much larger number of predictors than observations and with multicollineality, two of the main problems encountered when analysing microarray expression data. We explore the performance of PLS-DA with published data from breast cancer (Perou et al. 2000). Several such analyses were carried out: (1) before vs after chemotherapy treatment, (2) estrogen receptor positive vs negative tumours, and (3) tumour classification. We found that the performance of PLS-DA was extremely satisfactory in all cases and that the discriminant cDNA clones often had a sound biological interpretation. We conclude that PLS-DA is a powerful yet simple tool for analysing microarray data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1a–d.
Fig. 2a, b.
Fig. 3.
Fig. 4a, b.
Fig. 5a, b.
Fig. 6a, b.

Similar content being viewed by others

References

  • Alter O, Brown PO, Botstein D (2000) Singular value decomposition for genome-wide expression data processing and modeling. Proc Natl Acad Sci USA 97:10101–10106

    CAS  PubMed  Google Scholar 

  • Charpentier AH, Bednarek AK, Daniel RL, Hawkins KA, Laflin KJ, Gaddis S, MacLeod MC, Aldaz CM (2000) Effects of estrogen on global gene expression: identification of novel targets of estrogen action. Cancer Res 60:5977–5983

    CAS  PubMed  Google Scholar 

  • Datta S (2001) Exploring relationships in gene expressions: a partial least squares approach. Gene Expr 9:249–255

    CAS  PubMed  Google Scholar 

  • De Bruin A, Muller E, Wurm S, Caldelari R, Wyder M, Wheelock MJ, Suter MM (1999) Loss of invasiveness in squamous cell carcinoma cells overexpressing desmosomal cadherins. Cell Adhes Commun 7:13–28

    PubMed  Google Scholar 

  • Eisen MB, Spellman PT, Brown PO, Botstein D (1998) Cluster analysis and display of genome-wide expression patterns. Proc Natl Acad Sci USA 95:14863–14868

    CAS  PubMed  Google Scholar 

  • Eriksson L, Johansson E, Kettaneh-Wold N, Wold S (1999) Introduction to multi- and megavariate data analysis using projection methods (PCA and PLS). Umetrics, Umea

  • Frank IE, Friedman JH (1993) A statistical view of some chemometrics regression tools. Technometrics 35:109–135

    Google Scholar 

  • Gershon D (2002) Microarray technology: an array of opportunities. Nature 416:885–891

    Article  PubMed  Google Scholar 

  • Good PI (2000) Permutation tests: a practical guide to resampling methods for testing hypotheses. Springer, New York

    Google Scholar 

  • Gruvberger S, Ringner M, Chen Y, Panavally S, Saal LH, Borg A, Ferno M, Peterson C, Meltzer PS (2001) Estrogen receptor status in breast cancer is associated with remarkably distinct gene expression patterns. Cancer Res 61:5979–5984

    CAS  PubMed  Google Scholar 

  • Hastie T, Tibshirani R, Friedman JH (2001) The elements of statistical learning. Springer, New York

  • Hedenfak IA, Ringner M, Trent JM, Borg A (2002) Gene expression in inherited breast cancer. Adv Cancer Res 84:1–34

    PubMed  Google Scholar 

  • Khan J, Wei JS, Ringner M, Saal LH, Ladanyi M, Westermann F, Berthold F, Schwab M, Antonescu CR, Peterson C, Meltzer PS (2001) Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks. Nat Med 7:673–679

    CAS  PubMed  Google Scholar 

  • Kinoshita Y, Jarell AD, Flaman JM, Foltz G, Schuster J, Sopher BL, Irvin DK, Kanning K, Kornblum HI, Nelson PS, Hieter P, Morrison RS (2001) Pescadillo, a novel cell cycle regulatory protein abnormally expressed in malignant cells. J Biol Chem 276:6656–6665

    Article  CAS  PubMed  Google Scholar 

  • Knudsen S (2002) A biologist's guide to analysis of DNA microarray data. Wiley, New York

  • Kondo S, Kubota S, Shimo T, Nishida T, Yosimichi G, Eguchi T, Sugahara T, Takigawa M (2002) Connective tissue growth factor increased by hypoxia may initiate angiogenesis in collaboration with matrix metalloproteinases. Carcinogenesis 23:769–776

    Article  CAS  PubMed  Google Scholar 

  • Lakhani SR, Ashworth A (2001) Microarray and histopathological analysis of tumours: the future and the past? Nat Rev Cancer 1:151–157

    Article  CAS  PubMed  Google Scholar 

  • Nguyen DV, Rocke DM (2002) Tumor classification by partial least squares using microarray gene expression data. Bioinformatics 18:39–50

    Article  CAS  PubMed  Google Scholar 

  • Nobori K, Ito H, Tamamori-Adachi M, Adachi S, Ono Y, Kawauchi J, Kitajima S, Marumo F, Isobe M (2002) ATF3 inhibits doxorubicin-induced apoptosis in cardiac myocytes: a novel cardioprotective role of ATF3. J Mol Cell Cardiol 34:1387–1397

    Article  CAS  PubMed  Google Scholar 

  • Osborne CK (1998) Steroid hormone receptors in breast cancer management. Breast Cancer Res Treat 51:227–238

    CAS  PubMed  Google Scholar 

  • Perou CM, Sorlie T, Eisen MB, van de Rijn M, Jeffrey SS, Rees CA, Pollack JR, Ross DT, Johnsen H, Akslen LA, Fluge O, Pergamenschikov A, Williams C, Zhu SX, Lonning PE, Borresen-Dale AL, Brown PO, Botstein D (2000) Molecular portraits of human breast tumours. Nature 406:747–752

    CAS  PubMed  Google Scholar 

  • Quackenbush J (2001) Computational analysis of microarray data. Nat Rev Genet 2:418–427

    Article  CAS  PubMed  Google Scholar 

  • Shtil AA, Mandlekar S, Yu R, Walter RJ, Hagen K, Tan TH, Roninson IB, Kong AN (1999) Differential regulation of mitogen-activated protein kinases by microtubule-binding agents in human breast cancer cells. Oncogene 18: 377–384

    Article  CAS  PubMed  Google Scholar 

  • Sorlie T, Perou CM, Tibshirani R, Aas T, Geisler S, Johnsen H, Hastie T, Eisen MB, van de Rijn M, Jeffrey SS, Thorsen T, Quist H, Matese JC, Brown PO, Botstein D, Eystein Lonning P, Borresen-Dale AL (2001) Gene expression patterns of breast carcinomas distinguish tumor subclasses with clinical implications. Proc Natl Acad Sci USA 98:10869–10874

    CAS  PubMed  Google Scholar 

  • Tenenhaus M (1998) La régression PLS. Editions Technip, Paris

  • Thuillier P, Brash AR, Kehrer JP, Stimmel JB, Leesnitzer LM, Yang P, Newman RA, Fischer SM (2002) Inhibition of PPAR-mediated keratinocyte differentiation by lipoxygenase inhibitors. Biochem J 366:901–910

    CAS  PubMed  Google Scholar 

  • van 't Veer LJ, Dai H, van de Vijver MJ, He YD, Hart AA, Mao M, Peterse HL, van der Kooy K, Marton MJ, Witteveen AT, Schreiber GJ, Kerkhoven RM, Roberts C, Linsley PS, Bernards R, Friend SH (2002) Gene expression profiling predicts clinical outcome of breast cancer. Nature 415:530–536

    PubMed  Google Scholar 

  • West M, Blanchette C, Dressman H, Huang E, Ishida S, Spang R, Zuzan H, Olson JA Jr, Marks JR, Nevins JR (2001) Predicting the clinical status of human breast cancer by using gene expression profiles. Proc Natl Acad Sci USA 98:11462–11467

    Article  CAS  PubMed  Google Scholar 

  • Wold S, Martens H, Wold H (1983) The multivariate calibration problem in chemistry solved by the PLS method. In: Ruhe A, Kagstrom B (eds) Proc Conf Matrix Pencils. Springer, Heidelberg, pp 286–293

Download references

Acknowledgements

We thank A. Børresen-Dale for comments. This work was funded by Action en Bioinformatique, Ministère de la Recherche of France.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Miguel Pérez-Enciso.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Pérez-Enciso, M., Tenenhaus, M. Prediction of clinical outcome with microarray data: a partial least squares discriminant analysis (PLS-DA) approach. Hum Genet 112, 581–592 (2003). https://doi.org/10.1007/s00439-003-0921-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00439-003-0921-9

Keywords

Navigation