PLS Regression and Hybrid Methods in Genomics Association Studies
Using data from a case-control study on schizophrenia, we demonstrate the use of PLS regression in constructing predictors of a phenotype from Single Nucleotide Polymorphisms (SNPs). We consider straightforward application of PLS regression as well as two hybrid methods, in which PLS regression scores are used as input for a tree-growing algorithm and a clustering algorithm respectively. We compare these approaches with other classic predictors used in statistical learning, showing that our PLS-based hybrid methods outperform both classic predictors and straightforward PLS regression.
Key wordsPLS Regression Bagging SNP GWAS
- T. Hastie,T., R. Tibshirani, J.H., Friedman, The elements of Statistical Learning New York, Springer, 2008.Google Scholar
- L. Breiman, “Bagging Predictors” Machine Learning, 26, 123–140, 1996.Google Scholar
- Y. Freund, R.E. Schapire, “A short introduction to boosting,” Journal of Japanese Society for Artificial Intelligence, 14, 771–780, 1999.Google Scholar
- J.A. Hanley, B.J. McNeil, “The meaning and use of the area under a receiver operating characteristic (ROC) curve,” Radiology, 143, 29–36, 1982.Google Scholar