Skip to main content

Advertisement

Log in

Application of support vector regression to genome-assisted prediction of quantitative traits

  • Original Paper
  • Published:
Theoretical and Applied Genetics Aims and scope Submit manuscript

Abstract

A byproduct of genome-wide association studies is the possibility of carrying out genome-enabled prediction of disease risk or of quantitative traits. This study is concerned with predicting two quantitative traits, milk yield in dairy cattle and grain yield in wheat, using dense molecular markers as predictors. Two support vector regression (SVR) models, ε-SVR and least-squares SVR, were explored and compared to a widely applied linear regression model, the Bayesian Lasso, the latter assuming additive marker effects. Predictive performance was measured using predictive correlation and mean squared error of prediction. Depending on the kernel function chosen, SVR can model either linear or nonlinear relationships between phenotypes and marker genotypes. For milk yield, where phenotypes were estimated breeding values of bulls (a linear combination of the data), SVR with a Gaussian radial basis function (RBF) kernel had a slightly better performance than with a linear kernel, and was similar to the Bayesian Lasso. For the wheat data, where phenotype was raw grain yield, the RBF kernel provided clear advantages over the linear kernel, e.g., a 17.5% increase in correlation when using the ε-SVR. SVR with a RBF kernel also compared favorably to the Bayesian Lasso in this case. It is concluded that a nonlinear RBF kernel may be an optimal choice for SVR, especially when phenotypes to be predicted have a nonlinear dependency on genotypes, as it might have been the case in the wheat data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  • Ben-Hur A, Ong CS, Sonnenburg S, Schölkopf B, Ratsch G (2008) Support vector machines and kernels for computational biology. PLoS Comput Biol 4(10):e1000173

    Article  PubMed  Google Scholar 

  • Bishop CM (2006) Pattern recognition and machine learning. Springer, New York

    Google Scholar 

  • Burges CJ (1998) A tutorial on support vector machines for pattern recognition. Data Min Knowl Disc 2:121–167

    Article  Google Scholar 

  • Chang CC, Lin CJ (2001) LIBSVM: a library for support vector machines. Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm

  • Cherkassky V, Ma Y (2004a) Comparison of loss functions for linear regression. In: Proceedings of the International Joint Conference on Neural Network

  • Cherkassky V, Ma Y (2004b) Practical selection of SVM parameters and noise estimation for SVM regression. Neural Netw 17(1):113–126

    Article  PubMed  Google Scholar 

  • Cherkassky VS, Mulier F (2007) Learning from data: concepts, theory, and methods, 2nd edn. Wiley, Hoboken

    Book  Google Scholar 

  • Coen T, Saeys W, Ramon H, Baerdemaeker JD (2006) Optimizing the tuning parameters of least squares support vector machines regression for NIR spectra. J Chemometr 20:184–192

    Article  CAS  Google Scholar 

  • Cristianini N, Shawe-Taylor J (2000) An introduction to support vector machines and other kernel-based learning methods. Cambridge University Press, New York

    Google Scholar 

  • Crossa J, Burgueño J, Dreisigacker S, Vargas M, Herrera-Foessel SA, Lillemo M, Singh RP, Trethowan R, Warburton M, Franco J, Reynolds M, Crouch JH, Ortiz R (2007) Association analysis of historical bread wheat germplasm using additive genetic covariance of relatives and population structure. Genetics 177(3):1889–1913

    Article  PubMed  CAS  Google Scholar 

  • Crossa J, de los Campos G, Perez P, Gianola D, Burgueno J, Araus JL, Makumbi D, Singh R, Dreisigacker S, Yan J, Arief V, Banziger M, Braun H-J (2010) Prediction of genetic values of quantitative traits in plant breeding using pedigree and molecular markers. Genetics 186:713–724

    Article  PubMed  CAS  Google Scholar 

  • de los Campos G, Naya H, Gianola D, Crossa J, Legarra A, Manfredi E, Weigel KA, Cotes J (2009) Predicting quantitative traits with regression models for dense molecular markers and pedigrees. Genetics 182(1):375–385

    Article  Google Scholar 

  • de los Campos G, Gianola D, Allison DB (2010) Predicting genetic predisposition in humans: the promise of whole-genome markers. Nat Rev Genet 11:880–886

    Article  Google Scholar 

  • Gianola D, van Kaam J (2008) Reproducing kernel Hilbert spaces regression methods for genomic assisted prediction of quantitative traits. Genetics 178(4):2289–2303

    Article  PubMed  Google Scholar 

  • Gianola D, Fernando R, Stella A (2006) Genomic-assisted prediction of genetic value with semiparametric procedures. Genetics 173(3):1761–1776

    Article  PubMed  CAS  Google Scholar 

  • González-Recio O, Gianola D, Rosa G, Weigel K, Kranis A (2009) Genome-assisted prediction of a quantitative trait measured in parents and progeny: application to food conversion rate in chickens. Genet Sel Evol 41(1):3

    Article  PubMed  Google Scholar 

  • Goodman M, Stuber C (1983) Races of maize: VI. Isozyme variation among races of maize in Bolivia. Maydica 28:169–187

    Google Scholar 

  • Long N, Gianola D, Rosa GJM, Weigel KA, Kranis A, González-Recio O (2010) Radial basis function regression methods for predicting quantitative traits using SNP markers. Genet Res 92(3):209–225

    Article  CAS  Google Scholar 

  • Maccaferri M, Sanguineti MC, Corneti S, Ortega JLA, Salem MB, Bort J, DeAmbrogio E, del Moral LFG, Demontis A, El-Ahmed A, Maalouf F, Machlab H, Martos V, Moragues M, Motawaj J, Nachit M, Nserallah N, Ouabbou H, Royo C, Slama A, Tuberosa R (2008) Quantitative trait loci for grain yield and adaptation of durum wheat (Triticum durum Desf.) across a wide range of water availability. Genetics 178(1):489–511

    Article  PubMed  Google Scholar 

  • Maenhout S, Baets BD, Haesaert G, Bockstaele EV (2007) Support vector machine regression for the prediction of maize hybrid performance. Theor Appl Genet 115:1003–1013

    Article  PubMed  CAS  Google Scholar 

  • Meuwissen THE, Hayes BJ, Goddard ME (2001) Prediction of total genetic value using genome-wide dense marker maps. Genetics 157(4):1819–1829

    PubMed  CAS  Google Scholar 

  • Moser G, Tier B, Crump RE, Khatkar MS, Raadsma HW (2009) A comparison of five methods to predict genomic breeding values of dairy bulls from genome-wide SNP markers. Genet Sel Evol 41(1):56

    Article  PubMed  Google Scholar 

  • Nocedal J, Wright SJ (1999) Numerical optimization. Springer, New York

    Book  Google Scholar 

  • Park T, Casella G (2008) The Bayesian lasso. J Am Stat Assoc 103:681–686

    Article  CAS  Google Scholar 

  • Pelckmans K, Suykens JAK, Gestel TV, Brabanter JD, Lukas L, Hamers B, Moor BD, Vandewalle J (2007) LS-SVMlab: a MATLAB/C toolbox for least squares support vector machines. Software available at http://www.esat.kuleuven.be/sista/lssvmlab/

  • Shawe-Taylor J, Cristianini N (2004) Kernel methods for pattern analysis. Cambridge University Press, New York

    Book  Google Scholar 

  • Smola AJ, Schölkopf B (2004) A tutorial on support vector regression. Stat Comput 14(3):199–222

    Article  Google Scholar 

  • Suykens J, Gestel TV, Brabanter JD, Moor BD, Vandewalle J (2002) Leaset squares support vector machines. World Scientific, Singapore

    Book  Google Scholar 

  • Vapnik V (1995) The nature of statistical learning theory, 2nd edn. Springer, New York

    Google Scholar 

  • Vázquez AI, Rosa GJM, Weigel KA, de los Campos G, Gianola D, Allison DB (2010) Predictive ability of subsets of single nucleotide polymorphisms with and without parent average in US Holsteins. J Dairy Sci 93:5942–5949

    Article  PubMed  Google Scholar 

  • Visscher PM (2008) Sizing up human height variation. Nat Genet 40(5):489–490

    Article  PubMed  CAS  Google Scholar 

  • Watkins C (2000) Dynamic alignment kernels. In: Smola AJ, Bartlett PL, Schölkopf B, Schuurmans D (eds) Advances in large margin classifiers. MIT Press, Cambridge

    Google Scholar 

  • Wei Z, Wang K, Qu H-Q, Zhang H, Bradfield J, Kim C, Frackleton E, Hou C, Glessner JT, Chiavacci R, Stanley C, Monos D, Grant SFA, Polychronakos C, Hakonarson H (2009) From disease association to risk assessment: an optimistic view from genome-wide association studies on type 1 diabetes. PLoS Genet 5(10):e1000678

    Article  PubMed  Google Scholar 

  • Weigel KA, de los Campos G, González-Recio O, Naya H, Wu XL, Long N, Rosa GJM, Gianola D (2009) Predictive ability of direct genomic values for lifetime net merit of Holstein sires using selected subsets of single nucleotide polymorphism markers. J Dairy Sci 92(10):5248–5257

    Article  PubMed  CAS  Google Scholar 

  • Wright S (1978) Variability within and among natural populations. In: Evolution and the genetics of populations

  • Yang J, Benyamin B, McEvoy BP, Gordon S, Henders AK, Nyholt DR, Madden PA, Heath AC, Martin NG, Montgomery GW, Goddard ME, Visscher PM (2010) Common SNPs explain a large proportion of the heritability for human height. Nat Genet 42(7):565–569

    Article  PubMed  CAS  Google Scholar 

  • Yi N, Xu S (2008) Bayesian LASSO for quantitative trait loci mapping. Genetics 179(2):1045–1055

    Article  PubMed  CAS  Google Scholar 

Download references

Acknowledgments

This work was supported by the Wisconsin Agriculture Experiment Station, Aviagen Ltd., and by grants NRICGP/USDA 2003-35205-12833, NSF DEB-0089742 and NSF DMS-044371. We thank the editor and reviewers for their insightful comments.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Nanye Long.

Additional information

Communicated by C. Schön.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Long, N., Gianola, D., Rosa, G.J.M. et al. Application of support vector regression to genome-assisted prediction of quantitative traits. Theor Appl Genet 123, 1065–1074 (2011). https://doi.org/10.1007/s00122-011-1648-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00122-011-1648-y

Keywords

Navigation