Skip to main content

Flexible Modelling of Genetic Effects on Function-Valued Traits

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 9649))

Abstract

Genome-wide association studies commonly examine one trait at a time. Occasionally they examine several related traits with the hopes of increasing power; in such a setting, the traits are not generally smoothly varying in any way such as time or space. However, for function-valued traits, the trait is often smoothly-varying along the axis of interest, such as space or time. For instance, in the case of longitudinal traits like growth curves, the axis of interest is time; for spatially-varying traits such as chromatin accessibility it would be position along the genome. Although there have been efforts to perform genome-wide association studies with such function-valued traits, the statistical approaches developed for this purpose often have limitations such as requiring the trait to behave linearly in time or space, or constraining the genetic effect itself to be constant or linear in time. Herein, we present a flexible model for this problem—the Partitioned Gaussian Process—which removes many such limitations and is especially effective as the number of time points increases. The theoretical basis of this model provides machinery for handling missing and unaligned function values such as would occur when not all individuals are measured at the same time points. Further, we make use of algebraic re-factorizations to substantially reduce the time complexity of our model beyond the naive implementation. Finally, we apply our approach and several others to synthetic data before closing with some directions for improved modelling and statistical testing.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    If \(\mathbf{{J}}_N\) were an arbitrary matrix the time complexity would be \(O(N^3 + T^3\)), but because the spectral decomposition of \(\mathbf{{J}}_N\) can be computed once and cached, the complexity becomes \(O(T^3)\). Moreover, because it is an all-ones matrix, its spectral decomposition can be computed more efficiently than in the general case.

References

  1. Shim, H., Stephens, M.: Wavelet-based genetic association analysis of functional phenotypes arising from high-throughput sequencing assays. Ann. Appl. Stat. 9(2), 665–686 (2015)

    Article  MATH  MathSciNet  Google Scholar 

  2. Wu, M.C., Kraft, P., Epstein, M.P., Taylor, D.M., Chanock, S.J., Hunter, D.J., Lin, X.: Powerful SNP-set analysis for case-control genome-wide association studies. Am. J. Hum. Genet. 86(6), 929–942 (2010)

    Article  Google Scholar 

  3. Listgarten, J., Lippert, C., Kang, E.Y., Xiang, J., Kadie, C.M., Heckerman, D.: A powerful and efficient set test for genetic markers that handles confounders. Bioinformatics 29(12), 1526–1533 (2013)

    Article  Google Scholar 

  4. He, Z., Zhang, M., Lee, S., Smith, J.A., Guo, X., Palmas, W., Kardia, S.L.R., Diez Roux, A.V., Mukherjee, B.: Set-based tests for genetic association in longitudinal studies. Biometrics 71(3), 606–615 (2015)

    Article  MATH  MathSciNet  Google Scholar 

  5. Furlotte, N.A., Eskin, E., Eyheramendy, S.: Genome-wide association mapping with longitudinal data. Genet. Epidemiol. 36(5), 463–471 (2012)

    Article  Google Scholar 

  6. Smith, E.N., Chen, W., Kähönen, M., Kettunen, J., Lehtimäki, T., Peltonen, L., Raitakari, O.T., Salem, R.M., Schork, N.J., Shaw, M., Srinivasan, S.R., Topol, E.J., Viikari, J.S., Berenson, G.S., Murray, S.S.: Longitudinal genome-wide association of cardiovascular disease riskfactors in the Bogalusa heart study. PLoS Genet. 6(9), e1001094 (2010)

    Article  Google Scholar 

  7. Jaffa, M., Gebregziabher, M., Jaffa, A.A.: Analysis of multivariate longitudinal kidney function outcomes using generalized linear mixed models. J. Transl. Med. 13(1), 192 (2015)

    Article  Google Scholar 

  8. Das, K., Li, J., Wang, Z., Tong, C., Guifang, F., Li, Y., Meng, X., Ahn, K., Mauger, D., Li, R., Rongling, W.: A dynamic model for genome-wide association studies. Hum. Genet. 129(6), 629–639 (2011)

    Article  Google Scholar 

  9. Sikorska, K., Montazeri, N.M., Uitterlinden, A., Rivadeneira, F., Eilers, P.H.C., Lesaffre, E.: GWAS with longitudinal phenotypes: performance of approximate procedures. Eur. J. Hum. Genet. 23, 1384–1391 (2015)

    Article  Google Scholar 

  10. Ding, L., Kurowski, B.G., He, H., Alexander, E.S., Mersha, T.B., Fardo, D.W., Zhang, X., Pilipenko, V.V., Kottyan, L., Martin, L.J.: Modeling of multivariate longitudinal phenotypes in family geneticstudies with Bayesian multiplicity adjustment. BMC proceedings 8(Suppl 1), S69 (2014)

    Article  Google Scholar 

  11. Musolf, A., Nato, A.Q., Londono, D., Zhou, L., Matise, T.C., Gordon, D.: Mapping genes with longitudinal phenotypes via Bayesian posterior probabilities. BMC Proc. 8(Suppl 1), S81 (2014)

    Article  Google Scholar 

  12. Wang, T.: Linear mixed effects model for a longitudinal genome wideassociation study of lipid measures in type 1 diabetes linear mixed effectsmodel for a longitudinal genome wide association study of lipid measures in type 1 diabetes. Master’s thesis, McMaster University (2012)

    Google Scholar 

  13. Zhang, H.: Multivariate adaptive splines for analysis of longitudinal data. J. Comput. Graph. Stat. 6, 74–91 (1997)

    MathSciNet  Google Scholar 

  14. Kendziorski, C.M., Cowley, A.W., Greene, A.S., Salgado, H.C., Jacob, H.J., Tonellato, P.J.: Mapping baroreceptor function to genome: a mathematical modeling approach. Genetics 160(4), 1687–1695 (2002)

    Google Scholar 

  15. Chung, W., Zou, F.: Mixed-effects models for GAW18 longitudinal blood pressure data. BMC Proc. 8(Suppl 1), S87 (2014)

    Article  Google Scholar 

  16. Stegle, O., Denby, K.J., Cooke, E.J., Wild, D.L., Ghahramani, Z., Borgwardt, K.M.: A robust Bayesian two-sample test for detecting intervals of differential gene expression in microarray time series. J. Comput. Biol. J. Comput. Mol. Cell Biol. 17(3), 355–367 (2010)

    Article  MathSciNet  Google Scholar 

  17. Rasmussen, C.E., Williams, C.K.I.: Gaussian Processes for Machine Learning (Adaptive Computation and Machine Learning). The MIT Press, Cambridge (2005)

    Google Scholar 

  18. Yu, J., Pressoir, G., Briggs, W.H., Vroh Bi, I., Yamasaki, M., Doebley, J.F., McMullen, M.D., Gaut, B.S., Nielsen, D.M., Holland, J.B., Kresovich, S., Buckler, E.S.: A unified mixed-model method for association mapping that accounts for multiple levels of relatedness. Nat. Genet. 38, 203–208 (2006)

    Article  Google Scholar 

  19. Kang, H.M., Zaitlen, N.A., Wade, C.M., Kirby, A., Heckerman, D., Daly, M.J., Eskin, E.: Efficient control of population structure in model organism association mapping. Genetics 178(3), 1709–1723 (2008)

    Article  Google Scholar 

  20. Listgarten, J., Kadie, C., Schadt, E.E., Heckerman, D.: Correction for hidden confounders in the genetic analysis of gene expression. Proc. Nat. Acad. Sci. 107(38), 16465–16470 (2010)

    Article  Google Scholar 

  21. Lippert, C., Listgarten, J., Liu, Y., Kadie, C.M., Davidson, R.I., Heckerman, D.: FaST linear mixed models for genome-wide association studies. Nat. Methods 8(10), 833–835 (2011)

    Article  Google Scholar 

  22. Stegle, O., Lippert, C., Mooij, J.M., Lawrence, N.D., Borgwardt, K.M.: Efficient inference in matrix-variate gaussian models with \(\backslash \)iid observation noise. In: Shawe-Taylor, J., Zemel, R.S., Bartlett, P.L., Pereira, F., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems 24, pp. 630–638. Curran Associates Inc. (2011)

    Google Scholar 

  23. Candela, J.Q., Rasmussen, C.E.: A unifying view of sparse approximate gaussian process regression. J. Mach. Learn. Res. 6, 1939–1959 (2005)

    MATH  MathSciNet  Google Scholar 

  24. Titsias, M.K.: Variational learning of inducing variables in sparse Gaussian processes. Artif. Intell. Stat. 12, 567–574 (2009)

    Google Scholar 

  25. Price, A.L., Patterson, N.J., Plenge, R.M., Weinblatt, M.E., Shadick, N.A., Reich, D.: Principal components analysis corrects for stratification in genome-wide association studies. Nat. Genet. 38(8), 904–909 (2006)

    Article  Google Scholar 

  26. Fusi, N., Lippert, C., Lawrence, N.D., Stegle, O.: Warped linear mixed models for the genetic analysis of transformed phenotypes. Nature Communications, 5:4890 (2014)

    Google Scholar 

Download references

Acknowledgments

We thanks to Leigh Johnston, Ciprian Crainiceanu, Bobby Kleinberg and Praneeth Netrapalli for discussion; the anonymous reviewers for helpful feedback, and Carl Kadie for use of his HPC cluster code. Funding for CARe genotyping was provided by NHLBI Contract N01-HC-65226.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Nicolo Fusi or Jennifer Listgarten .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Fusi, N., Listgarten, J. (2016). Flexible Modelling of Genetic Effects on Function-Valued Traits. In: Singh, M. (eds) Research in Computational Molecular Biology. RECOMB 2016. Lecture Notes in Computer Science(), vol 9649. Springer, Cham. https://doi.org/10.1007/978-3-319-31957-5_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-31957-5_7

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-31956-8

  • Online ISBN: 978-3-319-31957-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics