Advertisement

Modeling Dependence in Evolutionary Inference for Proteins

  • Gary Larson
  • Jeffrey L. Thorne
  • Scott Schmidler
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10812)

Abstract

Protein structure alignment is a classic problem of computational biology, and is widely used to identify structural and functional similarity and to infer homology among proteins. Previously a statistical model for protein structural evolution has been introduced and shown to significantly improve phylogenetic inferences compared to approaches that utilize only amino acid sequence information. Here we extend this model to account for correlated evolutionary drift among neighboring amino acid positions, resulting in a spatio-temporal model of protein structure evolution. The result is a multivariate diffusion process convolved with a spatial birth-death process, which comes with little additional computational cost or analytical complexity compared to the site-independent model (SIM). We demonstrate that this extended, site-dependent model (SDM) yields a significant reduction of bias in estimated evolutionary distances and helps further improve phylogenetic tree reconstruction.

Keywords

Protein structure Evolution Dynamic programming Phylogeny Diffusion process 

Notes

Acknowledgments

This work was partially supported by NSF grant DMS-1407622 and NIH grant R01-GM090201 (S.C.S.). Jeffrey L. Thorne was supported by NIH grant GM118508. Gary Larson was partially supported by NSF training grant DMS-1045153 (S.C.S.).

References

  1. 1.
    Wang, S., Ma, J., Peng, J., Xu, J.: Protein structure alignment beyond spatial proximity. Sci. Rep. 3, 1448 (2013).  https://doi.org/10.1038/srep01448CrossRefGoogle Scholar
  2. 2.
    Challis, C.J., Schmidler, S.C.: A stochastic evolutionary model for protein structure alignment and phylogeny. Mol. Biol. Evol. 29(11), 3575–3587 (2012).  https://doi.org/10.1093/molbev/mss167CrossRefGoogle Scholar
  3. 3.
    Herman, J.L., Challis, C.J., Novák, A., Hein, J., Schmidler, S.C.: Simultaneous Bayesian estimation of alignment and phylogeny under a joint model of protein sequence and structure. Mol. Biol. Evol. 31(9), 2251–2266 (2014).  https://doi.org/10.1093/molbev/msu184CrossRefGoogle Scholar
  4. 4.
    von Haeseler, A., Schöniger, M.: Evolution of DNA or amino acid sequences with dependent sites. J. Comput. Biol. 5(1), 149–163 (1998).  https://doi.org/10.1089/cmb.1998.5.149CrossRefGoogle Scholar
  5. 5.
    Arenas, M.: Trends in substitution models of molecular evolution. Front. Genet. 6, 319 (2015).  https://doi.org/10.3389/fgene.2015.00319CrossRefGoogle Scholar
  6. 6.
    Schmidler, S.C.: Bayesian Statistics, vol. 8. Oxford University Press, New York (2006)Google Scholar
  7. 7.
    Wang, R., Schmidler, S.C.: Bayesian multiple protein structure alignment. In: Sharan, R. (ed.) RECOMB 2014. LNCS, vol. 8394, pp. 326–339. Springer, Cham (2014).  https://doi.org/10.1007/978-3-319-05269-4_27CrossRefGoogle Scholar
  8. 8.
    Cheng, H., Kim, B.H., Grishin, N.V.: MALIDUP: a database of manually constructed structure alignments for duplicated domain pairs. Proteins 70(4), 1162–1166 (2008).  https://doi.org/10.1002/prot.21783CrossRefGoogle Scholar
  9. 9.
    Thorne, J.L., Kishino, H., Felsenstein, J.: An evolutionary model for maximum likelihood alignment of DNA sequences. J. Mol. Evol. 33(2), 114–124 (1991).  https://doi.org/10.1007/BF02193625CrossRefGoogle Scholar
  10. 10.
    Durbin, R., Eddy, S., Krogh, A., Mitchison, G.: Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids. Cambridge University, Cambridge (1998).  https://doi.org/10.1110/ps.8.3.695CrossRefzbMATHGoogle Scholar
  11. 11.
    Kosiol, C., Goldman, N.: Different versions of the Dayhoff rate matrix. Mol. Biol. Evol. 22(2), 193–199 (2005).  https://doi.org/10.1093/molbev/msi005CrossRefGoogle Scholar
  12. 12.
    Felsenstein, J.: Phylip - phylogeny inference package (version 3.2). Cladistics (1989)Google Scholar

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  • Gary Larson
    • 1
  • Jeffrey L. Thorne
    • 2
  • Scott Schmidler
    • 3
  1. 1.Department of Statistical ScienceDuke UniversityDurhamUSA
  2. 2.Departments of Biological Sciences and StatisticsNorth Carolina State UniversityRaleighUSA
  3. 3.Departments of Statistical Science and Computer ScienceDuke UniversityDurhamUSA

Personalised recommendations