Skip to main content

Extracting Coevolving Characters from a Tree of Species

  • Chapter
  • First Online:
Discrete and Topological Models in Molecular Biology

Part of the book series: Natural Computing Series ((NCS))

  • 2120 Accesses

Abstract

Phylogenetic proximity has guided our understanding of the evolution of species for decades. It is clear nowadays that the paradigm “phylogenetically close species should share similar characters” is just one facet of the complex process of evolution inherent in development and species differentiation. Today, there is a need for novel mathematical approaches to cluster together symbolic information organized into trees of characters that could highlight the evolutionary relations between characters and the processes of coevolution of characters. We propose a combinatorial method to do so and to derive groups of characters which appear to be correlated through their evolutionary history. This approach was first developed for protein sequences, but it is revealed to be general and applicable to any list of characters describing species. In particular, one does not need to know all characters for all species to perform coevolution analysis.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. A. Armon, D. Graur, N. Ben-Tal, ConSurf: an algorithmic tool for the identification of functional regions in proteins by surface mapping of phylogenetic information. J. Mol. Biol. 307, 447–463 (2001)

    Article  Google Scholar 

  2. W.R. Atchley, K.R. Wollenberg, W.M. Fitch, W. Terhalle, A.W. Dress, Correlation among amino acid sites in bHLH protein domains: an information theoretic analysis. Mol. Biol. Evol. 17, 164–178 (2000)

    Article  Google Scholar 

  3. J. Baussand, A. Carbone, A combinatorial approach to detect co-evolved amino-acid networks in protein families with variable divergence. PLoS Comput. Biol. 5(9), e1000488 (2009)

    MathSciNet  Google Scholar 

  4. J. Bernardes, G. Zaverucha, C. Vaquero, A. Carbone, High performance domain identification in proteins explores a multitude of diversified profiles with grid computing (2012). Manuscript submitted

    Google Scholar 

  5. L. Brillouin, Science and Information Theory (Dover Publications, Mineola, 2004), p. 293

    Google Scholar 

  6. A. Carbone, L. Dib, Co-evolution and information signals in biological sequences. Theor. Comput. Sci. (2010). doi:10.1016/j.tcs.2010.10.040

    Google Scholar 

  7. A. Carbone, S. Engelen, Information content of sets of biological sequences revisited, in Algorithmic Bioprocesses, ed. by A. Condon, D. Harel, J.N. Kok, A. Salomaa, E. Winfree. Natural Computing Series (Springer, Berlin/Heidelberg, 2008)

    Google Scholar 

  8. A. Carbone, F. Képès, A. Zinovyev, Codon bias signatures, organisation of microorganisms in codon space and lifestyle. Mol. Biol. Evol. 22(3), 547–561 (2004)

    Article  Google Scholar 

  9. G. Cheng, B. Qian, R. Samudrala, D. Baker, Improvement in protein functional site prediction by distinguishing structural and functional constraints on protein family evolution using computational design. Nucleic Acids Res. 33, 5861–5867 (2005)

    Article  Google Scholar 

  10. T. Cover, J. Thomas, Elements of Information Theory (Wiley, New York, 1991)

    Book  MATH  Google Scholar 

  11. A. Del Sol, M.J. Arauzo-Bravo, D. Amoros, R. Nussinov, Modular architecture of protein structures and allosteric communications: potential implications for signaling proteins and regulatory linkages. Genome Biol. 8, R92 (2006)

    Article  Google Scholar 

  12. A. Del Sol, H. Fujihashi, D. Amoros, R. Nussinov, Residues crucial for maintaining short paths in network communication mediate signaling in proteins. Mol. Syst. Biol. 2, 2006.0019 (2006)

    Google Scholar 

  13. L. Dib, A. Carbone, Protein fragments: functional and structural roles of their coevolution networks. PLoS ONE 13, 194 (2012)

    Google Scholar 

  14. L. Dib, A. Carbone, CLAG: an unsupervised non hierarchical clustering algorithm handling biological data. BMC Bioinform. 13, 194 (2012)

    Article  Google Scholar 

  15. R.I. Dima, D. Thirumalai, Determination of networks of residues that regulate allostery in protein families using sequence analysis. Protein Sci. 15, 258–268 (2006)

    Article  Google Scholar 

  16. S. Engelen, L. Trojan, S. Sacquin-Mora, R. Lavery, A. Carbone, Joint evolutionary trees: detection and analysis of protein interfaces. PLoS Comput. Biol. 5(1), e1000267, 1–17 (2009)

    Google Scholar 

  17. M. Fares, S.A.A. Travers, A novel method for detecting intramolecular coevolution: adding a further dimension to select constraints analyses. Genetics 173, 9–13 (2006)

    Article  Google Scholar 

  18. J.H. Gillespie, Population Genetics: A Concise Guide (Johns Hopkins Press, Baltimore, 1998)

    Google Scholar 

  19. G.B. Gloor, L.C. Martin, L.N. Wahl, S.D. Dunn, Mutual information in protein multiple sequence alignments reveals two classes of coevolving positions. Biochemistry 44, 7156–7165 (2005)

    Article  Google Scholar 

  20. C.C. Goh, A.A. Bogan, M. Joachmiak, D. Walther, F.E. Cohen, Coevolution of proteins with their interaction partners. J. Mol. Biol. 299, 283–293 (2000)

    Article  Google Scholar 

  21. D. Hartl, Principles of Population Genetics (Sinauer Associates Publisher, Sunderland, 2007)

    Google Scholar 

  22. C.A. Innis, siteFiNDER-3D: a web-based tool for predicting the location of functional sites in proteins. Nucleic Acids Res. 35(Web-Server-Issue), 489–494 (2007)

    Google Scholar 

  23. P.D. Kreil, C.A. Ouzounis, Identification of thermophilic species by the amino-acids composition deduced from their genomes. Nucleic Acids Res. 29, 1608–1615 (2001)

    Article  Google Scholar 

  24. S. Kullback, R.A. Leibler, On information and sufficiency. Ann. Math. Stat. 22(1), 79–86 (1951)

    Article  MATH  MathSciNet  Google Scholar 

  25. O. Lichtarge, M.E. Sowa, Evolutionary predictions of binding surfaces and interactions. Curr. Opin. Struct. Biol. 12, 21–27 (2002)

    Article  Google Scholar 

  26. O. Lichtarge, H.R. Bourne, F.E. Cohen, An evolutionary trace method define binding surface common to protein families. J. Mol. Biol. 257, 342–358 (1996)

    Article  Google Scholar 

  27. S.W. Lockless, R. Ranganathan, Evolutionary conserved pathways of energetic connectivity in protein families. Science 286, 295–299 (1999)

    Article  Google Scholar 

  28. D.J. Lynn, G.A. Singer, D.A. Hickey, Synonymous codon usage is subject to selection in thermophilic bacteria. Nucleic Acids Res. 30, 4272–4277 (2002)

    Article  Google Scholar 

  29. I. Mihalek, I. Res, O. Lichtarge, A family of evolution-entropy hybrid methods for ranking protein residues by importance. J. Mol. Biol. 336, 1265–1282 (2004)

    Article  Google Scholar 

  30. N. Ota, D.A. Agard, Intramolecular signaling pathways revealed by modeling anisotropic thermal diffusion. Eur. J. Mol. Biol. 351, 345–354 (2005)

    Article  Google Scholar 

  31. D.D. Pollock, W.R. Taylor, Effectiveness of correlation analysis in identifying protein residues undergoing correlated evolution. Protein Eng. 10, 647–657 (1997)

    Article  Google Scholar 

  32. T. Pupko, R.E. Bell, I. Mayrose, F. Glaser, N. Ben-Tal, Rate4Site: an algorithmic tool for the identification of functional regions in proteins by surface mapping of evolutionary determinants within their homologues. Bioinformatics 18, S71–S77 (2002)

    Article  Google Scholar 

  33. A.K. Ramani, E.M. Marcotte, Exploiting the coevolution of interacting proteins to discover interaction specificity. J. Mol. Biol. 327, 273–284 (2003)

    Article  Google Scholar 

  34. C.E. Shannon, A mathematical theory of communication. Bell Syst. Tech. J. 27(3), 379–423 (1948)

    Article  MATH  MathSciNet  Google Scholar 

  35. G.M. Suel, S.W. Lockless, M.A. Wall, R. Ranganathan, Evolutionary conserved networks of residues mediate allosteric communication in proteins. Nat. Struct. Biol. 23, 59–69 (2003)

    Article  Google Scholar 

  36. F. Tekaia, E. Yeramian, B. Dujon, Amino acid composition of genomes, lifestyles of organisms, and evolutionary trends: a global picture with correspondence analysis. Gene 297, 51–60 (2002)

    Article  Google Scholar 

  37. J.N. Thompson, The Geographic Mosaic of Coevolution (University of Chicago Press, Chicago, 2005)

    Google Scholar 

  38. J.D. Watson, R.A. Laskowski, J.M. Thornton, Predicting protein function from sequence and structural data. Curr. Opin. Struct. Biol. 15, 275–284 (2005)

    Article  Google Scholar 

  39. M. Weigt, R.A. White, H. Szurmant, J.A. Hoch, T. Hwa, Identification of direct residue contacts in protein-protein interaction by message passing. Proc. Natl. Acad. Sci. U.S.A. 106, 67–72 (2009)

    Article  Google Scholar 

  40. H. Willenbrock, C. Friis, A.S. Juncker, D.W. Ussery, An environmental signature for 323 microbial genomes based on codon adaptation indices. Genome Biol. 7(12), R114 (2006)

    Google Scholar 

  41. C.-H. Yeang, D. Haussler, Detecting coevolution in and among proteins domains. PLoS Comput. Biol. 3, 2122–2134 (2007)

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Alessandra Carbone .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Carbone, A. (2014). Extracting Coevolving Characters from a Tree of Species. In: Jonoska, N., Saito, M. (eds) Discrete and Topological Models in Molecular Biology. Natural Computing Series. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-40193-0_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-40193-0_3

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-40192-3

  • Online ISBN: 978-3-642-40193-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics