Skip to main content

Irreplaceable Amino Acids and Reduced Alphabets in Short-Term and Directed Protein Evolution

  • Conference paper
Bioinformatics Research and Applications (ISBRA 2009)

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 5542))

Included in the following conference series:

Abstract

In this paper we extend codon volatility definition to amino acid reduced alphabets to characterize mutations that conserve physical-chemical properties. We also define the average relative changeability of amino acids in terms of single-base codon self-substitution frequencies (identities). These frequencies are taken from an empirical codon substitution matrix [14]. It is shown that this index splits the amino acids into two groups: replaceable and irreplaceable. The same grouping is obtained from the size/complexity index introduced by Dufton [32]. Also, a 71 % agreement is obtained with residues in mutually persistent conserved (MPC) positions [31]. These positions play a key role in fold and functional determination. The residual 29 % can be readily explained. 75 % of residues with highest rank according to MPC positions have the highest probability of causing disease if mutated.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Shih, A.C.-C., Hsiao, T.-C., Ho, M.-S., Li, W.-H.: Simultaneous amino acid substitutions at antigenic sites drive influenza A hemagglutinin evolution. Proc. Natl. Acad. Sci. USA 104(15), 6283–6288 (2007)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  2. Clark, L.A., Ganesan, S., Papp, S., van Vlijmen, H.W.T.: Trends in Antibody Sequence Changes during the Somatic Hypermutation Process. The Journal of Immunology 177, 333–340 (2006)

    Article  CAS  PubMed  Google Scholar 

  3. Keefe, A.D., Szostak, J.W.: Functional proteins from a random-sequence library. Nature 410, 715–718 (2001)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  4. Arnold, F.H.: Design by Directed Evolution. Accounts of Chemical Research 31(3), 125–131 (1998)

    Article  CAS  Google Scholar 

  5. Orencia, M.C., Yoon, J.S., Ness, J.E., Stemmer, W.P.C., Stevens, R.C.: Predicting the emergence of antibiotic resistance by directed evolution and structural analysis. Nature Structural Biology 8(3), 238–242 (2001)

    Article  CAS  PubMed  Google Scholar 

  6. Vitkup, D., Sander, C., Church, G.M.: The amino-acid mutational spectrum of human genetic diease. Genome Biology 4  R72 (2003)

    Google Scholar 

  7. Liò, P., Goldman, N.: Models of molecular evolution and phylogeny. Genome Res. 8, 1233–1244 (1998)

    PubMed  Google Scholar 

  8. Kosiol, C., Holmes, I., Goldman, N.L.: An empirical codon model for protein sequence evolution. Mol. Biol. Evol. 24(7), 1464–1479 (2007)

    Article  CAS  PubMed  Google Scholar 

  9. Yampolsky, L.Y., Stolzfus, A.: The exchangeability of amino acids in proteins. Genetics 170, 1459–1472 (2005)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  10. Jiménez-Montaño, M.A., de la Mora-Basáñez, R., Pöschel, T.: The Hypercube Structure of the Genetic Code Explains Conservative and Non-Conservartive Amino acid Substitutions in Vivo and in Vitro. BioSystems 39, 117–125 (1996)

    Article  PubMed  Google Scholar 

  11. Karasev, V.A., Soronkin, S.G.: Topological structure of the genetic code. Russian Journal of Genetics 33, 622–628 (1997)

    CAS  Google Scholar 

  12. He, M.X., Petoukhov, S.V., Ricci, P.E.: Genetic code, Hamming distance and stochastic matrices. Bull. Math. Biology 66(5), 1405–1421 (2004)

    Article  CAS  Google Scholar 

  13. Hershberg, U., Shlomchik, M.J.: Differences in potential for amino acid change after mutation reveals distinct strategies for {kappa} and {lambda} light-chain variation. Proc. Natl. Acad. Sci. USA 103(43), 15963–15968 (2006)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  14. Schneider, A., Cannarozzi, G.M., Gonnet, G.H.: Empirical codon substitution matrix. BMC Bioinformatics 6, 134 (2005)

    Article  PubMed  PubMed Central  Google Scholar 

  15. Doron-Faigenboim, A., Pupko, T.: A combined empirical and mechanistic codon model. Mol. Biol. Evol. 24(2), 388–397 (2007)

    Article  CAS  PubMed  Google Scholar 

  16. Plotkin, J.B., Dushoff, J.: Codon bias and frequency-dependent selection on the hemagglutinin epitopes of influenza A virus. Proc. Natl. Acad. Sci. USA 100(12), 7152–7157 (2003)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  17. Plotkin, J.B., Dushoff, J., Fraser, H.B.: Detecting selection using a single genome sequence of M. tuberculosis and P. falciparum. Nature 428, 942–945 (2004)

    Article  CAS  PubMed  Google Scholar 

  18. Grantham, R.: Amino Acid Difference Formula to Help Explain Protein Evolution. Science 185(4154), 862–864 (1974)

    Article  CAS  PubMed  Google Scholar 

  19. Miyata, T., Miyazawa, S., Yasunaga, T.: Two types of amino acid substitutions in protein evolution. J. Mol. Evol. 12, 219–236 (1979)

    Article  CAS  PubMed  Google Scholar 

  20. Cannata, N., Toppo, S., Romualdi, C., Valle, G.: Simplifying amino acid alphabets by means of a branch and bound algorithm and substitution matrices. Bioinformatics 18, 1102–1108 (2002)

    Article  CAS  PubMed  Google Scholar 

  21. Murphy, L.R., Wallqvist, A., Levy, R.M.: Simplified amino acid alphabets for protein fold recognition and implications for folding. Protein Eng. 13(3), 149–152 (2000)

    Article  CAS  PubMed  Google Scholar 

  22. Fan, K., Wang, W.: What is the minimum number of letters required to fold a protein? J. Mol. Biol. 328, 921–926 (2003)

    Article  CAS  PubMed  Google Scholar 

  23. Albatineh, A., Razeghifard, R.: Clustering Amino Acids Using Maximum Clusters Similarity. In: Doble, M., Loging, W., Malone, J., Tseng, V.S.-M. (eds.) Proc. 2008 International Conference on Bioinformatics, Computational Biology, Genomics and Chemoinformatics (BCBGC 2008), pp. 87–92. ISRST, USA (2008)

    Google Scholar 

  24. Jiménez-Montaño, M.A.: On the syntactic structure of protein sequences and the concept of grammar complexity. Bull. Math. Biol. 46(4), 641–659 (1984)

    Article  Google Scholar 

  25. Zhou, H., Zhou, Y.: Quantifying the effect of burial amino acid residues on protein stability. PROTEINS: Structure, Function, and Bioinformatics 54, 315–322 (2004)

    Article  CAS  Google Scholar 

  26. Burks, E.A., Chen, G., Georgiou, G., Iverson, B.L.: In vitro scanning saturation mutagenesis of an antibody binding pocket. Proc. Natl. Acad. Sci. USA 94, 412–417 (1997)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  27. Volkenstein, M.V.: Mutations and the value of information. J. Theor. Biol. 80, 155–169 (1979)

    Article  CAS  PubMed  Google Scholar 

  28. Bachinsky, A., Ratner, V.: Biomed. Zs. 18, 53 (1976) (in Russian)

    Google Scholar 

  29. Dayhoff, M. (ed).: Atlas of protein sequence and structure. Nat. Biomed. Res. Found (1972)

    Google Scholar 

  30. Luo, L.F.: The degeneracy rule of genetic code. Origins of Life and evolution of the biosphere 18, 65–70 (1988)

    Article  CAS  PubMed  Google Scholar 

  31. Friedberg, I., Margalit, H.: Persistently conserved positions in structurally similar, sequence dissimilar proteins: Roles in preserving protein fold and function. Protein Science 11, 350–360 (2002)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  32. Dufton, M.J.: Genetic code synonym quotas and amino acid complexity: Cutting the cost of proteins? J. Theor. Biol. 187, 165–173 (1997)

    Article  CAS  PubMed  Google Scholar 

  33. Papentin, F.: On order and complexity. II. Application to chemical and biochemical structures. J. Theor. Biol. 95(2), 225–245 (1982)

    Article  CAS  PubMed  Google Scholar 

  34. Jones, D.T., Taylor, W.R., Thornton, J.: The rapid generation of mutation data matrices from protein sequences. Compt. Appl. Biosci. 8, 275–282 (1992)

    CAS  Google Scholar 

  35. Tourasse, N.J., Li, W.-H.: selective constraints, amino acid composition, and the rate of protein evolution. Mol. Biol. Evol. 17(4), 656–664 (2000)

    Article  CAS  PubMed  Google Scholar 

  36. Wang, Z., Moult, J.: SNPs, protein structure, and disease. Hum. Mutat. 17, 263–270 (2001)

    Article  PubMed  Google Scholar 

  37. Li, W.-H., Wu, C.-I., Luo, C.-C.: A new method for estimating synonymous and non-synonymous rates of nucleotide substitution considering the relative likelihood of nucleotide and codon changes. Mol. Biol. Evol. 2, 150–174 (1985)

    PubMed  Google Scholar 

  38. Jiménez-Montaño, M.A., Ramos-Fernandez, A.: An empirical method to identify positively selected sites in antigenic evolution. In: Argüello-Astorga, G.R., González, R.A., Méndez Salinas, E. (eds.) e-Proc. V National Congress of Virology. Sociedad Mexicana de Bioquimica, Mexico (2007)

    Google Scholar 

  39. Cargill, M., Altshuler, D., Ireland, J., Sklar, P., Ardlie, K., Patil, N., Shaw, N., Lane, C.R., Lim, E.P., Kalyanaraman, N., et al.: Characterization of single-nucleotide polymorphisms in coding regions of human genes. Nat. Genet. 22, 231–238 (1999)

    Article  CAS  PubMed  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Jiménez-Montaño, M.A., He, M. (2009). Irreplaceable Amino Acids and Reduced Alphabets in Short-Term and Directed Protein Evolution. In: Măndoiu, I., Narasimhan, G., Zhang, Y. (eds) Bioinformatics Research and Applications. ISBRA 2009. Lecture Notes in Computer Science(), vol 5542. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-01551-9_29

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-01551-9_29

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-01550-2

  • Online ISBN: 978-3-642-01551-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics