Skip to main content
Log in

On the syntactic structure of protein sequences and the concept of grammar complexity

  • Published:
Bulletin of Mathematical Biology Aims and scope Submit manuscript

Abstract

It is shown that the concepts of grammar complexity and syntactic structure provide a useful mathematical framework for the investigation of some current problems in protein structure. Grammar complexity gives a measure of the degree of aperiodicity of a sequence and also an optimization criterion for evaluating amino acid categorizations. Three systems of amino acid categorization are compared in relation to their value for describing molecular architecture.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Literature

  • Argos, Patrick, M. Hanei and R. M. Garavito. 1978. “The Chou-Fasman Secondary Structure Prediction Method with an Extended Data Base”.FEBS Lett. 93, 19–25.

    Article  Google Scholar 

  • Chaitin, G. J. 1974. “Information-Theoretic Computational Complexity”.IEEE Trans. Informat. Theor. IT-20, 10–15.

    Article  MATH  MathSciNet  Google Scholar 

  • Chaitin, G. J. 1966. “On the Length of Programs for Computing Finite Binary Sequences”.J. Ass. Comput. Mach. 13, 547–569.

    MATH  MathSciNet  Google Scholar 

  • Chaitin, G. J. 1975. “Randomness and Mathematical Proof”.Scient. Am. 47–52.

  • Chomsky, N. 1959. “On Certain Formal Properties of Grammar”.Informat. Control 2, 137–167.

    Article  MATH  MathSciNet  Google Scholar 

  • Chou, P. Y. and G. D. Fasman. 1974a. “Conformational Parameters for Amino Acids in Helical, Beta-sheet and Coil Regions Isolated from Proteins”.Biochemistry 13, 212–221.

    Google Scholar 

  • Chou, P. Y. and G. D. Fasman. 1974b. “Prediction of Protein Conformation”.Biochemistry 13, 222–245.

    Article  Google Scholar 

  • Chou, P. Y. and G. D. Fasman. 1978. “Empirical Predictions of Protein Conformation”.A. Rev. Biochem. 47, 251–276.

    Article  Google Scholar 

  • Cohen, F. E., R. M. Abarbanel, I. D. Kuntz and R. Fletterick. 1983. “Secondary Structure Assignment for α/β Proteins by a Combinatorial Approach”.Biochemistry 22, 4895–4904.

    Article  Google Scholar 

  • Cohen, F. E., T. J. Richmond and F. M. Richards. 1979. “Protein Folding—Evaluation of Some Simple Rules for the Assembly of Helices into Tertiary Structures with Myoglobin as an Example”.J. Mol. Biol. 132, 275–288.

    Article  Google Scholar 

  • Cohen, F. E., M. J. Sternberg and W. R. Taylor. 1980. “Analysis and Prediction of Protein Beta-sheet Structures by a Combinatorial Approach”.Nature 285, 378–382.

    Article  Google Scholar 

  • Cornish-Bowden, A. and A. Marson. 1977. “Evaluation of the Non-randomness of Protein Compositions”.J. Mol. Evol. 10, 231–246.

    Article  Google Scholar 

  • Dayhoff, M. O. 1978.Atlas of Protein Structure and Function. Vol. 5, Suppl. 3. National Biomedical Research Foundation, Washington DC.

    Google Scholar 

  • Dayhoff, M. O., R. M. Schwartz and B. L. Orcutt, 1978. “Model of Evolutionary Changes in Proteins”. InAtlas of Protein Structure and Function, Vol. 5, Suppl. 3. National Biomedical Research Foundation, Washington, D. C.

    Google Scholar 

  • Davies (1980) “What is a Computation”. InMathematics Today. Twelve Informal Essays, Ed. L. A. Steen. New York: Random House. pp. 241–267.

    Google Scholar 

  • Ebeling, W. and R. Feistel. 1982.Physik de Selbstorganisation u. Evolution. Berlin: Akademie-Verlag.

    Google Scholar 

  • Ebeling, W. and M. A. Jiménez-Montaño. 1980. “On Grammars, Complexity, and Information Measures of Biological Macromolecules”.Math. Biosci. 52, 53–71.

    Article  MATH  Google Scholar 

  • Gatlin, L. L. 1974. “Conservation of Shannon's Redundancy for Proteins”.J. Mol. Evol. 3, 189–208.

    Article  Google Scholar 

  • Gò, M. and S. Miyazawa. 1980. “Relationship Between Mutability, Polarity and Exteriorty of Amino Acid Residues in Protein Evolution”.Int. J. Peptide Protein Res. 15, 211–224.

    Article  Google Scholar 

  • Hasegawa, M. and T. Yano. 1975. “The Genetic Code and the Entropy of Proteins”.Math. Biosci. 24, 169–182.

    Article  Google Scholar 

  • Jaenicke, R., Ed. 1980. “Protein Folding”. InProceedings of the 28th Conference of the German Biochemical Society, Regensburg, W. Germany, September, 1979. Amsterdam: Elsevier/North Holland Biomedical Press.

    Google Scholar 

  • Janin, J. 1979. “The Protein Kingdom: A Survey of the Three-Dimensional Structure and Evolution of Globular Proteins”.Bull. Inst. Pasteur 77, 337–373.

    Google Scholar 

  • Jiménez-Montaño, M. A. and H. M. Martinez. 1984. “A Procedure for Characterizing the Primary Structure of a Protein Family.”J. Mol. Evol. (submitted).

  • Jiménez-Montaño, M. A. and C. L. Zamora, 1981. “Evolutionary Model for the Generation of Amino Acid Sequences and its Application to the Study of Mamal Alpha-Hemoglobin Chains”. InProceedings, VII Int. Biophysics Congress, Mexico City, p. 151.

  • Knuth, D. E. 1971. “Semantics of Context-free Languages.”Math. Sys. Theor. 2, 127–145.

    Article  MathSciNet  Google Scholar 

  • Kolmogorov, A. N. 1965. “Three Approaches to the Definition of the Concept Quantity of Information”.Problemy Peradaei Informacii 1, 3–11;IEEE Trans. Informa. Theor. 1968.IT-14, 662–669.

    MATH  MathSciNet  Google Scholar 

  • Krzywicki, A. and P. P. Slonimsky. 1966.C. r. hebd. Séanc. Acad. Sci. Paris, Serie D,262.

  • Krzywicki, A. and P. P. Slonimsky. 1967. “Formal Analysis of Protein Sequences: I. Specific Long-range Constraints in Pair Associations of Amino Acids”.J. Theor. Biol. 17, 136.

    Article  Google Scholar 

  • Lempel. A. and J. Ziv. 1976. “On the Complexity of Finite Sequences”.IEEE Trans. Inform. Theor. IT-22, No. 1.

    Google Scholar 

  • Lesk, A. M. and C. Chothia, 1980. “How Different Amino Acid Sequences Determine Similar Protein Structure: The Structure and Evolutionary Dynamics of the Globin”.J. Mol. Biol. 36, 225–270.

    Article  Google Scholar 

  • Lim, V. I. 1974a. “Algorithms for Prediction of Alpha-helical and Beta-structural Regions in Globular Proteins”.J. Mol. Biol. 88, 873–894.

    Article  Google Scholar 

  • Lim, V. I. 1974b. “Structural Principles of Globular Organization of Protein Chains. A Stereochemical Theory of Globular Protein Secondary Structure”.J. Mol. Biol. 88, 857–872.

    Article  Google Scholar 

  • Löfgren, L. 1977. “Complexity of Descriptions of Systems: A Foundational Study”.Int. J. gen. Syst. 3, 197–214.

    MATH  Google Scholar 

  • Martinez, H. M., B. Katzung and T. Farrah. 1984. “Sequence Analysis Programs”. Biomathematics Computation Laboratory, Dept. of Biochemistry and Biophysics, University of California, San Francisco.

    Google Scholar 

  • Miller, G. M. and N. Chomsky. 1963. “Finitary Models of Language Users”. InHandbook of Mathematical Psychology, Vol. 2, Eds. R. D. Luce, R. R. Bush and E. Galonter. pp. 419–491.

  • Miyata, T., S. Miyazawa and T. Yasunaga. 1979. “Two Types of Amino Acid Substitution in Protein Evolution”.J. Mol. Evol. 12, 219–236.

    Article  Google Scholar 

  • Monod, J. 1968. “On Symmetry and Function in Biological Systems”. InProceedings of the 11th Nobel Symposium, Eds. A. Engstrom and B. Strondberg, pp. 15–17. Stockholm: Wiley Interscience.

    Google Scholar 

  • Pagan, F. G. 1981.Formal Specification of Programming Languages. pp. 27–49. New Jersey: Prentice-Hall.

    MATH  Google Scholar 

  • Papentin, F. 1980. “On Order and Complexity. I. General Considerations”.J. Theor. Biol. 87, 421–456.

    Article  MathSciNet  Google Scholar 

  • Perutz, M. F., J. C. Kendrew and M. J. Watson. 1965. “Structure and Function of Haemeglobin II. Some Relations Between Polypeptide Chain Configuration and Amino Acid Sequences”.J. Mol. Biol. 88, 287–300.

    Google Scholar 

  • Richmond, T. J. and F. M. Richards. 1978. “Packing of α-Helices Geometrical Constraints and Contact Areas”.J. Mol. Biol. 119, 537–555.

    Article  Google Scholar 

  • Rossman, M. G. and P. Argos. 1981. “Protein Folding”.A. Rev. Biochem. 50, 497–532.

    Article  Google Scholar 

  • Scheidereiter, U. 1974. “Zur Beschreibung Strukturierter Objekte mit Kontex frieien Grammatiker”. InOrganismische Informations Verarbeitung, Ed. F. Klix. Berlin: Verlag d. Wiss.

    Google Scholar 

  • Schulz, G. E., C. D. Barry, J. Friedman, P. Y. Chou, G. D. Fasman, A. V. Findlestein, V. I. Lim, O. B. Ptitsyn, E. A. Kabat, T. T. Wu, M. Levitt, B. Robson and K. Nagano. 1974. “Comparison of Predicted and Experimentally Determined Secondary Structure of Adenyl Kinase”.Nature 250, 140–142.

    Article  Google Scholar 

  • Schulz, G. E. and R. H. Shirmer, 1979.Principles of Protein Structure. New York: Springer.

    Google Scholar 

  • Sneath, P. H. A. 1966. “Relation Between Chemical Structure and Biological Activity in Peptides”.J. Theor. Biol. 12, 157–195.

    Article  Google Scholar 

  • Solomonoff, R. J. 1964. “A Formal Theory of Inductive Inference, Part 1”.Informat. Control 7, 1–22.

    Article  MATH  MathSciNet  Google Scholar 

  • Thiele, H. 1974. “Zur Definition von Kompliziertheitsmassen fur endliche Objekte”. InOrganismische Informationsverarbeitung, Ed. F. Klix. Berlin: Wiss.

    Google Scholar 

  • Zimmerman, J. M., N. Eliezer and R. Simha. 1968. “The Characterization of Amino Acid Sequence Proteins by Statistical Methods”.Theor. Biol. 21, 170–201.

    Article  Google Scholar 

  • Zuckerkandl, E. and L. Pauling. 1965. “Evolutionary Divergence and Convergence in Proteins”. InEvolving Genes and Proteins, Eds. B. Bryson and H. Vogel. New York: Academic Press.

    Google Scholar 

  • Zvonkin, A. K. and L. A. Levin. 1970. “The Complexity of Finite Objects and the Development of the Concepts of Information and Randomness by Means of the Theory of Algorithms”.Russian Math. Surveys 25, 83–124.

    Article  MATH  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Jiménez-Montaño, M.A. On the syntactic structure of protein sequences and the concept of grammar complexity. Bltn Mathcal Biology 46, 641–659 (1984). https://doi.org/10.1007/BF02459508

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF02459508

Keywords

Navigation