Abstract
It is shown that the concepts of grammar complexity and syntactic structure provide a useful mathematical framework for the investigation of some current problems in protein structure. Grammar complexity gives a measure of the degree of aperiodicity of a sequence and also an optimization criterion for evaluating amino acid categorizations. Three systems of amino acid categorization are compared in relation to their value for describing molecular architecture.
Similar content being viewed by others
Literature
Argos, Patrick, M. Hanei and R. M. Garavito. 1978. “The Chou-Fasman Secondary Structure Prediction Method with an Extended Data Base”.FEBS Lett. 93, 19–25.
Chaitin, G. J. 1974. “Information-Theoretic Computational Complexity”.IEEE Trans. Informat. Theor. IT-20, 10–15.
Chaitin, G. J. 1966. “On the Length of Programs for Computing Finite Binary Sequences”.J. Ass. Comput. Mach. 13, 547–569.
Chaitin, G. J. 1975. “Randomness and Mathematical Proof”.Scient. Am. 47–52.
Chomsky, N. 1959. “On Certain Formal Properties of Grammar”.Informat. Control 2, 137–167.
Chou, P. Y. and G. D. Fasman. 1974a. “Conformational Parameters for Amino Acids in Helical, Beta-sheet and Coil Regions Isolated from Proteins”.Biochemistry 13, 212–221.
Chou, P. Y. and G. D. Fasman. 1974b. “Prediction of Protein Conformation”.Biochemistry 13, 222–245.
Chou, P. Y. and G. D. Fasman. 1978. “Empirical Predictions of Protein Conformation”.A. Rev. Biochem. 47, 251–276.
Cohen, F. E., R. M. Abarbanel, I. D. Kuntz and R. Fletterick. 1983. “Secondary Structure Assignment for α/β Proteins by a Combinatorial Approach”.Biochemistry 22, 4895–4904.
Cohen, F. E., T. J. Richmond and F. M. Richards. 1979. “Protein Folding—Evaluation of Some Simple Rules for the Assembly of Helices into Tertiary Structures with Myoglobin as an Example”.J. Mol. Biol. 132, 275–288.
Cohen, F. E., M. J. Sternberg and W. R. Taylor. 1980. “Analysis and Prediction of Protein Beta-sheet Structures by a Combinatorial Approach”.Nature 285, 378–382.
Cornish-Bowden, A. and A. Marson. 1977. “Evaluation of the Non-randomness of Protein Compositions”.J. Mol. Evol. 10, 231–246.
Dayhoff, M. O. 1978.Atlas of Protein Structure and Function. Vol. 5, Suppl. 3. National Biomedical Research Foundation, Washington DC.
Dayhoff, M. O., R. M. Schwartz and B. L. Orcutt, 1978. “Model of Evolutionary Changes in Proteins”. InAtlas of Protein Structure and Function, Vol. 5, Suppl. 3. National Biomedical Research Foundation, Washington, D. C.
Davies (1980) “What is a Computation”. InMathematics Today. Twelve Informal Essays, Ed. L. A. Steen. New York: Random House. pp. 241–267.
Ebeling, W. and R. Feistel. 1982.Physik de Selbstorganisation u. Evolution. Berlin: Akademie-Verlag.
Ebeling, W. and M. A. Jiménez-Montaño. 1980. “On Grammars, Complexity, and Information Measures of Biological Macromolecules”.Math. Biosci. 52, 53–71.
Gatlin, L. L. 1974. “Conservation of Shannon's Redundancy for Proteins”.J. Mol. Evol. 3, 189–208.
Gò, M. and S. Miyazawa. 1980. “Relationship Between Mutability, Polarity and Exteriorty of Amino Acid Residues in Protein Evolution”.Int. J. Peptide Protein Res. 15, 211–224.
Hasegawa, M. and T. Yano. 1975. “The Genetic Code and the Entropy of Proteins”.Math. Biosci. 24, 169–182.
Jaenicke, R., Ed. 1980. “Protein Folding”. InProceedings of the 28th Conference of the German Biochemical Society, Regensburg, W. Germany, September, 1979. Amsterdam: Elsevier/North Holland Biomedical Press.
Janin, J. 1979. “The Protein Kingdom: A Survey of the Three-Dimensional Structure and Evolution of Globular Proteins”.Bull. Inst. Pasteur 77, 337–373.
Jiménez-Montaño, M. A. and H. M. Martinez. 1984. “A Procedure for Characterizing the Primary Structure of a Protein Family.”J. Mol. Evol. (submitted).
Jiménez-Montaño, M. A. and C. L. Zamora, 1981. “Evolutionary Model for the Generation of Amino Acid Sequences and its Application to the Study of Mamal Alpha-Hemoglobin Chains”. InProceedings, VII Int. Biophysics Congress, Mexico City, p. 151.
Knuth, D. E. 1971. “Semantics of Context-free Languages.”Math. Sys. Theor. 2, 127–145.
Kolmogorov, A. N. 1965. “Three Approaches to the Definition of the Concept Quantity of Information”.Problemy Peradaei Informacii 1, 3–11;IEEE Trans. Informa. Theor. 1968.IT-14, 662–669.
Krzywicki, A. and P. P. Slonimsky. 1966.C. r. hebd. Séanc. Acad. Sci. Paris, Serie D,262.
Krzywicki, A. and P. P. Slonimsky. 1967. “Formal Analysis of Protein Sequences: I. Specific Long-range Constraints in Pair Associations of Amino Acids”.J. Theor. Biol. 17, 136.
Lempel. A. and J. Ziv. 1976. “On the Complexity of Finite Sequences”.IEEE Trans. Inform. Theor. IT-22, No. 1.
Lesk, A. M. and C. Chothia, 1980. “How Different Amino Acid Sequences Determine Similar Protein Structure: The Structure and Evolutionary Dynamics of the Globin”.J. Mol. Biol. 36, 225–270.
Lim, V. I. 1974a. “Algorithms for Prediction of Alpha-helical and Beta-structural Regions in Globular Proteins”.J. Mol. Biol. 88, 873–894.
Lim, V. I. 1974b. “Structural Principles of Globular Organization of Protein Chains. A Stereochemical Theory of Globular Protein Secondary Structure”.J. Mol. Biol. 88, 857–872.
Löfgren, L. 1977. “Complexity of Descriptions of Systems: A Foundational Study”.Int. J. gen. Syst. 3, 197–214.
Martinez, H. M., B. Katzung and T. Farrah. 1984. “Sequence Analysis Programs”. Biomathematics Computation Laboratory, Dept. of Biochemistry and Biophysics, University of California, San Francisco.
Miller, G. M. and N. Chomsky. 1963. “Finitary Models of Language Users”. InHandbook of Mathematical Psychology, Vol. 2, Eds. R. D. Luce, R. R. Bush and E. Galonter. pp. 419–491.
Miyata, T., S. Miyazawa and T. Yasunaga. 1979. “Two Types of Amino Acid Substitution in Protein Evolution”.J. Mol. Evol. 12, 219–236.
Monod, J. 1968. “On Symmetry and Function in Biological Systems”. InProceedings of the 11th Nobel Symposium, Eds. A. Engstrom and B. Strondberg, pp. 15–17. Stockholm: Wiley Interscience.
Pagan, F. G. 1981.Formal Specification of Programming Languages. pp. 27–49. New Jersey: Prentice-Hall.
Papentin, F. 1980. “On Order and Complexity. I. General Considerations”.J. Theor. Biol. 87, 421–456.
Perutz, M. F., J. C. Kendrew and M. J. Watson. 1965. “Structure and Function of Haemeglobin II. Some Relations Between Polypeptide Chain Configuration and Amino Acid Sequences”.J. Mol. Biol. 88, 287–300.
Richmond, T. J. and F. M. Richards. 1978. “Packing of α-Helices Geometrical Constraints and Contact Areas”.J. Mol. Biol. 119, 537–555.
Rossman, M. G. and P. Argos. 1981. “Protein Folding”.A. Rev. Biochem. 50, 497–532.
Scheidereiter, U. 1974. “Zur Beschreibung Strukturierter Objekte mit Kontex frieien Grammatiker”. InOrganismische Informations Verarbeitung, Ed. F. Klix. Berlin: Verlag d. Wiss.
Schulz, G. E., C. D. Barry, J. Friedman, P. Y. Chou, G. D. Fasman, A. V. Findlestein, V. I. Lim, O. B. Ptitsyn, E. A. Kabat, T. T. Wu, M. Levitt, B. Robson and K. Nagano. 1974. “Comparison of Predicted and Experimentally Determined Secondary Structure of Adenyl Kinase”.Nature 250, 140–142.
Schulz, G. E. and R. H. Shirmer, 1979.Principles of Protein Structure. New York: Springer.
Sneath, P. H. A. 1966. “Relation Between Chemical Structure and Biological Activity in Peptides”.J. Theor. Biol. 12, 157–195.
Solomonoff, R. J. 1964. “A Formal Theory of Inductive Inference, Part 1”.Informat. Control 7, 1–22.
Thiele, H. 1974. “Zur Definition von Kompliziertheitsmassen fur endliche Objekte”. InOrganismische Informationsverarbeitung, Ed. F. Klix. Berlin: Wiss.
Zimmerman, J. M., N. Eliezer and R. Simha. 1968. “The Characterization of Amino Acid Sequence Proteins by Statistical Methods”.Theor. Biol. 21, 170–201.
Zuckerkandl, E. and L. Pauling. 1965. “Evolutionary Divergence and Convergence in Proteins”. InEvolving Genes and Proteins, Eds. B. Bryson and H. Vogel. New York: Academic Press.
Zvonkin, A. K. and L. A. Levin. 1970. “The Complexity of Finite Objects and the Development of the Concepts of Information and Randomness by Means of the Theory of Algorithms”.Russian Math. Surveys 25, 83–124.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Jiménez-Montaño, M.A. On the syntactic structure of protein sequences and the concept of grammar complexity. Bltn Mathcal Biology 46, 641–659 (1984). https://doi.org/10.1007/BF02459508
Issue Date:
DOI: https://doi.org/10.1007/BF02459508