Abstract
Spatial structures of transmembrane proteins are difficult to obtain either experimentally or by computational methods. Recognition of helix-helix contacts conformations, which provide structural skeleton of many transmembrane proteins, is essential in the modeling. Majority of helix-helix interactions in transmembrane proteins can be accurately clustered into a few classes on the basis of their 3D shape. We propose a Stochastic Context Free Grammars framework, combined with evolutionary algorithm, to represent sequence level features of these classes. The descriptors were tested using independent test sets and typically achieved the areas under ROC curves 0.60-0.70; some reached 0.77.
Chapter PDF
Similar content being viewed by others
Keywords
References
Yarov-Yarovoy, V., Schonbrun, J., Baker, D.: Multipass Membrane Protein Structure Prediction Using Rosetta. Proteins 62, 1010–1025 (2006)
Tusnady, G.E., Dosztányi, Z., Simon, I.: PDB_TM: selection and membrane localization of transmembrane proteins in the PDB. Nucleic Acids Res. 33, D275–D278 (2005)
Barth, P., Wallner, B., Baker, D.: Prediction of membrane protein structures with complex topologies using limited constraints. Proc. Natl. Acad. Sci. 106, 1409–1414 (2009)
Wu, S., Zhang, Y.: A comprehensive assessment of sequence-based and templatebased methods for protein contact prediction. Bioinformatics 24, 924–931 (2008)
Li, W., et al.: Application of sparse NMR restraints to large-scale protein structure prediction. Biophys. J. 87, 1241–1248 (2004)
Izarzugaza, J.M.G., Grana, O., Tress, M.L., Valencia, A., Clarke, N.D.: Assessment of intramolecular contact predictions for CASP7. Proteins 69(suppl. 8), 152–158 (2007)
Sathyapriya, R., Duarte, J.M., Stehr, H., Filippis, I., Lappe, M.: Defining an Essence of Structure Determining Residue Contacts in Proteins. PLoS Comput. Biol. 5, e1000584 (2009)
Walters, R.F.S., De Grado, W.F.: Helix-packing motifs in membrane proteins. Proc. Natl. Acad. Sci. 103, 13658–13663
Russ, W.P., Engelman, D.M.: The GxxxG motif: a framework for transmembrane helix-helix association. J. Mol. Biol. 296(3), 911–919 (2000)
Waldispühl, J., Steyaert, J.-M.: Modeling and predicting all-transmembrane proteins including helix-helix pairing. Theoretical Computer Science 335, 67–92 (2005)
Holland, J.H.: Adaptation in Natural and Artificial Systems. Univ. Michigan (1975)
Goldberg, D.E.: Genetic Algorithms in Search, Optimization and Machine Learning Reading. Addison-Wesley, Reading (1989)
O’Neill, M., Ryan, C.: Grammatical Evolution. IEEE Trans. Evol. Comput. 5, 349–358 (2001)
Koza, J.R.: Genetic Programming: On the Programming of Computers by Means of Natural Selection. MIT Press, Cambridge (1992)
Sakakibara, Y., Brown, M., Underwood, R.C., Mian, I.S.: Stochastic Context-Free Grammars for Modeling RNA. In: Procs 27th Hawaii Int. Conf. System Sciences (1993)
Sakakibara, Y., Brown, M., Hughey, R., Mian, I.S., Sjolander, K., Underwood, R., Haussler, D.: Stochastic Context-Free Grammars for tRNA. Nucleic Acids Res 22, 5112–5120 (1994)
Knudsen, B., Hein, J.: RNA secondary structure prediction using stochastic context-free grammars and evolutionary history. Bioinformatics 15, 446–454 (1999)
Mernik, M., Crepinsek, M., Gerlic, G., Zumer, V., Viljem, Z., Bryant, B.R., Sprague, A.: Learning CFG using an Evolutionary Approach. Technical report (2003)
Sakakibara, Y.: Learning context-free grammars using tabular representations. Pattern Recognition 38, 1372–1383 (2005)
Keller, B., Lutz, R.: Evolutionary induction of stochastic context free grammars. Pattern Recognition 38, 1393–1406 (2005)
Cielecki, L., Unold, O.: Real-valued GCS classifier system. Int. J. Appl. Math. Comput. Sci. 17, 539–547 (2007)
Dyrka, W., Nebel, J.-C.: A Stochastic Context Free Grammar based Framework for Analysis of Protein Sequences. BMC Bioinformatics 10, 323 (2009)
Hutchinson, E.G., Thornton, J.M.: PROMOTIF - A program to identify structural motifs in proteins. Protein Science 5, 212–220 (1996)
Dayhoff, M.O., Schwartz, R.M., Orcutt, B.C.: A model of evolutionary change in proteins. Atlas of Protein Sequence and Structure 5, 345–352 (1978)
Krogh, A., Brown, M., Mian, I.S., Sjolander, K., Haussler, D.: Hidden Markov models in computational biology: Applications to protein modeling. J. Mol. Biol. 235, 1501–1531 (1994)
Revesz, G.E.: Introduction to Formal Languages. McGraw-Hill, New York (1983)
Gimpelev, M., Forrest, L.R., Murray, D., Honig, B.: Helical Packing Patterns in Membrane and Soluble Proteins. Biophysical J. 87, 4075–4086 (2004)
Kawashima, S., Pokarowski, P., Pokarowska, M., Kolinski, A., Katayama, T., Kanehisa, M.: AAindex: amino acid index database. Nucleic Acids Res. 36, D202–D205 (2008)
Stolcke, A.: An Efficient Probabilistic Context-Free Parsing Algorithm that Computes Prefix Probabilities. Computational Linguistics 21(2), 165–201 (1995)
Arabas, J.: Wyklady z algorytmow ewolucyjnych Warsaw: WNT (2004)
Wall, M.: GAlib library documentation (version 2.4.4). MIT, Cambridge (1999)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Dyrka, W., Nebel, JC., Kotulska, M. (2010). Towards 3D Modeling of Interacting TM Helix Pairs Based on Classification of Helix Pair Sequence. In: Dijkstra, T.M.H., Tsivtsivadze, E., Marchiori, E., Heskes, T. (eds) Pattern Recognition in Bioinformatics. PRIB 2010. Lecture Notes in Computer Science(), vol 6282. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-16001-1_6
Download citation
DOI: https://doi.org/10.1007/978-3-642-16001-1_6
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-16000-4
Online ISBN: 978-3-642-16001-1
eBook Packages: Computer ScienceComputer Science (R0)