Abstract
Structural studies of proteins for motif mining and other pattern recognition techniques require the abstraction of the structure into simpler elements for robust matching. In this study, we propose the use of bond-orientational order parameters, a well-established metric usually employed to compare atom packing in crystals and liquids. Creating a vector of orientational order parameters of residue centers in a sliding window fashion provides us with a descriptor of local structure and connectivity around each residue that is easy to calculate and compare. To test whether this representation is feasible and applicable to protein structures, we tried to predict the secondary structure of protein segments from those descriptors, resulting in 0.99 AUC (area under the ROC curve). Clustering those descriptors to 6 clusters also yield 0.93 AUC, showing that these descriptors can be used to capture and distinguish local structural information.
Keywords
- bond-orientational order
- secondary structure
- machine learning
- structural alphabet
Download conference paper PDF
References
Joseph, A.P., Agarwal, G., Mahajan, S., Gelly, J.C., Swapna, L.S., Offmann, B., Cadet, F., Bornot, A., Tyagi, M., Valadie, H., Schneider, B., Etchebest, C., Srinivasan, N., De Brevern, A.G.: A short survey on protein blocks. Biophys. Rev. 2, 137–147 (2010)
de Brevern, A.G., Etchebest, C., Hazout, S.: Bayesian probabilistic approach for pre-dicting backbone structures in terms of protein blocks. Proteins 41, 271–287 (2000)
Grindley, H.M., Artymiuk, P.J., Rice, D.W., Willett, P.: Identification of tertiary structure resemblance in proteins using a maximal common subgraph isomorphism algorithm. J. Mol. Biol. 229, 707–721 (1993)
Atilgan, A.R., Durell, S.R., Jernigan, R.L., Demirel, M.C., Keskin, O., Bahar, I.: Ani-sotropy of fluctuation dynamics of proteins with an elastic network model. Biophys J. 80, 505–515 (2001)
Martin, J., Letellier, G., Marin, A., Taly, J.F., de Brevern, A.G., Gibrat, J.F.: Protein secondary structure assignment revisited: a detailed analysis of different assignment methods. BMC Struct. Biol. 5, 17 (2005)
Steinhardt, P.J., Nelson, D.R., Ronchetti, M.: Bond-Orientational Order in Liquids and Glasses. Phys. Rev. B 28, 784–805 (1983)
Offmann, B., Tyagi, M., de Brevern, A.G.: Local protein structures. Curr. Bioinform. 2, 165–202 (2007)
Atilgan, C., Okan, O.B., Atilgan, A.R.: How orientational order governs collectivity of folded proteins. Proteins 78, 3363–3375 (2010)
Sternberg, W.J., Smith, T.L.: The theory of potential and spherical harmonics. Univ. of Toronto Press, Toronto (1946)
Landau, L.D., Lifshitz, E.M.: Quantum mechanics: non-relativistic theory. Pergamon Press; sole distributors in the U.S.A., Addison-Wesley Pub. Co., Reading, Mass., Oxford, New York (1965)
Truskett, T.M., Torquato, S., Debenedetti, P.G.: Towards a quantification of disorder in materials: distinguishing equilibrium and glassy sphere packings. Phys. Rev. E Stat. Phys. Plasmas Fluids Relat. Interdiscip. Topics 62, 993–1001 (2000)
Torquato, S.: Random heterogeneous materials: microstructure and macroscopic properties. Springer, New York (2002)
Berman, H.M., Westbrook, J., Feng, Z., Gilliland, G., Bhat, T.N., Weissig, H., Shind-yalov, I.N., Bourne, P.E.: The Protein Data Bank. Nucleic Acids Res. 28, 235–242 (2000)
Murzin, A.G., Brenner, S.E., Hubbard, T., Chothia, C.: SCOP: a structural classification of proteins database for the investigation of sequences and structures. J. Mol. Biol. 247, 536–540 (1995)
Zhang, Y., Skolnick, J.: TM-align: a protein structure alignment algorithm based on the TM-score. Nucleic Acids Res. 33, 2302–2309 (2005)
Frishman, D., Argos, P.: Knowledge-based protein secondary structure assignment. Proteins 23, 566–579 (1995)
Fodje, M.N., Al-Karadaghi, S.: Occurrence, conformational features and amino acid propensities for the pi-helix. Protein Eng. 15, 353–358 (2002)
Chang, C.-C., Lin, C.-J.: LIBSVM: a library for support vector machines (2001)
Demšar, J., Zupan, B., Leban, G., Curk, T.: Orange: From Experimental Machine Learning to Interactive Data Mining. In: Boulicaut, J.-F., Esposito, F., Giannotti, F., Pedreschi, D. (eds.) PKDD 2004. LNCS (LNAI), vol. 3202, pp. 537–539. Springer, Heidelberg (2004)
Hanley, J.A., McNeil, B.J.: A method of comparing the areas under receiver operat-ing characteristic curves derived from the same cases. Radiology 148, 839–843 (1983)
Koren, Y., Carmel, L.: Visualization of labeled data using linear transformations. In: In-fovis 2002: IEEE Symposium on Information Visualization 2003, Proceedings, pp. 121–128, 248 (2003)
Leban, G., Zupan, B., Vidmar, G., Bratko, I.: VizRank: Data visualization guided by machine learning. Data Mining and Knowledge Discovery 13, 119–136 (2006)
Demsar, J., Leban, G., Zupan, B.: FreeViz–an intelligent multivariate visualization approach to explorative analysis of biomedical data. J. Biomed. Inform. 40, 661–671 (2007)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Meydan, C., Sezerman, O.U. (2012). Representation of Protein Secondary Structure Using Bond-Orientational Order Parameters. In: Shibuya, T., Kashima, H., Sese, J., Ahmad, S. (eds) Pattern Recognition in Bioinformatics. PRIB 2012. Lecture Notes in Computer Science(), vol 7632. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-34123-6_17
Download citation
DOI: https://doi.org/10.1007/978-3-642-34123-6_17
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-34122-9
Online ISBN: 978-3-642-34123-6
eBook Packages: Computer ScienceComputer Science (R0)
-
Published in cooperation with
http://www.iapr.org/
