On the Convergence of Protein Structure and Dynamics. Statistical Learning Studies of Pseudo Folding Pathways
Many algorithms that attempt to predict proteins’ native structure from sequence need to generate a large set of hypotheses in order to ensure that nearly correct structures are included, leading to the problem of assessing the quality of alternative 3D conformations. This problem has been mostly approached by focusing on the final 3D conformation, with machine learning techniques playing a leading role. We argue in this paper that additional information for recognising native-like structures can be obtained by regarding the final conformation as the result of a generative process reminiscent of the folding process that generates structures in nature. We introduce a coarse representation of protein pseudo-folding based on binary trees and introduce a kernel function for assessing their similarity. Kernel-based analysis techniques empirically demonstrate a significant correlation between information contained into pseudo-folding trees and features of native folds in a large and non-redundant set of proteins.
Unable to display preview. Download preview PDF.
- 3.Bau, D., Martin, A.J.M., Mooney, C., Vullo, A., Walsh, I., Pollastri, G.: Distill: A suite of web servers for the prediction of one-, two- and three-dimensional structural features of proteins. BMC Bioinformatics 7(402) (2006)Google Scholar
- 4.Bau, D., Pollastri, P., Vullo, A.: Distill: a machine learning approach to ab initio protein structure prediction. In: Bandyopadhyay, S., Maulik, U., Wang, J. (eds.) Analysis of Biological Data: A Soft Computing Approach, World Scientific, Singapore (2007)Google Scholar
- 6.Cristianini, N., Kandola, J., Elisseef, A., Shawe-Taylor, J.: On kernel-target alignment, innovations in Machine Learning, pp. 205–256 (2006)Google Scholar
- 10.Friesner, R.A., Prigogine, I., Rice, A.S.: Computational methods for protein folding. In: Advances in Chemical Physics, vol. 120, John Wiley, Chichester (2002)Google Scholar
- 13.Meila, M., Shi, J.: A random walks view of spectral segmentation. AISTATS (2001)Google Scholar
- 14.Abstracts of the CASP7 conference, Asilomar, CA, USA, 26-30/11/ (2007), http://www.predictioncenter.org/casp7/Casp7.html
- 20.Verma, D., Meila, M.: A comparison of spectral clustering algorithms. TR 03-05-01, University of Washington (2001)Google Scholar
- 22.Zaki, M.J., Nadimpally, V., Bardhan, D., Bystroff, C.: Predicting protein folding pathways. Bioinformatics 20, i386–393 (2004)Google Scholar