Improving Multi-Relief for Detecting Specificity Residues from Multiple Sequence Alignments
A challenging problem in bioinformatics is the detection of residues that account for protein function specificity, not only in order to gain deeper insight in the nature of functional specificity but also to guide protein engineering experiments aimed at switching the specificity of an enzyme, regulator or transporter. The majority of the state-of-the art algorithms for this task use multiple sequence alignments (MSA’s) to identify residue positions conserved within- and divergent between- protein subfamilies. In this study, we focus on a recent method based on this approach called multi-RELIEF. We analyze and modify the two core parts of the method in order to improve its predictive performance. A parametric generalization of the popular RELIEF machine learning algorithm for weighting residues is introduced and incorporated in multi-RELIEF. The ensemble criterion of multi-RELIEF for merging the weights of multiple runs is simplified. Finally, the method used by multi-RELIEF for exploiting tertiary structure information is modified by incorporating prior information describing the confidence of the original scores assigned to residues. Extensive computational experiments on six real-life datasets show improvement of both robustness and detection capability of the new multi-RELIEF over the original method.
Unable to display preview. Download preview PDF.
- 2.Carro, A., Tress, M., de Juan, D., Pazos, F., Lopez-Romero, P., Del Sol, A., Valencia, A., Rojas, A.M.: Treedet: a web server to explore sequence space. Nucleic Acids Res. 35(web server issue), 99 (2006)Google Scholar
- 5.Feenstra, K.A., Pirovano, W., Krab, K., Heringa, J.: Sequence harmony: detecting functional specificity from alignments. Nucleic Acids Res. 35(web server issue), W495–W498 (2007)Google Scholar
- 9.Kalinina, O.V., Gelfand, M.S., Russell, R.B.: Combining specificity determining and conserved residues improves functional site prediction. BMC Bioinformatics (2009)Google Scholar
- 10.Kalinina, O.V., Novichkov, P.S., Mironov, A.A., Gelfand, M.S., Rakhmaninova, A.B.: SDPpred: a tool for prediction of amino acid residues that determine differences in functional specificity of homologous proteins. Nucleic Acids Res. 32(web server issue), W424–W428 (2004)Google Scholar
- 11.Kononenko, I.: Estimating attributes: Analysis and extensions of relief. In: Bergadano, F., De Raedt, L. (eds.) ECML 1994. LNCS, vol. 784, pp. 171–182. Springer, Heidelberg (1994)Google Scholar
- 24.Zhang, Y., Ding, C., Li, T.: Gene selection algorithm by combining relieff and mrmr. BMC Genomics 9(suppl. 2) (2008)Google Scholar