A Consensus Approach to Predicting Protein Contact Map via Logistic Regression
Prediction of protein contact map is of great importance since it can facilitate and improve the prediction of protein 3D structure. However, the prediction accuracy is notoriously known to be rather low. In this paper, a consensus contact map prediction method called LRcon is developed, which combines the prediction results from several complementary predictors by using a logistic regression model. Tests on the targets from the recent CASP9 experiment and a large dataset D856 consisting of 856 protein chains show that LRcon not only outperforms its component predictors but also the simple averaging and voting schemes. For example, LRcon achieves 41.5% accuracy on the D856 dataset for the top L/10 long-range contact predictions, which is about 5% higher than its best-performed component predictor. The improvements made by LRcon are mainly attributed to the application of a consensus approach to complementary predictors and the logistic regression analysis under the machine learning framework.
KeywordsProtein contact map CASP Logistic regression Machine learning
Unable to display preview. Download preview PDF.
- 16.Rajgaria, R., Wei, Y., Floudas, C.A.: Contact prediction for beta and alpha-beta proteins using integer linear optimization and its impact on the first principles 3D structure prediction method ASTRO-FOLD. Proteins 78, 1825–1846 (2010)Google Scholar
- 21.Tress, M.L., Valencia, A.: Predicted residue-residue contacts can help the scoring of 3D models. Proteins 78, 1980–1991 (2010)Google Scholar