Skip to main content

A Consensus Approach to Predicting Protein Contact Map via Logistic Regression

  • Conference paper
Book cover Bioinformatics Research and Applications (ISBRA 2011)

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 6674))

Included in the following conference series:

Abstract

Prediction of protein contact map is of great importance since it can facilitate and improve the prediction of protein 3D structure. However, the prediction accuracy is notoriously known to be rather low. In this paper, a consensus contact map prediction method called LRcon is developed, which combines the prediction results from several complementary predictors by using a logistic regression model. Tests on the targets from the recent CASP9 experiment and a large dataset D856 consisting of 856 protein chains show that LRcon not only outperforms its component predictors but also the simple averaging and voting schemes. For example, LRcon achieves 41.5% accuracy on the D856 dataset for the top L/10 long-range contact predictions, which is about 5% higher than its best-performed component predictor. The improvements made by LRcon are mainly attributed to the application of a consensus approach to complementary predictors and the logistic regression analysis under the machine learning framework.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Berman, H.M., Westbrook, J., Feng, Z., Gilliland, G., Bhat, T.N., Weissig, H., Shindyalov, I.N., Bourne, P.E.: The Protein Data Bank. Nucleic Acids Research 28, 235–242 (2000)

    Article  Google Scholar 

  2. Björkholm, P., Daniluk, P., Kryshtafovych, A., Fidelis, K., Andersson, R., Hvidsten, T.R.: Using multi-data hidden Markov models trained on local neighborhoods of protein structure to predict residue-residue contacts. Bioinformatics 25, 1264–1270 (2009)

    Article  Google Scholar 

  3. Cessie, L.S., van Houwelingen, J.C.: Ridge estimators in logistic regression. Applied Statistics 41, 191–201 (1992)

    Article  MATH  Google Scholar 

  4. Cheng, J., Baldi, P.: Improved residue contact prediction using support vector machines and a large feature set. BMC Bioinformatics 8, 113 (2007)

    Article  Google Scholar 

  5. Ezkurdia, I., Graña, O., Izarzugaza, J.M.G., Tress, M.L.: Assessment of domain boundary predictions and the prediction of intramolecular contacts in CASP8. Proteins 77, 196–209 (2009)

    Article  Google Scholar 

  6. Gao, X., Bu, D., Xu, J., Li, M.: Improving consensus contact prediction via server correlation reduction. BMC Structural Biology 9, 28 (2009)

    Article  Google Scholar 

  7. Wu, S., Zhang, Y.: A comprehensive assessment of sequence-based and template-based methods for protein contact prediction. Bioinformatics 24, 924–931 (2008)

    Article  Google Scholar 

  8. Griep, S., Hobohm, U.: PDBselect 1992-2009 and PDBfilter-select. Nucleic Acids Research 38, D318–D319 (2009)

    Article  Google Scholar 

  9. Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA data mining software: an update. SIGKDD Explorations 11, 10–18 (2009)

    Article  Google Scholar 

  10. Hamilton, N., Burrage, L., Ragan, M.A., Huber, T.: Protein contact prediction using patterns of correlation. Proteins 7, 679–684 (2004)

    Article  Google Scholar 

  11. Izarzugaza, J.M.G., Graña, O., Tress, M.L., Valencia, A., Clarke, N.: Assessment of intramolecular contact predictions for CASP7. Proteins 69, 152–158 (2007)

    Article  Google Scholar 

  12. Kundrotas, P.J., Alexov, E.G.: Predicting residue contacts using pragmatic correlated mutations method: reducing the false positives. BMC Bioinformatics 7, 503 (2006)

    Article  Google Scholar 

  13. Olmea, O., Valencia, A.: Improving contact predictions by the combination of correlated mutations and other sources of sequence information. Folding & Design 2, S25–S32 (1997)

    Article  Google Scholar 

  14. Pollastri, G., Baldi, P.: Prediction of contact maps by GIOHMMs and recurrent neural networks using lateral propagation from all four cardinal corners. Bioinformatics 70, S62–S70 (2002)

    Article  Google Scholar 

  15. Punta, M., Rost, B.: PROFcon: novel prediction of long-range contacts. Bioinformatics 21, 2960–2968 (2005)

    Article  Google Scholar 

  16. Rajgaria, R., Wei, Y., Floudas, C.A.: Contact prediction for beta and alpha-beta proteins using integer linear optimization and its impact on the first principles 3D structure prediction method ASTRO-FOLD. Proteins 78, 1825–1846 (2010)

    Google Scholar 

  17. Shackelford, G., Karplus, K.: Contact prediction using mutual information and neural nets. Proteins 69, 159–164 (2007)

    Article  Google Scholar 

  18. Shao, Y., Bystroff, C.: Predicting interresidue contacts using templates and pathways. Proteins 53, 497–502 (2003)

    Article  Google Scholar 

  19. Tegge, A.N., Wang, Z., Eickholt, J., Cheng, J.: NNcon: improved protein contact map prediction using 2D-recursive neural networks. Nucleic Acids Research 37, W515–W518 (2009)

    Article  Google Scholar 

  20. Thomas, D.J., Casari, G., Sander, C.: The prediction of protein contacts from multiple sequence alignments. Protein Engineering 9, 941–948 (1996)

    Article  Google Scholar 

  21. Tress, M.L., Valencia, A.: Predicted residue-residue contacts can help the scoring of 3D models. Proteins 78, 1980–1991 (2010)

    Google Scholar 

  22. Vullo, A., Walsh, I., Pollastri, G.: A two-stage approach for improved prediction of residue contact maps. BMC Bioinformatics 7, 180 (2006)

    Article  Google Scholar 

  23. Xue, B., Faraggi, E., Zhou, Y.: Predicting residue-residue contact maps by a two-layer, integrated neural-network method. Proteins 76, 176–183 (2009)

    Article  Google Scholar 

  24. Zhang, Y., Kolinski, A., Skolnick, J.: TOUCHSTONE II: a new approach to ab initio protein structure prediction. Biophysical Journal 85, 1145–1164 (2003)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Yang, JY., Chen, X. (2011). A Consensus Approach to Predicting Protein Contact Map via Logistic Regression. In: Chen, J., Wang, J., Zelikovsky, A. (eds) Bioinformatics Research and Applications. ISBRA 2011. Lecture Notes in Computer Science(), vol 6674. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-21260-4_16

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-21260-4_16

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-21259-8

  • Online ISBN: 978-3-642-21260-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics