Reconstruction of 3D Structures from Protein Contact Maps

  • Marco Vassura
  • Luciano Margara
  • Filippo Medri
  • Pietro di Lena
  • Piero Fariselli
  • Rita Casadio
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4463)

Abstract

Proteins are large organic compounds made of amino acids arranged in a linear chain (primary structure). Most proteins fold into unique three-dimensional (3D) structures called interchangeably tertiary, folded, or native structures. Discovering the tertiary structure of a protein (Protein Folding Problem) can provide important clues about how the protein performs its function and it is one of the most important problems in Bioinformatics. A contact map of a given protein P is a binary matrix M such that M i,j= 1 iff the physical distance between amino acids i and j in the native structure is less than or equal to a pre-assigned threshold t. The contact map of each protein is a distinctive signature of its folded structure. Predicting the tertiary structure of a protein directly from its primary structure is a very complex and still unsolved problem. An alternative and probably more feasible approach is to predict the contact map of a protein from its primary structure and then to compute the tertiary structure starting from the predicted contact map. This last problem has been recently proven to be NP-Hard [6]. In this paper we give a heuristic method that is able to reconstruct in a few seconds a 3D model that exactly matches the target contact map. We wish to emphasize that our method computes an exact model for the protein independently of the contact map threshold. To our knowledge, our method outperforms all other techniques in the literature [5,10,17,19] both for the quality of the provided solutions and for the running times. Our experimental results are obtained on a non-redundant data set consisting of 1760 proteins which is by far the largest benchmark set used so far. Average running times range from 3 to 15 seconds depending on the contact map threshold and on the size of the protein. Repeated applications of our method (starting from randomly chosen distinct initial solutions) show that the same contact map may admit (depending on the threshold) quite different 3D models. Extensive experimental results show that contact map thresholds ranging from 10 to 18 Ångstrom allow to reconstruct 3D models that are very similar to the proteins native structure. Our Heuristic is freely available for testing on the web at the following url: http://vassura.web.cs.unibo.it/cmap23d/

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Altschul, S.F., et al.: Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25(17), 3389–3402 (1997)CrossRefGoogle Scholar
  2. 2.
    Andreeva, A., et al.: SCOP database in,: refinements integrate structure and sequence family data. Nucleic Acids Res. 32(Database issue), D226–299 (2004)Google Scholar
  3. 3.
    Bartoli, L., et al.: The pros and cons of predicting protein contact maps.Google Scholar
  4. 4.
    Blumental, L.M.: Theory and applications of distance geometry. Chelsea, New York (1970)Google Scholar
  5. 5.
    Bohr, J., et al.: Protein structures from distance inequalities. J. Mol. Biol. 231, 861–869 (1993)CrossRefGoogle Scholar
  6. 6.
    Breu, H., Kirkpatrick, D.G.: Unit disk graph recognition is NP-hard. Computational Geometry 9, 3–24 (1998)CrossRefMathSciNetMATHGoogle Scholar
  7. 7.
    Cormen, T., Leiserson, C.E., Rivest, R.L.: Introduction to algorithms, 2nd edn. MIT Press, Cambridge (2001)MATHGoogle Scholar
  8. 8.
    Crippen, G.M., Havel, T.F.: Distance geometry and molecular conformation. John Wiley & Sons, Chichester (1988)MATHGoogle Scholar
  9. 9.
    Fariselli, P., et al.: Progress in predicting inter- residue contacts of proteins with neural networks and correlated mutations. Proteins 45(Suppl. 5), 157–162 (2001)CrossRefGoogle Scholar
  10. 10.
    Galaktionov, S.G., Marshall, G.R.: Properties of intraglobular contacts in proteins: an approach to prediction of tertiary structure. In: System Sciences, 1994, Vol.V: Proceedings of the Twenty-Seventh Hawaii International Conference on Biotechnology Computing, vol. 5, 4-7 Jan. 1994, pp. 326–335 (1994)Google Scholar
  11. 11.
    de Groot, B.L., et al.: Prediction of protein conformational freedom from distance constraints. Proteins 29, 240–251 (1997)CrossRefGoogle Scholar
  12. 12.
    Havel, T.F.: Distance Geometry: Theory, Algorithms, and Chemical Applications. In: The Encyclopedia of Computational Chemistry (1998)Google Scholar
  13. 13.
    Lesk, A.: Introduction to Bioinformatics. Oxford University Press, Oxford (2006)Google Scholar
  14. 14.
    Margara, L., et al.: Reconstruction of the Protein Structures from Contact Maps. Technical report UBLCS-2006-24, University of Bologna, Department of Computer Science (October 2006)Google Scholar
  15. 15.
    Moré, J., Wu, Z.: [epsilon]-Optimal solutions to distance geometry problems via global continuation. In: Pardalos, P.M., Shalloway, D., Xue, G. (eds.) Global Minimization of Nonconvex Energy Functions: Molecular Conformation and Protein Folding, pp. 151–168. American Mathemtical Society (1995)Google Scholar
  16. 16.
    Moré, J., Wu, Z.: Distance geometry optimization for protein structures. Journal on Global Optimization 15, 219–234 (1999)CrossRefMATHGoogle Scholar
  17. 17.
    Pollastri, G., et al.: Modular DAG-RNN Architectures for Assembling Coarse Protein Structures. J. Comp. Biol. 13(3), 631–650 (2006)CrossRefGoogle Scholar
  18. 18.
    Saxe, J.B.: Embeddability of weighted graphs in k-space is strongly NP-hard. In: Proc. 17th Allerton Conf. Commun. Control Comput., pp. 480–489 (1979)Google Scholar
  19. 19.
    Vendruscolo, M., Kussell, E., Domany, E.: Recovery of protein structure from contact maps. Folding and Design 2(5), 295–306 (1997)CrossRefGoogle Scholar
  20. 20.
    Vendruscolo, M., Domany, E.: Protein folding using contact maps. Vitam. Horm. 58, 171–212 (2000)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2007

Authors and Affiliations

  • Marco Vassura
    • 1
  • Luciano Margara
    • 1
  • Filippo Medri
    • 1
  • Pietro di Lena
    • 1
  • Piero Fariselli
    • 2
  • Rita Casadio
    • 2
  1. 1.Computer Science Department 
  2. 2.Biocomputing Group, Department of Biology, University of BolognaItaly

Personalised recommendations