Information Systems Frontiers

, Volume 8, Issue 1, pp 29–36 | Cite as

Protein Structure from Contact Maps: A Case-Based Reasoning Approach

  • Janice GlasgowEmail author
  • Tony Kuo
  • Jim Davies


Determining the three-dimensional structure of a protein is an important step in understanding biological function. Despite advances in experimental methods (crystallography and NMR) and protein structure prediction techniques, the gap between the number of known protein sequences and determined structures continues to grow.

Approaches to protein structure prediction vary from those that apply physical principles to those that consider known amino acid sequences and previously determined protein structures. In this paper we consider a two-step approach to structure prediction: (1) predict contacts between amino acids using sequence data; (2) predict protein structure using the predicted contact maps. Our focus is on the second step of this approach. In particular, we apply a case-based reasoning framework to determine the alignment of secondary structures based on previous experiences stored in a case base, along with detailed knowledge of the chemical and physical properties of proteins. Case-based reasoning is founded on the premise that similar problems have similar solutions. Our hypothesis is that we can use previously determined structures and their contact maps to predict the structure for novel proteins from their contact maps.

The paper presents an overview of contact maps along with the general principles behind our methodology of case-based reasoning. We discuss details of the implementation of our system and present empirical results using contact maps retrieved from the Protein Data Bank.


Case-based reasoning Protein structure Contact maps Secondary structure Analogy 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Aaronson JS, Juergen H, Overton GC. Knowledge discovery in genbank. In: Hunter L, Searls D, and Shavlik J, eds. In: Proceedings of the First International Conference on Intelligent Systems for Molecular Biology, AAAI Press, 1993;3–11.Google Scholar
  2. Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H, SHindyalov IN, Bourne PE. Protein data bank. Nucleic Acids Research 2000;28:235–242.CrossRefGoogle Scholar
  3. Epstein S. For the right reasons: The FORR architecture for learning in a skill domain. Cognitive Science 1994;18(3):479–511.CrossRefGoogle Scholar
  4. Epstein S. Pragmatic navigation: Reactivity, heuristics and search. Artificial Intelligence 1998;100:275–322.CrossRefGoogle Scholar
  5. Epstein S, Gelfand J, Lock E. Learning game-specific spatially oriented heuristics. Constraints: An International Journal 1998;2:239–251.Google Scholar
  6. Fariselli P, Olmea O, Valencia A, Casadio R. Prediction of contact maps with neural networks and correlated mutations. Protein Engineering 2001;14(11):835–843.CrossRefGoogle Scholar
  7. Haralick RM, Shanmugam K, Dinstein I. Textural features for image classification. IEEE Transactions on Systems, Man and Cybernetics SMC-1973;3(6):610–621.Google Scholar
  8. Hennessy D, Buchanan B, Subramanian D, Wilkosz PA, Rosenberg JM. Statistical methods for the objective design of screening procedures for macromolecular crystallization. Acta Crystallogr D Biol Crystallogr 2000;56(Pt 7):817–827.Google Scholar
  9. Jaccard P. Nouvelles recherches sur la distribution florale. Bulletin de la Société Vaudoise des Sciences Naturelles 1908;44:223–270.Google Scholar
  10. Jurisica I, Glasgow J. Applications of case-based reasoning in molecular biology. AI Magazine Winter, 2004.Google Scholar
  11. Jurisica I, Rogers P, Glasgow J, Fortier S, Luft J, Wolfley J, Bianca M, Weeks D, DeTitta GT. Intelligent decision support for protein crystal growth. IBM Systems Journal, Special Issue on Deep Computing for Life Sciences 2001;40(2):394–409.Google Scholar
  12. Kettler B, Darden L. Protein sequencing experiment planning using analogy. In: Proceedings of the First International Conference on Intelligent Systems for Molecular Biology, 1993; 216–224.Google Scholar
  13. Kleywegt GJ, Jones TA. Model-building and refinement practice. In: Methods in Enzymology 1997;(277):208–230.Google Scholar
  14. Kolodner J. Case-Based Reasoning. Morgan Kaufmann, 1993.Google Scholar
  15. Leng B, Buchanan BG, Nicholas HB. Protein secondary structure prediction using two-level case-based reasoning. In: Proceedings of the First International Conference on Intelligent Systems for Molecular Biology, 1993;251–259.Google Scholar
  16. Luthy R, Bowie JU, Eisenber D. Assessment of protein models with three-dimensional profiles. Nature 1992;356:83–85.Google Scholar
  17. Nilges M, Clore GM, Gronenborn AM. Determination of the three-dimensional structures of proteins from interproton distance data by dynamical simulated annealing from a random array of atoms. FEBS Lett. 1988;229:129–136.CrossRefGoogle Scholar
  18. Pollastri G, Baldi P. Prediction of contact maps by recurrent neural network architectures and hidden context propagation from all four cardinal corners. Bioinformatics 2002;1(1):1–9.Google Scholar
  19. Punta M, Rost B. Profcon: Novel prediction of long-range contacts. Bioinformatics 2005;21(13):2960–2968.CrossRefGoogle Scholar
  20. Riesbeck C, Schank R. Inside Case-Based Reasoning. Lawrence Erlbaum: Hillsdale, NJ, 1989.Google Scholar
  21. Shavlik J. Finding genes by case-based reasoning in the presence of noisy case boundaries. In: Proceedings of the 1991 DARPA Workshop on Case-Based Reasoning Morgan-Kauffman, 1991.Google Scholar
  22. Smith JR, Chang SF. Quad-tree segmentation for texture-based image query. Proceedings of the second ACM international conference on = Multimedia 1994;279–286.Google Scholar
  23. Sullivan GJ, Baker RL. Efficient quadtree coding of images and video. IEEE Transactions on Image Processing 1994;3(3):327–331.CrossRefGoogle Scholar
  24. Vendruscolo M, Kussell E, Domany E. Recovery of protein structure from contact maps. Folding and Design 1997;2:295–306.CrossRefGoogle Scholar
  25. Won CS, Park DK, Park SJ. Efficient use of mpeg-7 edge histogram descriptor. Electronics and Telecommunications Research Institute Journal 2002;24:23.Google Scholar

Copyright information

© Springer Science+Business Media, Inc. 2006

Authors and Affiliations

  1. 1.School of ComputingQueen's UniversityKingston

Personalised recommendations