Skip to main content

Ballast: A Ball-Based Algorithm for Structural Motifs

  • Conference paper
Research in Computational Molecular Biology (RECOMB 2012)

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 7262))

Abstract

Structural motifs encapsulate local sequence-structure-function relationships characteristic of related proteins, enabling the prediction of functional characteristics of new proteins, providing molecular-level insights into how those functions are performed, and supporting the development of variants specifically maintaining or perturbing function in concert with other properties. Numerous computational methods have been developed to search through databases of structures for instances of specified motifs. However, it remains an open problem as to how best to leverage the local geometric and chemical constraints underlying structural motifs in order to develop motif-finding algorithms that are both theoretically and practically efficient. We present a simple, general, efficient approach, called Ballast (Ball-based algorithm for structural motifs), to match given structural motifs to given structures. Ballast combines the best properties of previously developed methods, exploiting the composition and local geometry of a structural motif and its possible instances in order to effectively filter candidate matches. We show that on a wide range of motif matching problems, Ballast efficiently and effectively finds good matches, and we provide theoretical insights into why it works well. By supporting generic measures of compositional and geometric similarity, Ballast provides a powerful substrate for the development of motif matching algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Artymiuk, P.J., Poirrette, A.R., Grindley, H.M., Rice, D.W., Willett, P.: A graph-theoretic approach to the identification of three-dimensional patterns of amino acid side-chains in protein structures. J. Mol. Biol. 243, 327–344 (1994)

    Article  Google Scholar 

  2. Arun, K.S., Huang, T.S., Blostein, S.D.: Least-squares fitting of two 3-d point sets. IEEE Trans. Pattern Anal. Mach. Intell. 9, 698–700 (1987)

    Article  Google Scholar 

  3. Babbitt, P.C., Hasson, M.S., et al.: The enolase superfamily: A general strategy for enzyme-catalyzed abstraction of the α-protons of carboxylic acids. Biochemistry 35(51), 16489–16501 (1996)

    Article  Google Scholar 

  4. Bandyopadhyay, D., Huan, J., et al.: Identification of family-specific residue packing motifs and their use for structure-based protein function prediction: I. Method development. J. Comput. Aided Mol. Des. 23, 773–784 (2009)

    Article  Google Scholar 

  5. Bandyopadhyay, D., Snoeyink, J.: Almost-delaunay simplices: nearest neighbor relations for imprecise points. In: Proc. SODA, pp. 410–419 (2004)

    Google Scholar 

  6. Barker, J.A., Thornton, J.M.: An algorithm for constraint-based structural template matching: application to 3D templates with statistical analysis. Bioinformatics 19, 1644–1649 (2003)

    Article  Google Scholar 

  7. Bernstein, F.C., Koetzle, T.F., et al.: The Protein Data Bank: a computer-based archival file for macromolecular structures. J. Mol. Biol. 112, 535–542 (1977)

    Article  Google Scholar 

  8. Bron, C., Kerbosch, J.: Algorithm 457: finding all cliques of an undirected graph. Commun. ACM 16, 575–577 (1973)

    Article  MATH  Google Scholar 

  9. Chen, B.Y., Fofanov, V.Y., et al.: The MASH pipeline for protein function prediction and an algorithm for the geometric refinement of 3D motifs. J. Comput. Biol. 14, 791–816 (2007)

    Article  MathSciNet  Google Scholar 

  10. Feige, U., Goldwasser, S., Lovász, L., Safra, S., Szegedy, M.: Interactive proofs and the hardness of approximating cliques. J. ACM 43, 268–292 (1996)

    Article  MathSciNet  MATH  Google Scholar 

  11. Gardiner, E.J., Artymiuk, P.J., et al.: Clique-detection algorithms for matching three-dimensional molecular structures. J. Mol. Graph. Model. 15, 245–253 (1997)

    Article  Google Scholar 

  12. Hegyi, H., Gerstein, M.: The relationship between protein structure and function: a comprehensive survey with application to the yeast genome. J. Mol. Biol. 288, 147–164 (1999)

    Article  Google Scholar 

  13. Karp, R.M.: Reducibility among combinatorial problems. Complexity of Computer Computations 40(4), 85–103 (1972)

    Article  MathSciNet  Google Scholar 

  14. Kleywegt, G.J.: Recognition of spatial motifs in protein structures. J. Mol. Biol. 285, 1887–1897 (1999)

    Article  Google Scholar 

  15. Loewenstein, Y., Raimondo, D., et al.: Protein function annotation by homology-based inference. Genome Biol. 10, 207 (2009)

    Article  Google Scholar 

  16. Lueker, G.S.: A data structure for orthogonal range queries. In: Proc. FOCS, pp. 28–34. IEEE Computer Society, Washington, DC (1978)

    Google Scholar 

  17. Meng, E.C., et al.: Superfamily active site templates. Proteins 55, 962–976 (2004)

    Article  Google Scholar 

  18. Milik, M., Szalma, S., Olszewski, K.A.: Common Structural Cliques: a tool for protein structure and function analysis. Protein Eng. 16, 543–552 (2003)

    Article  Google Scholar 

  19. Mitzenmacher, M., Upfal, E.: Probability and Computing: Randomized Algorithms and Probabilistic Analysis. Cambridge Univ. Press, New York (2005)

    MATH  Google Scholar 

  20. Moll, M., Bryant, D.H., Kavraki, L.E.: The labelhash algorithm for substructure matching. BMC Bioinformatics 11, 555 (2010)

    Article  Google Scholar 

  21. Muthukrishnan, S., Pandurangan, G.: The bin-covering technique for thresholding random geometric graph properties. In: Proc. SODA, pp. 989–998 (2005)

    Google Scholar 

  22. Najmanovich, R., Kurbatova, N., Thornton, J.: Detection of 3D atomic similarities and their use in the discrimination of small molecule protein-binding sites. Bioinformatics 24, i105–i111 (2008)

    Article  Google Scholar 

  23. Nussinov, R., Wolfson, H.J.: Efficient detection of three-dimensional structural motifs in biological macromolecules by computer vision techniques. PNAS 88, 10495–10499 (1991)

    Article  Google Scholar 

  24. Pegg, S.C., Brown, S.D., et al.: Leveraging enzyme structure-function relationships for functional inference and experimental design: the structure-function linkage database. Biochemistry 45, 2545–2555 (2006)

    Article  Google Scholar 

  25. Penrose, M.D.: Random Geometric Graphs. Oxford University Press (2003)

    Google Scholar 

  26. Porter, C.T., Bartlett, G.J., Thornton, J.M.: The Catalytic Site Atlas: a resource of catalytic sites and residues identified in enzymes using structural data. Nucleic Acids Res. 32, D129–D133 (2004)

    Article  Google Scholar 

  27. Shulman-Peleg, A., Nussinov, R., Wolfson, H.J.: Recognition of functional sites in protein structures. J. Mol. Biol. 339, 607–633 (2004)

    Article  Google Scholar 

  28. Ullmann, J.R.: An algorithm for subgraph isomorphism. J. ACM 23, 31–42 (1976)

    Article  MathSciNet  Google Scholar 

  29. Wallace, A.C., Borkakoti, N., Thornton, J.M.: TESS: a geometric hashing algorithm for deriving 3D coordinate templates for searching structural databases. Application to enzyme active sites. Protein Sci. 6, 2308–2323 (1997)

    Article  Google Scholar 

  30. Wangikar, P.P., et al.: Functional sites in protein families uncovered via an objective and automated graph theoretic approach. J. Mol. Biol. 326, 955–978 (2003)

    Article  Google Scholar 

  31. Willard, D.E.: Predicate-Oriented Database Search Algorithms. Outstanding Dissertations in the Computer Sciences. Garland Publishing, New York (1978)

    Google Scholar 

  32. Wolfson, H.J., Rigoutsos, I.: Geometric hashing: An overview. Computing in Science and Engineering 4, 10–21 (1997)

    Google Scholar 

  33. Xie, L., Bourne, P.E.: Detecting evolutionary relationships across existing fold space, using sequence order-independent profile-profile alignments. PNAS 105, 5441–5446 (2008)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

He, L., Vandin, F., Pandurangan, G., Bailey-Kellogg, C. (2012). Ballast: A Ball-Based Algorithm for Structural Motifs. In: Chor, B. (eds) Research in Computational Molecular Biology. RECOMB 2012. Lecture Notes in Computer Science(), vol 7262. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-29627-7_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-29627-7_9

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-29626-0

  • Online ISBN: 978-3-642-29627-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics