FAUST: An Algorithm for Extracting Functionally Relevant Templates from Protein Structures

Milik, Mariusz; Szalma, Sandor; Olszewski, Krzysztof A.

doi:10.1007/3-540-45784-4_13

Mariusz Milik⁶,
Sandor Szalma⁶ &
Krzysztof A. Olszewski⁶

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2452))

Included in the following conference series:

International Workshop on Algorithms in Bioinformatics

1078 Accesses

Abstract

FAUST(Functional Annotations Using Structural Templates) is an algorithm for: extraction of functionally relevant templates from protein structures and using such templates to annotate novel structures. Proteins and structural templates are represented as colored, undirected graphs with atoms as nodes and interatomic distances as edge weights. Node colors are based on chemical identities of atoms. Edge labels are equivalent if interatomic distances for corresponding nodes (atoms) differ less than a threshold value. We define FAUST structural template as a common subgraph of a set of graphs corresponding to two or more functionally related proteins. Pairs of functionally related protein structures are searched for sets of chemically equivalent atoms whose interatomic distances are conserved in both structures. Structural templates resulting from such pair wise searches are then combined to maximize classification performance on a training set of irredundant protein structures. The resulting structural template provides new language for description of structure—function relationship in proteins. These templates are used for active and binding site identification in protein structures. We are demonstrating here structural template extraction results for the highly divergent family of serine proteases. We compare FAUST templates to the standard description of the serine proteases active site pattern conservation and demonstrate depth of information captured in such description. Also, we present preliminary results of the high-throughput protein structure database annotations with a comprehensive library of FAUST templates.

to whom correspondence should be addressed

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

N.N. Alexandrov and N. Go, “Biological meaning, statistical significance, and classification of local spatial similarities in nonhomologous proteins,” Protein Science 3:866, 1994.
Google Scholar
P.J. Artymiuk, A.R. Poirette, H.M. Grindley, D.W. Rice, and P. Willet, “A graphtheoretic approach to identification of the three-dimensional patterns of amino-acid side chains in protein structures,” J. Mol. Biol. 243:327, 1994.
Google Scholar
A. Bairoch, “The ENZYME database in 2000.” Nucleic Acids Res. 28:304, 2000.
Google Scholar
H.M. Berman, J. Westbrook, Z. Feng, G. Gilliland, T.N. Bhat, H. Weissig, I.N. Shindyalov, and P.E. Bourne, “The protein data bank.” Nucleic Acids Research, 28:235, 2000.
Google Scholar
C. Chotia, “One thousand families for the molecular biologist,” Nature 357:543, 1992.
Google Scholar
J.S. Fetrow, A. Godzik, and J. Skolnick, “Functional analysis of the E. Coli genome using the sequence-to-structure-to-function paradigm: Identification of proteins exhibiting the glutaredoxin/thioredoxin disulfide oxidoreductase activity,” J. Mol. Biol. 1998.
Google Scholar
J.S. Fetrow and J. Skolnick, “Method for prediction of protein function from sequence using sequence-to-structure-to-function paradigm with application to glutaredoxins/thioredoxins and T ₁ ribonucleases,” J. Mol. Biol. 1998.
Google Scholar
D. Fischer, H. Wolfson, S.L. Lin, and R. Nussinov, “Three-dimensional, sequenceorder independent structural comparison of a serine protease against the crystallographic database reveals active site similarities,” Protein Sci. 3:769, 1994.
Google Scholar
M. Fujinaga and M.N.G. James, “Rat submaxillary gland serine protease, tonin, structure solution and refinement at 1.18Å resolution,” J. Mol. Biol. 195:373, 1987.
Google Scholar
D.H. Kitson, A. Badredtinov, Z.-Y. Zhu, M. Velikanov, D.J. Edwards, K. Olszewski, S. Szalma, and L. Yan, “Functional annotation of proteomic sequences based on consensus of sequence and structural analysis,” Briefings in Bioinformatics, 3:32, 2002.
Google Scholar
R.B. Russel, “Detection of protein three-dimensional side chain pattern.” J. Mol. Biol. 279:1211, 1998.
Google Scholar
A.E. Todd, C.A. Orengo, and J.M. Thornton, “Evolution of function in protein superfamilies from a structural perspective, ” J. Mol. Biol. 307:1113, 2001.
Google Scholar
A.C. Wallace, N. Borkakoti, and J.M. Thornton, “TESS, a geometric hashing algorithm for deriving 3D coordinate templates for searching structural databases: Application to enzyme active sites. ” Protein Sci. 6:2308, 1997.
Google Scholar

Download references

Author information

Authors and Affiliations

accelrys, 9685 Scranton Road, 92121, San Diego, CA, USA
Mariusz Milik, Sandor Szalma & Krzysztof A. Olszewski

Authors

Mariusz Milik
View author publications
You can also search for this author in PubMed Google Scholar
Sandor Szalma
View author publications
You can also search for this author in PubMed Google Scholar
Krzysztof A. Olszewski
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

IMIM-UPF-CRG, Dr. Aiguader 80, 08003, Barcelona, Spain
Roderic Guigó
Department of Computer Science, University of California, 95616, Davis, CA, USA
Dan Gusfield

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Milik, M., Szalma, S., Olszewski, K.A. (2002). FAUST: An Algorithm for Extracting Functionally Relevant Templates from Protein Structures. In: Guigó, R., Gusfield, D. (eds) Algorithms in Bioinformatics. WABI 2002. Lecture Notes in Computer Science, vol 2452. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45784-4_13

Download citation

DOI: https://doi.org/10.1007/3-540-45784-4_13
Published: 10 October 2002
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-44211-0
Online ISBN: 978-3-540-45784-8
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics