, Volume 8, Issue 1, pp 5-27

Structure-based identification and clustering of protein families and superfamilies

Rent the article at a discount

Rent now

* Final gross prices may vary according to local VAT.

Get Access

Summary

We describe an approach to protein structure comparison designed to detect distantly related proteins of similar fold, where the procedure must be sufficiently flexible to take into account the elasticity of protein folds without losing specificity. Protein structures are represented as a series of secondary structure elements, where for each element a local environment describes its relations with the elements that surround it. Secondary structures are then aligned by comparing their features and local environments. The procedure is illustrated with searches of a database of 468 protein structures in order to identify proteins of similar topology to porcine pepsin, porphobilinogen deaminase and serum amyloid P-component. In all cases the searches correctly identify protein structures of similar fold as the search proteins. Multiple cross-comparisons of protein structures allow the clustering of proteins of similar fold. This is exemplified with a clustering of α/β- and β-class protein structures. We discuss applications of the comparison and clustering of three-dimensional protein structures to comparative modelling and structure-based protein design.