Touring Protein Space with Matt
Using the Matt structure alignment program, we take a tour of protein space, producing a hierarchical clustering scheme that divides protein structural domains into clusters based on geometric dissimilarity. While it was known that purely structural, geometric, distance-based metrics of structural similarity, such as Dali/FSSP, could largely replicate hand-curated schemes such as SCOP at the family level, it was an open question as to whether any such scheme could approximate SCOP at the more distant superfamily and fold levels. We partially answer this question in the affirmative, by designing a clustering scheme based on Matt that approximately matches SCOP at the superfamily level. Implications for the debate over the organization of protein fold space are discussed.
KeywordsJaccard Index Matt Family Fold Level Protein Space Protein Structural Domain
Unable to display preview. Download preview PDF.
- 4.Cheek, S., Qi, Y., Krishna, S., Kinch, L., Grishin, N.V.: SCOPmap: Automated assignment of protein structures to evolutionary superfamilies. BMC Bioinformatics 7 (2006)Google Scholar
- 8.Gerstein, M., Levitt, M.: Comprehensive assement of automatic structural alignment against a manual standard, the SCOP classification of proteins. Protein Sci., 445–456 (1998)Google Scholar
- 11.Greene, L., Lewis, T., Addou, S., Cuff, A., Dallman, T., Dibley, M., Redfern, O., Pearl, F., Nambudiry, R., Reid, A., Silitoe, I., Yeats, C., Thornton, J., Orengo, C.: The CATH domain structure database: new protocols and classification levels give a more comprehensive resource for exploring evolution. Nucleic Acids Res. 35, D291–D297 (2007)CrossRefGoogle Scholar
- 17.Holm, L., Sander, C.: Touring protein fold space with Dali/FSSP. Nucleic Acids Res., 316–319 (1998)Google Scholar
- 21.Murzin, A., Brenner, S., Hubbard, T., Chothia, C.: SCOP: a structural classification of proteins database for the investigation of sequences and structures. J. Mol. Biol. 297, 536–540 (1995)Google Scholar
- 24.Redfern, O., Harrison, A., Dallman, T., Pearl, F., Orengo, C.: CATHEDRAL: A fast and effective algorithm to predict folds and domain boundaries from multidomain protein structures. PLOS Computational Biology, e232 (2007) doi:10.1371/journal.pcji.0030232Google Scholar
- 26.Rost, B.: Did evolution leap to create the protein universe? Curr. Opinion in Struct. Biol., 409–416 (2002)Google Scholar
- 29.Sam, V., Tai, C., Garnier, J., Gibrat, J.F., Lee, B., Munson, P.: Towards an automatic classification of protein structural domains based on structural similarity. BMC Bioinformatics 9 (2008)Google Scholar
- 35.Vuk, M., Curk, T.: Roc curve, lift chart and calibration plot. Metodolo ski zvezki 2, 89–108 (2006)Google Scholar