Skip to main content

A Review on Protein Structure Classification

  • 1521 Accesses

Part of the Lecture Notes in Computational Vision and Biomechanics book series (LNCVB,volume 30)

Abstract

A massive amount of sequence data is gradually produced by the genome projects that have to be annotated in terms of structure, molecular, and biological functions. In structural genomics, the aim is to resolve several protein structures in an efficient way and to exploit the solved protein structures for assigning the biological function to theoretically solved protein structures. In earlier stages, the protein structures are classified manually in a successful manner and now it suffers from updating problem because of the high throughput of recently solved protein structures. To overcome this issue, several data mining techniques have been examined for the structural classification of the protein world. This review article presents an overview of the existing classification techniques, databases, tools, and performance metrics used for evaluating the performance of protein structure classification algorithms.

Keywords

  • Protein structure
  • Classification techniques
  • Tools
  • Databases
  • Computational biology
  • Challenges

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-3-030-00665-5_10
  • Chapter length: 7 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
eBook
USD   44.99
Price excludes VAT (USA)
  • ISBN: 978-3-030-00665-5
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Hardcover Book
USD   59.99
Price excludes VAT (USA)
Fig. 1

References

  1. Richardson J (1981) The anatomy and taxonomy of protein structure. Adv Protein Chem 34:167

    CrossRef  Google Scholar 

  2. Branden C, Tooze J (1991) Introduction to protein structures. Garland Publishing, New York

    Google Scholar 

  3. Kolodny R et al (2013) On the universe of protein folds. Annu Rev Biophys 42:559–582

    CrossRef  Google Scholar 

  4. Ouzounis CA et al (2003) Classification schemes for protein structure and function. Nat Rev Genet 4(7):508–519

    CrossRef  Google Scholar 

  5. Hadley C, Jones DT (1999) A systematic comparison of protein structure classifications: SCOP, CATH and FSSP. Structure 7(9):1099–1112

    CrossRef  Google Scholar 

  6. Pastore A, Lesk AM (1990) Comparison of the structures of globins and phycocyanins: evidence for evolutionary relationship. Proteins 8(2):133–155

    CrossRef  Google Scholar 

  7. Ravantti J et al (2013) Automatic comparison and classification of protein structures. J Struct Biol 183(1):47–56

    CrossRef  Google Scholar 

  8. Palmenberg et al (2009) Sequencing and analyses of all known human rhinovirus genomes reveal structure and evolution. Science 324:55–59

    CrossRef  Google Scholar 

  9. Le Q et al (2009) Structural alphabets for protein structure classification: a comparison study. J Mol Biol 387(2):431–450

    CrossRef  Google Scholar 

  10. Murzin AG et al (1995) Scop: a structural classification of proteins database for the investigation of sequences and structures. J Mol Biol 247:536–540

    CrossRef  Google Scholar 

  11. Govindarajan S et al (1999) Estimating the total number of protein folds. Proteins: Struct Funct Bioinform 35:408–414

    CrossRef  Google Scholar 

  12. Andreeva et al (2008) Data growth and its impact on the SCOP database: new developments. Nucleic Acids Res 36:D419–D425

    CrossRef  Google Scholar 

  13. Burley S et al (1999) Structural genomics: beyond the human genome project. Nat Genet 23:151–157

    CrossRef  Google Scholar 

  14. Hieter P, Boguski M (1997) Functional genomics: it’s all how you read it. Science 278:601–602

    CrossRef  Google Scholar 

  15. Jain P et al (2009) Supervised machine learning algorithms for protein structure classification. Comput Biol Chem 33(3):216–223

    CrossRef  Google Scholar 

  16. Røgen P, Fain B (2003) Automatic classification of protein structure by using Gauss integrals. Proc Natl Acad Sci U S A. 100(1):119–124

    CrossRef  Google Scholar 

  17. Levy ED et al (2006) 3D complex: a structural classification of protein complexes. PLoS Comput Biol 2(11):e155

    CrossRef  Google Scholar 

  18. Daras P et al (2006) Three-dimensional shape-structure comparison method for protein classification. IEEE/ACM Trans Comput Biol Bioinform 3(3):193–207

    CrossRef  Google Scholar 

  19. Cui X, Gao X (2017) K-nearest uphill clustering in the protein structure space. Neurocomputing 220:52–59

    CrossRef  Google Scholar 

  20. Leon F et al (2009) Performance analysis of algorithms for protein structure classification. In: 2009 IEEE 20th international workshop on database and expert systems application. https://doi.org/10.1109/dexa.2009.17. ISBN: 978-0-7695-3763-4

  21. Jain P, Hirst JD (2010) Automatic structure classification of small proteins using random forest. BMC Bioinform 11:364

    CrossRef  Google Scholar 

  22. Dietmann S, Holm L (2001) Identification of homology in protein structure classification. Nat Struct Biol 8(11):953–957

    CrossRef  Google Scholar 

  23. Najibi SM et al (2017) Protein structure classification and loop modeling using multiple Ramachandran distributions. Comput Struct Biotechnol J 8(15):243–254

    CrossRef  Google Scholar 

  24. Swindells MB et al (1998) Contemporary approaches to protein structure classification. BioEssays 20(11):884–891

    CrossRef  Google Scholar 

  25. Sali A, Blundell TL (1990) Definition of general topological equivalence in protein structures. A procedure involving comparison of properties and relationships through simulated annealing and dynamic programming. J Mol Biol 212:403–428. https://doi.org/10.1016/0022-2836(90)90134-8

    CrossRef  Google Scholar 

  26. Holm L, Sander C (1993) Protein structure comparison by alignment of distance matrices. J Mol Biol 233:123–138. https://doi.org/10.1006/jmbi.1993.1489

    CrossRef  Google Scholar 

  27. Taylor WR, Orengo CA (1989) Protein structure alignment. J Mol Biol 208:1–22

    CrossRef  Google Scholar 

  28. Pedruzzi I et al (2013) HAMAP in 2013, new developments in the protein family classification and annotation system. Nucleic Acids Res 41:D584–D589

    CrossRef  Google Scholar 

  29. Haft DH, Selengut JD, White O (2003) The TIGRFAMs database of protein families. Nucleic Acids Res 31:371–373

    CrossRef  Google Scholar 

  30. Mi H, Muruganujan A, Thomas PD (2013) PANTHER in 2013: modeling the evolution of gene function, and other gene attributes, in the context of phylogenetic trees. Nucleic Acids Res 41:D377–D386

    CrossRef  Google Scholar 

  31. Akiva E et al (2013) The structure–function linkage database. Nucleic Acids Res 42:D521–D530

    CrossRef  Google Scholar 

  32. Finn RD et al (2014) Pfam: the protein families database. Nucleic Acids Res 42:D222–D230

    CrossRef  Google Scholar 

  33. Letunic I, Doerks T, Bork P (2015) SMART: recent updates, new developments and status in 2015. Nucleic Acids Res 43:D257–D260

    CrossRef  Google Scholar 

  34. Hunter S et al (2012) InterPro in 2011: new developments in the family and domain prediction database. Nucleic Acids Res 40:D306–D312

    CrossRef  Google Scholar 

  35. Attwood TK et al (2012) The PRINTS database: a fine-grained protein sequence annotation and analysis resource—its status in 2012. Database 2012:bas019

    Google Scholar 

  36. Sillitoe I et al (2015) CATH: comprehensive structural and functional annotations for genome sequences. Nucleic Acids Res 43:D376–D381

    CrossRef  Google Scholar 

  37. Marchler-Bauer A et al (2013) CDD: conserved domains and protein three-dimensional structure. Nucleic Acids Res 41:D348–D352

    CrossRef  Google Scholar 

  38. Cheng H et al (2014) ECOD: an evolutionary classification of protein domains. PLoS Comput Biol 10:e1003926

    CrossRef  Google Scholar 

  39. Andreeva A et al (2007) Data growth and its impact on the SCOP database: new developments. Nucleic Acids Res 36:D419–D425

    CrossRef  Google Scholar 

  40. Bernstein FC et al (1977) The protein data bank. Eur J Biochem 80:319–324

    CrossRef  Google Scholar 

  41. Consortium, U (2008) The universal protein resource (UniProt). Nucleic Acids Res 36:D190–D195

    CrossRef  Google Scholar 

  42. Kabsch W, Sander C (1983) Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features. Biopolymers 22:2577–2637

    CrossRef  Google Scholar 

  43. Andreeva A et al (2014) SCOP2 prototype: a new approach to protein structure mining. Nucleic Acids Res 42:310–314

    CrossRef  Google Scholar 

Download references

Acknowledgements

The authors like to thank the Department of Science and Technology (DST), New Delhi (DST/INSPIRE Fellowship/2015/IF150093) for the financial support under INSPIRE Fellowship for this research work.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to N. Sajithra .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Verify currency and authenticity via CrossMark

Cite this paper

Sajithra, N., Ramyachitra, D., Manikandan, P. (2019). A Review on Protein Structure Classification. In: Pandian, D., Fernando, X., Baig, Z., Shi, F. (eds) Proceedings of the International Conference on ISMAC in Computational Vision and Bio-Engineering 2018 (ISMAC-CVB). ISMAC 2018. Lecture Notes in Computational Vision and Biomechanics, vol 30. Springer, Cham. https://doi.org/10.1007/978-3-030-00665-5_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-00665-5_10

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-00664-8

  • Online ISBN: 978-3-030-00665-5

  • eBook Packages: EngineeringEngineering (R0)