Skip to main content
Log in

Quantitative characterization of protein tertiary motifs

  • Original Paper
  • Published:
Journal of Molecular Modeling Aims and scope Submit manuscript

Abstract

A quantitative feature-vector representation/model of tertiary structural motifs of proteins is presented. Multiclass logistic regression and a probabilistic neural network were employed to apply this representation to large data sets in order to classify them into major families of distinct motif types (including those of functional importance) with high statistical confidence. Scatter plots of random samples of these motifs were obtained through two-dimensional transformation of the feature vector by metric MDS (multidimensional scaling). The plots showed distinct clusters and shapes for different families and demonstrated the relevance and importance of the proposed quantitative feature-vector representation for characterizing protein tertiary structural motifs. The relative importance of the features was analyzed. The scope of the present work to investigate Nature’s prioritization and optimization of functional motif structures is highlighted.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  1. Hastie T, Tibshirani R, Friedman JH (2009) The elements of statistical learning: data mining, inference and prediction, 2nd edn. Springer Series in Statistics. Springer, New York

  2. Altschul SF, Madden TL, Schäffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ (1997) Nucl Acids Res 25:3389–3402. http://blast.ncbi.nlm.nih.gov/Blast.cgi

    Google Scholar 

  3. Joshi RR, Jyothi S (2003) Comput Biol Chem 27(3):241–252

    Google Scholar 

  4. Jyothi S, Mustafi SM, Chary KVR, Joshi RR (2005) J Mol Mod 11:481–488

    Article  CAS  Google Scholar 

  5. Joshi RR, Sawant V (2006) J Mol Mod 12(6):943–952

    Article  CAS  Google Scholar 

  6. Joshi RR, Sawant V (2007) J Mol Mod 13(1):275–282

    Article  CAS  Google Scholar 

  7. Joshi RR (2009) Statistical mining of gene protein databanks. In: Fulekar M (ed) Bioinformatics applications in life and environmental sciences. Springer, New York

  8. Tao T, Zhai CX, Lu X, Fang H (2004) Appl Bioinformatics 3(2–3):115–124

    Article  CAS  Google Scholar 

  9. Chen BY, Fofanov VY, Kristensen DM, Kimmel M, Lichtarge O, Kavraki LE (2005) Proc Pacific Symp Biocomputing 10:334–345

    Google Scholar 

  10. Cassela G, George EI (1992) Amer Statist 46:167–174

    Google Scholar 

  11. Jun X, Nak-Kyeong K (2005) J Comput Biol 12(7):950–968

    Google Scholar 

  12. Joshi RR, Hira U, Suri D (2009) Protein Peptide Letts 16(11):1393–1398

    Google Scholar 

  13. Joshi RR, Sekharan S (2010) Protein Pept Lett 17(10):1198–1206

    Article  CAS  Google Scholar 

  14. Helmer-Citterich M, Tramontano A (1994) J Mol Biol 235:1021–1031

    Article  CAS  Google Scholar 

  15. Burkhard P, Taylor P, Walkinshaw MD (1998) J Mol Biol 277(2):449–466

    Article  CAS  Google Scholar 

  16. Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H (2000) Nucl Acids Res 28:235–242. http://www.rcsb.org

    Article  CAS  Google Scholar 

  17. Hulo N, Bairoch A, Bulliard V, Cerrutti L, De E, Castro P (2006) Nucleic Acid Res 34:D227–D230. http://www.prosite.expasy.org

    Google Scholar 

  18. Murzin AG, Brenner SE, Hubbard T, Chothia C (1995) J Mol Biol 247:536–540. http://scop.mrc-lmb.cam.ac.uk/scop/

    Google Scholar 

  19. Orengo CA, Michie AD, Jones DT, Swindells MB, Thornton JM (1997) Structure 5:1093–1108. http://www.cathdb.info/

  20. Bateman A, Coin L, Durbin R, Finn RD, Volker Hollich V, Jones SG, Khanna A, Marshall A, Moxon S, Erik L, Sonnhammer L, Studholme DJ, Yeats C, Eddy SR (2004) Nucl Acids 32:D138–D141. http://www.Sanger.ac.uk/Software/Pfam

    Article  CAS  Google Scholar 

  21. Joshi RR, Krishnanand K (1996) J Comp Biol 3(1):143–162

    Article  CAS  Google Scholar 

  22. Joshi RR (2001) Protein Pept Letts 8(4):257–264

    Google Scholar 

  23. Xu D, Li H, Gu T (2008) In: Chen F, Juttler B (ed) Advances in geometrical modeling and processing (Lect Notes Comp Sci vol 4975). Springer, Berlin, pp 556–562

  24. Chi PH, Scott G, Shyu CR (2005) Int J Softw Eng Know 15(3):527–545

    Google Scholar 

  25. Chi PH, Shyu CR, Xu D (2006) BMC Bioinform 7:362. doi:10.1186/1471-2105-7-362

    Article  Google Scholar 

  26. Joshi RR, Panigrahi P, Patil RN (2012) J Mol Mod 18(6):2741–2754. doi:10.1007/s00894-011-1223-0

    Article  CAS  Google Scholar 

  27. Branden C, Tooze J (1999) Introduction to protein structure. Garland, New York

  28. Sreenath S (2011) Project dissertation. M.Sc. Chemistry. Amrita Vishwa Vidyapeetham, Coimbatore

  29. Voet D, Voet JG (2004) Biochemistry. Wiley, Hoboken

  30. Dewasthaly SS, Bhonde GS, ShankarramanV BSM, Ayachit VM, Gore MM (2007) Protein Pept Lett 14(6):543–551

    Google Scholar 

  31. McConkey BJ, Sobolev V, Edelman M (2002) Quantification of protein surfaces, volumes and atom-atom contacts using a constrained Voronoi procedure. Bioinformatics 18:1365–1373 (program: vsurface.exe)

    Google Scholar 

  32. Hosmer DW, Lemeshow S (2000) Applied logistic regression, 2nd edn. Wiley, New York

  33. Specht DF (1990) Neural Networks 3:109–118

    Google Scholar 

  34. Beale R, Jackson TC (1990) Introduction to artificial neural networks. Adam Higler, Bristol

    Google Scholar 

  35. Anacona F, Colla AN, Rovetta S, Zunino R (1998) Neural Comput Appl 7:37–51

    Google Scholar 

  36. Härdle W (2002) Applied nonparametric regression. Cambridge University Press, Cambridge

    Google Scholar 

  37. Chou KC, Shen HB (2007) Anal Biochem 370:1–16

    Article  CAS  Google Scholar 

  38. Chou KC, Shen HB (2008) Nat Protoc 3:153–162

    Article  CAS  Google Scholar 

  39. Montgomery DC, Peck E (1992) Linear regression analysis, 2nd edn. Wiley, New York

    Google Scholar 

  40. Sherrod PH (2012) DTREG: predictive modeling software. User’s guide 2008–2012. http://www.dtreg.com)

  41. Everitt BS, Dunn G (2001) Applied multivariate data analysis. Hodder Arnold, London

Download references

Acknowledgments

The authors would like to thank the anonymous reviewers for their distinct comments/suggestions that helped us to improve the quality and scope of the paper. Also, special thanks to the reviewer who suggested representing the structural variation in 3D structure space defined across the eigenspace. This will lead to important extensions of the present study.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Rajani R. Joshi.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Joshi, R.R., Sreenath, S. Quantitative characterization of protein tertiary motifs. J Mol Model 20, 2077 (2014). https://doi.org/10.1007/s00894-014-2077-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s00894-014-2077-z

Keywords

Navigation