Diversity and motif conservation in protein 3D structural landscape: exploration by a new multivariate simulation method

  • Rajani R. Joshi
Original Paper


In this paper, diversity and conservation in the ‘landscape’ of random variation of protein tertiary structures are explored for quantitative feature-vector models of major types of functionally important 3D structural motifs. For this, I have deployed a recently developed nonparametric regression (NPR)-based multidimensional copula method of simulation. Apart from improved accuracy of multidimensional random sample generation, the simulation provides additional insight into diversity in the protein structural landscape in terms of random variation in the feature-vector. It shows the relative importance of several features, with biological implications, in conservation of motifs. Mapping of this landscape in distance-preserving 2D eigenspace also shows consistency in demarcation of different motif classes and preservation of their characteristic patterns in this 2D space.


Protein tertiary structural motifs Multi-dimensional feature vector Random number generation Copulas Nonparametric regression Multidimensional scaling 



The author would like to thank Srijit Chakrabarty for implementing the initial version of the author’s algorithm as part of his MSc project. The version used in this work has been developed further with significant modifications.


  1. 1.
    Zhang J, Grigoryan G (2013) Methods Enzymol 523:21–40. CrossRefGoogle Scholar
  2. 2.
    Zhou J, Gevorg GG (2014) Protein Sci 24:508–524. CrossRefGoogle Scholar
  3. 3.
    Jun X, Nak-Kyeong K (2005) J Comput Biol 12(7):950–968Google Scholar
  4. 4.
    Joshi RR, Hira U, Suri D (2009) Protein Pept Lett 16(11):1393–1398CrossRefGoogle Scholar
  5. 5.
    Joshi RR, Sekharan S (2010) Protein Pept Lett 17(10):1198–1206CrossRefGoogle Scholar
  6. 6.
    Joshi RR, Sreenath S (2014) J Mol Model 20(1):2077–2085. CrossRefGoogle Scholar
  7. 7.
    Henikoff S, Henikoff JG, Alford WJ, Pietrokovski S (1995) Gene 163:7–26CrossRefGoogle Scholar
  8. 8.
    Orengo CA, Michie AD, Jones DT, Swindells MB, Thornton JM (1997) Structure 5:1093–1108 Google Scholar
  9. 9.
    Gonnet P, Lisacek F (2002) Bioinformatics 18:1091–1101CrossRefGoogle Scholar
  10. 10.
    Bateman A, Coin L, Durbin R, Finn RD, Hollich V, Griffiths-Jones S, Khanna A, Marshall M, Moxon S, Sonnhammer ELL, Studholme DJ, Yeats C, Eddy SR (2004) Nucl Acids Res Database Issue 32:D138–D141CrossRefGoogle Scholar
  11. 11.
    Tao T, Zhai CX, Lu X, Fang H (2004) Appl Bioinforma 3(2–3):115–124CrossRefGoogle Scholar
  12. 12.
    Chen BY, Fofanov VY, Kristensen DM, Kimmel M, Lichtarge O, Kavraki LE (2005) Proc Pac Symp Biocompu 10:334–345Google Scholar
  13. 13.
    Cassela G, George EI (1992) Am Stat 46:167–174Google Scholar
  14. 14.
    Thijs G, Marchal K, Lescot M, Rombauts S, De Moor B, Rouzé P, Moreau Y (2002) J Comput Biol 9(2):447–464CrossRefGoogle Scholar
  15. 15.
    Mckenzie CO, Zhou J, Grigoryan G (2016) Proc Natl Acad Sci U S A 113(47):E7438–E7447CrossRefGoogle Scholar
  16. 16.
    David P, Leader E, Milner-White J (2015) PROTEINS: Struct Funct Bioinform 83(11):2067–2076CrossRefGoogle Scholar
  17. 17.
    Michalik M, Orwick-Rydmark M, Habeck M, Alva V, Arnold T, Linke D (2017) PLoS One 12(8):e0182016. CrossRefGoogle Scholar
  18. 18.
    Mckenzie CO, Grigoryan G (2017) Curr Opin Struct Biol 44:161–167. CrossRefGoogle Scholar
  19. 19.
    Nepomnyachiya S, Ben-Tala N, Kolodny R (2017) Proc Natl Acad Sci U S A 114(44):11703–11708CrossRefGoogle Scholar
  20. 20.
    Kozakov D, Hall DR, Chuang G-Y, Cencic R, Brenke R, Grove LE, Beglov D, Pelletier J, Whitty A, Vajda S (2011) Proc Natl Acad Sci U S A 108(33):13528–13533CrossRefGoogle Scholar
  21. 21.
    Joshi RR, Krishnanand K (1996) J Comp Biol 3(1):143–162CrossRefGoogle Scholar
  22. 22.
    Joshi RR (2001) Protein Pept Lett 8(4):257–264CrossRefGoogle Scholar
  23. 23.
    Xu D, Li H, Gu T (2008) In: Chen F, Juttler B (ed) Advances in geometrical modelling and processing. Lect Notes Comp Sci 4975:556–562. Springer, BerlinGoogle Scholar
  24. 24.
    Chi PH, Scott G, Shyu CR (2005) Int J Softw Eng Knowl Eng 15(3):527–545CrossRefGoogle Scholar
  25. 25.
    Chi PH, Shyu CR, Xu D (2006) BMC Bioinform 7:362. CrossRefGoogle Scholar
  26. 26.
    Joshi RR, Panigrahi P, Patil RN (2012) J Mol Model 18(6):2741–2754. CrossRefGoogle Scholar
  27. 27.
    Teodorescu D (1977) Biol Cybern 28(2):83–93CrossRefGoogle Scholar
  28. 28.
    Adami C, Ofria C, Collier TC (2000) Proc Natl Acad Sci U S A 97:4463–4468CrossRefGoogle Scholar
  29. 29.
    Adami C (2004) Information theory in molecular biology. Phys Life Rev 1:3–22 Google Scholar
  30. 30.
    Williams OT (ed) (2007) Biological cybernetics – research trends. Nova Science, New YorkGoogle Scholar
  31. 31.
    Joshi RR (1990) Math Comput Model 13(10):59–65CrossRefGoogle Scholar
  32. 32.
    Jones G, Hobert J (2001) Stat Sci 16:312–334CrossRefGoogle Scholar
  33. 33.
    Nelsen RB (2006) Introduction to copulas. Springer, New YorkGoogle Scholar
  34. 34.
    Voet D, Voet JG (2004) Biochemsitry. Wiley, HobokenGoogle Scholar
  35. 35.
    Dewasthaly SS, Bhonde GS, Shankarraman V, Biswas SM, Ayachit VM, Gore MM (2007) Protein Pept Lett 14(6):543–551CrossRefGoogle Scholar
  36. 36.
    McConkey BJ, Sobolev V, Edelman M (2002) Bioiniformatics 18(10):1365–1373Google Scholar
  37. 37.
    Härdle W (1990) Applied nonparametric regression. Cambridge Univ Press, CambridgeCrossRefGoogle Scholar
  38. 38.
    Everitt BS, Dunn GD (2001) Applied multivariate data analysis, 2nd edn. Arnold, LondonGoogle Scholar

Copyright information

© Springer-Verlag GmbH Germany, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Department of MathematicsIndian Institute of Technology BombayMumbaiIndia

Personalised recommendations