Skip to main content

Advertisement

Log in

Multiscale Persistent Functions for Biomolecular Structure Characterization

  • Research Methods Article
  • Published:
Bulletin of Mathematical Biology Aims and scope Submit manuscript

Abstract

In this paper, we introduce multiscale persistent functions for biomolecular structure characterization. The essential idea is to combine our multiscale rigidity functions (MRFs) with persistent homology analysis, so as to construct a series of multiscale persistent functions, particularly multiscale persistent entropies, for structure characterization. To clarify the fundamental idea of our method, the multiscale persistent entropy (MPE) model is discussed in great detail. Mathematically, unlike the previous persistent entropy (Chintakunta et al. in Pattern Recognit 48(2):391–401, 2015; Merelli et al. in Entropy 17(10):6872–6892, 2015; Rucco et al. in: Proceedings of ECCS 2014, Springer, pp 117–128, 2016), a special resolution parameter is incorporated into our model. Various scales can be achieved by tuning its value. Physically, our MPE can be used in conformational entropy evaluation. More specifically, it is found that our method incorporates in it a natural classification scheme. This is achieved through a density filtration of an MRF built from angular distributions. To further validate our model, a systematical comparison with the traditional entropy evaluation model is done. It is found that our model is able to preserve the intrinsic topological features of biomolecular data much better than traditional approaches, particularly for resolutions in the intermediate range. Moreover, by comparing with traditional entropies from various grid sizes, bond angle-based methods and a persistent homology-based support vector machine method (Cang et al. in Mol Based Math Biol 3:140–162, 2015), we find that our MPE method gives the best results in terms of average true positive rate in a classic protein structure classification test. More interestingly, all-alpha and all-beta protein classes can be clearly separated from each other with zero error only in our model. Finally, a special protein structure index (PSI) is proposed, for the first time, to describe the “regularity” of protein structures. Basically, a protein structure is deemed as regular if it has a consistent and orderly configuration. Our PSI model is tested on a database of 110 proteins; we find that structures with larger portions of loops and intrinsically disorder regions are always associated with larger PSI, meaning an irregular configuration, while proteins with larger portions of secondary structures, i.e., alpha-helix or beta-sheet, have smaller PSI. Essentially, PSI can be used to describe the “regularity” information in any systems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

References

  • Baron R, Hunenberger PH, McCammon JA (2009) Absolute single-molecule entropies from quasi-harmonic analysis of microsecond molecular dynamics: correction terms and convergence properties. J Chem Theory Comput 5(12):3150–3160

    Article  Google Scholar 

  • Baruah A, Rani P, Biswas P (2015) Conformational entropy of intrinsically disordered proteins from amino acid triads. Sci Rep 5:11740

    Article  Google Scholar 

  • Bauer U, Kerber M, Reininghaus J (2014) Distributed computation of persistent homology. In: Proceedings of the sixteenth workshop on algorithm engineering and experiments (ALENEX), 2014

  • Bendich P, Edelsbrunner H, Kerber M (2010) Computing robustness and persistence for images. IEEE Trans Vis Comput Gr 16:1251–1260

    Article  Google Scholar 

  • Biasotti S, De Floriani L, Falcidieno B, Frosini P, Giorgi D, Landi C, Papaleo L, Spagnuolo M (2008) Describing shapes by geometrical-topological properties of real functions. ACM Comput Surv 40(4):12

    Article  Google Scholar 

  • Binchi J, Merelli E, Rucco M, Petri G, Vaccarino F (2014) jHoles: a tool for understanding biological complex networks via clique weight rank persistent homology. Electron Notes Theor Comput Sci 306:5–18

    Article  MathSciNet  MATH  Google Scholar 

  • Bowen R (1973) Topological entropy for noncompact sets. Trans Am Math Soc 184:125–136

    Article  MathSciNet  MATH  Google Scholar 

  • Brady GP, Sharp KA (1997) Entropy in protein folding and in protein protein interactions. Curr Opinn Struct Biol 7(2):215–221

    Article  Google Scholar 

  • Brooijmans N, Kuntz ID (2003) Molecular recognition and docking algorithms. Ann Rev Biophys Biomol Struct 32(1):335–373

    Article  Google Scholar 

  • Bubenik P (2015) Statistical topological data analysis using persistence landscapes. J Mach Learn Res 16(1):77–102

    MathSciNet  MATH  Google Scholar 

  • Bubenik P, Kim PT (2007) A statistical approach to persistent homology. Homol Homot Appl 19:337–362

    Article  MathSciNet  MATH  Google Scholar 

  • Cang ZX, Mu L, Wu KD, Opron K, Xia KL, Wei GW (2015) A topological approach to protein classification. Mol Based Math Biol 3:140–162

    MATH  Google Scholar 

  • Carlsson G (2009) Topology and data. Am Math Soc 46(2):255–308

    Article  MathSciNet  MATH  Google Scholar 

  • Carlsson G (2014) Topological pattern recognition for point cloud data. Acta Numerica 23:289

    Article  MathSciNet  Google Scholar 

  • Carlsson G, Ishkhanov T, Silva V, Zomorodian A (2008) On the local behavior of spaces of natural images. Int J Comput Vis 76(1):1–12

    Article  MathSciNet  Google Scholar 

  • Carlsson G, Singh G, Zomorodian A (2009) Computing multidimensional persistence. Algorithms and computation. Springer, Berlin, pp 730–739

    MATH  Google Scholar 

  • Carlsson G, Zomorodian A (2009) The theory of multidimensional persistence. Discrete Comput Geom 42(1):71–93

    Article  MathSciNet  MATH  Google Scholar 

  • Cerri A, Fabio B, Ferri M, Frosini P, Landi C (2013) Betti numbers in multidimensional persistent homology are stable functions. Math Methods Appl Sci 36(12):1543–1557

    Article  MathSciNet  MATH  Google Scholar 

  • Cerri A, Landi C (2013) The persistence space in multidimensional persistent homology. Discrete geometry for computer imagery. Springer, Berlin, pp 180–191

    Book  MATH  Google Scholar 

  • Chazal F, De Silva V, Oudot S (2014) Persistence stability for geometric complexes. Geometriae Dedicata 173(1):193–214

    Article  MathSciNet  MATH  Google Scholar 

  • Chintakunta H, Gentimis T, Gonzalez-Diaz R, Jimenez MJ, Krim H (2015) An entropy-based persistence barcode. Pattern Recognit 48(2):391–401

    Article  MATH  Google Scholar 

  • Chung F (1997) Spectral graph theory. American Mathematical Society, Providence

    MATH  Google Scholar 

  • Cohen-Steiner D, Edelsbrunner H, Morozov D (2006) Vines and vineyards by updating persistence in linear time. In: Proceedings of the twenty-second annual symposium on Computational geometry, ACM. pp 119–126

  • Dey TK, Li KY, Sun J, David CS (2008) Computing geometry aware handle and tunnel loops in 3d models. ACM Trans Gr 27:45

    Article  Google Scholar 

  • Dey TK, Wang YS (2013) Reeb graphs: approximation and persistence. Discrete Comput Geom 49(1):46–73

    Article  MathSciNet  MATH  Google Scholar 

  • Di Fabio B, Landi C (2011) A Mayer–Vietoris formula for persistent homology with an application to shape recognition in the presence of occlusions. Found Comput Math 11:499–527

    Article  MathSciNet  MATH  Google Scholar 

  • Dionysus: the persistent homology software. Software available at http://www.mrzv.org/software/dionysus

  • Doig AJ, Sternberg MJE (1995) Side-chain conformational entropy in protein folding. Prot Sci 4(11):2247–2251

    Article  Google Scholar 

  • Edelsbrunner H (2010) Computational topology: an introduction. American Mathematical Society, Providence

    MATH  Google Scholar 

  • Edelsbrunner H, Letscher D, Zomorodian A (2002) Topological persistence and simplification. Discrete Comput Geom 28:511–533

    Article  MathSciNet  MATH  Google Scholar 

  • Edelsbrunner H, Mucke EP (1994) Three-dimensional alpha shapes. Phys Rev Lett 13:43–72

    MATH  Google Scholar 

  • Fitter J (2003) A measure of conformational entropy change during thermal protein unfolding using neutron spectroscopy. Biophys J 84(6):3924–3930

    Article  Google Scholar 

  • Frederick KK, Marlow MS, Valentine KG, Wand AJ (2007) Conformational entropy in molecular recognition by proteins. Nature 448(7151):325–329

    Article  Google Scholar 

  • Frosini P, Landi C (2013) Persistent Betti numbers for a noise tolerant shape-based approach to image retrieval. Pattern Recognit Lett 34(8):863–872

    Article  Google Scholar 

  • Frosini Patrizio, Landi Claudia (1999) Size theory as a topological tool for computer vision. Pattern Recognit Image Anal 9(4):596–603

    Google Scholar 

  • Gameiro M, Hiraoka Y, Izumi S, Kramar M, Mischaikow K, Nanda V (2015) A topological measurement of protein compressibility. Jpn J Ind Appl Math 32(1):1–17

  • Gellman SH (1997) Introduction: molecular recognition. Chem Rev 97(5):1231–1232

    Article  Google Scholar 

  • Ghrist R (2008) Barcodes: the persistent topology of data. Bull Am Math Soc 45(1):61–75

    Article  MathSciNet  MATH  Google Scholar 

  • Halle B (2002) Flexibility and packing in proteins. PNAS 99:1274–1279

    Article  Google Scholar 

  • Hatcher A (2001) Algebraic topology. Cambridge University Press, Cambridge

    MATH  Google Scholar 

  • Horak D, Maletic S, Rajkovic M (2009) Persistent homology of complex networks. J Stat Mech Theory Exp 2009(03):P03034

    Article  MathSciNet  Google Scholar 

  • Janin J, Sternberg MJ (2013) Protein flexibility, not disorder, is intrinsic to molecular recognition. F1000 Biol Rep 5(2):1–7

    Google Scholar 

  • Kaczynski T, Mischaikow K, Mrozek M (2004) Computational homology. Springer, Springer

    Book  MATH  Google Scholar 

  • Karplus M, Kushick JN (1981) Method for estimating the configurational entropy of macromolecules. Macromolecules 14(2):325–332

    Article  Google Scholar 

  • Kasson PM, Zomorodian A, Park S, Singhal N, Guibas LJ, Pande VS (2007) Persistent voids a new structural metric for membrane fusion. Bioinformatics 23:1753–1759

    Article  Google Scholar 

  • Korkut A, Hendrickson WA (2013) Stereochemistry of polypeptide conformation in Coarse Grained analysis. In: Biomolecular forms and functions: a celebration of 50 years of the Ramachandran Map, World Scientific Publishing. pp 136–147

  • Lee H, Kang H, Chung MK, Kim B, Lee DS (2012) Persistent brain network homology from the perspective of dendrogram. IEEE Trans Med Imaging 31(12):2267–2277

    Article  Google Scholar 

  • Levitt M, Warshel A (1975) Computer simulation of protein folding. Nature 253(5494):694–698

    Article  Google Scholar 

  • Liu X, Xie Z, Yi DY (2012) A fast algorithm for constructing topological structure in large data. Homol Homot Appl 14:221–238

    Article  MathSciNet  MATH  Google Scholar 

  • Marlow MS, Dogan J, Frederick KK, Valentine KG, Wand AJ (2010) The role of conformational entropy in molecular recognition by calmodulin. Nat Chem Biol 6(5):352–358

    Article  Google Scholar 

  • Merelli E, Rucco M, Sloot P, Tesei L (2015) Topological characterization of complex systems: using persistent entropy. Entropy 17(10):6872–6892

    Article  Google Scholar 

  • Mischaikow K, Mrozek M, Reiss J, Szymczak A (1999) Construction of symbolic dynamics from experimental time series. Phys Rev Lett 82:1144–1147

    Article  Google Scholar 

  • Mischaikow K, Nanda V (2013) Morse theory for filtrations and efficient computation of persistent homology. Discrete Comput Geom 50(2):330–353

    Article  MathSciNet  MATH  Google Scholar 

  • Munkres JR (1984) Elements of algebraic topology, vol 2. Addison-Wesley, Menlo Park

    MATH  Google Scholar 

  • Nanda V Perseus: the persistent homology software. Software available at http://www.sas.upenn.edu/~vnanda/perseus

  • Nguyen D, Xia KL, Wei GW (2016) Generalized flexibility–rigidity index. J Chem Phys 144(23):234106

    Article  Google Scholar 

  • Niyogi P, Smale S, Weinberger S (2011) A topological view of unsupervised learning from noisy data. SIAM J Comput 40:646–663

    Article  MathSciNet  MATH  Google Scholar 

  • Opron K, Xia KL, Burton ZF, Wei GW (2016) Flexibility rigidity index for protein nucleic acid flexibility and fluctuation analysis. J Comput Chem 37(14):1283–1295

    Article  Google Scholar 

  • Opron K, Xia KL, Wei GW (2014) Fast and anisotropic flexibility–rigidity index for protein flexibility and fluctuation analysis. J Chem Phys 140:234105

    Article  Google Scholar 

  • Opron K, Xia KL, Wei GW (2015) Communication: capturing protein multiscale thermal fluctuations. J Chem Phys 142(21):211101

    Article  Google Scholar 

  • Pachauri D, Hinrichs C, Chung MK, Johnson SC, Singh V (2011) Topology-based kernels with application to inference problems in alzheimer’s disease. IEEE Trans Med Imaging 30(10):1760–1770

    Article  Google Scholar 

  • Rieck B, Mara H, Leitte H (2012) Multivariate data analysis using persistence-based filtering and topological signatures. IEEE Trans Vis Comput Gr 18:2382–2391

    Article  Google Scholar 

  • Robins Vanessa (1999) Towards computing homology from finite approximations. Topol Proc 24:503–532

    MathSciNet  MATH  Google Scholar 

  • Rucco M, Castiglione F, Merelli E, Pettini M (2016) Characterisation of the idiotypic immune network through persistent entropy. In: Proceedings of ECCS 2014, Springer. pp 117–128

  • Rucco M, Gonzalez-Diaz R, Jimenez MJ, Atienza N, Cristalli C, Concettoni E, Ferrante A, Merelli E (2017) A new topological entropy-based approach for measuring similarities among piecewise linear functions. Signal Process 134:130–138

    Article  Google Scholar 

  • Sapienza PJ, Lee AL (2010) Using NMR to study fast dynamics in proteins: methods and applications. Curr Opin Pharmacol 10(6):723–730

    Article  Google Scholar 

  • Shen MY, Sali A (2006) Statistical potential for assessment and prediction of protein structures. Prot Sci 15(11):2507–2524

    Article  Google Scholar 

  • Silva VD, Ghrist R (2005) Blind swarms for coverage in 2-d. In: Proceedings of robotics: science and systems, pp 01

  • Singh G, Memoli F, Ishkhanov T, Sapiro G, Carlsson G, Ringach DL (2008) Topological analysis of population activity in visual cortex. J Vis 8(8):11.1–18

  • Stites WE, Pranata J (1995) Empirical evaluation of the influence of side chains on the conformational entropy of the polypeptide backbone. Prot Struct Funct Bioinf 22(2):132–140

    Article  Google Scholar 

  • Tausz A, Vejdemo-Johansson M, Adams H (2011) Javaplex: a research software package for persistent (co)homology. Software available at http://code.google.com/p/javaplex

  • Thompson JB, Hansma HG, Hansma PK, Plaxco KW (2002) The backbone conformational entropy of protein folding: experimental measures from atomic force microscopy. J Mol Biol 322(3):645–652

    Article  Google Scholar 

  • Trbovic N, Cho JH, Abel R, Friesner RA, Rance M, Palmer AG III (2008) Protein side-chain dynamics and residual conformational entropy. J Am Chem Soc 131(2):615–622

    Article  Google Scholar 

  • Wang B, Summa B, Pascucci V, Vejdemo-Johansson M (2011) Branching and circular features in high dimensional data. IEEE Trans Vis Comput Gr 17:1902–1911

    Article  Google Scholar 

  • Wang B, Wei GW (2016) Object-oriented persistent homology. J Comput Phys 305:276–299

    Article  MathSciNet  MATH  Google Scholar 

  • Xia KL, Feng X, Tong YY, Wei GW (2015) Persistent homology for the quantitative prediction of fullerene stability. J Comput Chem 36:408–422

    Article  Google Scholar 

  • Xia KL, Opron K, Wei GW (2013) Multiscale multiphysics and multidomain models—flexibility and rigidity. J Chem Phys 139:194109

    Article  Google Scholar 

  • Xia KL, Opron K, Wei GW (2015) Multiscale Gaussian network model (mGNM) and multiscale anisotropic network model (manm). J Chem Phys 143(20):204106

    Article  Google Scholar 

  • Xia KL, Wei GW (2014) Persistent homology analysis of protein structure, flexibility and folding. Int J Numer Methods Biomed Eng 30:814–844

    Article  MathSciNet  Google Scholar 

  • Xia KL, Wei GW (2015) Multidimensional persistence in biomolecular data. J Comput Chem 36:1502–1520

    Article  Google Scholar 

  • Xia KL, Wei GW (2015) Persistent topology for cryo-EM data analysis. Int J Numer Methods Biomed Eng 31:e02719

    Article  MathSciNet  Google Scholar 

  • Xia KL, Zhao ZX, Wei GW (2015) Multiresolution topological simplification. J Comput Biol 22:1–5

    Article  Google Scholar 

  • Yao Y, Sun J, Huang XH, Bowman GR, Singh G, Lesnick M, Guibas LJ, Pande VS, Carlsson G (2009) Topological methods for exploring low-density states in biomolecular folding pathways. J Chem Phys 130:144115

    Article  Google Scholar 

  • Zhang J, Lin M, Chen R, Wang W, Liang J (2008) Discrete state model and accurate estimation of loop entropy of rna secondary structures. J Chem Phys 128(12):125107

    Article  Google Scholar 

  • Zhong S, Moix JM, Quirk S, Hernandez R (2006) Dihedral-angle information entropy as a gauge of secondary structure propensity. Biophys J 91(11):4014–4023

    Article  Google Scholar 

  • Zomorodian A, Carlsson G (2005) Computing persistent homology. Discrete Comput Geom 33:249–274

    Article  MathSciNet  MATH  Google Scholar 

  • Zomorodian Afra, Carlsson Gunnar (2008) Localized homology. Comput Geom Theory Appl 41(3):126–148

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

This work was supported in part by Nanyang Technological University Startup Grant M4081842.110 and Singapore Ministry of Education Academic Research fund Tier 1 M401110000. Zhiming Li thanks the Chinese Scholarship Council for the financial support No. 201506775038. Lin Mu’s research is based upon work supported in part by the U.S. Department of Energy, Office of Science, Office of Advanced Scientific Computing Research, Applied Mathematics program under award number ERKJE45; and by the Laboratory Directed Research and Development program at the Oak Ridge National Laboratory, which is operated by UT-Battelle, LLC., for the U.S. Department of Energy under Contract DE-AC05-00OR22725.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kelin Xia.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Xia, K., Li, Z. & Mu, L. Multiscale Persistent Functions for Biomolecular Structure Characterization. Bull Math Biol 80, 1–31 (2018). https://doi.org/10.1007/s11538-017-0362-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11538-017-0362-6

Keywords

Navigation