Skip to main content

Persistent Homology for RNA Data Analysis

  • Protocol
  • First Online:
Homology Modeling

Part of the book series: Methods in Molecular Biology ((MIMB,volume 2627))

Abstract

Molecular representations are of great importance for machine learning models in RNA data analysis. Essentially, efficient molecular descriptors or fingerprints that characterize the intrinsic structural and interactional information of RNAs can significantly boost the performance of all learning modeling. In this paper, we introduce two persistent models, including persistent homology and persistent spectral, for RNA structure and interaction representations and their applications in RNA data analysis. Different from traditional geometric and graph representations, persistent homology is built on simplicial complex, which is a generalization of graph models to higher-dimensional situations. Hypergraph is a further generalization of simplicial complexes and hypergraph-based embedded persistent homology has been proposed recently. Moreover, persistent spectral models, which combine filtration process with spectral models, including spectral graph, spectral simplicial complex, and spectral hypergraph, are proposed for molecular representation. The persistent attributes for RNAs can be obtained from these two persistent models and further combined with machine learning models for RNA structure, flexibility, dynamics, and function analysis.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 189.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 249.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Singh J, Hanson J, Paliwal K, Zhou Y (2019) RNA secondary structure prediction using an ensemble of two-dimensional deep neural networks and transfer learning.Nat Commun 10(1):1–13

    Google Scholar 

  2. Liu B (2019) BioSeq-Analysis: a platform for DNA, RNA and protein sequence analysis based on machine learning approaches. Brief Bioinfor 20(4), 1280–1294

    Article  CAS  Google Scholar 

  3. Puton T, Kozlowski LP, Rother KM, Bujnicki JM (2013) CompaRNA: a server for continuous benchmarking of automated methods for RNA secondary structure prediction. Nucleic Acids Res 41(7):4307–4323

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  4. Bellaousov S, Mathews DH (2010) ProbKnot: fast prediction of RNA secondary structure including pseudoknots. RNA 16(10):1870–1880

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  5. Guruge I, Taherzadeh G, Zhan J, Zhou Y, Yang Y (2018) B-factor profile prediction for RNA flexibility using support vector machines. J Comput Chem 39(8):407–411

    Article  CAS  PubMed  Google Scholar 

  6. Wei H, Wang B, Yang J, Gao J (2019) RNA flexibility prediction with sequence profile and predicted solvent accessibility. IEEE/ACM Trans Comput Biol Bioinf 18:2017–2022

    Article  Google Scholar 

  7. Verri A, Uras C, Frosini P, Ferri M (1993) On the use of size functions for shape analysis. Biolog Cybern 70(2):99–107

    Article  Google Scholar 

  8. Edelsbrunner H, Letscher D, Zomorodian A (2002) Topological persistence and simplification. Discrete Comput Geom 28:511–533

    Article  Google Scholar 

  9. Zomorodian A, Carlsson G (2005) Computing persistent homology. Discrete Comput Geom 33:249–274

    Article  Google Scholar 

  10. Zomorodian A, Carlsson G (2008) Localized homology. Comput Geom Theory Appl 41(3):126–148

    Article  Google Scholar 

  11. Edelsbrunner H, Harer J (2010) Computational topology: an introduction. American Mathematical Society, Providence

    Google Scholar 

  12. Kaczynski T, Mischaikow K, Mrozek M (2004) Computational homology. Springer, Berlin

    Book  Google Scholar 

  13. Xia KL, Wei GW (2014) Persistent homology analysis of protein structure, flexibility and folding. Int J Num Methods Biomed Eng 30:814–844

    Article  Google Scholar 

  14. Wang B, Wei GW (2016) Object-oriented persistent homology. J Comput Phys 305:276–299

    Article  PubMed  PubMed Central  Google Scholar 

  15. Cang ZX, Wei GW (2017) TopologyNet: topology based deep convolutional and multi-task neural networks for biomolecular property predictions. PLOS Comput Biol 13(7):e1005690

    Article  PubMed  PubMed Central  Google Scholar 

  16. Cang ZX, Wei GW (2017) Integration of element specific persistent homology and machine learning for protein-ligand binding affinity prediction. Int J Numer Methods Biomed Eng 34:e2914. https://doi.org/10.1002/cnm.2914

    Google Scholar 

  17. Nguyen DD, Xiao T, Wang ML, Wei GW (2017) Rigidity strengthening: a mechanism for protein–ligand binding. J Chem Inf Modeling 57(7):1715–1721

    Article  CAS  Google Scholar 

  18. Cang ZX, Wei GW (2017) Analysis and prediction of protein folding energy changes upon mutation by element specific persistent homology. Bioinformatics 33(22):3549–3557

    CAS  PubMed  Google Scholar 

  19. Cang ZX, Mu L, Wei GW (2018) Representability of algebraic topology for biomolecules in machine learning based scoring and virtual screening. PLoS Comput Biol 14(1):e1005929

    Article  PubMed  PubMed Central  Google Scholar 

  20. Wu KD, Wei GW (2018) Quantitative toxicity prediction using topology based multi-task deep neural networks. J Chem Inf Modeling 58:520–531. https://doi.org/10.1021/acs.jcim.7b00558

    Article  CAS  Google Scholar 

  21. Ghrist R (2008) Barcodes: the persistent topology of data. Bull Amer Math Soc 45(1):61–75

    Article  Google Scholar 

  22. Tausz A, Vejdemo-Johansson M, Adams H (2011) Javaplex: a research software package for persistent (co)homology. Software available at http://code.google.com/p/javaplex

  23. Nanda V, Perseus: the persistent homology software. Software available at http://www.sas.upenn.edu/~vnanda/perseus

  24. Bauer U, Kerber M, Reininghaus J (2014) Distributed computation of persistent homology. In: Proceedings of the sixteenth workshop on algorithm engineering and experiments (ALENEX)

    Google Scholar 

  25. Dionysus: the persistent homology software. Software available at http://www.mrzv.org/software/dionysus

  26. Binchi J, Merelli E, Rucco M, Petri G, Vaccarino F (2014) jHoles: a tool for understanding biological complex networks via clique weight rank persistent homology. Electron Notes Theoret Comput Sci 306:5–18

    Article  Google Scholar 

  27. Maria C (2015) Filtered complexes. In: GUDHI User and Reference Manual, GUDHI Editorial Board

    Google Scholar 

  28. Fasy BT, Kim J, Lecci F, Maria C (2014) Introduction to the R package TDA. Preprint arXiv:1411.1830

    Google Scholar 

  29. Mischaikow K, Nanda V (2013) Morse theory for filtrations and efficient computation of persistent homology. Discrete Comput Geom 50(2):330–353

    Article  Google Scholar 

  30. Bubenik P, Kim PT (2007) A statistical approach to persistent homology. Homol Homotopy Appl 19:337–362

    Article  Google Scholar 

  31. Bubenik P (2015) Statistical topological data analysis using persistence landscapes. J Mach Learn Res 16(1):77–102

    Google Scholar 

  32. Carlsson G (2009) Topology and data. Am Math Soc 46(2):255–308

    Article  Google Scholar 

  33. Chintakunta H, Gentimis T, Gonzalez-Diaz R, Jimenez MJ, Krim H (2015) An entropy-based persistence barcode. Pattern Recogn 48(2):391–401

    Article  Google Scholar 

  34. Merelli E, Rucco M, Sloot P, Tesei L (2015) Topological characterization of complex systems: Using persistent entropy. Entropy 17(10):6872–6892

    Article  Google Scholar 

  35. Rucco M, Castiglione F, Merelli E, Pettini M (2016) Characterisation of the idiotypic immune network through persistent entropy. In: Proceedings of ECCS 2014, pp 117–128. Springer, Berlin

    Chapter  Google Scholar 

  36. Xia KL, Li ZM, Mu L (2018) Multiscale persistent functions for biomolecular structure characterization. Bull Math Biol 80(1):1–31

    Article  CAS  PubMed  Google Scholar 

  37. Collins A, Zomorodian A, Carlsson G, Guibas LJ (2004) A barcode shape descriptor for curve point cloud data. Comput Graph 28(6):881–894

    Article  Google Scholar 

  38. Cohen-Steiner D, Edelsbrunner H, Harer J (2007) Stability of persistence diagrams. Discrete Comput Geom 37(1):103–120

    Article  Google Scholar 

  39. Cohen-Steiner D, Edelsbrunner H, Harer J, Mileyko Y (2010) Lipschitz functions have lp-stable persistence. Found Comput Math 10(2):127–139

    Article  Google Scholar 

  40. Dawson RJM (1990) Homology of weighted simplicial complexes. Cahiers de Topologie et Géométrie Différentielle Catégoriques 31(3):229–243

    Google Scholar 

  41. Ren SQ, Wu CY, Wu J (2018) Weighted persistent homology. Rocky Mountain J Math 48(8):2661–2687

    Article  Google Scholar 

  42. Wu CY, Ren SQ, Wu J, Xia KL (2018) Weighted (co) homology and weighted Laplacian. Sci China Math

    Google Scholar 

  43. Edelsbrunner H (1992) Weighted alpha shapes, vol 92. University of Illinois at Urbana-Champaign, Department of Computer Science, Champaign

    Google Scholar 

  44. Bell G, Lawson A, Martin J, Rudzinski J, Smyth C (2017) Weighted persistent homology. Preprint arXiv:1709.00097

    Google Scholar 

  45. Guibas L, Morozov D, Mérigot Q (2013) Witnessed k-distance. Discrete Comput Geom 49(1):22–45

    Article  Google Scholar 

  46. Buchet M, Chazal F, Oudot SY, Sheehy DR (2016) Efficient and robust persistent homology for measures. Comput Geom 58:70–96

    Article  Google Scholar 

  47. Xia KL, Wei GW (2015) Multidimensional persistence in biomolecular data. J Comput Chem 36:1502–1520

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  48. Xia KL, Zhao ZX, Wei GW (2015) Multiresolution persistent homology for excessively large biomolecular datasets. J Chem Phys 143(13):10B603_1

    Google Scholar 

  49. Petri G, Scolamiero M, Donato I, Vaccarino F (2013) Topological strata of weighted complex networks. PloS one 8(6):e66506

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  50. Xia KL, Wei GW (2014) Persistent homology analysis of protein structure, flexibility, and folding. Int J Numer Methods Biomed Eng 30(8):814–844

    Article  Google Scholar 

  51. Nguyen DD, Cang ZX, Wu KD, Wang ML, Cao Y, Wei GW (2019) Mathematical deep learning for pose and binding affinity prediction and ranking in D3R Grand Challenges. J Comput-Aided Molec Design 33(1):71–82

    Article  CAS  Google Scholar 

  52. Meng ZY, Anand DV, Lu YP, Wu J, Xia KL (2020) Weighted persistent homology for biomolecular data analysis. Sci Rep 10(1):1–15

    Article  Google Scholar 

  53. Anand DV, Meng ZY, Xia KL, Mu YG (2020) Weighted persistent homology for osmolyte molecular aggregation and hydrogen-bonding network analysis. Sci Rep 10(1):1–17

    Article  Google Scholar 

  54. Pun CS, Yong BYS, Xia K (2020) Weighted-persistent-homology-based machine learning for rna flexibility analysis. PloS one 15(8):e0237747

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  55. Chung F (1997) Spectral graph theory. American Mathematical Society, Providence

    Google Scholar 

  56. Spielman DA (2007) Spectral graph theory and its applications. In: 48th annual IEEE symposium on foundations of computer science (FOCS’07), pp 29–38, IEEE

    Google Scholar 

  57. Mohar B, Alavi Y, Chartrand G, Oellermann OR (1991) The Laplacian spectrum of graphs. Graph Theory Combin Appl 2(871–898):12

    Google Scholar 

  58. Von Luxburg U (2007) A tutorial on spectral clustering. Statist Comput 17(4):395–416

    Article  Google Scholar 

  59. Eckmann B (1944) Harmonische funktionen und randwertaufgaben in einem komplex. Commen Math Helvetici 17(1):240–255

    Article  Google Scholar 

  60. Muhammad A, Egerstedt M (2006) Control using higher order Laplacians in network topologies. In: Proceeding of the 17th international symposium on mathematical theory of networks and systems, pp 1024–1038. CiteSeer

    Google Scholar 

  61. Horak D, Jost J (2013) Spectra of combinatorial Laplace operators on simplicial complexes. Adv Math 244:303–336

    Article  Google Scholar 

  62. Barbarossa S, Sardellitti S (2020) Topological signal processing over simplicial complexes. IEEE Trans Signal Process 68:2992–3007

    Article  Google Scholar 

  63. Mukherjee S, Steenbergen J (2016) Random walks on simplicial complexes and harmonics. Random Struct Algor 49(2):379–405

    Article  Google Scholar 

  64. Parzanchevski O, Rosenthal R (2017) Simplicial complexes: spectrum, homology and random walks. Random Struct Algor 50(2):225–261

    Article  Google Scholar 

  65. Shukla S, Yogeshwaran D (2020) Spectral gap bounds for the simplicial Laplacian and an application to random complexes. J Combin Theory Ser A 169:105134

    Article  Google Scholar 

  66. Torres JJ, Bianconi G (2020) Simplicial complexes: higher-order spectral dimension and dynamics. Preprint arXiv:2001.05934

    Google Scholar 

  67. Bramer D, Wei G-W (2018) Blind prediction of protein b-factor and flexibility. J Chem Phys 149(13):134107

    Article  PubMed  PubMed Central  Google Scholar 

  68. Bramer D, Wei G-W (2020) Atom-specific persistent homology and its application to protein flexibility analysis. Comput Math Biophys 8(1):1–35

    Article  PubMed  PubMed Central  Google Scholar 

  69. Wee J, Xia K (2021) Forman persistent Ricci curvature (FPRC) based machine learning models for protein-ligand binding affinity prediction. Briefings in Bioinformatics 22:bbab136

    Google Scholar 

  70. Wee J, Xia K (2021) Ollivier persistent Ricci curvature-based machine learning for the protein–ligand binding affinity prediction. J Chem Inf Modeling 61(4):1617–1626

    Article  CAS  Google Scholar 

  71. Liu X, Wang XJ, Wu J, and Xia KL (2021) Hypergraph based persistent cohomology (HPC) for molecular representations in drug design. Briefings in Bioinformatics 22:bbaa411

    Google Scholar 

  72. Wang R, Nguyen DD, Wei G-W (2020) Persistent spectral graph. Int J Numer Methods Biomed Eng 36:e3376

    Article  Google Scholar 

  73. Wang R, Zhao R, Ribando-Gros E, Chen J, Tong Y, Wei G-W (2020) HERMES: persistent spectral graph software. Found Data Sci 3:67–97

    Article  Google Scholar 

  74. Zhao R, Wang M, Chen J, Tong Y, Wei G-W (2020) The de Rham-Hodge analysis and modeling of biomolecules. Bull Math Biol 82(8):1–38

    Article  Google Scholar 

  75. Zhao R, Desbrun M, Wei G-W, Tong Y (2019) 3D Hodge decompositions of edge-and face-based vector fields. ACM Trans Graph (TOG) 38(6):1–13

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kelin Xia .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 Springer Science+Business Media, LLC, part of Springer Nature

About this protocol

Check for updates. Verify currency and authenticity via CrossMark

Cite this protocol

Xia, K., Liu, X., Wee, J. (2023). Persistent Homology for RNA Data Analysis. In: Filipek, S. (eds) Homology Modeling. Methods in Molecular Biology, vol 2627. Humana, New York, NY. https://doi.org/10.1007/978-1-0716-2974-1_12

Download citation

  • DOI: https://doi.org/10.1007/978-1-0716-2974-1_12

  • Published:

  • Publisher Name: Humana, New York, NY

  • Print ISBN: 978-1-0716-2973-4

  • Online ISBN: 978-1-0716-2974-1

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics