Skip to main content

Statistical Identification of Important Nodes in Biological Systems

Abstract

Biological systems can be modeled and described by biological networks. Biological networks are typical complex networks with widely real-world applications. Many problems arising in biological systems can be boiled down to the identification of important nodes. For example, biomedical researchers frequently need to identify important genes that potentially leaded to disease phenotypes in animal and explore crucial genes that were responsible for stress responsiveness in plants. To facilitate the identification of important nodes in biological systems, one needs to know network structures or behavioral data of nodes (such as gene expression data). If network topology was known, various centrality measures can be developed to solve the problem; while if only behavioral data of nodes were given, some sophisticated statistical methods can be employed. This paper reviewed some of the recent works on statistical identification of important nodes in biological systems from three aspects, that is, 1) in general complex networks based on complex networks theory and epidemic dynamic models; 2) in biological networks based on network motifs; and 3) in plants based on RNA-seq data. The identification of important nodes in a complex system can be seen as a mapping from the system to the ranking score vector of nodes, such mapping is not necessarily with explicit form. The three aspects reflected three typical approaches on ranking nodes in biological systems and can be integrated into one general framework. This paper also proposed some challenges and future works on the related topics. The associated investigations have potential real-world applications in the control of biological systems, network medicine and new variety cultivation of crops.

References

  1. Newman M, Barabási A L, and Watts D J, The Structure and Dynamics of Networks, Princeton University Press, Princeton and Oxford, 2006.

    MATH  Book  Google Scholar 

  2. Wu X, Wei W, Tang L, et al., Coreness and h-index for weighted networks, IEEE Trans. Circuits Syst. I: Reg. Papers, 2019, 66(8): 3113–3122.

    MathSciNet  Article  Google Scholar 

  3. Mei G, Wu X, Wang Y, et al., Compressive-sensing-based structure identification for multilayer networks, IEEE Trans. Cyber., 2018, 48(2): 754–764.

    Article  Google Scholar 

  4. Wei X, Wu X, Chen S, et al., Cooperative epidemic spreading on a two-layered interconnected network, SIAM J. Appl. Dyn. Syst., 2018, 17(2): 1503–1520.

    MathSciNet  MATH  Article  Google Scholar 

  5. Jia Z, Chen H, Tu L, et al., Stability and feedback control for a coupled hematopoiesis nonlinear system, Adv. Differ. Equa., 2018, 2018: 401.

    MathSciNet  MATH  Article  Google Scholar 

  6. Long Y, Jia Z, and Wang Y, Coarse graining method based on generalized degree in complex network, Physica A, 2018, 505: 655–665.

    Article  Google Scholar 

  7. Chen L, Wang R, and Zhang X, Biomolecular Networks: Methods and Applications in Systems Biology, Wiley, New Jersey, 2009.

    Book  Google Scholar 

  8. Liu S, Xu Q, Chen A, et al., Structural controllability of static and dynamic transcriptional regulatory networks for Saccharomyces cerevisiae, Physica A, 2020, 537: 122772.

    Article  Google Scholar 

  9. Barabási A L, Gulbahce N, and Loscalzo J, Network medicine: A network-based approach to human disease, Nat. Rev., 2011, 12: 56–68.

    Article  Google Scholar 

  10. Wang Z, Yang C, Chen H, et al., Multi-gene co-transformation can improve comprehensive resistance to abiotic stresses in B. napus L., Plant Sci., 2018, 274: 410–419.

    Article  Google Scholar 

  11. Shang B, Zang Y, Zhao X, et al., Functional characterization of GhPHOT2 in chloroplast avoidance of Gossypium hirsutum, Plant Physiol. Bioch., 2019, 135: 51–60.

    Article  Google Scholar 

  12. Qu X, Cao B, Kang J, et al., Fine-tuning stomatal movement through small signaling peptides, Front Plant Sci., 2019, 10: 69.

    Article  Google Scholar 

  13. Wang D, Yang C, Dong L, et al., Comparative transcriptome analyses of drought-resistant and -susceptible Brassica napus L. and development of EST-SSR markers by RNA-Seq, J. Plant Biol., 2015, 58: 259–269.

    Article  Google Scholar 

  14. Zhang S, Li X, Pan J, et al., Use of comparative transcriptome analysis to identify candidate genes related to albinism in channel catfish (Ictalurus punctatus), Aquaculture, 2018, 500: 75–81.

    Article  Google Scholar 

  15. Dong, W, Li M M, Li Z G, et al., Transcriptome analysis of the molecular mechanism of Chrysanthemum flower color change under short-day photoperiods, Plant Physiol. Bioch., 2020, 146: 315–328.

    Article  Google Scholar 

  16. Zhang G F, Yue C M, Lu T T, et al., Genome-wide identification and expression analysis of NADPH oxidase genes in response to ABA and abiotic stresses, and in fibre formation in Gossypium, Peer J, 2020, 8: e8404.

    Article  Google Scholar 

  17. Kitsak M, Gallos L K, Havlin S, et al., Identification of influential spreaders in complex networks, Nat. Phys., 2010, 6: 888–893.

    Article  Google Scholar 

  18. Wang P, Tian C, and Lu J, Identifying influential spreaders in artificial complex networks, Journal of Systems Science and Complexity, 2014, 27(4): 650–665.

    MATH  Article  Google Scholar 

  19. Lü L Y, Chen D, Ren X, et al., Vital nodes identification in complex networks, Phys. Rep., 2016, 650: 1–63.

    MathSciNet  Article  Google Scholar 

  20. Zhang Z K, Liu C, Zhan X X, et al., Dynamics of information diffusion and its applications on complex networks, Phys. Rep., 2016, 651: 1–34.

    MathSciNet  Article  Google Scholar 

  21. Ksiazek T G, Erdman D, Goldsmith C S, et al., A novel coronavirus associated with severe acute respiratory syndrome, N. Engl. J. Med., 2003, 348: 1953–1966.

    Article  Google Scholar 

  22. Kuiken T, Fouchier R, Schutten M, et al., Newly discovered coronavirus as the primary cause of severe acute respiratory syndrome, Lancet, 2003, 362: 263–270.

    Article  Google Scholar 

  23. Zhu N, Zhang D, Wang W, et al., A novel coronavirus from patients with pneumonia in China, N. Engl. J. Med., 2020, 382: 727–733.

    Article  Google Scholar 

  24. Huang C, Wang Y, Li X, et al., Clinical features of patients infected with 2019 novel coronavirus in Wuhan, China, Lancet, 2020, 395: 497–506.

    Article  Google Scholar 

  25. Wang P, Lu J, Jin Y, et al., Statistical and network analysis of 1212 COVID-19 patients in Henan, China, Int. J. Infect. Disease, 2020, 95: 391–398.

    Article  Google Scholar 

  26. Pastor-Satorras R and Vespignani A, Epidemic spreading in scale-free networks, Phys. Rev. Lett., 2001, 86(14): 3200–3203.

    Article  Google Scholar 

  27. Boguna M, Pastor-Satorras R, and Vespignani A, Absence of epidemic threshold in scale-free networks with degree correlations, Phys. Rev. Lett., 2003, 90(2): 028701.

    MATH  Article  Google Scholar 

  28. Gallos L K, Liljeros F, Argyrakis P, et al., Improving immunization strategies, Phys. Rev. E, 2007, 75(4): 045104.

    Article  Google Scholar 

  29. Xu S, Wang P, Zhang C, et al., Spectral learning algorithm reveals propagation capability of complex network, IEEE Trans. Cyber., 2019, 49(12): 4253–4261.

    Article  Google Scholar 

  30. Wang P, Lü J, and Yu X, Identification of important nodes in directed biological networks: A network motif approach, PLoS One, 2014, 9(8): e106132.

    Article  Google Scholar 

  31. Wang P, Chen Y, Lü J, et al., Graphical features of functional genes in human protein interaction network, IEEE Trans. Biomed. Circuits Syst., 2016, 10(3): 707–720.

    Article  Google Scholar 

  32. Wang P, Yang C, Chen H, et al., Exploring transcriptional factors reveals crucial members and regulatory networks involved in different abiotic stresses in Brassica napus L., BMC Plant Biol., 2018, 18: 202.

    Article  Google Scholar 

  33. Wang P, Yang C, Chen H, et al., Transcriptomic basis for drought-resistance in Brassica napus L., Sci. Rep., 2017, 7: 40532.

    Article  Google Scholar 

  34. Chen F, Wang Y, Wang B, et al., Graph representation learning: A survey, 2019, arXiv: 1909.00958.

  35. Wu Z, Pan S, Chen F, et al., A comprehensive survey on graph neural networks, 2019, ArXiv: 1901.00596v3.

  36. Bühlmann P and van de Geer S, Statistics for High-Dimensional Data: Methods, Theory and Applications, Springer-Verlag, Berlin Heidelberg, 2011.

    MATH  Book  Google Scholar 

  37. Wang P, Yu X, and Lü J, Identification and evolution of structurally dominant nodes in protein-protein interaction networks, IEEE Trans. Biomed. Circuits Syst., 2014, 8(1): 87–97.

    Article  Google Scholar 

  38. Xu S, Wang P, and Lü J, Iterative neighbour-information gathering for ranking nodes in complex networks, Sci. Rep., 2017, 7: 41321.

    Article  Google Scholar 

  39. Brin S and Page L, Reprint of: The anatomy of a large-scale hypertextual web search engine, Comput. Netw., 2012, 56(18): 3825–3833.

    Article  Google Scholar 

  40. Lü L, Zhang Y, Yeung C H, et al., Leaders in social networks, the delicious case, PLoS One, 2011, 6: e21202.

    Article  Google Scholar 

  41. Xu S and Wang P, Identifying important nodes by adaptive LeaderRank, Physica A, 2017, 469: 654–664.

    Article  Google Scholar 

  42. Metzner R, Fundamental of statistical and thermal physics, Phys. Today, 1967, 20(12): 85–87.

    Article  Google Scholar 

  43. Milo R, Shen-Orr S, Itzkovitz S, et al., Network motifs: Simple building blocks of complex networks, Science, 2002, 298: 824–827.

    Article  Google Scholar 

  44. Koschützki D, Schwöbbermeyer H, and Schreiber F, Ranking of network elements based on functional substructures, J. Theor. Biol., 2007, 248: 471–479.

    MATH  Article  Google Scholar 

  45. Alon U, Network motifs: Theory and experimental approaches, Nat. Rev. Genet., 2007, 8(6): 450–461.

    Article  Google Scholar 

  46. Koschützki D and Schreiber F, Centrality analysis methods for biological networks and their application to gene regulatory networks, Gene Regulat. Syst. Biol., 2008, 2: 193–201.

    Google Scholar 

  47. Sporns O and Kötter R, Motifs in brain networks, PLoS Biol., 2004, 2: e369.

    Article  Google Scholar 

  48. Sporns O, Honey C J, and Kötter R, Identification and classification of hubs in brain networks, PLoS One, 2007, 2: e1049.

    Article  Google Scholar 

  49. Rubinov M and Sporns O, Complex network measures of brain connectivity: Uses and interpretations, NeuroImage, 2010, 52: 1059–1069.

    Article  Google Scholar 

  50. Härdle W K and Simar L, Applied Multivariate Statistical Analysis, Springer-Verlag, Berlin Heidelberg, 2012.

    MATH  Book  Google Scholar 

  51. Li W and Li J, Modeling and analysis of RNA-seq data: A review from a statistical perspective, Quantitative Biol., 2018, 6(3): 195–209.

    Article  Google Scholar 

  52. Samuels M L, Witmer J A, and Schaffner A A, Statistics for the Life Sciences, 5th Edition, Pearson Education, Edinburgh Gate, Harlow, 2016.

    Google Scholar 

  53. Anders S and Huber W, Differential expression analysis for sequence count data, Genome Biol., 2010, 11(10): R106.

    Article  Google Scholar 

  54. Love M I, Huber W, and Anders S, Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2, Genome Biol., 2014, 15(12): 550.

    Article  Google Scholar 

  55. Li H, Wei Z, and Maris J M, A hidden Markov random field model for genome-wide association studies, Biostat., 2010, 11: 139–150.

    MATH  Article  Google Scholar 

  56. Chen M, Cho J, Zhao H, et al., Incorporating biological pathways via a Markov random field model in genome-wide association studies, PLoS Genet., 2011, 7: e1001353.

    Article  Google Scholar 

  57. Hou L, Chen M, Zhang C K, et al., Guilt by rewiring: Gene prioritization through network rewiring in genome wide association studies, Hum. Mol. Genet., 2014, 23(10): 2780–2790.

    Article  Google Scholar 

  58. Chalhoub B, Denoeud F, Liu S, et al., Early allopolyploid evolution in the post-neolithic Brassica napus oilseed genome, Science, 2014, 345: 950–953.

    Article  Google Scholar 

  59. Wang X, Wang H, Wang J, et al., The genome of the mesopolyploid crop species Brassica rapa, Nat Genet., 43: 1035–1039.

  60. Liu S, Liu Y, Yong C, et al., The Brassica oleracea genome reveals the asymmetrical evolution of polyploid genomes, Nat. Commun., 2014, 5: 3930.

    Article  Google Scholar 

  61. Huala E, Dickerman A W, Garciahernandez M, et al., The Arabidopsis Information Resource (TAIR): A comprehensive database and web-based information retrieval, analysis, and visualization system for a model plant, Nucleic Acids Res., 2001, 29: 102–105.

    Article  Google Scholar 

  62. Li C and Li H, Network-constrained regularization and variable selection for analysis of genomic data, Bioinformat., 2008, 24(9): 1175–1182.

    Article  Google Scholar 

  63. Liao J G and Chin K V, Logistic regression for disease classification using microarray data: Model selection in a large p and small n case, Bioinformat., 2007, 23(15): 1945–1951.

    Article  Google Scholar 

Download references

Author information

Affiliations

Authors

Corresponding author

Correspondence to Pei Wang.

Additional information

This paper was supported by the National Natural Science Foundation of China under Grant No. 61773153, the Natural Science Foundation of Henan under Grant No. 202300410045, the Supporting Plan for Scientific and Technological Innovative Talents in Universities of Henan Province under Grant No. 20HASTIT025, and the Training Plan of Young Key Teachers in Colleges and Universities of Henan Province under Grant No. 2018GGJS021. Partly supported by the Supporting Grant of Bioinformatics Center of Henan University under Grant No. 2018YLJC03.

This paper was recommended for publication by Editor GUO Jin.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Wang, P. Statistical Identification of Important Nodes in Biological Systems. J Syst Sci Complex (2021). https://doi.org/10.1007/s11424-021-0001-2

Download citation

  • Received:

  • Revised:

  • Published:

  • DOI: https://doi.org/10.1007/s11424-021-0001-2

Keywords

  • Biological network
  • complex network
  • important node
  • network motif
  • RNA-seq