Abstract
Corona virus 19 (Covid-19) has caused many problems in public health, economic, and even cultural and social fields since the beginning of the epidemic. However, in order to provide therapeutic solutions, many researches have been conducted and various omics data have been published. But there is still no early diagnosis method and comprehensive treatment solution. In this manuscript, by collecting important genes related to COVID-19 and using centrality and controllability analysis in PPI networks and signaling pathways related to the disease; hub and driver genes have been identified in the formation and progression of the disease. Next, by analyzing the expression data, the obtained genes have been evaluated. The results show that in addition to the significant difference in the expression of most of these genes, their expression correlation pattern is also different in the two groups of COVID-19 and control. Finally, based on the drug-gene interaction, drugs affecting the identified genes are presented in the form of a bipartite graph, which can be used as the potential drug combinations.
Similar content being viewed by others
Introduction
After more than three years since the beginning of the COVID-19 epidemic, we still do not have a complete understanding of the disease process, early diagnosis, providing appropriate treatment and dealing with new strains of the disease. In addition to the many deaths, it has had during this period, on the one hand, due to its effect on other diseases, COVID-19 is considered a serious risk to public health1,2,3,4. And on the other hand, it has also challenged the social, cultural, and economic balances of societies5,6,7,8,9. Therefore, in order to provide treatment solutions, many databases have been formed and various bioinformatics researches have been conducted and various results have been provided10,11,12,13,14,15. But due to the unknown nature of the disease and the different goals and tools of the researches, the presented results of the processes and genes involved in the disease cover a wide range. Therefore, it is necessary to carry out research that is based on comprehensive analysis and integration of the results, by using the systems biology approaches based on the types of their effects on each other, to find a useful and comprehensive treatment method. Mostly, the research that considers the role of each gene in the progression of the disease separately and without its interactions with genes and other biomolecules, compared to the methods that consider the set of factors and their effects on each other in the form of complex networks and by using different network analysis tools, they seek to determine the role of each factor in the formation and progression of the disease, are less efficiency and have more side effects16,17,18,19,20,21,22,23.
Network science is an emerging field that represents existing complex systems in the form of networks, where system components are considered as vertices and connections between components as network edges. And it has become a powerful conceptual model in the field of computational biology for understanding biological relationships at the system level24,25,26,27,28,29. In the analysis of complex networks depending on the type and size of the network as well as the purpose of the research, various analytical tools from graph theory are available to determine important vertices and edges, identify motifs and modules, and finally infer general parameters of the network and compare it in different modes. One of the widely used tools in network analysis is checking controllability and identifying driver vertices30,31,32,33.
In order to determine the drug targets, one of the most practical methods is the use of controllability algorithms in the network with the aim of finding vertices with the least number in the network and strategies to change from an unfavorable state (disease) to a favorable state (Health), by using the intervention like medicine and other therapeutic agents. Several researches have been conducted with the perspective of system biology and various analyzes of the obtained bioinformatic networks, including checking the controllability and identifying driver vertices in order to obtain a collection of important and effective genes in COVID-1934,35,36,37.
In addition to network analysis of the omics data and identification of the effective and driver vertices to provide therapeutic solutions related to COVID-19, several studies have been conducted based on machine learning methods to identify signatures and the rules between them to diagnose COVID-19. For example, in an article, multiple machine learning methods on transcriptomics data of upper airway tissue with acute respiratory illnesses led to the identification of effective qualitative biomarkers and quantitative rules for distinguishing COVID-19 patients from other infectious38. In another work, the methylation data is first analyzed using the Monte Carlo feature selection method to obtain a feature list, and a decision tree is used to identify methylation features and decision rules that clearly distinguish different cases39. In an article, key methylation sites that have distinct patterns among patients with COVID-19 at different ages, have been identified40. Also, in another work, the biomarkers related to COVID-19 have been classified according to the severity of the disease41, microRNA, methylation signatures, and the rules between them that determine the severity of COVID-19 in different patients have been identified based on machine learning methods42,43.
In this manuscript; first, with the aim of considering the results of various researches conducted in order to obtain genes related to COVID-19, genes that have been reported with a valid p Value or with a large number of references in different articles, Collected. Then, by forming a PPI network between related proteins, centrality analysis has been performed and proteins that have many connections have been identified. Next, by targeting these proteins, in the directed network of signaling pathways related to COVID-19, the vertices that have the highest control power have been determined. Then, with correlation and DEG analyzes in the expression data, it has been shown that the identified hub and driver genes mainly in the two groups of COVID-19 and control have significant expression differences and secondly, their expression correlation patterns change. In the end, by using the connections between the existing drugs and the set of genes obtained, the bipartite drug-gene network, the drugs affecting each of the obtained genes have been determined, which could be potential combinations of existing drugs in order to repurpose drugs to treat COVID-19.
Results
Collection of genes related to COVID-19 based on literature
In the first stage, the set of genes related to corona disease have been collected based on the literature. For this purpose, two databases; the CORMINE Medical online and DisGeNET have been used. In the CORMINE database, the set of related genes is ranked based on the p Value criteria. 564 genes with p Value less than 0.05 were selected (Table S1). The DisGeNET database has ranked the set of related genes based on the number of references. Genes with more than 5 references have been selected, which is equal to 296 genes (Table S2). Due to the commonality of some genes, 757 genes related to COVID-19 that have many references or small p Value were collected from the above two sets, which were used for further analysis.
PPI network construction and analysis
In the following, among the collected genes related to COVID-19, genes that have many connections and effects on other genes have been identified. For this purpose, the PPI network between the corresponding proteins is formed based on the STRING database in which, functional and structural relationships is considered (Fig. 1a). The 10 proteins with the highest degree in the constructed network are indicated in the figure along with the number of connections. Also, the set of all proteins with a degree greater than 50 have been identified.
Next, it has been investigated that the identified hub proteins necessarily interact with the COVID-19 proteins. For this purpose, the human genes associated with COVID-19 which were obtained based on the virus-human protein interaction network and using the gene ontology, have been used44. Out of 10 identified hub proteins, 9 proteins have interactions which are considered for further analysis. (ACTB protein has no interaction) Also, out of 310 proteins with a degree higher than 50, 239 proteins have interaction and 71 proteins have no interaction, which are categorized into two groups in Table S3. A set of 239 proteins with a degree of more than 50 and having interaction, are considered control targets in signaling pathways related to COVID-19.
Controllability of signaling pathways related to COVID-19
Because the directional network related to the signaling pathways of COVID-19 includes the effect and effectiveness relationships between the genes in the network and specifies functional relationships that different genes have on each other during the progression of the disease. To analyze the controllability, the signaling pathways in the KEGG database45 have been used. At this stage, the directed network related to the signaling pathways of COVID-19 has been extracted. (Coronavirus disease—COVID-19—Homo sapiens) (Fig. 1b) Next, common vertices are obtained in the directed network, and the set of proteins with a degree higher than 50 and interacting with COVID-19 proteins of the previous step is considered as the control target. (Vertices that marked with green color) and then based on the target controllability algorithm with the least mediator vertices33, 10 vertices that have the most power to control the target set have been determined (the set of driver vertices is marked in the figure).
Because each of the obtained vertices, according to the definition of the vertices with the highest control power, must control a large number of the set of target vertices, and on the other hand, each of the vertices of the target set has many connections in the PPI network, so the obtained driver vertices should affect the set of many proteins in the PPI network between the proteins related to COVID-19.
IL6 protein is among the top 9 proteins in the PPI network and is also among the top 10 drivers in signaling pathways. Therefore, a total of 18 proteins have been considered as a set of hub and driver proteins for further analysis. Driver and hub genes identified, have subscription with the results of machine learning methods to identify signatures associated with COVID-19. For example, the TNF gene is one of the essential biomarkers of COVID-1938, or TNF and INS genes have been obtained as transcriptional biomarkers for severe COVID-19 by machine learning methods41.
Analysis of expression data
In the following, to evaluate the obtained results, analysis of expressive data has been done. For this purpose, expression data set GSE163151 was used, in which 404 expression profiles were collected from nasal swabs and blood of healthy people and people suffering from COVID-19 and other viral and bacterial infections46. Of the total data, 31 data profiles are from healthy people (control group) and 145 data profiles are from people with COVID-19. First, DEG analysis has been performed between two control and COVID-19 groups. Figure 2a,b show the Principal component analysis in two groups and the heatmap of 50 genes with the highest expression changes15. Then the co-expression analysis between the hub and driver genes obtained in the previous steps has been performed. Figures 2c,d show the level of expression correlation in the COVID-19 group and the control group, respectively. The results show the correlation changes in gene expression in the two groups. In general, the correlation is higher in the COVID-19 group.
Of the 18 selected hub and driver genes, 13 genes with a p Value less than 0.05 have expression differences in the two groups of COVID-19 and control, as shown in Table 1. Also, AveExpr and Log2FC of each gene are expressed in the second and third columns respectively.
Identification of drugs related to the obtained gene set
In order to identify potential drug combinations against COVID-19, existing drugs that affect each of the identified hub and driver genes based on DrugCentral47 and KEGG pathways were presented as potential drug targets. The related genes and drugs are indicated in the bipartite graph format shown in Fig. 3. In the following, Similar to the process used to find candidate drugs for lung cancer48, based on the STITCH database, which identifies chemical-chemical and chemical-proteins interaction, five chemical compounds corresponding to each of the 18 driver and hub genes have been determined, which are presented in figures S1 and S2.
There is considerable evidence of relationships between the identified drugs and Remdesivir, which is a well-known and widely used antiviral medicine that works by stopping the virus that causes COVID-19. including: The metabolism of Remdesivir can be decreased when combined with Midostaurin49. The metabolism of Bortezomib can be decreased when combined with Remdesivir49. The metabolism of Digitoxin can be decreased when combined with Remdesivir49. The excretion of Raloxifene can be decreased when combined with Remdesivir50. The metabolism of Sunitinib can be decreased when combined with Remdesivir49. The metabolism of Ceritinib can be decreased when combined with Remdesivir49. The metabolism of Crizotinib can be decreased when combined with Remdesivir49. The metabolism of Ruxolitinib can be decreased when combined with Remdesivir49. The metabolism of Tofacitinib can be decreased when combined with Remdesivir51. The serum concentration of Remdesivir can be increased when it is combined with Fedratinib52. The metabolism of Nintedanib can be decreased when combined with Remdesivir49. Details of each of the 61 drug-gene interactions identified in Fig. 3 between 14 genes and 48 drugs are specified in Table S4. Where the signaling pathways and mechanism of drug action on the relevant gene, structure id, target class, accession, action value, and action type, are specified.
Conclusion
Due to the importance of finding a treatment solution for COVID-19, various omics data related to it have been created. Also, many bioinformatics researches have been conducted, which depending on the data selection and analysis method, have produced sometimes different results. Therefore, in the direction of a comprehensive analysis, it is necessary to consider all the presented results.
In this paper, to create a PPI network related to COVID-19, using the results presented in different articles, a set of genes has been selected that have either a large number of references confirming its importance with COVID-19 or based on the criterion p Value is very important in the development of COVID-19. In the PPI network, which includes the functional and physical connections of the selected proteins, the vertices with the highest degree are selected. Therefore, the change in the concentration of obtained proteins will affect a large number of proteins related to COVID-19. Also, the controllability analysis of the directed network related to the signaling pathways of COVID-19 has led to the identification of driver vertices. Network control methods look for vertices in the network that have a stimulating role and when they are affected by external signals such as drugs or any other treatment method, they can hierarchically affect all network vertices and create the desired change. The above analyzes were not performed independently. Rather, the control target vertices are proteins that have many connections in the PPI network. Since the high control power is due to the presence of long paths and the high degree is due to many connections, the high control power of the vertex and the high degree of the vertex in the network are complementary to each other. Therefore, when the controlled vertices have a high degree, the influence of the stimulating vertices in the network will be doubled.
In order to evaluate and check the significance of the obtained results, expression data analysis has been done. The results confirm that the expression of the set of genes obtained is mainly different in the two groups of COVID-19 and control, and on the other hand, the correlation of the expression of pairs of genes is different in the two groups. Finally, the most important drugs that affect each of the obtained genes are shown in the form of a bipartite network, which can potentially provide new drug suggestions for the treatment of COVID-19.
The process carried out in this paper to provide drug combinations for COVID-19, Collects and filters disease-related genes based on literature. Then it performs centrality and controllability analysis on different PPI networks and the related signaling pathways to identify hub and driver genes. In the following based on the analysis of expression data, the obtained results are evaluated. Finally, based on various drug-gene relationships, appropriate drug combinations are suggested, which can be a general pipeline that can be considered for any other disease.
Material and methods
Hub vertices selection
Determination of hub vertices in the PPI network is based on the degree centrality criterion. The 10 proteins with the highest degree have been selected as the 10 hub proteins.
Controllability of complex networks
A dynamical system is controllable if, by entering appropriate input signals, the state of vertices can be transferred from any position to any desired position with a finite number of steps. Checking the exact controllability is based on the Kalman's controllability rank condition. On the other hand, checking the structural controllability is based on the minimum input theorem, which determines the relationship between maximum matching and driver vertices based on the Lin's structural controllability theorem.
Controllability with minimum mediator
The target controllability algorithm with minimum mediator, which is based on the length of the paths between vertices in the network, identifies driver vertices that can control the desired target with the least number of intermediary vertices.
Control centrality
As a centrality measure, it seeks to identify the vertices that control the largest number of vertices in the whole network or desired target. 10 vertices with the highest control power in the directed network related to the KEGG pathway signaling have been identified.
Data availability
All data generated or analysed during this study are included in this published article [and its supplementary information files].
References
Baral, S. et al. Competing health risks associated with the COVID-19 pandemic and early response: A scoping review. PLoS One 17, e0273389 (2022).
Connor, J. et al. Health risks and outcomes that disproportionately affect women during the Covid-19 pandemic: A review. Soc. Sci. Med. 266, 113364 (2020).
Moreno, C. et al. How mental health care should change as a consequence of the COVID-19 pandemic. Lancet Psychiatry 7, 813–824 (2020).
Gavin, B., Lyne, J. & McNicholas, F. Mental health and the COVID-19 pandemic. Ir. J. Psychol. Med. 37, 156–158 (2020).
Ratten, V. Coronavirus (covid-19) and entrepreneurship: Changing life and work landscape. J. Small Bus. Entrep. 32, 503–516 (2020).
Cheer, J. M. Human flourishing, tourism transformation and COVID-19: A conceptual touchstone. Tour. Geogr. 22, 514–524 (2020).
Ratten, V. Coronavirus (Covid-19) and entrepreneurship: Cultural, lifestyle and societal changes. J. Entrepreneurship Emerg. Econ. 13, 747–761 (2021).
He, H. & Harris, L. The impact of Covid-19 pandemic on corporate social responsibility and marketing philosophy. J. Bus. Res. 116, 176–182 (2020).
Mofijur, M. et al. Impact of COVID-19 on the social, economic, environmental and energy domains: Lessons learnt from a global pandemic. Sustain. Product. Consum. 26, 343–359 (2021).
Stasi, C., Fallani, S., Voller, F. & Silvestri, C. Treatment for COVID-19: An overview. Eur. J. Pharmacol. 889, 173644 (2020).
Li, X. et al. Network bioinformatics analysis provides insight into drug repurposing for COVID-19. Med. Drug Discov. 10, 100090 (2021).
Li, R. et al. Network Pharmacology and bioinformatics analyses identify intersection genes of niacin and COVID-19 as potential therapeutic targets. Brief. Bioinform. 22, 1279–1290 (2021).
Aghdam, R., Habibi, M. & Taheri, G. Using informative features in machine learning based method for COVID-19 drug repurposing. J. Cheminform. 13, 1–14 (2021).
Masoudi-Sobhanzadeh, Y., Esmaeili, H. & Masoudi-Nejad, A. A fuzzy logic-based computational method for the repurposing of drugs against COVID-19. BioImpacts BI 12, 315 (2022).
Zhang, W. et al. COVID19db: A comprehensive database platform to discover potential drugs and targets of COVID-19 at whole transcriptomic scale. Nucleic Acids Res. 50, D747–D757 (2022).
Kitano, H. Computational systems biology. Nature 420, 206–210 (2002).
Vandamme, D., Minke, B. A., Fitzmaurice, W., Kholodenko, B. N. & Kolch, W. Systems biology-embedded target validation: Improving efficacy in drug discovery. Wiley Interdiscipl. Rev.: Syst. Biol. Med. 6, 1–11 (2014).
Kinnings, S. L. et al. Drug discovery using chemical systems biology: Repositioning the safe medicine Comtan to treat multi-drug and extensively drug resistant tuberculosis. PLoS Comput. Biol. 5, e1000423 (2009).
Prathipati, P. & Mizuguchi, K. Systems biology approaches to a rational drug discovery paradigm. Curr. Top. Med. Chem. 16, 1009–1025 (2016).
Bugrim, A., Nikolskaya, T. & Nikolsky, Y. Early prediction of drug metabolism and toxicity: Systems biology approach and modeling. Drug Discov. Today 9, 127–135 (2004).
Davidov, E., Holland, J., Marple, E. & Naylor, S. Advancing drug discovery through systems biology. Drug Discov. Today 8, 175–183 (2003).
Arrell, D. & Terzic, A. Network systems biology for drug discovery. Clin. Pharmacol. Ther. 88, 120–125 (2010).
Wagner, H. J., Weber, W. & Fussenegger, M. Synthetic biology: Emerging concepts to design and advance adeno-associated viral vectors for gene therapy. Adv. Sci. 8, 2004018 (2021).
Pavlopoulos, G. A. et al. Using graph theory to analyze biological networks. BioData Min. 4, 1–27 (2011).
Aghdam, R. et al. Inferring gene regulatory networks by PCA-CMI using Hill climbing algorithm based on MIT score and SORDER method. Int. J. Biomath. 9, 1650040 (2016).
Ma’ayan, A. Insights into the organization of biochemical regulatory networks using graph theory analyses. J. Biol. Chem. 284, 5451–5455 (2009).
Giuliani, A., Krishnan, A., Zbilut, J. P. & Tomita, M. Proteins as networks: Usefulness of graph theory in protein science. Curr. Protein Pept. Sci. 9, 28–38 (2008).
Kantelis, K. F. et al. Graph theory-based simulation tools for protein structure networks. Simulat. Modell. Pract. Theory 121, 102640 (2022).
Zhou, Z. & Guang, H. Applications of graph theory in studying protein structure, dynamics, and interactions. J. Math. Chem. https://doi.org/10.1007/s10910-023-01511-6 (2023).
Liu, Y.-Y., Slotine, J.-J. & Barabási, A.-L. Controllability of complex networks. Nature 473, 167–173 (2011).
Yuan, Z., Zhao, C., Di, Z., Wang, W.-X. & Lai, Y.-C. Exact controllability of complex networks. Nat. Commun. 4, 2447 (2013).
Gao, J., Liu, Y.-Y., D’souza, R. M. & Barabási, A.-L. Target control of complex networks. Nat. Commun. 5, 5415 (2014).
Ebrahimi, A., Nowzari-Dalini, A., Jalili, M. & Masoudi-Nejad, A. Target controllability with minimal mediators in complex biological networks. Genomics 112, 4938–4944 (2020).
Popescu, V.-B., Kanhaiya, K., Năstac, D. I., Czeizler, E. & Petre, I. Network controllability solutions for computational drug repurposing using genetic algorithms. Sci. Rep. 12, 1437 (2022).
Guo, W.-F. et al. Network controllability-based algorithm to target personalized driver genes for discovering combinatorial drugs of individual patients. Nucleic Acids Res. 49, e37–e37 (2021).
Habibi, M., Taheri, G. & Aghdam, R. A SARS-CoV-2 (COVID-19) biological network to find targets for drug repurposing. Sci. Rep. 11, 9378 (2021).
Sharma, A., Cinti, C. & Capobianco, E. Multitype network-guided target controllability in phenotypically characterized osteosarcoma: role of tumor microenvironment. Front. Immunol. 8, 918 (2017).
Zhang, Y.-H. et al. Identifying transcriptomic signatures and rules for SARS-CoV-2 infection. Front. Cell Dev. Biol. 8, 627302 (2021).
Li, Z. et al. Identifying methylation signatures and rules for COVID-19 with machine learning methods. Front. Mole. Biosci. 9, 908080 (2022).
Chen, L. et al. Identification of DNA methylation signature and rules for SARS-CoV-2 associated with age. Front. Biosci. -Landmark 27, 204 (2022).
Li, X. et al. Identification of transcriptome biomarkers for severe COVID-19 with machine learning methods. Biomolecules 12, 1735 (2022).
Ren, J., Guo, W., Feng, K., Huang, T. & Cai, Y. Identifying MicroRNA markers that predict COVID-19 severity using machine learning methods. Life 12, 1964 (2022).
Liu, Z. et al. Identification of methylation signatures and rules for predicting the severity of SARS-CoV-2 infection with machine learning methods. Front. Microbiol. 13, 1007295 (2022).
Zhang, Y. et al. Identification of COVID-19 infection-related human genes based on a random walk model in a virus–human protein interaction network. BioMed Res. Int. 2020, 1–7 (2020).
Kanehisa, M. & Goto, S. KEGG: Kyoto encyclopedia of genes and genomes. Nucleic Acids Res. 28, 27–30 (2000).
Ng, D. L. et al. A diagnostic host response biosignature for COVID-19 from RNA profiling of nasal swabs and blood. Sci. Adv. 7, eabe5984 (2021).
Ursu, O. et al. DrugCentral: Online drug compendium. Nucleic Acids Res. 45, gkw993 (2016).
Lu, J. et al. Identification of new candidate drugs for lung cancer using chemical–chemical interactions, chemical–protein interactions and a K-means clustering algorithm. J. Biomol. Struct. Dyn. 34, 906–917 (2016).
Zhou, S.-F. Drugs behave as substrates, inhibitors and inducers of human cytochrome P450 3A4. Curr. Drug Metab. 9, 310–322 (2008).
Karlgren, M. et al. In vitro and in silico strategies to identify OATP1B1 inhibitors and predict clinical drug–drug interactions. Pharm. Res. 29, 411–426 (2012).
Dowty, M. E. et al. The pharmacokinetics, metabolism, and clearance mechanisms of tofacitinib, a janus kinase inhibitor, in humans. Drug Metab. Dispos. 42, 759–773 (2014).
Ogasawara, K. et al. Assessment of effects of repeated oral doses of fedratinib on inhibition of cytochrome P450 activities in patients with solid tumors using a cocktail approach. Cancer Chemother. Pharmacol. 86, 87–95 (2020).
Acknowledgements
The authors would like to thank Alzahra University for supporting this research.
Author information
Authors and Affiliations
Contributions
F.R. designed and conceptualized the project and developed analytical calculations. A.E. designed the model, and performed the simulations. A.E. drafted the manuscript and F.R. critically revised the manuscript. All authors reviewed the manuscript and gave final approval for publication.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher's note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Ebrahimi, A., Roshani, F. Systems biology approaches to identify driver genes and drug combinations for treating COVID-19. Sci Rep 14, 2257 (2024). https://doi.org/10.1038/s41598-024-52484-8
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41598-024-52484-8
- Springer Nature Limited
This article is cited by
-
Network analysis to identify driver genes and combination drugs in brain cancer
Scientific Reports (2024)